fix: add support for enabling onnx subfunction ci by vbaddi · Pull Request #42 · qraniumcitest/efficient-transformers

vbaddi · 2025-11-03T07:57:18Z

Jenkins-Ready

Signed-off-by: qraniumcitest <rmakar@qti.qualcomm.com>

Signed-off-by: vbaddi <quic_vbaddi@quicinc.com>

vbaddi · 2025-11-03T08:15:45Z

Jenkins-Ready

qraniumcitest · 2025-11-03T08:16:15Z

From GHES (comment) by @qraniumtest

Jenkins-Ready

Source: https://github.qualcomm.com/qranium/efficient-transformers/pull/40#issuecomment-993845

qraniumcitest · 2025-11-03T08:16:23Z

Build started: https://re-qranium-ci.qualcomm.com/job/Qeff-Q1/job/Qeff-Pipeline/45/

qraniumcitest · 2025-11-17T06:33:12Z

From GHES (comment) by @qgeniecodeassistant[bot]

Code Assistant

Reviewed Commits: cb7da87, 6b2a01b, bd0d9ef

cb7da87: updated notebooks (updated notebooks quic/efficient-transformers#543)

Updated the correct code with updated syntax, removed device_group
parameter in model.compile()

Signed-off-by: Sharvari Medhe smedhe@qti.qualcomm.com

6b2a01b: Create Mirror_Fork_PRs_to_GHES.yml

Signed-off-by: qraniumcitest rmakar@qti.qualcomm.com

bd0d9ef: fix: add support for enabling onnx subfunction

Signed-off-by: vbaddi quic_vbaddi@quicinc.com

PR Overview

This PR introduces significant enhancements to the ONNX export functionality, including support for exporting decoder layers as ONNX functions, custom operation transforms, ONNX graph optimization via onnx-slim, and torch patches for compatibility. It also adds a new GitHub Actions workflow for mirroring fork PRs to GHES.

Files Changed Summary

File	Lines Changed	Issues Found	Highest Severity
`.github/workflows/Mirror_Fork_PRs_to_GHES.yml`	+230	2	Medium
`QEfficient/__init__.py`	+6	0	-
`QEfficient/base/modeling_qeff.py`	+11	1	Low
`QEfficient/base/onnx_transforms.py`	+162	1	Medium
`QEfficient/transformers/cache_utils.py`	+26	1	Low
`QEfficient/transformers/models/gemma3/modeling_gemma3.py`	+1	0	-
`QEfficient/transformers/models/modeling_auto.py`	+14	0	-
`QEfficient/transformers/models/pytorch_transforms.py`	+26	0	-
`QEfficient/utils/constants.py`	+2	0	-
`QEfficient/utils/patches.py`	+121	1	Medium
`notebooks/QEfficientGPT2.ipynb`	+6	0	-
`notebooks/QEfficientMPT.ipynb`	+6	0	-

Critical Issues

[Security] Hardcoded Secrets in Workflow - The GitHub Actions workflow contains hardcoded repository and organization names that could be security-sensitive
[Functionality] Debug Print Statement Left in Production Code - A debug print statement in cache_utils.py should be removed
[Maintainability] Empty Dictionary Override - The patches.py file intentionally overrides onnx_attrs with an empty dict, which may cause issues
[Error Handling] Missing Error Handling in ONNX Transform - The OnnxSlimTransform lacks proper error handling for onnxslim.slim() failures

[Security - Medium] Hardcoded credentials and sensitive information in GitHub workflow

The GitHub Actions workflow file contains hardcoded repository names, organization names, and base URLs that could be security-sensitive. While these may be intentional for the specific use case, it's better practice to use repository variables or secrets for such configuration.

Fixed Code Snippet:

env:
  GHES_BASE_URL: ${{ vars.GHES_BASE_URL }}
  GHES_OWNER: ${{ vars.GHES_OWNER }}
  GHES_REPO: ${{ vars.GHES_REPO }}
  FORK_OWNER: ${{ vars.FORK_OWNER }}
  FORK_REPO: ${{ vars.FORK_REPO }}

[Functionality - Medium] Debug print statement left in production code

A debug print statement is present in the production code at line 441 of cache_utils.py. This appears to be leftover debugging code that should be removed before merging to production.

Fixed Code Snippet:

invalid_idx_value = _get_invalid_idx_value()
ctx_indices = torch.where(invalid_mask, invalid_idx_value, ctx_indices)

[Maintainability - Medium] Intentional empty dictionary override may cause issues

In patches.py at line 40, the code intentionally overrides onnx_attrs with an empty dictionary after retrieving it. The comment indicates this is a fix for a type mismatch issue with transformers v4.55+, but this approach discards potentially important module attributes. This could lead to loss of metadata during ONNX export.

Consider investigating if there's a better way to handle the type mismatch that doesn't discard the attributes entirely, or add comprehensive testing to ensure this doesn't break functionality for models that rely on these attributes.

Fixed Code Snippet:

# FIX: Convert to compatible type to avoid mismatch with _jit_pass_onnx_track_scope_attributes
# Observed in transformers v4.55 and above
if onnx_attrs and not isinstance(onnx_attrs, dict):
    onnx_attrs = {}
_C._jit_pass_onnx_track_scope_attributes(graph, onnx_attrs)

[Error Handling - Low] Missing error handling in ONNX transform

The OnnxSlimTransform.apply() method at line 132 calls onnxslim.slim(model) without proper error handling. If onnxslim fails (e.g., due to unsupported operations or memory issues), the exception will propagate without context.

Fixed Code Snippet:

if onnx_slim_transform:
    transformed = True
    try:
        slimmed_model = onnxslim.slim(model)
        onnx.save(slimmed_model, temp_onnx_path)
    except Exception as e:
        logger.warning(f"ONNX slim optimization failed: {e}. Continuing with original model.")
        return model, False
    return slimmed_model, transformed

[Performance - Low] Redundant environment variable check in _get_invalid_idx_value

The function _get_invalid_idx_value() in cache_utils.py checks the environment variable QEFF_USE_ONNX_FUNCTIONS on every call. Since this is called frequently during cache operations, consider caching this value at module level to avoid repeated environment lookups.

Fixed Code Snippet:

# At module level
_USE_ONNX_FUNCTIONS = os.environ.get("QEFF_USE_ONNX_FUNCTIONS", "false").lower() == "true"

def _get_invalid_idx_value():
    """
    Get the appropriate invalid index value for CtxGather operations.
    """
    if torch.onnx.is_in_onnx_export():
        if _USE_ONNX_FUNCTIONS:
            return 0
        else:
            return torch.iinfo(torch.int32).max
    else:
        return 0

** Version 1.3.6**

Help us improve!

How useful was this code feedback? Not very useful 1️⃣ 2️⃣ 3️⃣ 4️⃣ 5️⃣ Very useful

How much time did it save you (in hours)? 0 | <1 | 1-3 | >4

Let us know your detailed feedback

Source: https://github.qualcomm.com/qranium/efficient-transformers/pull/43#issuecomment-1009227

qraniumcitest and others added 2 commits October 29, 2025 10:00

Create Mirror_Fork_PRs_to_GHES.yml

6b2a01b

Signed-off-by: qraniumcitest <rmakar@qti.qualcomm.com>

fix: add support for enabling onnx subfunction

bd0d9ef

Signed-off-by: vbaddi <quic_vbaddi@quicinc.com>

vbaddi requested review from quic-hemagnih and quic-rishinr as code owners November 3, 2025 07:57

qraniumcitest closed this Nov 17, 2025

qraniumcitest reopened this Nov 17, 2025

qraniumcitest force-pushed the main branch from 33d6bab to f4ff803 Compare November 18, 2025 09:34

qraniumcitest force-pushed the main branch from b0c0a43 to 32f30c0 Compare January 22, 2026 15:01

qraniumcitest closed this Jan 22, 2026

qraniumcitest deleted the add_onnx_subfunctions_ci branch January 30, 2026 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add support for enabling onnx subfunction ci#42

fix: add support for enabling onnx subfunction ci#42
vbaddi wants to merge 2 commits intomainfrom
add_onnx_subfunctions_ci

vbaddi commented Nov 3, 2025

Uh oh!

vbaddi commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vbaddi commented Nov 3, 2025

Uh oh!

vbaddi commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 3, 2025

Uh oh!

qraniumcitest commented Nov 17, 2025

Code Assistant

PR Overview

Files Changed Summary

Critical Issues

[Security - Medium] Hardcoded credentials and sensitive information in GitHub workflow

[Functionality - Medium] Debug print statement left in production code

[Maintainability - Medium] Intentional empty dictionary override may cause issues

[Error Handling - Low] Missing error handling in ONNX transform

[Performance - Low] Redundant environment variable check in _get_invalid_idx_value

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants