Skip to content

Add SplitTensorsTransform to QEFFAutoModel to prevent >2GB protobuf export issue#950

Open
quic-rishinr wants to merge 1 commit intoquic:mainfrom
quic-rishinr:auto_model_proto_export
Open

Add SplitTensorsTransform to QEFFAutoModel to prevent >2GB protobuf export issue#950
quic-rishinr wants to merge 1 commit intoquic:mainfrom
quic-rishinr:auto_model_proto_export

Conversation

@quic-rishinr
Copy link
Copy Markdown
Contributor

@quic-rishinr quic-rishinr commented Apr 28, 2026

Add SplitTensorsTransform to QEFFAutoModel to prevent >2GB protobuf exports

FP16ClipTransform inlines external weights, causing large embedding
models (e.g. BAAI/bge-reranker-v2-m3) to exceed the 2GB ModelProto
parser limit in the AIC compiler

Adding SplitTensorsTransform to _onnx_transforms spills large
initializers to sidecar *.onnx.data files. Updated existing tests
and added regression tests to verify external data spilling behavior.

…xports

Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
@quic-rishinr quic-rishinr requested review from asmigosw and vbaddi April 28, 2026 08:02
@quic-rishinr
Copy link
Copy Markdown
Contributor Author

CI-Ready

Copy link
Copy Markdown
Contributor

@vbaddi vbaddi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quic-rishinr as discussed, lets disable the split transform and fix the onnx.save thing. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants