Skip to content

Mllama single qpc support added#258

Merged
quic-rishinr merged 8 commits intomllama_visionfrom
mllama_single_qpc
Feb 4, 2025
Merged

Mllama single qpc support added#258
quic-rishinr merged 8 commits intomllama_visionfrom
mllama_single_qpc

Conversation

@quic-amitraj
Copy link
Copy Markdown
Contributor

@quic-amitraj quic-amitraj commented Feb 3, 2025

  1. Mllama single qpc support added
  2. Simplified generate inputs for single and dual qpc

Comment thread QEfficient/base/onnx_transforms.py Outdated
Comment thread QEfficient/transformers/models/mllama/modeling_mllama.py Outdated
Comment thread QEfficient/transformers/models/mllama/modeling_mllama.py Outdated
# Out-of-place Scatter new into old
# out-of-place is important so the original tensor is not affected,
# otherwise leads to same operations in both graphs
indices = (torch.arange(bsz),)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a brief documentation on why these changes are required for single qpc and how does it create the graph.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will update in final version.

return outputs No newline at end of file
return outputs

def generate_mllama_single(self, processor):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just required for the onnx export right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. As processor output varies model to model, this function will help to get the model specific processor output. Now I have also removed the dependency of processor by creating dummy inputs and made it generic for the single and dual qpcs.

Comment thread QEfficient/transformers/models/modeling_auto.py Outdated
asmigosw and others added 6 commits February 3, 2025 19:17
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@quic-rishinr quic-rishinr merged commit 8624eec into mllama_vision Feb 4, 2025
quic-amitraj added a commit that referenced this pull request Feb 10, 2025
1. Mllama single qpc support added
2. Simplified generate inputs for single and dual qpc

---------

Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Co-authored-by: asmigosw <asmigosw@qti.qualcomm.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
@ochougul ochougul deleted the mllama_single_qpc branch February 18, 2025 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants