Mllama(single + dual) + InternVL(single) + Llava (single)#267
Mllama(single + dual) + InternVL(single) + Llava (single)#267
Conversation
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
1. Mllama single qpc support added 2. Simplified generate inputs for single and dual qpc --------- Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Co-authored-by: asmigosw <asmigosw@qti.qualcomm.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Added support for Laava model single QPC Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
b9c0bc1 to
b3a5d22
Compare
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
|
|
||
| logger = logging.getLogger(__file__) | ||
|
|
||
| class QEFFTransformersBase(QEFFBaseModel): |
There was a problem hiding this comment.
why is this entire part showing up as diff? possible to clean up the PR?
There was a problem hiding this comment.
because we moved the QEFFAutoModelForCausalLM class to the bottom of the file from top as we wanted to support Intern via that.
As we discussed should I chnage it to use QEFFAutoModelImageTextTOText?
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
24fad68 to
81cea10
Compare
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
quic-hemagnih
left a comment
There was a problem hiding this comment.
Can we please add the description about newly added argument model
2de6982 to
f5efdae
Compare
Signed-off-by: Amit Raj <quic_amitraj@quicinc.com>
f5efdae to
ad594d7
Compare
| config = AutoConfig.from_pretrained(pretrained_model_name_or_path, trust_remote_code=True) | ||
| config._attn_implementation = "eager" | ||
| config.vision_config.use_flash_attn = "false" | ||
| model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, config, *args, **kwargs) |
There was a problem hiding this comment.
Why we are explicitly fetching the config and directly calling the from_pretrained() of AutoModelForImageTextToText. We could have called super.from_pretrained() and then internally it would have handled this
| logger.warning("Updating low_cpu_mem_usage=False") | ||
|
|
||
| kwargs.update({"attn_implementation": "eager", "low_cpu_mem_usage": False}) | ||
| model = cls._hf_auto_class.from_pretrained(pretrained_model_name_or_path, *args, **kwargs) |
There was a problem hiding this comment.
Instead of making these changes in derived class from_pretained() function shouldn't we make these changes in Base class and then call it from here, it might be good from design perspective
There was a problem hiding this comment.
We think we can remove QEFFTransformerBaseClass we will need a bigger discussion to decide this.
| *args, | ||
| **kwargs, | ||
| ): | ||
| if kwargs.get("attn_implementation", None) not in {None, "eager"}: |
There was a problem hiding this comment.
why we are not calling super.from_pretained() here?, it already has this peice of code
There was a problem hiding this comment.
because we need to change config options in this method that are not needed for other auto clases.
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com>
Adding generalized infrastructure to support VLMs with Dual/single QPC approaches --------- Signed-off-by: Amit Raj <quic_amitraj@quicinc.com> Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com> Signed-off-by: Onkar Chougule <quic_ochougul@quicinc.com> Co-authored-by: Rishin Raj <quic_rishinr@quicinc.com> Co-authored-by: Amit Raj <quic_amitraj@quicinc.com> Co-authored-by: Amit Raj <168538872+quic-amitraj@users.noreply.github.com> Co-authored-by: asmigosw <asmigosw@qti.qualcomm.com> Signed-off-by: Hem Agnihotri <quic_hemagnih@quicinc.com>
Adding generalized infrastructure to support VLMs with Dual/single QPC approaches