VLM Pipeline for Model Onboarding through QEff#261
VLM Pipeline for Model Onboarding through QEff#261qcdipankar wants to merge 4 commits intoquic:mainfrom
Conversation
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
| import torch.nn.functional as F | ||
| import torch.utils.checkpoint | ||
| import transformers | ||
| from einops import rearrange |
There was a problem hiding this comment.
Not needed, please remove
| # if repl_module := cls._module_mapping.get(type(module)): | ||
| if repl_module := cls._module_mapping.get(module.__class__.__name__): | ||
| module.__class__ = repl_module | ||
| # Handling the __init__ calls in the models | ||
| if hasattr(module, "__qeff_init__"): | ||
| module.__qeff_init__() | ||
| transformed = True | ||
|
|
There was a problem hiding this comment.
create a new transform named like:
class ModuleMappingViaStringAndClassMatchTransform:
_module_mapping_via_class: Dict[Type[nn.Module], Type[nn.Module]]
_module_mapping_via_string: Dict[string, Type[nn.Module]]
@classmethod
def apply(cls, model):
transformed=False
for module in model.modules():
if repl_module := cls._module_mapping_via_class.get(type(module)):
# replace the class here
elif repl_module := cls._module_mapping_via_string.get(type(module)):
# replace the class here
Create two different dicts basically.
And write a test that makes sure the keys on the two dicts don't match.
| def get_num_layers_vlm(config): | ||
| if hasattr(config, "architectures") and "LlavaForConditionalGeneration" in config.architectures: | ||
| num_layers = config.text_config.num_hidden_layers | ||
| return num_layers |
There was a problem hiding this comment.
can't we reuse existing method named get_num_layers_from_config and pass model.config.text_config to it?
There was a problem hiding this comment.
In some models it is text_config, In some it is llm_config, txt_config etc. Hence adding it as a new function for vlm architecture
There was a problem hiding this comment.
Let's keep it separate to avoid cluttering it with multiple conditions inside same function?
| def get_padding_shape_vlm(config, batch_size=1): | ||
| if hasattr(config, "architectures") and "LlavaForConditionalGeneration" in config.architectures: | ||
| n_heads = config.text_config.num_key_value_heads | ||
| d_head = config.text_config.hidden_size // config.text_config.num_attention_heads | ||
| padding_shape = [batch_size, n_heads, Constants.CTX_LEN_VLM, d_head] | ||
| return padding_shape |
There was a problem hiding this comment.
same comment as above is this new method required? Can't we reuse existing one?
| # InternVL | ||
| "InternVLChatModel": QEffInternVLChatModel, | ||
| "InternVisionEmbeddings": QEffInternVisionEmbeddings, |
There was a problem hiding this comment.
please create different transform as mentioned above and separate this dict
| model_config["n_layer_text"] = 1 | ||
| model_config["n_layer_vision"] = 1 |
There was a problem hiding this comment.
this should not go in library code. It is allowed only in tests
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
|
Already addressed in #267 |
Features Added
1.Original modeling files removed for Intern. Generic Solution for Models not part of transformers.
2.Used Model Wrapper inside modeling files to put the generate_inputs functions. Calls will be made based on model at pretrained.
3.Constant file updated.
4.Removed pytorch generate from modeling_auto
5.General Clean up of code done
Tested and Verified on
TODO