Ft experimental v1 from quic-akuruvil#97
Open
quic-akuruvil wants to merge 33 commits intoqraniumcitest:mainfrom
Open
Ft experimental v1 from quic-akuruvil#97quic-akuruvil wants to merge 33 commits intoqraniumcitest:mainfrom
quic-akuruvil wants to merge 33 commits intoqraniumcitest:mainfrom
Conversation
e9e7a7f to
c5b43e5
Compare
In this PR, I have created 3 test pipelines: dummy model execution, few layers execution, and full layer model execution. --------- Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
Model: Qwen/Qwen3-VL-30B-A3B-Instruct Adding changes to fix disagg mode 3 qpc output issue Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
…code/Vision/Encoder/Embedding) (quic#904) ## Summary The backend compiler team requested a new specializations.json format where each entry carries a meaningful graph name (e.g. "Prefill", "Decode") ## Changes - **`QEfficient/utils/_utils.py`** — new `_infer_specialization_name()` and `to_named_specializations()` helpers - **`QEfficient/base/modeling_qeff.py`** — `_compile()` uses new format - **`QEfficient/compile/qnn_compiler.py`** — QNN path uses new format - **`QEfficient/compile/compile_helper.py`** — legacy `create_and_dump_specializations()` uses new format ## Name inference rules | Keys present | Assigned name | |---|---| | `vision_size` / `img_size` / `grid_*`, no `seq_len` | `Vision` | | `encoder_ctx_len`, no `seq_len` | `Encoder` | | `sequence_length`, no `seq_len` | `Embedding` | | `seq_len != 1` | `Prefill` | | `seq_len == 1` | `Decode` | | anything else | `Graph_N` | ## Testing 21-unit tests added to `tests/unit_test/models/test_model_quickcheck.py` covering causal LM, continuous batching, VLM vision/language, Whisper, encoder/decoder, text embedding, and end-to-end JSON roundtrip. cc: @anujgupt-github @quic-rishinr --------- Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com> Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com> Co-authored-by: Rishin Raj <rishinr@qti.qualcomm.com>
…bfunctions for cached text models (quic#928) ## Summary This change moves layer-invariant RoPE cos/sin indexing out of repeated decoder-layer subfunctions and into model-level forward paths. For cached decoder models, we were repeatedly doing: ``` cos = cos[position_ids].unsqueeze(1) sin = sin[position_ids].unsqueeze(1) ``` inside each decoder attention block. With ONNX subfunctions enabled, that indexing becomes part of the exported repeated subfunction body and contributes to the on-device regression we observed after the single-subfunction Rope Fix work quic#880 . This patch hoists that work once per forward pass and passes the already-shaped cos/sin tensors into each decoder layer. ## What changed Applied the refactor to the applicable QEff model families that thread static cached RoPE tensors through repeated decoder layers, including: - Llama - Llama SwiftKV - Gemma - Gemma2 - Mistral - Falcon - GPT-OSS - Granite - GraniteMoE - Mllama text path - Mixtral - Olmo2 - Phi3 - Qwen2 - Qwen3 - Qwen3 MoE - Qwen2.5 VL text path - Qwen3 VL text path - Qwen3 VL MoE text path For the Qwen VL text towers, the same idea is applied to the indexed/interleaved MRoPE preparation: the already-indexed cos/sin tensors are prepared once before the decoder-layer loop and reused across layers. ## Tests Added a TinyLlama regression test to assert that export with subfunctions still produces a single decoder-layer ONNX function. Verified: `python -m pytest -q tests/unit_test/models/test_model_quickcheck.py -n auto` --------- Signed-off-by: vbaddi <vbaddi@qti.qualcomm.com> Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com> Co-authored-by: Rishin Raj <rishinr@qti.qualcomm.com>
c5b43e5 to
357d671
Compare
… v4.57.3 compatibility (quic#933) ## Problem After the transformers v4.57.3 rebase (commit 7acb860), the following error occurs: TypeError: QEffLlama4VisionModel.forward() got an unexpected keyword argument 'vision_feature_layer' ## Root Cause Analysis 1. In transformers v4.57.3, Llama4ForConditionalGeneration.get_image_features() now passes additional parameters (vision_feature_layer and vision_feature_select_strategy) to the vision model's forward method via **kwargs. 2. The call chain: - QEffLlama4EncoderWrapper.forward() calls self.model.get_image_features() - get_image_features() (from transformers library) calls self.vision_model(**kwargs) - QEffLlama4VisionModel.forward() was not accepting **kwargs 3. Since QEffLlama4VisionModel overrides the forward() method but didn't accept **kwargs, it raised a TypeError when these unexpected arguments were passed. ## Solution Added **kwargs parameter to QEffLlama4VisionModel.forward() method signature. This allows the method to accept vision_feature_layer and vision_feature_select_strategy parameters from the parent class's get_image_features() method, even though they're not used in the QEff implementation. ## Impact - Backward compatible change - Fixes ONNX export failures for Llama4 vision models - Maintains compatibility with transformers v4.57.3 API Tested with: examples/image_text_to_text/models/llama4/single_image.py Signed-off-by: sudheepm <sudheepm@qti.qualcomm.com> Co-authored-by: sudheepm <sudheepm@qti.qualcomm.com>
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com>
- Added a logger which will log onto console and file. This code is similar to existing QEff. Finetuning logger code. - Also added dist_utils which serves as utility code when dealing with distributed training. - Added logger test cases for sanity checks. --------- Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
cherry picking PRs- 697,658,667,666,656,652,647,649,645 --------- Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com> Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
…c#872) we are only cherry-picking PR-787, 791,813,795, skipping rebasing PR 785, cherry-picking experimental related branches from PR 692,747 --------- Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com> Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Co-authored-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Adding config file to support style remix dataset --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
…ntation (quic#893) 1) Added unit test cases for Pipeline Parallelism 2) Added documentation on how to run these tests 3) Created a constants file Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com> Co-authored-by: Swati Allabadi <sallabad@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Added testcase to test compare loss and metrics for different sdks to stable sdk Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Updating PP CLI command as per latest changes in config manager In future, this command should also be updated if any changes are done in single SOC CLI command Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com> Co-authored-by: Swati Allabadi <sallabad@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Added the following support for easy visualization of training and validation statistics: 1. train_logger callback function which captures the per epoch time, per epoch loss metric and per epoch perplexity 2. This function also captures number of trainable parameters, number of samples in training and eval dataset 3. All these are logged into a log file which can be given as an input by user by setting the flag --log_file_path in the input config .yaml file. Signed-off-by: abhamidi <abhamidi@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Anusha V.S Bhamidipati <abhamidi@qti.qualcomm.com>
…ults in trainer config Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
c5b6bd4 to
3ff5aca
Compare
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.