[QEff.finetune]Test finetune#826
Merged
quic-akuruvil merged 60 commits intoquic:ft_experimentalfrom Mar 6, 2026
Merged
Conversation
carry over patch quic#693 Signed-off-by: Onkar Chougule <ochougul@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com> Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com> Co-authored-by: Mohit Soni <mohisoni@qti.qualcomm.com> Co-authored-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
Updating README, custom script for 2-layer instruction for Wan Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Added step wise instructions for MULTI NODE Finetuning. --------- Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Add support for multi-node Distributed Data Parallel (DDP) training to the QEfficient finetuning pipeline. This enables scaling training across multiple nodes while keeping the existing single-node behavior unchanged. Commands for DDP across 2 servers: For the Master Addr or the Primary Machine, use node-rank as 0: QAIC_VISIBLE_DEVICES=0,1,2,3 torchrun --nnodes=2 --nproc-per-node=4 --seed 0 --node-rank=0 --master_addr=<MASTER_NODE_IP> --master_port=8000 -m QEfficient.cloud.finetune --device qaic --enable_ddp --model_name "meta-llama/Llama-3.2-1B" --dataset alpaca_dataset --train_batch_size 1 --val_batch_size 1 --num_epochs 1 --max_train_step 200 --max_eval_step 50 For Node 1, use node-rank as 1: QAIC_VISIBLE_DEVICES=0,1,2,3 torchrun --nnodes=2 --nproc-per-node=4 --seed 0 --node-rank=1 --master_addr=<MASTER_NODE_IP> --master_port=8000 -m QEfficient.cloud.finetune --device qaic --enable_ddp --model_name "meta-llama/Llama-3.2-1B" --dataset alpaca_dataset --train_batch_size 1 --val_batch_size 1 --num_epochs 1 --max_train_step 200 --max_eval_step 50 --------- Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
QEfficient should ignore providing `-mdp-load-partition-config` when `-mdp-dump-partition-config` is provided in compiler_options of compile API. --------- Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <quic_akuruvil@quicinc.com>
Handled the edge case where num samples in a dataset are less than 20. Corrected the dataset link in grammar_dataset.py Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Since CCL is deactivated by default, the value of CCL lists (ccl_prefill and ccl_decode) should be None by default. In infer.py script the value of these lists wasn't None and it caused the problem of ccl activation by default. In this PR we addressed this issue. --------- Signed-off-by: Vahid Janfaza <vjanfaza@qti.qualcomm.com>
In this PR: 1) We have modified the code to support PP+DDP on multi-server setup 2) Added preprocessing file for grammar dataset 3) Modified the naming convention for output dir to include the node rank of the server --------- Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Added default NPI file for Gemma3. 1. Eliminates the need to provide NPI file as an extra argument by user. NPI file added as default, no need to provide it explicitly in the example script --------- Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <quic_akuruvil@quicinc.com>
Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com> Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com> Signed-off-by: Amit Raj <amitraj@qti.qualcomm.com> Co-authored-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com> Co-authored-by: Amit Raj <amitraj@qti.qualcomm.com>
…WQ and FP8 models. (quic#735) Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Removed OpenGVLab/InternVL2_5-1B and OpenGVLab/InternVL3_5-1B test due to a compiler issue to unblock the CI --------- Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Updated Qeff version to mainline --------- Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Reverts quic#741 Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>
Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com>
Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com> Signed-off-by: abhishek-singh591 <sabhis@qti.qualcomm.com> Signed-off-by: Abhishek kumar singh <sabhis@qti.qualcomm.com>
The decode‑only GPT‑OSS model was failing when executing subfunctions due to somehow considering a dynamic dim value during reduced‑sum calculation. This caused incorrect tensor reduction and resulted in compilation errors. The fix replaces the reduction logic with an einsum-based computation, ensuring stable and deterministic summation regardless of dimension shape. --------- Signed-off-by: asmigosw <asmigosw@qti.qualcomm.com>
- updated the random sampling gold text, ids for InternVL2_5-1B Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Support to skip export, compilation if qpc already exists - Updated Flux, wan configs, pipelines with qpc_path changes --------- Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
The SW issue came with prompt + generation length > SW. Fix 1. Cache updated with HybridSlidingWindowCache in cache utils --------- Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <quic_akuruvil@quicinc.com>
Fix gemma3 to support cb with new SW code Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
This PR fixes subfunction-based export issues for the following models: 1. `bigcode/starcoder` 2. `ibm-granite/granite-20b-code-base-8k` 3. `ibm-granite/granite-20b-code-instruct-8k` 4. `Qwen3-30B-A3B-Instruct-2507` 5. `Mixtral-8x7B` In addition, it updates the Causal LM subfunction test file to make it more robust and resilient across models. --------- Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>
Updated the mainline version to 1.22.0.dev0 Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
qaic-exec is going to be deprecated. Updated the code to use new qaic-compile for compile API. --------- Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
- skip subfn handling in export utils for diffusers, we handle this in export() of diffuser models --------- Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com> Signed-off-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com> Co-authored-by: Abhishek Kumar Singh <sabhis@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
| @@ -21,9 +21,10 @@ model: | |||
| # Dataset configuration | |||
| dataset: | |||
| dataset_type: "sft_dataset" | |||
| dataset_name: "yahma/alpaca-cleaned" # Dataset name from Hugging Face Hub | |||
| prompt_func: "QEfficient.finetune.experimental.preprocessing.alpaca_func:create_alpaca_prompt" # Function to create prompt from dataset fields | |||
Contributor
There was a problem hiding this comment.
Keep another sft_single_device_alpaca_config.yaml also
|
|
||
| ``` | ||
|
|
||
| ### Step-by-Step Guide to run a fine-tuning job |
Contributor
There was a problem hiding this comment.
why is this added again? Is it not already present?
Contributor
Author
There was a problem hiding this comment.
Oh, during resolving conflicts it came twice, I remove it
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Contributor
|
Looks Good! |
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
| **Single device using yaml file** | ||
| ```bash | ||
| python finetune_experimental.py configs/sft_single_device_config.yaml | ||
| QAIC_VISIBLE_DEVICES=1 python QEfficient/cloud/finetune_experimental.py configs/sft_single_device_gsm8k_config.yaml |
Contributor
There was a problem hiding this comment.
for configs also give full path starting from QEfficient
Contributor
There was a problem hiding this comment.
And use QAIC_visible_devices starting from 0 always
| **Distributed (Using Accelerate)** | ||
| ```bash | ||
| accelerate launch --num_processes 4 finetune_experimental.py configs/sft_ddp_config.yaml | ||
| QAIC_VISIBLE_DEVICES=1,2,3,4 accelerate launch --num_processes 4 -m QEfficient.cloud.finetune_experimental configs/sft_ddp_config.yaml |
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 8, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 8, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 8, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 9, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 10, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 17, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 23, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 23, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe
pushed a commit
to smedhe/QEff_Sharvari
that referenced
this pull request
Mar 24, 2026
Modified test_finetune.py Changed optimizer names --------- Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com> Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com> Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Modified test_finetune.py
Changed optimizer names