Skip to content

Merge ft_experimental_v1 branch to main#887

Open
quic-akuruvil wants to merge 28 commits intomainfrom
ft_experimental_v1
Open

Merge ft_experimental_v1 branch to main#887
quic-akuruvil wants to merge 28 commits intomainfrom
ft_experimental_v1

Conversation

@quic-akuruvil
Copy link
Copy Markdown
Contributor

@quic-akuruvil quic-akuruvil commented Mar 25, 2026

Merge HF Trainer FT code base to main

Main API file: QEfficient/cloud/finetune_experimental.py

Documentation:
docs/source/hf_finetune.md – End‑to‑end fine‑tuning pipeline details and sample commands to kick-off
docs/source/config.md – All training hyperparameters and usage using config.yaml

@quic-akuruvil quic-akuruvil changed the title Ft experimental v1 Merge ft_experimental_v1 branch to main Mar 25, 2026
@quic-akuruvil quic-akuruvil force-pushed the ft_experimental_v1 branch 5 times, most recently from 3e76f60 to fb3fb86 Compare March 30, 2026 06:42
@quic-akuruvil quic-akuruvil force-pushed the ft_experimental_v1 branch 3 times, most recently from aff10af to 357d671 Compare April 22, 2026 08:49
smedhe and others added 5 commits April 27, 2026 10:39
- Added a logger which will log onto console and file. This code is
similar to existing QEff. Finetuning logger code.
- Also added dist_utils which serves as utility code when dealing with
distributed training.
- Added logger test cases for sanity checks.

---------

Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com>
Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
cherry picking PRs- 697,658,667,666,656,652,647,649,645

---------

Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
we are only cherry-picking PR-787, 791,813,795, skipping rebasing PR
785, cherry-picking experimental related branches from PR 692,747

---------

Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Co-authored-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Adding config file to support style remix dataset

---------

Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Ann Kuruvilla and others added 18 commits April 27, 2026 10:39
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
…ntation (#893)

1) Added unit test cases for Pipeline Parallelism
2) Added documentation on how to run these tests
3) Created a constants file

Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Co-authored-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Added testcase to test compare loss and metrics for different sdks to stable sdk

Signed-off-by: Tanisha Chawada <tchawada@qti.qualcomm.com>
Updating PP CLI command as per latest changes in config manager
In future, this command should also be updated if any changes are done
in single SOC CLI command

Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Co-authored-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Added the following support for easy visualization of training and
validation statistics:

1. train_logger callback function which captures the per epoch time, per
epoch loss metric and per epoch perplexity
2. This function also captures number of trainable parameters, number of
samples in training and eval dataset
3. All these are logged into a log file which can be given as an input
by user by setting the flag --log_file_path in the input config .yaml
file.

Signed-off-by: abhamidi <abhamidi@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
@quic-akuruvil quic-akuruvil force-pushed the ft_experimental_v1 branch 2 times, most recently from c85f4e8 to 13382d8 Compare April 27, 2026 16:38
Signed-off-by: Anusha V.S Bhamidipati <abhamidi@qti.qualcomm.com>
Ann Kuruvilla added 4 commits April 27, 2026 16:55
…ults in trainer config

Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Signed-off-by: Ann Kuruvilla <akuruvil@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants