Skip to content

[QEff. Finetuning] Adding finetune_experiemental.py and related files#731

Closed
quic-swatia wants to merge 5 commits intoquic:ft_experimentalfrom
quic-swatia:HF-Trainer-main
Closed

[QEff. Finetuning] Adding finetune_experiemental.py and related files#731
quic-swatia wants to merge 5 commits intoquic:ft_experimentalfrom
quic-swatia:HF-Trainer-main

Conversation

@quic-swatia
Copy link
Copy Markdown
Contributor

@quic-swatia quic-swatia commented Jan 16, 2026

  1. Added FinetuningPipeline (finetune_experiemental.py) which integrates all the components added for HF-trainer and enable running fine tuning through it.
  2. Added files to handle PEFT and training config.
  3. Made changes in the config_manager and callbacks files.
  4. Added unit tests for the FinetuningPipeline (test_finetune.py)
  5. Updated tests in test_callback and test_config_manager based on above changes.

Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
@quic-swatia quic-swatia changed the title Adding finetune_experiemental.py and related files [QEff. Finetuning] Adding finetune_experiemental.py and related files Jan 16, 2026
Copy link
Copy Markdown
Contributor

@quic-akuruvil quic-akuruvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check and verify the functionality of python -m Qefficient.finetune_experimental.py with the new stack.

Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
@quic-swatia quic-swatia force-pushed the HF-Trainer-main branch 2 times, most recently from 0043eaa to d0d3251 Compare January 28, 2026 21:29
Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
Comment thread QEfficient/cloud/finetune_experimental.py
Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Copy link
Copy Markdown
Contributor

@quic-akuruvil quic-akuruvil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--enble_pp, pipeline parallelism support is also missing. Please add that too.

Copy link
Copy Markdown
Contributor

@quic-meetkuma quic-meetkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks cleaner. Tests might need proper refactoring. Let us try to close this at the earliest.

PS: add description to the PR.

try:
import torch_qaic # noqa: F401
except ImportError as e:
logger.log_rank_zero(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are passing and not blocking. It will be fine here but if user has provided device="qaic" and torch_qaic is loaded then we need to break the execution. This has to happen either here or inside ConfigManager. I believe this kind of validation and all other validations about config should reside inside ConfigManager.

CC: @tchawada

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in new PR

Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Comment thread QEfficient/finetune/experimental/core/callbacks.py
Comment thread QEfficient/finetune/experimental/core/callbacks.py Outdated
Comment thread QEfficient/finetune/experimental/core/callbacks.py Outdated
Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Comment thread QEfficient/cloud/finetune_experimental.py Outdated
Comment thread QEfficient/finetune/experimental/core/callbacks.py
Comment thread QEfficient/finetune/experimental/tests/test_finetune.py Outdated
Comment thread QEfficient/finetune/experimental/tests/test_finetune.py
@quic-akuruvil
Copy link
Copy Markdown
Contributor

Please check and need to add trl library also in requirements, if not already present.

…th recetly merged PRs. Made changes in the test files as well accordinlgy.

Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com>
@quic-swatia
Copy link
Copy Markdown
Contributor Author

Check and verify the functionality of python -m Qefficient.finetune_experimental.py with the new stack.

Its's working with this PR.

@quic-swatia
Copy link
Copy Markdown
Contributor Author

--enble_pp, pipeline parallelism support is also missing. Please add that too.

PP enablement will be done iteratively in the subsequent PR.

# callback_config.callbacks is a dictionary of callback configurations
for callback_name, callback_kwargs in callback_config["callbacks"].items():
try:
callback_instance = create_callbacks(callback_name, **callback_kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is not handling case in which callback_kwargs is None, can you see it once.

Execute the complete fine-tuning pipeline.
"""
# Validate configuration
self.config_manager.validate_config()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

try:
import torch_qaic # noqa: F401
except ImportError as e:
logger.log_rank_zero(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in new PR

@quic-swatia
Copy link
Copy Markdown
Contributor Author

These changes have been added in PR #791 and it's merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants