Added use_liger flag to Trainer#32889
Conversation
There was a problem hiding this comment.
Nice work @shimizust ! Very excited to have this on transformers Trainer API ! Could you fix the CI tests by typing make style or/and rebase from main + add a few simple tests to check that layers were indeed replaced and check that we are able to train without errors (check the TrainerIntegrationTest in test_trainer.py file )?
| ) | ||
| ``` | ||
|
|
||
| Currently, the kernel supports the Llama, Gemma, Mistral, and Mixtral model architectures. A list of supported models is defined [here](https://github.com/linkedin/Liger-Kernel/blob/main/src/liger_kernel/transformers/monkey_patch.py#L7). When use_liger is set to True, the corresponding layers in the original model will be patched with Liger's efficient implementation, so you don't need to do anything extra other than setting the argument value. |
There was a problem hiding this comment.
Maybe it is better to link the readme and put the supported models there ? Right now, you are linking to a specific line in the monkey_patch file, not sure if this will change in the future.
| model_type = getattr(model, "config", None) and getattr(model.config, "model_type", None): | ||
| if model_type: | ||
| # Monkey patch the model with liger kernels. Use the default kernel configurations. | ||
| _apply_liger_kernel(model_type=model_type) |
|
Quick question @shimizust , I see that there is already another PR opened here. Which one should we focus on ? |
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
Hi @SunMarc thanks for the review! I'm going to incorporate your suggestions and merge into the original PR and close this one |
|
Using this PR instead: #32860 |
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.