Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
This PR is stale because it has been open 15 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
b744417 to
1166877
Compare
|
@JingyaHuang there seems to be an issue with SD/SDXL models when bumping the |
Starting from transformers 4.54, there is an error when compiling Qwen2.5-0.5M with a sequence length of 128. This is a very unlikely configuration, and not one we want to cache. The pipeline code is therefore modified to align on default values that are actually tested in the NeuronModelForCausalLM export tests.
224226f to
5701183
Compare
CLIP models used in SD pipelines do not specify return_dict in their config but the tracing fails if return_dict is True, which is now the default in transformers.
In the latest transformers version, it is not done automatically anymore.
|
from v4.53.3, transformers removed some apis used by granite here won't it be a problem for granite model @tengomucho ? I came across something like: |
I do not see where they have been removed, I still see them on the v4.43: |
The latest T5Block layer in transformers does not expect the past_key_value to be returned by the T5Attention anymore.
|
JingyaHuang
left a comment
There was a problem hiding this comment.
lgtm, thanks for the feature and fixing the compatibility with trfrs!!
| "granite": "hf-internal-testing/tiny-random-GraniteForCausalLM", | ||
| "phi3": "yujiepan/phi-4-tiny-random", | ||
| "mixtral": "dacorvo/Mixtral-tiny", | ||
| "smollm3": "HuggingFaceTB/SmolLM3-3B", |
What does this PR do?
This adds support for the SmolLM3 model.
This required the following packages to be updated: