Bump transformers to 4.56.1#136
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Looks like we need to fix some stuff 😂 |
| ) | ||
| num_heads = getattr(config, "num_key_value_heads", config.num_attention_heads) | ||
| head_dim = getattr(config, "head_dim", config.hidden_size // config.num_attention_heads) | ||
| self.early_initialization( |
There was a problem hiding this comment.
(In pr description)
There was a problem hiding this comment.
linked pr description says that cache_position will also be removed. Is it talking about cache maintaining cache_position? Something to keep track of
There was a problem hiding this comment.
Oh it's the part on early_initialization - basically you need to call this to initialize some of the attributes you need on each cache layer
| args=(), | ||
| kwargs={ | ||
| "input_ids": input_ids, | ||
| "cache_position": cache_position, | ||
| }, |
There was a problem hiding this comment.
This changed from args to kwargs in transformers
|
ok looks largely benign as in things that have to be done to make it work. But I would suggest you put comments around the changes to make it easier to review. I dont know why all the changes are needed |
Some changes needed for the bump: