Skip to content

Move get_mask_sizes from Cache to masking_utils and remove use of get_seq_length.#39142

Closed
manueldeprada wants to merge 4 commits intohuggingface:mainfrom
manueldeprada:cache-move-mask-sizes-out
Closed

Move get_mask_sizes from Cache to masking_utils and remove use of get_seq_length.#39142
manueldeprada wants to merge 4 commits intohuggingface:mainfrom
manueldeprada:cache-move-mask-sizes-out

Conversation

@manueldeprada
Copy link
Copy Markdown
Contributor

@manueldeprada manueldeprada commented Jul 1, 2025

This PR depends on #39106

Look at the last commit, f09e0cd:

I think having the get_mask_sizes out of cache makes much more sense. There is only one extra change:

past_seen_tokens = cache_position.shape[0] if cache_position.shape[0] > 1 else cache_position[0] + 1

It substitutes past_seen_tokens=past_key_values.get_seq_length() (which depends on cache info that might be hard to cumpute, i.e., QuantizedCaches). What we would like to compute is

past_seen_tokens = cache_position[-1]

but that is not compatible with torch.export.

The new solution is torch.export friendly and works both when cache_position = torch.tensor([ 0, 1, 2, 3, 4, 5, 6]) (prefill phase) and when cache_position = torch.tensor([16]).

…yeredCache (huggingface#38077)

- Introduces CacheLayer and Cache base classes
- Ports Static, Dynamic, Offloaded, Quantized, Hybrid, etc. to use layers
- Implements method/attr dispatch across layers to reduce boilerplate
- Adds CacheProcessor hooks for offloading, quantization, etc.
- Updates and passes tests
@manueldeprada manueldeprada force-pushed the cache-move-mask-sizes-out branch from 6b6314d to f09e0cd Compare July 1, 2025 08:26
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@manueldeprada manueldeprada force-pushed the cache-move-mask-sizes-out branch 4 times, most recently from c030aa2 to b78affa Compare July 2, 2025 15:08
@manueldeprada manueldeprada force-pushed the cache-move-mask-sizes-out branch from b78affa to 16a6624 Compare July 2, 2025 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants