What feature would you like to request?
The Qwen3 models will need something like this (this is taken from Qwen3 example):
def last_token_pool(last_hidden_states: Tensor,
attention_mask: Tensor) -> Tensor:
left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
if left_padding:
return last_hidden_states[:, -1]
else:
sequence_lengths = attention_mask.sum(dim=1) - 1
batch_size = last_hidden_states.shape[0]
return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
Is there any additional information you would like to provide?
No response
What feature would you like to request?
The Qwen3 models will need something like this (this is taken from Qwen3 example):
Is there any additional information you would like to provide?
No response