Skip to content

fix: resolve dropout type error in DogeDecoder#40022

Open
wubingheng111 wants to merge 2 commits intohuggingface:mainfrom
wubingheng111:model_doge_fix
Open

fix: resolve dropout type error in DogeDecoder#40022
wubingheng111 wants to merge 2 commits intohuggingface:mainfrom
wubingheng111:model_doge_fix

Conversation

@wubingheng111
Copy link
Copy Markdown

@wubingheng111 wubingheng111 commented Aug 8, 2025

Fix: #40079
Fixed TypeError where dropout() received tuple instead of Tensor in DogeDecoderLayer when using MoE configuration. The MLP forward method returns a tuple (hidden_states, router_logits) for MoE layers, but the subsequent dropout operation expected only a Tensor.

  • Extract hidden_states from tuple before dropout when using MoE
  • Ensure consistent tensor handling in both MLP and MoE configurations

Fixes issue where model.generate() failed with:
TypeError: dropout(): argument 'input' (position 1) must be Tensor, not tuple
@ArthurZucker @gante @LoserCheems

Fixed TypeError where dropout() received tuple instead of Tensor in
DogeDecoderLayer when using MoE configuration. The MLP forward method
returns a tuple (hidden_states, router_logits) for MoE layers, but the
subsequent dropout operation expected only a Tensor.

- Extract hidden_states from tuple before dropout when using MoE
- Ensure consistent tensor handling in both MLP and MoE configurations

Fixes issue where model.generate() failed with:
TypeError: dropout(): argument 'input' (position 1) must be Tensor, not tuple
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Aug 8, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: doge

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Let's push this a little bit: it's weird that no tests caught this! Can you make sure this is tested in tests/models/doge/test_modeling_doge.py ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError in DogeDecoderLayer with MoE Configuration when using dropout()

2 participants