fix: add mapping of deepseek_v32 model type#42767
fix: add mapping of deepseek_v32 model type#42767mpashkovskii wants to merge 3 commits intohuggingface:mainfrom
Conversation
|
I think the tests are failing because of the irrelevant ResNet precision error. |
0cbfa6e to
2f72c0f
Compare
|
I noticed that deepseek-ai/DeepSeek-V3.2 uses DeepSeek's native sparse attention. Does the current deepseek_v3 architecture support this? I don't see the Indexer or selector in the code here, so I wonder if this mapping is safe: transformers/src/transformers/models/deepseek_v3/modeling_deepseek_v3.py Lines 430 to 461 in 8ebfd84 |
|
Yes, I don't think we can just map the new model to the old architecture! |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, deepseek_v32 |
|
Hi @huzama and @Rocketknight1, thanks for pointing that out. I’ve added the initial DeepSeek v3.2 implementation, but it still needs more testing and validation. I’d appreciate any feedback you have. Do you know if anyone else is actively working on this? If so, does it make sense to complete the implementation in this PR? |
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=42767&sha=f17882 |
|
@mpashkovskii, I’m working on implementing an indexer and top k feature for a personal project. However, there are some minor changes needed to make it into a pull request. You can try writing the code for the Indexer of DSA yourself. Alternatively, once I have a well-drafted version, I can also push the changes. |
|
Hello @mpashkovskii @huzama |
|
@mpashkovskii hello |
|
@freedom-cui the model is not implemented yet as of last commit. If you need only inference please check out VLLM library! |
Thank you very much for your reply. Is there a complete schedule available for supporting Deepseek v3.2 at this time? |
|
Please see #41251 (comment) cc @ArthurZucker |
What does this PR do?
Adds the missing mapping for model type
deepseek_v32todeepseek_v3model andDeepseekV3ConfigFixes #42590
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@Cyrilvallez could you please review the changes?