Skip to content

Change the order of grid dim in bwd convert kernel to avoid overlimit#173

Merged
beginlner merged 1 commit intodeepseek-ai:mainfrom
uchihatmtkinu:feat/fix_super_long_seq_error
Mar 31, 2026
Merged

Change the order of grid dim in bwd convert kernel to avoid overlimit#173
beginlner merged 1 commit intodeepseek-ai:mainfrom
uchihatmtkinu:feat/fix_super_long_seq_error

Conversation

@uchihatmtkinu
Copy link
Copy Markdown
Collaborator

Fix the cutlass internal error when sequence length is very large(>1M). The reason is that the gridDim.z stores the seq_len/block_size which is overlimit(>65535) when seqlen is too large. So we move it to the gridDim.x.
#171

@beginlner beginlner merged commit 71c7379 into deepseek-ai:main Mar 31, 2026
@uchihatmtkinu uchihatmtkinu deleted the feat/fix_super_long_seq_error branch March 31, 2026 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants