Skip to content

Fix multi-parallelism (TP+DP or PP+DP)#2

Merged
Jeronymous merged 3 commits intomainfrom
parallelism
Feb 20, 2026
Merged

Fix multi-parallelism (TP+DP or PP+DP)#2
Jeronymous merged 3 commits intomainfrom
parallelism

Conversation

@Jeronymous
Copy link
Copy Markdown
Member

Also start to implement context parallelism (when the version of VLLM permits it), but unfortunately that is still failing in our env with VLLM 0.15.1 :

  File ".../vllm/v1/worker/gpu_worker.py", line 412, in initialize_from_config
    self.model_runner.initialize_kv_cache(kv_cache_config)
  File ".../vllm/v1/worker/gpu_model_runner.py", line 5874, in initialize_kv_cache
    self.initialize_attn_backend(kv_cache_config)
  File ".../vllm/v1/worker/gpu_model_runner.py", line 5225, in initialize_attn_backend
    check_attention_cp_compatibility(self.vllm_config)
  File ".../vllm/v1/worker/cp_utils.py", line 39, in check_attention_cp_compatibility
    assert layer_impl.supports_pcp, (
AssertionError: PCP requires attention impls' support, but the impl FlashAttentionImpl does not support PCP.

…it (>= 0.15).

Unfortunately, it currently fails with VLLM 0.15.1 in our env:
  File ".../vllm/v1/worker/gpu_worker.py", line 412, in initialize_from_config
    self.model_runner.initialize_kv_cache(kv_cache_config)
  File ".../vllm/v1/worker/gpu_model_runner.py", line 5874, in initialize_kv_cache
    self.initialize_attn_backend(kv_cache_config)
  File ".../vllm/v1/worker/gpu_model_runner.py", line 5225, in initialize_attn_backend
    check_attention_cp_compatibility(self.vllm_config)
  File ".../vllm/v1/worker/cp_utils.py", line 39, in check_attention_cp_compatibility
    assert layer_impl.supports_pcp, (
AssertionError: PCP requires attention impls' support, but the impl FlashAttentionImpl does not support PCP.
@Jeronymous Jeronymous merged commit 1167c70 into main Feb 20, 2026
@Jeronymous Jeronymous deleted the parallelism branch February 20, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant