Skip to content

Commit 788697b

Browse files
Natfiiclaude
andcommitted
docs(uber-kernel): C2 diagnostic spec — β-coop vs legacy under PIECEWISE+graphs
Spec for a one-shot diagnostic harness that disambiguates the C2 migration's PIECEWISE+graphs failure: is β-coop's kernel graph-replay-broken, or is the consume-gate op pattern at fault? Implementation deferred to next session on fresh branch diag/c2-beta-coop-vs-legacy. Decisions captured (Option 1, shape (a), strategy (i), probe (P2)): - Backend-only opaque ops modeled on cute_phase_e_dispatch (no framework op signature change, no kernel work in the probe itself). - Diagnose first, design second — the B-fix (514b88c, reverted) was already shape (a) and broke under graphs; we need to know whether the kernel or the op-pattern is the culprit before committing to a redesign. - Probe = comparison + dump on divergence at qwen3_5.py:466-476 in dual-fire under PIECEWISE+graphs; stashed companion eager-replay harness available via CUTE_C2_DIAG_EAGER for if the primary probe is inconclusive. - Sanity rung 0 (paged-only + NVFP4 + PIECEWISE+graphs) added to rule out NVFP4+graphs as a confound before the main run. Includes: - 5 design sections (architecture, components, data flow, error handling, testing) plus host-safety bounds (no flashinfer-autotune, no .item() syncs, no infinite loops, no OOM, no driver-wedge paths). - Container baseline snapshot at design time (PIECEWISE+graphs+dual-fire is healthy in production right now → diagnostic premise is testable). - Upstream-issue check (vllm-project/vllm vllm-project#35659, vllm-project#38208, vllm-project#37060) confirms none match our kernel stack — symptom class differs (crash vs gibberish), GEMM/attention backends differ. - Open questions surfaced for probe-wiring time (CUTE_FUSION_DISABLE env name verification, _fusion_bound semantic). Refs: docs/research/uber_kernel_migration/2026-04-26-consume-gate-dce-and-graph-capture.md commit 514b88c (B-fix attempt, reverted in 3ffcf87) memory: project_uber_kernel_migration, project_beta_coop_residual_solo_bug, feedback_mutates_args_not_dce_safe, feedback_item_breaks_cuda_graphs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 90b06d5 commit 788697b

1 file changed

Lines changed: 429 additions & 0 deletions

File tree

0 commit comments

Comments
 (0)