glm5 fix precision key name #976

claude · 2026-03-29T05:47:26Z

🔴 The precision rename from nvfp4 to fp4 breaks B200 benchmark execution because the corresponding shell script was not renamed. Both runners/launch_b200-nb.sh and runners/launch_b200-dgxc.sh dynamically construct the benchmark script path using the PRECISION env var, so they will now look for benchmarks/single_node/glm5_fp4_b200.sh, which does not exist — only glm5_nvfp4_b200.sh is present. All GLM-5 NVFP4 B200 benchmark jobs will fail at runtime with "No such file or directory"; the fix is to rename glm5_nvfp4_b200.sh to glm5_fp4_b200.sh (or create the new file) in the same PR.

Extended reasoning...

What the bug is and how it manifests

The PR renames the field value from to for the benchmark configuration in . However, the corresponding benchmark shell script that is expected to exist under was not renamed or created to match the new value. At runtime, the CI/CD runners will attempt to execute a script that does not exist, resulting in a hard failure for every GLM-5 NVFP4 B200 benchmark job.

The specific code path that triggers it

Both (line 20) and (line 39) construct the benchmark script path dynamically:

The environment variable is populated directly from the field in the YAML configuration (confirmed by line 83). With derived from the model prefix (e.g., ), evaluates to , so the full resolved path becomes .

Why existing code doesn't prevent it

There is no guard or existence check before the call — the script simply tries to execute the resolved path directly. The YAML config field and the benchmark shell script filenames are tightly coupled by convention but not enforced at commit or workflow-definition time. The rename of the YAML value is therefore a breaking change unless the shell script file is also renamed.

What the impact would be

Every CI run that triggers the benchmark (on both and runner types) will fail immediately at the script execution step with . No benchmark data will be collected, wasting runner time and blocking any dependent reporting or dashboards.

How to fix it

Rename Error: The following required environment variables are not set:

MODEL

TP

CONC

ISL

OSL

RANDOM_RANGE_RATIO

RESULT_FILENAME

EP_SIZE to in this same PR. Alternatively, create with equivalent content. The YAML rename alone is incomplete without the corresponding script rename.

Step-by-step proof

PR changes : →

CI workflow reads this YAML and sets in the runner environment ( line 83).

line 20 executes:

benchmarks/single_node/glm5_fp8_b200.sh
benchmarks/single_node/glm5_nvfp4_b200.sh shows only — is absent.

Shell exits with error: . Job fails.

claude · 2026-03-29T05:47:26Z

🔴 This PR modifies .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for glm5-nvfp4-b200-sglang) but does not update perf-changelog.yaml. Per the repo's own policy in .github/workflows/claude-pr-review.yml lines 122–126, this is a 🔴 BLOCKING issue. A changelog entry documenting the precision key rename must be added to perf-changelog.yaml with a reference to PR #976 before this PR can merge.

Extended reasoning...

What the bug is and how it manifests

The repo enforces a policy (defined in .github/workflows/claude-pr-review.yml, lines 122–126) that whenever .github/configs/nvidia-master.yaml or .github/configs/amd-master.yaml is edited in a PR, perf-changelog.yaml must also be updated in the same PR. The automated Claude PR-review bot will read the changed-file list and flag the absence as a 🔴 BLOCKING issue, leaving a comment that prevents the PR from being considered ready to merge.

The specific code path that triggers it

The diff for this PR touches exactly one file: .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for the glm5-nvfp4-b200-sglang config key at lines 1812–1816). The perf-changelog.yaml file is not in the PR's changed-file list.

Why existing code doesn't prevent it

There is no automated CI check that hard-blocks the merge via a status check. The enforcement is done by the Claude PR-review GitHub Actions bot, which posts a blocking review comment. Because it is bot-enforced rather than a required status check, the PR can technically be merged by a maintainer who overrides it — but the policy is explicit and the bot will flag it immediately on review.

Impact

The omission violates the repo's documented change-tracking policy and will cause the automated reviewer to block the PR on its first pass. This adds friction and delays the merge. More broadly, perf-changelog.yaml is the historical record used to audit what configuration parameters changed and when; skipping an entry means this rename (from the incorrect nvfp4 key to the correct fp4 key) will be untracked.

How to fix it

Add an entry to perf-changelog.yaml following the established format seen in other entries (PR #919, #857, etc.):

- config-keys: - glm5-nvfp4-b200-sglang description: - "Fix precision key name from 'nvfp4' to 'fp4' for GLM-5 NVFP4 B200 SGLang config" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/976

Step-by-step proof

.github/workflows/claude-pr-review.yml lines 122–126 state: "If either master config file was edited AND perf-changelog.yaml was NOT edited in the same PR: This is a 🔴 BLOCKING issue."

The PR's changed-files list contains only .github/configs/nvidia-master.yaml.

perf-changelog.yaml is absent from the changed-files list.

Condition (master config edited) = true; Condition (perf-changelog edited) = false.

Therefore the policy is triggered and the bot will block the PR.

claude · 2026-03-29T05:47:26Z

🟡 The config key glm5-nvfp4-b200-sglang still embeds nvfp4 while precision: fp4 after this fix, breaking the established {model-prefix}-{precision}-{hardware}-{framework} naming pattern. Consider renaming the key to glm5-fp4-b200-sglang and updating the corresponding reference in perf-changelog.yaml.

Extended reasoning...

What the bug is and how it manifests

The PR correctly changes precision: nvfp4 to precision: fp4, but leaves the YAML key itself as glm5-nvfp4-b200-sglang. After the fix the key name and the precision field are inconsistent: the key says nvfp4 while the precision field says fp4.

The specific code path that triggers it

All other NVFP4-quantized model configs in nvidia-master.yaml use fp4 in their key names, not nvfp4:

kimik2.5-fp4-b200-vllm — model: nvidia/Kimi-K2.5-NVFP4, precision: fp4

dsr1-fp4-gb200-dynamo-trt — model: nvidia/DeepSeek-R1-0528-NVFP4-v2, precision: fp4

glm5-nvfp4-b200-sglang is the only entry that bakes the vendor tag nvfp4 into the key while the canonical precision field says fp4.

Additionally, perf-changelog.yaml (line 1210) references the old key name and would be a dangling reference if the key is ever renamed downstream.

Why existing code doesn't prevent it

The precision field is authoritative for benchmark script discovery (per CONFIGS.md line 33), so full-sweep workflows continue to work. The convention mismatch is therefore not caught by any hard failure — it's a silent inconsistency.

Impact

Wildcard key filtering such as test-config --config-keys '*-fp4-*' will miss this entry, reducing discoverability. Developers searching for all fp4 configs by key pattern will silently omit this one.

Addressing the refutation

A refutation notes that CONFIGS.md line 25 says the naming convention is "not required" and that no existing tooling mechanically parses key names. Both points are valid: this is not a functional breakage, which is why the severity is nit. However, every other NVFP4 model entry already follows the fp4 convention in the key name, making this the sole outlier after the precision field was corrected. Consistency in human-readable identifiers still has value, especially when perf-changelog.yaml has a cross-reference that would need to stay in sync regardless.

How to fix

Rename the key from glm5-nvfp4-b200-sglang to glm5-fp4-b200-sglang in .github/configs/nvidia-master.yaml, and update the matching reference in perf-changelog.yaml to keep cross-references consistent.

Step-by-step proof

After this PR: key = glm5-nvfp4-b200-sglang, precision field = fp4.

A developer runs test-config --config-keys '*-fp4-b200-sglang' to find all fp4 B200 SGLang configs.

The glob *-fp4-b200-sglang does not match glm5-nvfp4-b200-sglang, so this entry is silently skipped.

Compare: kimik2.5-fp4-b200-vllm correctly matches *-fp4-b200-vllm because its key was set to fp4 from the start.

-Original file line number
+Diff line change
@@ Expand Up / @@ -1809,10 +1809,10 @@ @@
     glm5-nvfp4-b200-sglang:
       image: lmsysorg/sglang:nightly-dev-cu13-20260328-a27651d5
       model: nvidia/GLM-5-NVFP4
       model-prefix: glm5
       runner: b200
-      precision: nvfp4
+      precision: fp4
       framework: sglang
       multinode: false
       seq-len-configs:
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glm5 fix precision key name #976

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Claude / Claude Code Review

claude Bot Mar 29, 2026

Uh oh!

Claude / Claude Code Review

claude Bot Mar 29, 2026

Uh oh!

Claude / Claude Code Review

claude Bot Mar 29, 2026

Uh oh!

glm5 fix precision key name #976

Uh oh!

glm5 fix precision key name #976

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Claude / Claude Code Review

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Claude / Claude Code Review

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

What the bug is and how it manifests

The specific code path that triggers it

Why existing code doesn't prevent it

Impact

How to fix it

Step-by-step proof

Uh oh!

Claude / Claude Code Review

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!