glm5 fix precision key name#976
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
| model-prefix: glm5 | ||
| runner: b200 | ||
| precision: nvfp4 | ||
| precision: fp4 |
There was a problem hiding this comment.
🔴 The precision rename from nvfp4 to fp4 breaks B200 benchmark execution because the corresponding shell script was not renamed. Both runners/launch_b200-nb.sh and runners/launch_b200-dgxc.sh dynamically construct the benchmark script path using the PRECISION env var, so they will now look for benchmarks/single_node/glm5_fp4_b200.sh, which does not exist — only glm5_nvfp4_b200.sh is present. All GLM-5 NVFP4 B200 benchmark jobs will fail at runtime with "No such file or directory"; the fix is to rename glm5_nvfp4_b200.sh to glm5_fp4_b200.sh (or create the new file) in the same PR.
Extended reasoning...
What the bug is and how it manifests
The PR renames the field value from to for the benchmark configuration in . However, the corresponding benchmark shell script that is expected to exist under was not renamed or created to match the new value. At runtime, the CI/CD runners will attempt to execute a script that does not exist, resulting in a hard failure for every GLM-5 NVFP4 B200 benchmark job.
The specific code path that triggers it
Both (line 20) and (line 39) construct the benchmark script path dynamically:
The environment variable is populated directly from the field in the YAML configuration (confirmed by line 83). With derived from the model prefix (e.g., ), evaluates to , so the full resolved path becomes .
Why existing code doesn't prevent it
There is no guard or existence check before the call — the script simply tries to execute the resolved path directly. The YAML config field and the benchmark shell script filenames are tightly coupled by convention but not enforced at commit or workflow-definition time. The rename of the YAML value is therefore a breaking change unless the shell script file is also renamed.
What the impact would be
Every CI run that triggers the benchmark (on both and runner types) will fail immediately at the script execution step with . No benchmark data will be collected, wasting runner time and blocking any dependent reporting or dashboards.
How to fix it
Rename Error: The following required environment variables are not set:
- MODEL
- TP
- CONC
- ISL
- OSL
- RANDOM_RANGE_RATIO
- RESULT_FILENAME
- EP_SIZE to in this same PR. Alternatively, create with equivalent content. The YAML rename alone is incomplete without the corresponding script rename.
Step-by-step proof
- PR changes : →
- CI workflow reads this YAML and sets in the runner environment ( line 83).
- line 20 executes:
- benchmarks/single_node/glm5_fp8_b200.sh
benchmarks/single_node/glm5_nvfp4_b200.sh shows only — is absent. - Shell exits with error: . Job fails.
| model: nvidia/GLM-5-NVFP4 | ||
| model-prefix: glm5 | ||
| runner: b200 | ||
| precision: nvfp4 | ||
| precision: fp4 | ||
| framework: sglang |
There was a problem hiding this comment.
🔴 This PR modifies .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for glm5-nvfp4-b200-sglang) but does not update perf-changelog.yaml. Per the repo's own policy in .github/workflows/claude-pr-review.yml lines 122–126, this is a 🔴 BLOCKING issue. A changelog entry documenting the precision key rename must be added to perf-changelog.yaml with a reference to PR #976 before this PR can merge.
Extended reasoning...
What the bug is and how it manifests
The repo enforces a policy (defined in .github/workflows/claude-pr-review.yml, lines 122–126) that whenever .github/configs/nvidia-master.yaml or .github/configs/amd-master.yaml is edited in a PR, perf-changelog.yaml must also be updated in the same PR. The automated Claude PR-review bot will read the changed-file list and flag the absence as a 🔴 BLOCKING issue, leaving a comment that prevents the PR from being considered ready to merge.
The specific code path that triggers it
The diff for this PR touches exactly one file: .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for the glm5-nvfp4-b200-sglang config key at lines 1812–1816). The perf-changelog.yaml file is not in the PR's changed-file list.
Why existing code doesn't prevent it
There is no automated CI check that hard-blocks the merge via a status check. The enforcement is done by the Claude PR-review GitHub Actions bot, which posts a blocking review comment. Because it is bot-enforced rather than a required status check, the PR can technically be merged by a maintainer who overrides it — but the policy is explicit and the bot will flag it immediately on review.
Impact
The omission violates the repo's documented change-tracking policy and will cause the automated reviewer to block the PR on its first pass. This adds friction and delays the merge. More broadly, perf-changelog.yaml is the historical record used to audit what configuration parameters changed and when; skipping an entry means this rename (from the incorrect nvfp4 key to the correct fp4 key) will be untracked.
How to fix it
Add an entry to perf-changelog.yaml following the established format seen in other entries (PR #919, #857, etc.):
- config-keys:
- glm5-nvfp4-b200-sglang
description:
- "Fix precision key name from 'nvfp4' to 'fp4' for GLM-5 NVFP4 B200 SGLang config"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/976Step-by-step proof
.github/workflows/claude-pr-review.ymllines 122–126 state: "If either master config file was edited ANDperf-changelog.yamlwas NOT edited in the same PR: This is a 🔴 BLOCKING issue."- The PR's changed-files list contains only
.github/configs/nvidia-master.yaml. perf-changelog.yamlis absent from the changed-files list.- Condition (master config edited) = true; Condition (perf-changelog edited) = false.
- Therefore the policy is triggered and the bot will block the PR.
| model: nvidia/GLM-5-NVFP4 | ||
| model-prefix: glm5 | ||
| runner: b200 | ||
| precision: nvfp4 | ||
| precision: fp4 | ||
| framework: sglang | ||
| multinode: false | ||
| seq-len-configs: |
There was a problem hiding this comment.
🟡 The config key glm5-nvfp4-b200-sglang still embeds nvfp4 while precision: fp4 after this fix, breaking the established {model-prefix}-{precision}-{hardware}-{framework} naming pattern. Consider renaming the key to glm5-fp4-b200-sglang and updating the corresponding reference in perf-changelog.yaml.
Extended reasoning...
What the bug is and how it manifests
The PR correctly changes precision: nvfp4 to precision: fp4, but leaves the YAML key itself as glm5-nvfp4-b200-sglang. After the fix the key name and the precision field are inconsistent: the key says nvfp4 while the precision field says fp4.
The specific code path that triggers it
All other NVFP4-quantized model configs in nvidia-master.yaml use fp4 in their key names, not nvfp4:
kimik2.5-fp4-b200-vllm— model: nvidia/Kimi-K2.5-NVFP4, precision: fp4dsr1-fp4-gb200-dynamo-trt— model: nvidia/DeepSeek-R1-0528-NVFP4-v2, precision: fp4
glm5-nvfp4-b200-sglang is the only entry that bakes the vendor tag nvfp4 into the key while the canonical precision field says fp4.
Additionally, perf-changelog.yaml (line 1210) references the old key name and would be a dangling reference if the key is ever renamed downstream.
Why existing code doesn't prevent it
The precision field is authoritative for benchmark script discovery (per CONFIGS.md line 33), so full-sweep workflows continue to work. The convention mismatch is therefore not caught by any hard failure — it's a silent inconsistency.
Impact
Wildcard key filtering such as test-config --config-keys '*-fp4-*' will miss this entry, reducing discoverability. Developers searching for all fp4 configs by key pattern will silently omit this one.
Addressing the refutation
A refutation notes that CONFIGS.md line 25 says the naming convention is "not required" and that no existing tooling mechanically parses key names. Both points are valid: this is not a functional breakage, which is why the severity is nit. However, every other NVFP4 model entry already follows the fp4 convention in the key name, making this the sole outlier after the precision field was corrected. Consistency in human-readable identifiers still has value, especially when perf-changelog.yaml has a cross-reference that would need to stay in sync regardless.
How to fix
Rename the key from glm5-nvfp4-b200-sglang to glm5-fp4-b200-sglang in .github/configs/nvidia-master.yaml, and update the matching reference in perf-changelog.yaml to keep cross-references consistent.
Step-by-step proof
- After this PR: key =
glm5-nvfp4-b200-sglang, precision field =fp4. - A developer runs
test-config --config-keys '*-fp4-b200-sglang'to find all fp4 B200 SGLang configs. - The glob
*-fp4-b200-sglangdoes not matchglm5-nvfp4-b200-sglang, so this entry is silently skipped. - Compare:
kimik2.5-fp4-b200-vllmcorrectly matches*-fp4-b200-vllmbecause its key was set tofp4from the start.
No description provided.