glm5 fix precision key name by functionstackx · Pull Request #976 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-29T05:33:31Z

No description provided.

github-actions · 2026-03-29T05:33:39Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

claude · 2026-03-29T05:47:26Z

  model-prefix: glm5
  runner: b200
-  precision: nvfp4
+  precision: fp4


🔴 The precision rename from nvfp4 to fp4 breaks B200 benchmark execution because the corresponding shell script was not renamed. Both runners/launch_b200-nb.sh and runners/launch_b200-dgxc.sh dynamically construct the benchmark script path using the PRECISION env var, so they will now look for benchmarks/single_node/glm5_fp4_b200.sh, which does not exist — only glm5_nvfp4_b200.sh is present. All GLM-5 NVFP4 B200 benchmark jobs will fail at runtime with "No such file or directory"; the fix is to rename glm5_nvfp4_b200.sh to glm5_fp4_b200.sh (or create the new file) in the same PR.

Extended reasoning...

What the bug is and how it manifests

The PR renames the field value from to for the benchmark configuration in . However, the corresponding benchmark shell script that is expected to exist under was not renamed or created to match the new value. At runtime, the CI/CD runners will attempt to execute a script that does not exist, resulting in a hard failure for every GLM-5 NVFP4 B200 benchmark job.

The specific code path that triggers it

Both (line 20) and (line 39) construct the benchmark script path dynamically:

The environment variable is populated directly from the field in the YAML configuration (confirmed by line 83). With derived from the model prefix (e.g., ), evaluates to , so the full resolved path becomes .

Why existing code doesn't prevent it

There is no guard or existence check before the call — the script simply tries to execute the resolved path directly. The YAML config field and the benchmark shell script filenames are tightly coupled by convention but not enforced at commit or workflow-definition time. The rename of the YAML value is therefore a breaking change unless the shell script file is also renamed.

What the impact would be

Every CI run that triggers the benchmark (on both and runner types) will fail immediately at the script execution step with . No benchmark data will be collected, wasting runner time and blocking any dependent reporting or dashboards.

How to fix it

Rename Error: The following required environment variables are not set:

MODEL

TP

CONC

ISL

OSL

RANDOM_RANGE_RATIO

RESULT_FILENAME

EP_SIZE to in this same PR. Alternatively, create with equivalent content. The YAML rename alone is incomplete without the corresponding script rename.

Step-by-step proof

PR changes : →

CI workflow reads this YAML and sets in the runner environment ( line 83).

line 20 executes:

benchmarks/single_node/glm5_fp8_b200.sh
benchmarks/single_node/glm5_nvfp4_b200.sh shows only — is absent.

Shell exits with error: . Job fails.

claude · 2026-03-29T05:47:26Z

  model: nvidia/GLM-5-NVFP4
  model-prefix: glm5
  runner: b200
-  precision: nvfp4
+  precision: fp4
  framework: sglang


🔴 This PR modifies .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for glm5-nvfp4-b200-sglang) but does not update perf-changelog.yaml. Per the repo's own policy in .github/workflows/claude-pr-review.yml lines 122–126, this is a 🔴 BLOCKING issue. A changelog entry documenting the precision key rename must be added to perf-changelog.yaml with a reference to PR #976 before this PR can merge.

Extended reasoning...

What the bug is and how it manifests

The repo enforces a policy (defined in .github/workflows/claude-pr-review.yml, lines 122–126) that whenever .github/configs/nvidia-master.yaml or .github/configs/amd-master.yaml is edited in a PR, perf-changelog.yaml must also be updated in the same PR. The automated Claude PR-review bot will read the changed-file list and flag the absence as a 🔴 BLOCKING issue, leaving a comment that prevents the PR from being considered ready to merge.

The specific code path that triggers it

The diff for this PR touches exactly one file: .github/configs/nvidia-master.yaml (changing precision: nvfp4 → precision: fp4 for the glm5-nvfp4-b200-sglang config key at lines 1812–1816). The perf-changelog.yaml file is not in the PR's changed-file list.

Why existing code doesn't prevent it

There is no automated CI check that hard-blocks the merge via a status check. The enforcement is done by the Claude PR-review GitHub Actions bot, which posts a blocking review comment. Because it is bot-enforced rather than a required status check, the PR can technically be merged by a maintainer who overrides it — but the policy is explicit and the bot will flag it immediately on review.

Impact

The omission violates the repo's documented change-tracking policy and will cause the automated reviewer to block the PR on its first pass. This adds friction and delays the merge. More broadly, perf-changelog.yaml is the historical record used to audit what configuration parameters changed and when; skipping an entry means this rename (from the incorrect nvfp4 key to the correct fp4 key) will be untracked.

How to fix it

Add an entry to perf-changelog.yaml following the established format seen in other entries (PR #919, #857, etc.):

- config-keys: - glm5-nvfp4-b200-sglang description: - "Fix precision key name from 'nvfp4' to 'fp4' for GLM-5 NVFP4 B200 SGLang config" pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/976

Step-by-step proof

.github/workflows/claude-pr-review.yml lines 122–126 state: "If either master config file was edited AND perf-changelog.yaml was NOT edited in the same PR: This is a 🔴 BLOCKING issue."

The PR's changed-files list contains only .github/configs/nvidia-master.yaml.

perf-changelog.yaml is absent from the changed-files list.

Condition (master config edited) = true; Condition (perf-changelog edited) = false.

Therefore the policy is triggered and the bot will block the PR.

claude · 2026-03-29T05:47:26Z

  model: nvidia/GLM-5-NVFP4
  model-prefix: glm5
  runner: b200
-  precision: nvfp4
+  precision: fp4
  framework: sglang
  multinode: false
  seq-len-configs:


🟡 The config key glm5-nvfp4-b200-sglang still embeds nvfp4 while precision: fp4 after this fix, breaking the established {model-prefix}-{precision}-{hardware}-{framework} naming pattern. Consider renaming the key to glm5-fp4-b200-sglang and updating the corresponding reference in perf-changelog.yaml.

Extended reasoning...

What the bug is and how it manifests

The PR correctly changes precision: nvfp4 to precision: fp4, but leaves the YAML key itself as glm5-nvfp4-b200-sglang. After the fix the key name and the precision field are inconsistent: the key says nvfp4 while the precision field says fp4.

The specific code path that triggers it

All other NVFP4-quantized model configs in nvidia-master.yaml use fp4 in their key names, not nvfp4:

kimik2.5-fp4-b200-vllm — model: nvidia/Kimi-K2.5-NVFP4, precision: fp4

dsr1-fp4-gb200-dynamo-trt — model: nvidia/DeepSeek-R1-0528-NVFP4-v2, precision: fp4

glm5-nvfp4-b200-sglang is the only entry that bakes the vendor tag nvfp4 into the key while the canonical precision field says fp4.

Additionally, perf-changelog.yaml (line 1210) references the old key name and would be a dangling reference if the key is ever renamed downstream.

Why existing code doesn't prevent it

The precision field is authoritative for benchmark script discovery (per CONFIGS.md line 33), so full-sweep workflows continue to work. The convention mismatch is therefore not caught by any hard failure — it's a silent inconsistency.

Impact

Wildcard key filtering such as test-config --config-keys '*-fp4-*' will miss this entry, reducing discoverability. Developers searching for all fp4 configs by key pattern will silently omit this one.

Addressing the refutation

A refutation notes that CONFIGS.md line 25 says the naming convention is "not required" and that no existing tooling mechanically parses key names. Both points are valid: this is not a functional breakage, which is why the severity is nit. However, every other NVFP4 model entry already follows the fp4 convention in the key name, making this the sole outlier after the precision field was corrected. Consistency in human-readable identifiers still has value, especially when perf-changelog.yaml has a cross-reference that would need to stay in sync regardless.

How to fix

Rename the key from glm5-nvfp4-b200-sglang to glm5-fp4-b200-sglang in .github/configs/nvidia-master.yaml, and update the matching reference in perf-changelog.yaml to keep cross-references consistent.

Step-by-step proof

After this PR: key = glm5-nvfp4-b200-sglang, precision field = fp4.

A developer runs test-config --config-keys '*-fp4-b200-sglang' to find all fp4 B200 SGLang configs.

The glob *-fp4-b200-sglang does not match glm5-nvfp4-b200-sglang, so this entry is silently skipped.

Compare: kimik2.5-fp4-b200-vllm correctly matches *-fp4-b200-vllm because its key was set to fp4 from the start.

glm5 fix precision key name

92b3036

functionstackx requested a review from a team March 29, 2026 05:33

functionstackx requested review from jgangani and kedarpotdar-nv as code owners March 29, 2026 05:33

github-project-automation Bot added this to InferenceMAX Board Mar 29, 2026

functionstackx merged commit 758dd7f into main Mar 29, 2026
5 checks passed

functionstackx deleted the fix-glm5-precision-key branch March 29, 2026 05:33

github-project-automation Bot moved this to Done in InferenceMAX Board Mar 29, 2026

claude Bot reviewed Mar 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glm5 fix precision key name#976

glm5 fix precision key name#976
functionstackx merged 1 commit intomainfrom
fix-glm5-precision-key

functionstackx commented Mar 29, 2026

Uh oh!

github-actions Bot commented Mar 29, 2026

Uh oh!

Uh oh!

claude Bot Mar 29, 2026

Uh oh!

claude Bot Mar 29, 2026

Uh oh!

claude Bot Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

functionstackx commented Mar 29, 2026

Uh oh!

github-actions Bot commented Mar 29, 2026

Uh oh!

Uh oh!

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

What the bug is and how it manifests

The specific code path that triggers it

Why existing code doesn't prevent it

Impact

How to fix it

Step-by-step proof

Uh oh!

claude Bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant