From 54e6cde9ce577c3493d83f0d520974b0076ea71e Mon Sep 17 00:00:00 2001
From: "claude[bot]" <41898282+claude[bot]@users.noreply.github.com>
Date: Thu, 5 Feb 2026 00:14:28 +0000
Subject: [PATCH] Add benchmark script code style guidelines to PR review
 workflow

Add guidelines for:
- Formatting sglang.launch_server and vllm serve commands on separate lines
- Requiring --use-chat-template flag for MTP benchmark scripts

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
---
 .github/workflows/claude-pr-review.yml | 55 ++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/.github/workflows/claude-pr-review.yml b/.github/workflows/claude-pr-review.yml
index 2a1cbc80d..5c8302a40 100644
--- a/.github/workflows/claude-pr-review.yml
+++ b/.github/workflows/claude-pr-review.yml
@@ -204,3 +204,58 @@ jobs:
                - **Base Lines:** [base count]
                - **Change:** [+/-][difference] lines
             5. Use 📈 for additions, 📉 for removals, ➡️ for no change
+
+            ## Benchmark Script Code Style Guidelines:
+            When reviewing changes to `benchmarks/*.sh` files, verify the following code style requirements:
+
+            ### 1. Server Launch Command Formatting:
+            All `sglang.launch_server` and `vllm serve` commands MUST have their arguments formatted on separate lines for readability.
+
+            **Invalid format (all arguments on one line):**
+            ```bash
+            python -m sglang.launch_server --model $MODEL_PATH --tp 8 --ep 1 --port $PORT --quantization fp8 --kv-cache-dtype fp8_e4m3
+            ```
+
+            **Valid format (arguments on separate lines):**
+            ```bash
+            python -m sglang.launch_server \
+                --model $MODEL_PATH \
+                --tp 8 \
+                --ep 1 \
+                --port $PORT \
+                --quantization fp8 \
+                --kv-cache-dtype fp8_e4m3
+            ```
+
+            The same applies to `vllm serve` commands:
+            ```bash
+            vllm serve $MODEL_PATH \
+                --tensor-parallel-size 8 \
+                --port $PORT \
+                --quantization fp8
+            ```
+
+            If a benchmark script has server launch commands on a single line:
+            - This is a 🟡 **WARNING** issue
+            - Comment: "Server launch commands (`sglang.launch_server` or `vllm serve`) should have each argument on a separate line for better readability and easier code review. Please reformat using line continuations (`\`)."
+
+            ### 2. MTP (Multi-Token Prediction) Benchmark Requirements:
+            When reviewing MTP benchmark scripts (files containing `mtp` in the name or using EAGLE speculative decoding):
+
+            **Required: `--use-chat-template` flag**
+            MTP benchmarks MUST include the `--use-chat-template` flag in the benchmark client configuration.
+
+            **Why it matters:**
+            - Chat templates ensure proper tokenization for speculative decoding
+            - Without this flag, MTP performance may be incorrect or suboptimal
+            - Ensures consistent behavior across different model configurations
+
+            **Validation:**
+            Check that any benchmark script with MTP/EAGLE speculative decoding includes `--use-chat-template`:
+            ```bash
+            benchmark_client ... --use-chat-template
+            ```
+
+            If an MTP benchmark script is missing `--use-chat-template`:
+            - This is a 🔴 **BLOCKING** issue
+            - Comment: "MTP benchmark scripts MUST include `--use-chat-template` flag in the benchmark client configuration. This ensures proper tokenization for speculative decoding. Please add `--use-chat-template` to the benchmark command."