Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions .github/workflows/claude-pr-review.yml
Original file line number Diff line number Diff line change
Expand Up @@ -204,3 +204,58 @@ jobs:
- **Base Lines:** [base count]
- **Change:** [+/-][difference] lines
5. Use 📈 for additions, 📉 for removals, ➡️ for no change

## Benchmark Script Code Style Guidelines:
When reviewing changes to `benchmarks/*.sh` files, verify the following code style requirements:

### 1. Server Launch Command Formatting:
All `sglang.launch_server` and `vllm serve` commands MUST have their arguments formatted on separate lines for readability.

**Invalid format (all arguments on one line):**
```bash
python -m sglang.launch_server --model $MODEL_PATH --tp 8 --ep 1 --port $PORT --quantization fp8 --kv-cache-dtype fp8_e4m3
```

**Valid format (arguments on separate lines):**
```bash
python -m sglang.launch_server \
--model $MODEL_PATH \
--tp 8 \
--ep 1 \
--port $PORT \
--quantization fp8 \
--kv-cache-dtype fp8_e4m3
```

The same applies to `vllm serve` commands:
```bash
vllm serve $MODEL_PATH \
--tensor-parallel-size 8 \
--port $PORT \
--quantization fp8
```

If a benchmark script has server launch commands on a single line:
- This is a 🟡 **WARNING** issue
- Comment: "Server launch commands (`sglang.launch_server` or `vllm serve`) should have each argument on a separate line for better readability and easier code review. Please reformat using line continuations (`\`)."

### 2. MTP (Multi-Token Prediction) Benchmark Requirements:
When reviewing MTP benchmark scripts (files containing `mtp` in the name or using EAGLE speculative decoding):

**Required: `--use-chat-template` flag**
MTP benchmarks MUST include the `--use-chat-template` flag in the benchmark client configuration.

**Why it matters:**
- Chat templates ensure proper tokenization for speculative decoding
- Without this flag, MTP performance may be incorrect or suboptimal
- Ensures consistent behavior across different model configurations

**Validation:**
Check that any benchmark script with MTP/EAGLE speculative decoding includes `--use-chat-template`:
```bash
benchmark_client ... --use-chat-template
```

If an MTP benchmark script is missing `--use-chat-template`:
- This is a 🔴 **BLOCKING** issue
- Comment: "MTP benchmark scripts MUST include `--use-chat-template` flag in the benchmark client configuration. This ensures proper tokenization for speculative decoding. Please add `--use-chat-template` to the benchmark command."