From 88c1a16405dc0aa5d2bb06db75ea16cc28f744ba Mon Sep 17 00:00:00 2001
From: "claude[bot]" <41898282+claude[bot]@users.noreply.github.com>
Date: Tue, 17 Feb 2026 04:10:11 +0000
Subject: [PATCH] Add STP/MTP terminology definitions to agent docs

Define STP (Single Token Prediction) and MTP (Multi-Token Prediction)
in AGENTS.md and workflow prompt configs so agents understand that
STP means standard autoregressive decoding with no speculative
decoding or MTP.

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
---
 .github/workflows/claude-pr-review.yml | 4 ++++
 .github/workflows/claude.yml           | 2 ++
 AGENTS.md                              | 5 +++++
 3 files changed, 11 insertions(+)

diff --git a/.github/workflows/claude-pr-review.yml b/.github/workflows/claude-pr-review.yml
index a21b2afe6..1b5e3f96e 100644
--- a/.github/workflows/claude-pr-review.yml
+++ b/.github/workflows/claude-pr-review.yml
@@ -125,6 +125,10 @@ jobs:
             - The perf-changelog entry should document what changed in the config and include the PR link
             - Format: "Master config files were modified but `perf-changelog.yaml` was not updated. When changing `.github/configs/amd-master.yaml` or `.github/configs/nvidia-master.yaml`, you must add a corresponding entry to `perf-changelog.yaml` documenting the changes."
 
+            ## Terminology:
+            - **STP (Single Token Prediction)**: Standard autoregressive decoding — one token per forward pass. No speculative decoding or MTP. Benchmarks labeled "STP only" use vanilla decoding.
+            - **MTP (Multi-Token Prediction)**: Predicts multiple tokens per forward pass using speculative decoding (e.g., EAGLE, NEXTN).
+
             Remember: Silence is golden. No comment is better than a low-value comment.
 
             ## Container Image Accessibility Validation:
diff --git a/.github/workflows/claude.yml b/.github/workflows/claude.yml
index 4614556d5..8b4f81663 100644
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -227,4 +227,6 @@ jobs:
 
             ### Additional Knowledge
             - MI355 is gfx950 not gfx1201
+            - **STP (Single Token Prediction)**: Standard autoregressive decoding — one token per forward pass. No speculative decoding or MTP. Benchmarks labeled "STP only" use vanilla decoding.
+            - **MTP (Multi-Token Prediction)**: Predicts multiple tokens per forward pass using speculative decoding (e.g., EAGLE, NEXTN).
 
diff --git a/AGENTS.md b/AGENTS.md
index ecc1862f8..8ed144c81 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -46,6 +46,11 @@ InferenceMAX is an open-source, automated benchmarking system that continuously
 └── perf-changelog.yaml      # Triggers benchmarks on changes
 ```
 
+## Terminology
+
+- **STP (Single Token Prediction)**: Standard autoregressive decoding where one token is generated per forward pass. No speculative decoding or MTP (Multi-Token Prediction) is used. When a benchmark is labeled "STP only", it means vanilla decoding without any speculation.
+- **MTP (Multi-Token Prediction)**: A technique where the model predicts multiple tokens per forward pass, typically using speculative decoding methods like EAGLE or NEXTN.
+
 ## Key Technologies
 
 - **Python 3.13**: Core automation and config generation