SD²: Steering Pretrained Drafters During Speculative Decoding

Frédéric Berdoz · Peer Rheinboldt · Roger Wattenhofer

Accepted at FPI @ NeurIPS 2025, SPIGM @ NeurIPS 2025 and at AAAI 2026

Overview

SD² is a framework for steering pretrained drafters during speculative decoding to further increase alignment between drafter and verifier. This repo contains:

Evaluation tools
Training scripts
Synthetic dataset generation
Configuration files and checkpoints used in the paper can be found in this Huggingface collection

Getting Started

Create a new python environment with python version 3.12

python -m venv ./.venv
source ./.venv/bin/activate

Install dependencies

pip install -r requirements.txt

Note that you will have to login to Hugging Face and have access to the Llama Models. Additionally training requires Weights & Biases

Running Evaluation

Open eval.py and add the configurations you'd like to try out to configs
Run the script

python eval.py --out_file 'eval_results.json'
# We recommend piping output to a seperate file for later Evaluation
python eval.py --pattern "llama" --out_file "llama_results.json"  > llama_out.log

Running Training

Update `configs/experiment.yaml with correct data
Generate a synthetic dataset

python create_synth_ds.py --bsz 128 --target_len 256  --use_ultrachat_prompts --model_name 'meta-llama/llama-3.1-8b-instruct'

Run training script

python main.py fit --config configs/experiment.yaml \
  --data.path ./data/synthetic/llama-3.1-8b-instruct-ultrachat-prompts \
  --data.bsz=12 \
  --data.n_val 3000 \
  --trainer.max_epochs 6 \
  --trainer.val_check_interval 2000 \
  --trainer.accumulate_grad_batches 2 \
  --model.method guided-drafter \
  --model.lr_start 0.00001 \
  --model.lr_end 0.000001 \
  --model.warmup_steps 1000  \
  --model.loss_method kl \
  --model.drafter=meta-llama/llama-3.2-1b-instruct \
  --model.finetune_drafter full \
  --model.guide_method merged \
  --model.d_layer all \
  --model.verifier=meta-llama/llama-3.1-8b-instruct \
  --model.v_layer '[3,16,29]'

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
data/eval		data/eval
src		src
.gitignore		.gitignore
.python-version		.python-version
ReadMe.md		ReadMe.md
create_synth_ds.py		create_synth_ds.py
eval.py		eval.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
upload_to_hf.py		upload_to_hf.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SD²: Steering Pretrained Drafters During Speculative Decoding

Overview

Getting Started

Running Evaluation

Running Training

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

ETH-DISCO/SD-square

Folders and files

Latest commit

History

Repository files navigation

SD²: Steering Pretrained Drafters During Speculative Decoding

Overview

Getting Started

Running Evaluation

Running Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages