End to end acceptance rate regression test #226

fynnsu · 2025-12-18T03:49:27Z

Add acceptance rate metric collection to tests/e2e/vllm/run_vllm.py
Add acceptance rate optional asserts to tests/e2e/vllm/utils.py
Add tests/e2e/vllm/test_gen_train_acceptance.py based on examples/data_generation_and_training/llama3_8b_sharegpt_5k.py, which trains a llama 3.1 8B model on 5k samples from sharegpt and then checks the acceptance rate on several test prompts. This test uses the functionality added to the above files.

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>

github-actions · 2025-12-18T04:07:53Z

📦 Build Artifacts Available
The build artifacts (`.whl` and `.tar.gz`) have been successfully generated and are available for download: https://github.com/vllm-project/speculators/actions/runs/20377405084/artifacts/4927042111.
They will be retained for up to 30 days.
Commit: 1f86169

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>

shanjiaz

Looks awesome! Do we want to test regression for different models? Maybe the models in examples?

brian-dellabetta

nice!

fynnsu · 2025-12-19T15:33:10Z

Looks awesome! Do we want to test regression for different models? Maybe the models in examples?

Yes, but I think we need to review our compute budget for this because the current llama 3.1 8b on 5k samples test already takes half an hour on a single H100 gpu.

dsikka

Just a question otherwise lgtm

dsikka · 2025-12-19T17:15:54Z

tests/e2e/vllm/run_vllm.py

    return parser.parse_args()


+def extract_metrics(


Is this different than how Rahul set-up extracting metrics from the logs through is benchmarking work? Is there anyway we could share that functionality?

Yeah it's a different approach. I got this approach from the vllm examples/offline_inference/spec_decode.py script.

The challenge is, this system uses vllm metrics and works because we're running vllm through the python api. Rahul's testing instead uses the cli to spin up a vllm instances which guidellm then interacts with. The advantage of the guidellm approach is that it allows us to simulate slightly more "real world" workloads and measure actual server response times. I can look into if there's a way to use the metrics system to get acceptance rates instead of the current log scrapping method, but either way I don't think we can easily combine the implementations.

fynnsu added 2 commits December 15, 2025 15:51

Add acceptance threshold checks to run_vllm script

8e6eb0f

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>

Add gen train acceptance test

3b18c42

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>

Small fixes

caea2c2

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>

fynnsu force-pushed the acceptance_test branch from 357a855 to caea2c2 Compare December 18, 2025 18:39

shanjiaz approved these changes Dec 18, 2025

View reviewed changes

brian-dellabetta approved these changes Dec 18, 2025

View reviewed changes

dsikka approved these changes Dec 19, 2025

View reviewed changes

Merge branch 'main' into acceptance_test

1f86169

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

End to end acceptance rate regression test #226

End to end acceptance rate regression test #226

Uh oh!

fynnsu commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025 •

edited

Loading

Uh oh!

shanjiaz left a comment

Uh oh!

brian-dellabetta left a comment

Uh oh!

fynnsu commented Dec 19, 2025

Uh oh!

dsikka left a comment

Uh oh!

dsikka Dec 19, 2025

Uh oh!

fynnsu Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

End to end acceptance rate regression test #226

Are you sure you want to change the base?

End to end acceptance rate regression test #226

Uh oh!

Conversation

fynnsu commented Dec 18, 2025

Uh oh!

github-actions bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

fynnsu commented Dec 19, 2025

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

dsikka Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

fynnsu Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Dec 18, 2025 •

edited

Loading