[TRITON] Add Triton Topk Kernel #458

hubertlu-tw · 2025-05-21T22:27:49Z

For the input tensors used in the following workload (hidden_size=128256, K=8):

python3 -m sglang.launch_server --model meta-llama/Meta-Llama-3-8B-Instruct --speculative-algo EAGLE     --speculative-draft lmsys/sglang-EAGLE-LLaMA3-Instruct-8B --speculative-num-steps 5     --speculative-eagle-topk 8 --speculative-num-draft-tokens 64 --dtype float16 --port 30000

python3 -m sglang.bench_serving --backend sglang  --dataset-name random  --random-input 1024  --random-output 1024   --num-prompts 100   --request-rate 4

With the topk kernel change, it show about 9.63% improvement in request throughput and 40% improvement in TTFT.

For the input tensors used in another internal workload (hidden_size=16, K=2):

To run the performance benchmark and analyze the peak memory usage of the kernel for various input shapes and k,

aiter/op_tests/op_benchmarks/triton# python bench_topk.py --roofline

To run the roofline model for the kernel for various input shapes and k,

aiter/op_tests/op_benchmarks/triton# python bench_topk.py --roofline

To run the unit test of the kernel

aiter/op_tests/triton# pytest test_topk.py

rahulbatra85 · 2025-05-23T22:15:51Z

@hubertlu-tw Can you please run the black linter tool locally and fix the issues?
pip install black
black [name of the file]

hubertlu-tw · 2025-05-25T21:04:32Z

@hubertlu-tw Can you please run the black linter tool locally and fix the issues? pip install black black [name of the file]

Sure. I have ran the below two linters for the scripts I added and tested them locally.

pip install black
black [name of the file]

pip install ruff
ruff check bench_topk.py --unsafe-fixes --fix [name of the file]

Thanks!

hubertlu-tw · 2025-05-27T21:44:47Z

@rahulbatra85 and @vgokhale could you please review the PR and let me know what I need to refactor or add? Thanks!

aiter/ops/triton/topk.py

rahulbatra85

Please see some my comments.
Thanks!

rahulbatra85 · 2025-06-12T14:40:25Z

@hubertlu-tw Let me know whenever you have changes made. Thanks!

hubertlu-tw · 2025-06-12T16:48:59Z

@rahulbatra85 I have refactored the code based on your suggestions. Thank you very much.

hubertlu-tw added 3 commits May 21, 2025 20:15

Add test script for triton topk

bae5549

Add benchmark script for triton topk

b9985dc

Add triton topk kernel

9fcf63f

hubertlu-tw requested review from rahulbatra85 and vgokhale May 21, 2025 22:27

hubertlu-tw self-assigned this May 21, 2025

Fix lint errors

49f33c2

hubertlu-tw changed the title ~~Add Triton Topk Kernel~~ [TRITON] Add Triton Topk Kernel May 22, 2025

Add roofline model for triton topk kernel

a2f30a1

Fix ruff_black errors

d024214

hubertlu-tw and others added 3 commits May 25, 2025 14:04

Merge branch 'main' into topk

8cb8af6

Fix black linter's error

4eed7ce

Merge branch 'main' into topk

1b0ed42