Skip to content

Conversation

@peri044
Copy link

@peri044 peri044 commented Jan 22, 2026

This PR has some fixes for a few bugs encountered while running the kernel generation & evaluation using a local hosted model.
Please review

cc: @simonguozirui

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>
@peri044
Copy link
Author

peri044 commented Jan 25, 2026

@simonguozirui @anneouyang Can you take a look at this ? Also, I have been trying to generate kernels using https://github.com/ScalingIntelligence/KernelBench?tab=readme-ov-file#run-on-all-problems but when I evaluate them, I get poor pass@k results with strong models (eg: GPT-5.2). The commands are

# For generation
uv run python scripts/generate_samples.py run_name=test_hf_level_1_gpt5.2_fp16 dataset_src=huggingface level=1 num_workers=50  model_name="azure/openai/gpt-5.2" server_type=openai server_address=localhost server_port=9000  log_prompt=True verbose=True num_samples=50 max_tokens=32768 precision="fp32" reasoning_effort="high" is_reasoning_model=True subset="(1, 10)"

# For evaluation
uv run python scripts/eval_from_generations.py run_name=test_hf_level_1_gpt5.2_fp32_subset10_check_kernel_false dataset_src=local level=1 num_gpu_devices=1 timeout=300 subset="(0, 10)" num_samples_per_problem=50 pass_at_k_values="[1, 3]"

I get syntax errors, lockfile errors and missing headerfile errors for level 1 problem solutions. I believe this is not expected since level - 1 problems are almost solved ? Can you share some insights and a reproducer of results ? Thanks

@peri044
Copy link
Author

peri044 commented Jan 27, 2026

@AffectionateCurry @simonguozirui @anneouyang @nataliakokoromyti - would appreciate any insights or help on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant