fix: Fixes for bugs encountered during generation & evaluation #134

peri044 · 2026-01-22T12:12:58Z

This PR has some fixes for a few bugs encountered while running the kernel generation & evaluation using a local hosted model.
Please review

cc: @simonguozirui

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

peri044 · 2026-01-25T21:47:53Z

@simonguozirui @anneouyang Can you take a look at this ? Also, I have been trying to generate kernels using https://github.com/ScalingIntelligence/KernelBench?tab=readme-ov-file#run-on-all-problems but when I evaluate them, I get poor pass@k results with strong models (eg: GPT-5.2). The commands are

# For generation
uv run python scripts/generate_samples.py run_name=test_hf_level_1_gpt5.2_fp16 dataset_src=huggingface level=1 num_workers=50  model_name="azure/openai/gpt-5.2" server_type=openai server_address=localhost server_port=9000  log_prompt=True verbose=True num_samples=50 max_tokens=32768 precision="fp32" reasoning_effort="high" is_reasoning_model=True subset="(1, 10)"

# For evaluation
uv run python scripts/eval_from_generations.py run_name=test_hf_level_1_gpt5.2_fp32_subset10_check_kernel_false dataset_src=local level=1 num_gpu_devices=1 timeout=300 subset="(0, 10)" num_samples_per_problem=50 pass_at_k_values="[1, 3]"

I get syntax errors, lockfile errors and missing headerfile errors for level 1 problem solutions. I believe this is not expected since level - 1 problems are almost solved ? Can you share some insights and a reproducer of results ? Thanks

peri044 · 2026-01-27T23:08:05Z

@AffectionateCurry @simonguozirui @anneouyang @nataliakokoromyti - would appreciate any insights or help on this thread.

fix: Fixes for bugs encountered during generation & evaluation

f9e8849

Signed-off-by: Dheeraj Peri <dperi@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fixes for bugs encountered during generation & evaluation #134

fix: Fixes for bugs encountered during generation & evaluation #134

Uh oh!

peri044 commented Jan 22, 2026 •

edited

Loading

Uh oh!

peri044 commented Jan 25, 2026

Uh oh!

peri044 commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: Fixes for bugs encountered during generation & evaluation #134

Are you sure you want to change the base?

fix: Fixes for bugs encountered during generation & evaluation #134

Uh oh!

Conversation

peri044 commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

peri044 commented Jan 25, 2026

Uh oh!

peri044 commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

peri044 commented Jan 22, 2026 •

edited

Loading