Skip to content

Official implementation of "REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving" (NeurIPS 2025)

License

Notifications You must be signed in to change notification settings

he-actlab/REASONING_COMPILER

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REASONING COMPILER

This is the official implementation of REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving (NeurIPS 2025)

In our project, we use TVM since it is an open source compiler stack for deep learning systems with Apache-2.0 license.

Quick Start

To run this repo, follow these steps:

  1. Clone this repo. Configure the environment as detailed in TVM's documentation https://tvm.apache.org/docs/install/index.html
  2. Instead of using the default strategy, create the LLM guided MCTS search strategy object by
llm_mcts_strategy = MCTSSearchPyFull(
    use_llm=True,
    llm_budget=600,
    llm_model_name="API_MODEL_NAME",
)

If you want to run the pure MCTS search, set use_llm = False so you do not enable LLM.

To use the function tune_tir for tuning, pass llm_mcts_strategy as a parameter of tune_tir, like

database = ms.tune_tir(
    mod=MyModule,
    target="llvm --num-cores=1",
    max_trials_global=64,
    num_trials_per_iter=64,
    work_dir="./tune_tmp",
    strategy=llm_mcts_strategy,
)

Citation

@inproceedings{
tang2025reasoning,
title={{REASONING} {COMPILER}: {LLM}-Guided Optimizations for Efficient Model Serving},
author={Annabelle Sujun Tang and Christopher Priebe and Rohan Mahapatra and Lianhui Qin and Hadi Esmaeilzadeh},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}

About

Official implementation of "REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving" (NeurIPS 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 52.5%
  • C++ 42.9%
  • Shell 0.8%
  • CMake 0.7%
  • TypeScript 0.6%
  • Rust 0.5%
  • Other 2.0%