[NAACL 2025] Official Implementation of H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables.
In this paper, we introduce a novel algorithm H-STAR that integrates both symbolic and semantic approaches to perform tabular reasoning tasks. H-STAR decomposes the table reasoning task into two stages: 1) Table Extraction and 2) Adaptive Reasoning.
Activate the environment by running
conda create -n hstar python=3.9
conda activate hstar
pip install -r requirements.txt
pip install records==0.5.3
Create a file named <file_name>.pth in the /[PATH to Conda]/envs/hstar/lib/python3.9/site-packages/ directory, and paste the Project root path [PATH to H-STAR ].
Benchmark datasets studied in the paper have been provided in the datasets/ directory.
Apply and get API keys from OpenAI API, save the key in key.txt
For running the Gemini model generate the API key from Vertex AI and store it as a .json file in the directory.
Run the H-STAR pipeline for different Large Language Models (LLMs) using:
For Open AI models:
python run_gpt.py
For Gemini/PaLM models:
run_gemini.py
The outputs for every intermediate step in the pipeline are saved in the results/ directory.
Evaluate the results for TabFact/ WikiTQ using the notebook
evaluate.ipynb
Evaluate FetaQA using command line instuction
python fetaqa_score.py --model_name [MODEL_NAME]
Set model_name to the desired LLM
If you find our paper or the repository helpful, please cite us with
@article{abhyankar2024h,
title={H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables},
author={Abhyankar, Nikhil and Gupta, Vivek and Roth, Dan and Reddy, Chandan K},
journal={arXiv preprint arXiv:2407.05952},
year={2024}
}
This implementation is based on Binding Language Models in Symbolic Languages. The work has also benefitted from TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition. Thanks to the author for releasing the code.
For any questions or issues, you are welcome to open an issue in this repo, or contact us at nikhilsa@vt.edu, keviv9@gmail.com.
