The official repository for our Findings of EMNLP 2025 paper Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages.
| 🔥 News | 🌈 Method | ⚡️ Quick Start |
- Aug 2025: Our paper has been accepted by Findings of EMNLP 2025 .
- May 2025: Our code is open-sourced.
conda create -n pisql python=3.10 -y
conda activate pisql
pip install -r requirements.txtDownload dev.zip from the BIRD benchmark site. Extract it and move the resulting dev_20240627/ folder into your project's databases/ directory. Then, navigate into databases/dev_20240627/ and unzip dev_databases.zip. This will create a dev_databases/ subdirectory (i.e., databases/dev_20240627/dev_databases/) containing the actual database files.
# Pi-SQL need construct csv files from the databases first.
python preprocess.pyExecute the script to generate Python programs in different strategies and their corresponding SQL queries:
bash run_sql_generate.shThe output, including statistic.json, will be saved in the final_{strategy}/ directory, where strategy can be 'direct', 'filter', or 'merge'.
Execute the script to select the final SQL from the generated candidates:
python select_final.sql.pyThe results, including statistic.json, will be saved in the final_sql/ directory.
@inproceedings{chi-etal-2024-pi,
title = "Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages",
author = "Chi, Yongdong and
Wang, Hanqing and
Chen, Yun and
Yang, Yan and
Yang, Jian and
Yang, Zonghan and
Yan, Xiao and
Chen, Guanhua",
editor = "Christodoulopoulos, Christos and
Chakraborty, Tanmoy and
Rose, Carolyn and
Peng, Violet",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
month = nov,
year = "2025",
address = "Suzhou, Jiangsu, China",
publisher = "Association for Computational Linguistics",
}
