Skip to content

sustech-nlp/Pi-SQL

Repository files navigation

Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages

The official repository for our Findings of EMNLP 2025 paper Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages.

| 🔥 News | 🌈 Method | ⚡️ Quick Start |

|📓 Citation | 📃 Paper |

🔥 News

  • Aug 2025: Our paper has been accepted by Findings of EMNLP 2025 .
  • May 2025: Our code is open-sourced.

🌈 Pi-SQL

Pi-SQL

⚡️ Quick Start

1. Environment configuration

conda create -n pisql python=3.10 -y
conda activate pisql
pip install -r requirements.txt

2. Prepare the BIRD dev set

Download dev.zip from the BIRD benchmark site. Extract it and move the resulting dev_20240627/ folder into your project's databases/ directory. Then, navigate into databases/dev_20240627/ and unzip dev_databases.zip. This will create a dev_databases/ subdirectory (i.e., databases/dev_20240627/dev_databases/) containing the actual database files.

3. Prepare csv files

# Pi-SQL need construct csv files from the databases first.
python preprocess.py

4. Generate Python and SQL Programs

Execute the script to generate Python programs in different strategies and their corresponding SQL queries:

bash run_sql_generate.sh

The output, including statistic.json, will be saved in the final_{strategy}/ directory, where strategy can be 'direct', 'filter', or 'merge'.

5. Select the Final SQL

Execute the script to select the final SQL from the generated candidates:

python select_final.sql.py

The results, including statistic.json, will be saved in the final_sql/ directory.

📓 Citation

@inproceedings{chi-etal-2024-pi,
    title = "Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages",
    author = "Chi, Yongdong  and
      Wang, Hanqing  and
      Chen, Yun  and
      Yang, Yan  and
      Yang, Jian  and
      Yang, Zonghan  and
      Yan, Xiao  and
      Chen, Guanhua",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
    month = nov,
    year = "2025",
    address = "Suzhou, Jiangsu, China",
    publisher = "Association for Computational Linguistics",
}

About

[Findings of EMNLP 2025] Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors