This repository contains code for the paper "Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL".
If you use Fused in your work, please cite it as follows:
@article{wang2024improving,
title={Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL},
author={Wang, Dingzirui and Dou, Longxu and Zhang, Xuanliang and Zhu, Qingfu and Che, Wanxiang},
journal={arXiv preprint arXiv:2402.10663},
year={2024}
}
conda create -n fused python=3.9 -y
conda activate fused
pip install requirements.txtDownload and put the Spider databases in ./dataset/Spider/database
Implement your openai-key in utils/generator.py if you want to use openai to generate demonstrations.
Run generate/slurm/generate.bash to synthesize with transformers or generate/slurm/generate.35turbo.bash with openai api.
The synthesized demonstartions are save in "./generate/examples/<model>/<scale>/Spider/<turn>/example.filt.json" in the following format:
[
...,
{
"reference": "List[Dict[str, Any]]: demonstrations used for fusing",
"table": "Dict[str, Any]: database used",
"query": "str: synthesized SQL query",
"question": "str: synthesized question"
},
...
]Use text_to_sql/preprocess.py to process the synthesized demonstrations into the demonstration pool format of ODIS.
Then you can use the ODIS to convert the user question into the SQL.
It is recommanded to evaluate the result with https://github.com/taoyds/test-suite-sql-eval.