Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain

Code for the EMNLP 2024 paper (Findings).

Setup

If you use conda, create the environment for this project running:

conda env create -f environment.yml

If you use venv, activate your environment and run:

pip install -r requirements.txt

Bootstrapping

To create the datasets for Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO), insert the huggingface_login credential and the path to save HF models and datasets in lines 385-386 of bootstrapping.py. Then run it:

python scripts/bootstrapping.py

This will populate the data/bootstrapped folder and create a HuggingFace dataset that will be used for DPO (DPO dataset used in the paper).

Training

SFT

For training the base model with SFT, insert the cache_dir and output_dir and run:

python scripts/SFT.py

The best-performing checkpoints for the SFT model are after 4k samples (SFT adapter).

DPO

Insert the cache_dir, output_dir and huggingface_login. Then run for DPO training:

python scripts/DPO.py

The trained DPO model is in HuggingFace Hub (DPO adapter).

Evaluation

Citation

If you find it useful, you can cite our paper as:

@inproceedings{mazzaccara2024learningtoask,
    title = "Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain",
    author = "Mazzaccara, Davide  and
      Testoni, Alberto  and
      Bernardi, Raffaella",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    year = "2024",
    url = "https://aclanthology.org/2024.findings-emnlp.291/",
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data/game_sets		data/game_sets
scripts		scripts
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain

Setup

Bootstrapping

Training

SFT

DPO

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain

Setup

Bootstrapping

Training

SFT

DPO

Evaluation

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages