This is the repository for the EMNLP 2023 paper "Background Summarization of Event Timelines" by Adithya Pratapa, Kevin Small and Markus Dreyer. The image below provides an overview of the background summarization task.
Background summarization dataset is available under data, as well as on Hugging Face datasets.
We experiment with Flan-T5-XL and Long-T5-TGlobal-XL. For Flan-T5-XL, we explore both generic and query-focused setups. See configs/train.conf for supported model configurations.
# example flan-t5-xl training using deepspeed
bash bash_scripts/t5/train.sh flan-t5-xl 8888For inference, set the checkpoint path in configs/eval.conf and run the evaluation script.
# example flan-t5-xl inference
bash bash_scripts/t5/eval.sh flan-t5-xlWe experiment with zero-shot inference with GPT-3.5. See configs/gpt.conf for supported model configurations.
bash bash_scripts/gpt/predict.sh gpt-3.5-turboWe propose a new QA-based evaluation metric that measures the utility of a background summary for answering questions about a news update. See the illustration below.
See src/bus/bus.py for details on GPT-3.5 and GPT-4 based BUS metrics.
results contains the data from our Mechanical Turk and BUS evaluations. For the 1,000 news updates from test set, it includes human-written and system-generated backgrounds. It includes results from best-worst ratings, BUS--human, BUS--GPT-3.5 and BUS--GPT-4.
See src/mturk for details on MTurk setup.
To download the model checkpoints and predictions,
URL=https://d1f9rvlwrb54wt.cloudfront.net/background-summaries
wget $URL/models-flan-t5.tgz # flan-t5-xl (file size: ~10G)
wget $URL/models-flan-t5-ift.tgz # flan-t5-xl-ift, flan-t5-xl-ift-ents (file size: ~20G)
wget $URL/models-gpt-anns.tgz # gpt-3.5-turbo, gpt-3.5-turbo-cond-ents, human annotators (file size: ~5M)
wget $URL/models-long-t5.tgz # long-t5-tglobal-xl (file size: ~10G)See CONTRIBUTING for more information.
This project is licensed under the CC-BY-NC-4.0 License. See the LICENSE file.
You can cite our paper as follows:
@inproceedings{pratapa-etal-2023-background,
title = "Background Summarization of Event Timelines",
author = "Pratapa, Adithya and Small, Kevin and Dreyer, Markus",
booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
publisher = "Association for Computational Linguistics",
year="2023"
}

