Background Summarization of Event Timelines (EMNLP 2023)

This is the repository for the EMNLP 2023 paper "Background Summarization of Event Timelines" by Adithya Pratapa, Kevin Small and Markus Dreyer. The image below provides an overview of the background summarization task.

Dataset

Background summarization dataset is available under data, as well as on Hugging Face datasets.

Training and inference

T5-based systems

We experiment with Flan-T5-XL and Long-T5-TGlobal-XL. For Flan-T5-XL, we explore both generic and query-focused setups. See configs/train.conf for supported model configurations.

# example flan-t5-xl training using deepspeed
bash bash_scripts/t5/train.sh flan-t5-xl 8888

For inference, set the checkpoint path in configs/eval.conf and run the evaluation script.

# example flan-t5-xl inference
bash bash_scripts/t5/eval.sh flan-t5-xl

GPT-based systems

We experiment with zero-shot inference with GPT-3.5. See configs/gpt.conf for supported model configurations.

bash bash_scripts/gpt/predict.sh gpt-3.5-turbo

Background Utility Score (BUS)

We propose a new QA-based evaluation metric that measures the utility of a background summary for answering questions about a news update. See the illustration below.

See src/bus/bus.py for details on GPT-3.5 and GPT-4 based BUS metrics.

Human and BUS evaluation data

results contains the data from our Mechanical Turk and BUS evaluations. For the 1,000 news updates from test set, it includes human-written and system-generated backgrounds. It includes results from best-worst ratings, BUS--human, BUS--GPT-3.5 and BUS--GPT-4.

MTurk setup

See src/mturk for details on MTurk setup.

Model checkpoints and predictions

To download the model checkpoints and predictions,

URL=https://d1f9rvlwrb54wt.cloudfront.net/background-summaries
wget $URL/models-flan-t5.tgz # flan-t5-xl (file size: ~10G)
wget $URL/models-flan-t5-ift.tgz # flan-t5-xl-ift, flan-t5-xl-ift-ents (file size: ~20G)
wget $URL/models-gpt-anns.tgz # gpt-3.5-turbo, gpt-3.5-turbo-cond-ents, human annotators (file size: ~5M)
wget $URL/models-long-t5.tgz # long-t5-tglobal-xl (file size: ~10G)

Security

See CONTRIBUTING for more information.

License

This project is licensed under the CC-BY-NC-4.0 License. See the LICENSE file.

Reference

You can cite our paper as follows:

@inproceedings{pratapa-etal-2023-background,
    title = "Background Summarization of Event Timelines",
    author = "Pratapa, Adithya and Small, Kevin and Dreyer, Markus",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    publisher = "Association for Computational Linguistics",
    year="2023"
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
bash_scripts		bash_scripts
configs		configs
data		data
deepspeed_configs		deepspeed_configs
images		images
results		results
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pull_request_template.md		pull_request_template.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Background Summarization of Event Timelines (EMNLP 2023)

Dataset

Training and inference

T5-based systems

GPT-based systems

Background Utility Score (BUS)

Human and BUS evaluation data

MTurk setup

Model checkpoints and predictions

Security

License

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Background Summarization of Event Timelines (EMNLP 2023)

Dataset

Training and inference

T5-based systems

GPT-based systems

Background Utility Score (BUS)

Human and BUS evaluation data

MTurk setup

Model checkpoints and predictions

Security

License

Reference

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages