Skip to content

sommerschield/iphi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

I.PHI dataset: ancient Greek inscriptions

Thea Sommerschield*, Yannis Assael*, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, Nando de Freitas


Χαῖρε! This repository is forked from Pythia, and contains a pipeline to download and process the Packard Humanities Institute database of ancient Greek inscriptions including the geographical and chronological metadata into a machine actionable format. The processed dataset is referred to as I.PHI.

Dependencies

pip install -r requirements.txt && \
python -m nltk.downloader punkt

Dataset generation

# Download and process PHI (this will take a while)
python -m train.data.iphi_download  --connections=1

To enable multi-threaded processing set: --connections=100.

Preprocessed I.PHI dataset uploaded by @Holger.Danske800: link

Reference

When using this dataset, please cite the Packard Humanities Institute database of ancient Greek inscriptions and:

@misc{sommerschield2021iphi,
  title={{I.PHI} dataset: ancient Greek inscriptions},
  author={Sommerschield*, Thea and Assael*, Yannis and Shillingford, Brendan and Bordbar, Mahyar and Pavlopoulos, John and Chatzipanagiotou, Marita and Androutsopoulos, Ion and Prag, Jonathan and de Freitas, Nando},
  howpublished = {\url{https://github.com/sommerschield/iphi}},
  year={2021}
}

Acknowledgements

This project relies on the availability of a high-quality dataset of ancient Greek inscriptions, built through centuries of scholarly collection and decades of digital editorial work. In particular, it draws on the Searchable Greek Inscriptions database made available by the Packard Humanities Institute, generously supported by David Packard: inscriptions.packhum.org. This resource brings together a large proportion of published inscriptions in a searchable digital format. Any use of the dataset generated here should acknowledge and cite the Packard Humanities Institute project, as well as the underlying scholarly contributions on which it depends.

License

Apache License, Version 2.0

Epigraphy
Damaged inscription: a decree concerning the Acropolis of Athens (485/4 BCE). IG I3 4B.
(CC BY-SA 3.0, WikiMedia)

About

I.PHI dataset generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages