TODO: Travis
Timo Breuer and Philipp Schaer
This is the docker image for our replicated submission to CENTRE@CLEF2019 conforming to the OSIRRC jig for the Open-Source IR Replicability Challenge (OSIRRC 2019) at SIGIR 2019. This image is available on Docker Hub has been tested with the jig at commit ca31987 (6/5/2019).
- Supported test collections:
core17 - Required training collections:
robust04,robust05 - Supported hooks:
init,index,search
Use the commands below to get the runs for WCRobust04 and WCRobust0405 as they were replicated in the course of our participation in CENTRE@CLEF19.
The following jig command can be used to index the New York Times corpus and prepare training data for WCRobust04:
python run.py prepare \
--repo osirrc2019/irc-centre2019 \
--collections robust04=/path/to/robust04/=trectext \
core17=/path/to/core17/=trectext \
--opts run="wcrobust04"
The argument run can be customized to run="wcrobust0405" in order to prepare training data for WCRobust0405.
In this case, the robust05 corpus has to be mounted as an additional volume.
python run.py prepare \
--repo osirrc2019/irc-centre2019 \
--collections robust04=/path/to/robust04/=trectext \
robust05=/path/to/robust05/=trectext \
core17=/path/to/core17/=trectext \
--opts run="wcrobust0405"
The following jig command can be used to perform a retrieval run on the New York Times depending on the previously defined training corpora.
python run.py search \
--repo osirrc2019/irc-centre2019 \
--collection core17 \
--topic topics/topics.core17.txt \
--output /path/to/output/ \
--qrels qrels/qrels.core17.txt
TODO: add outcomes
The following is a short summary of what happens in each of the scripts in this repo.
The Dockerfile installs python3, copies scripts for corresponding hooks and makes required directory. The working directory is set to /work/
The init script will download the code from a repository and installs required Python packages from the requirements.txt file. Depending on the specified run, scripts for WCRobust04 or WCRobust0405 will be prepared.
The index script runs a subprocess which starts indexing.
The search script will start the ranking depending on the previously specified run.