News data entity search
-
Docker 17.12
-
Increase your system mmap count setting by running:
sudo sysctl -w vm.max_map_count=262144See https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html for more information.
The ElasticSearch index and Fuseki database need to be set up before the system is operational.
Create news reader text index using Elasticsearch with Voikko plugin (support for Finnish language).
You will need to have the news data in ./yledata.
Start the ElasticSearch instance:
docker-compose up -d elasticOnce the instance is running, run the indexing script (this will take a long while, and you need lots of RAM, if you're indexing the whole dataset):
docker-compose run --rm tasks populate-index.jsCreate a directory for the rdf files:
mkdir data
chmod 777 dataCreate RDF for the subjects in the news, and load them into the database:
docker-compose run --rm tasks parse-entities.js
docker-compose run --rm fuseki ./load_subjects.shEnrich the subjects:
docker-compose up -d fuseki
docker-compose run --rm tasks enrich-entities.js
docker-compose stop fuseki
docker-compose run --rm fuseki ./load_enrichments.shdocker-compose up -d fuseki elastic api client