Skip to content

Conversation

@chreman
Copy link
Member

@chreman chreman commented Apr 2, 2020

This PR contains a deployable MVP of a new data source integration in a micro-service oriented architecture. The PR has following main elements:

  • A minimum search example for TRIPLE data (examples/triple)
  • new backend micro-services, running in Docker containers (server/workers)
  • documentation README and example config files for each service

Following changes to the backend pipeline have been made: For predefined integrations (here TRIPLE), the pipeline now branches off for the stages of document retrieval and map data generation (function search in search.php):

  • Instead of executing R-scripts on the OS, a POST request is sent through a localhost reverse proxy to the API (Apache2 setup is described in the README), which passes the request to the data source connector (in this case an ElasticSearch API)
  • The connector retrieves documents according to the search query, preprocesses them, and returns input data (metadata and textual content), which is then handed over to dataprocessing
  • The data processing service executes the machine learning and NLP components on the input data and returns a map representation
  • The API service takes the map representation and returns it as JSON to the call in search.php.

Other functionalities in search.php such as snapshot generation and map persistence to SQLITE function as before.

For an initial setup, please follow the instructions in README.

Copy link
Member

@pkraker pkraker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! This is a big step forward in terms of our backend development and deployment.

@pkraker pkraker merged commit 7c11cac into master Apr 9, 2020
@pkraker pkraker deleted the containerization branch June 5, 2020 09:59
chreman pushed a commit to chreman/Headstart that referenced this pull request Oct 13, 2021
…inerization

Containerization

Former-commit-id: 7c11cac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants