Skip to content

Conversation

@bodom0015
Copy link
Member

@bodom0015 bodom0015 commented Mar 9, 2021

Problem

InfluxDB consumer currently just inserts a large JSON string into the database - this structure is a non-queryable, and therefore not useful.

Approach

Instead, parse the JSON message into tags/fields that Influx understands. Tags are indexed, and therefore more performant when used as a search query. At least one field is required - these are meant contain all other measurement data.

Noting this section from the "Discussion" section of the docs:

Tags are optional. You don’t need to have tags in your data structure, but it’s generally a good idea to make use of them because, unlike fields, tags are indexed. This means that queries on tags are faster and that tags are ideal for storing commonly-queried metadata.

Tags (indexed):

  • type
  • category
  • service_name
  • user_id
  • author_id
  • extractor_name

Fields:

  • resource_id
  • dataset_id
  • dataset_name
  • author_name
  • user_name
  • resource_name
  • size

How to Test

Prerequisites:

  • RabbitMQ running locally
  • Docker

See other PR for testing steps

@lmarini lmarini merged commit 5e2129b into main Mar 16, 2021
@robkooper robkooper deleted the sync-fields branch April 5, 2021 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants