Skip to content

Add inline firehose#8056

Merged
fjy merged 3 commits intoapache:masterfrom
ccaominh:feature-656-inline-firehose
Jul 12, 2019
Merged

Add inline firehose#8056
fjy merged 3 commits intoapache:masterfrom
ccaominh:feature-656-inline-firehose

Conversation

@ccaominh
Copy link
Copy Markdown
Contributor

@ccaominh ccaominh commented Jul 10, 2019

Fixes #7910.

Description

To allow users to quickly parsing and schema, add a firehose that reads data that is inlined in its spec.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.

To allow users to quickly parsing and schema, add a firehose that reads
data that is inlined in its spec.
@ccaominh ccaominh force-pushed the feature-656-inline-firehose branch from b2a9abe to 99fa288 Compare July 10, 2019 22:16
@vogievetsky
Copy link
Copy Markdown
Contributor

❤️ ❤️ ❤️ ❤️ ❤️

@ccaominh
Copy link
Copy Markdown
Contributor Author

Manual test:
connect
parse-data

@vogievetsky
Copy link
Copy Markdown
Contributor

Man, the Connect step with the inline firehose is hilarious. I have a really king bomber idea of what to do there. Eagerly awaiting this getting merged.

Comment thread core/src/main/java/org/apache/druid/data/input/impl/InlineFirehose.java Outdated
Comment thread core/src/main/java/org/apache/druid/data/input/impl/InlineFirehose.java Outdated
Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just had a comment about the Sonar annotations.

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 after CI

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 after CI. Thanks @ccaominh!

Copy link
Copy Markdown
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm 👍

I don't consider any of my comments as blockers to merge, just pointing out some things I noticed so feel free to ignore

### SqlFirehose

SqlFirehoseFactory can be used to ingest events residing in RDBMS. The database connection information is provided as part of the ingestion spec. For each query, the results are fetched locally and indexed. If there are multiple queries from which data needs to be indexed, queries are prefetched in the background upto `maxFetchCapacityBytes` bytes.
This Firehose can be used to ingest events residing in RDBMS. The database connection information is provided as part of the ingestion spec. For each query, the results are fetched locally and indexed. If there are multiple queries from which data needs to be indexed, queries are prefetched in the background upto `maxFetchCapacityBytes` bytes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i know you didn't write this, but maybe it should be "an RDBMS."?

This firehose can be used to combine and merge data from a list of different firehoses.
This can be used to merge data from more than one firehose.
This Firehose can be used to combine and merge data from a list of different Firehoses.
This can be used to merge data from more than one Firehose.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about not writing, but this 2nd sentence seems redundant

import java.util.function.Predicate;
import java.util.stream.Collectors;

public class FirehoseModuleTest
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the point of this test, is it just seeing .. if jackson works I guess for the things that are defined in the module? It doesn't really test all firehose factories since some are registered in extensions, am I missing something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that this test fails if the newInlineFirehoseFactory is not registered in FirehoseModule, for example (or any other FirehoseFactories in that package). Are there tests elsewhere that would catch an error like that?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I guess in the case of tests you've added, no nothing else would be testing this. I thought that generally the 'serde' tests are usually setup to catch this, but looking closer this isn't terribly consistent across the codebase either.

@fjy fjy merged commit da3d141 into apache:master Jul 12, 2019
@ccaominh ccaominh deleted the feature-656-inline-firehose branch July 12, 2019 16:08
@clintropolis clintropolis added this to the 0.16.0 milestone Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add an InlineFirehose that contains the data to be ingested in its spec (for examples and demos)

6 participants