Skip to content

add note on consistency of results for sys.segments queries#7034

Merged
jihoonson merged 7 commits intoapache:masterfrom
surekhasaharan:sys-segments-doc-update
Feb 19, 2019
Merged

add note on consistency of results for sys.segments queries#7034
jihoonson merged 7 commits intoapache:masterfrom
surekhasaharan:sys-segments-doc-update

Conversation

@surekhasaharan
Copy link
Copy Markdown

For the sys.segments queries, it seems broker randomly chooses one of the replicas, so if there are more than one replica for a segment, then the fields like size num_rows etc. can have different values based on which realtime replica, the broker queries. The results will be eventually consistent once, the segment is served by a historical server.Adding this note to the docs. This may not be a problem once this issue is addressed.

Comment thread docs/content/querying/sql.md Outdated

### SEGMENTS table
Segments table provides details on all Druid segments, whether they are published yet or not.
Note that if a segment is served by more than one realtime tasks(multiple realtime replicas), then the results may vary between the sys.segments queries for columns such as `size`, `num_rows` etc., until the segment is served by a historical eventually.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a space between tasks and (multiple

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got changed.

Comment thread docs/content/querying/sql.md Outdated

### SEGMENTS table
Segments table provides details on all Druid segments, whether they are published yet or not.
Note that if a segment is served by more than one realtime tasks(multiple realtime replicas), then the results may vary between the sys.segments queries for columns such as `size`, `num_rows` etc., until the segment is served by a historical eventually.
Copy link
Copy Markdown
Contributor

@jihoonson jihoonson Feb 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this note is to make people less confused, and thus it should be detailed as much as possible.

Please add more details about when this can happen and why, and what columns can vary. I think it's worth to add a new section for this caveat.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes more details can be more confusing :)). Tried to add more details, let me know if it's less confusing. Not sure if it needs it's own section and what should be the title of that section. Added a caveat subheading.

Comment thread docs/content/querying/sql.md Outdated
Segments table provides details on all Druid segments, whether they are published yet or not.

#### CAVEAT
Note that a segment can be served by more than one realtime or historical servers, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple realtime tasks, until a segment is eventually served by a historical, at that point the segment is immutable. And broker prefers to query a segment from historical over realtime server. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks. The columns of segments table that can have inconsistent values during this period include `size`, `num_replicas`, `num_rows`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no such things are realtime or historical servers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • please ensure consistent capitalization for Historicals, Brokers, etc

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no such things are realtime or historical servers

I see mention of Historical Node, Real-time Node in docs. So what should I write historical node ? process ?

Copy link
Copy Markdown
Contributor

@gianm gianm Feb 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, "Historical process" and "stream ingestion tasks"

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected the capitalization and changed to correct terminology

Comment thread docs/content/querying/sql.md Outdated
Segments table provides details on all Druid segments, whether they are published yet or not.

#### CAVEAT
Note that a segment can be served by more than one realtime or historical servers, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple realtime tasks, until a segment is eventually served by a historical, at that point the segment is immutable. And broker prefers to query a segment from historical over realtime server. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks. The columns of segments table that can have inconsistent values during this period include `size`, `num_replicas`, `num_rows`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you explain why size and num_replica vary? It looks that they are not getting from segmentMetadataQuery.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add why this happens. The root cause is that system schema uses segmentMetadatQuery to retrieve some information, and the broker randomly picks one of the realtime tasks for query processing if there's no published segments, and thus it's not guaranteed that the same task serves segmentMetadataQuery every time.

I think it's worth to link #5915 here too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, should there be mention of segmentMetadatQuery and RandomServerSelectorStrategy in user facing docs. I tried to explain without adding internal code details. I feel such details should be in github issues or in javadocs. And do we generally link to github issues in user documentation, are there any similar examples in druid docs?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SegmentMetdataQuery is a documented query type (http://druid.io/docs/latest/querying/segmentmetadataquery.html). I don't think it's worth to mention the class name of RandomServerSelectorStrategy but the configuration for it is also documented (http://druid.io/docs/latest/configuration/index.html#query-prioritization).

Well, but my above comment about random selection may not be appropriate because it can give a wrong intuition to users. Probably better to not say about random selection at all. But, I think it's still needed to say about only one of the realtime tasks is selected if multiple replicas are running.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you explain why size and num_replica vary? It looks that they are not getting from segmentMetadataQuery.

I think size would not vary between ingestion tasks, since they all would show 0, but it can vary if a segment is queried from Historical vs realtime task. But given that, Broker prefers Historical, may be size is not an issue. For num_replica, it can change if a segment gets added or removed from TimelineServerView.TimelineCallback in DruidSchema, and it's value can vary between the queries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. For num_replica, it sounds like it's a valid result because it reflects the changes which actually happened. I think it's different from varying num_rows and doesn't have to be noted here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it seems num_rows is the only col affected.

@jon-wei jon-wei added this to the 0.14.0 milestone Feb 13, 2019
Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

Comment thread docs/content/querying/sql.md Outdated
Segments table provides details on all Druid segments, whether they are published yet or not.

#### CAVEAT
Note that a segment can be served by more than one stream ingestion tasks or Historical processes, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple ingestion tasks, until a segment is eventually served by a Historical, at that point the segment is immutable. Broker prefers to query a segment from Historical over a ingestion task. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks because only one of the ingestion tasks is queried by the Broker and it is not gauranteed that the same task gets picked everytime. The columns of segments table that can have inconsistent values during this period include `num_replicas` and `num_rows`. There is an open [issue](https://github.com/apache/incubator-druid/issues/5915) about this inconsistency with stream ingestion tasks.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a ingestion task -> an ingestion task.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@jihoonson
Copy link
Copy Markdown
Contributor

@surekhasaharan thanks! LGTM.

@jihoonson jihoonson merged commit 2b04e6d into apache:master Feb 19, 2019
surekhasaharan pushed a commit to surekhasaharan/druid that referenced this pull request Feb 20, 2019
)

* add doc

* change docs

* PR comments

* few more changes
fjy pushed a commit that referenced this pull request Feb 20, 2019
…7101)

* add doc

* change docs

* PR comments

* few more changes
@surekhasaharan surekhasaharan deleted the sys-segments-doc-update branch February 20, 2019 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants