Skip to content

Place __time in signatures according to sort order.#16958

Merged
gianm merged 7 commits intoapache:masterfrom
gianm:signature-time-ordering
Aug 27, 2024
Merged

Place __time in signatures according to sort order.#16958
gianm merged 7 commits intoapache:masterfrom
gianm:signature-time-ordering

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Aug 24, 2024

Updates a variety of places to put __time in row signatures according to its position in the sort order, rather than always first, including:

  • InputSourceSampler.
  • ScanQueryEngine (in the default signature when "columns" is empty).
  • Various StorageAdapters, which also have the effect of reordering the column order in segmentMetadata queries, and therefore in SQL schemas as well.

This all helps users understand when their reorderings of __time in the segment sort order are having an effect. The sampler changes are also helpful for making the web console flow support reordering __time.

Follow-up to #16849.

Also updates the warning message for out of order __time to be a little clearer.

Updates a variety of places to put __time in row signatures according
to its position in the sort order, rather than always first, including:

- InputSourceSampler.
- ScanQueryEngine (in the default signature when "columns" is empty).
- Various StorageAdapters, which also have the effect of reordering
  the column order in segmentMetadata queries, and therefore in SQL
  schemas as well.

Follow-up to apache#16849.
@github-actions github-actions Bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Aug 25, 2024
Copy link
Copy Markdown
Contributor

@vogievetsky vogievetsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I built this branch locally and tested some APIs and and it all works great

image

image

image

@gianm gianm merged commit 5d2ed33 into apache:master Aug 27, 2024
@gianm gianm deleted the signature-time-ordering branch August 27, 2024 04:45
@gianm gianm added this to the 31.0.0 milestone Aug 27, 2024
hevansDev pushed a commit to hevansDev/druid that referenced this pull request Aug 29, 2024
* Place __time in signatures according to sort order.

Updates a variety of places to put __time in row signatures according
to its position in the sort order, rather than always first, including:

- InputSourceSampler.
- ScanQueryEngine (in the default signature when "columns" is empty).
- Various StorageAdapters, which also have the effect of reordering
  the column order in segmentMetadata queries, and therefore in SQL
  schemas as well.

Follow-up to apache#16849.

* Fix compilation.

* Additional fixes.

* Fix.

* Fix style.

* Omit nonexistent columns from the row signature.

* Fix tests.
edgar2020 pushed a commit to edgar2020/druid that referenced this pull request Sep 5, 2024
* Place __time in signatures according to sort order.

Updates a variety of places to put __time in row signatures according
to its position in the sort order, rather than always first, including:

- InputSourceSampler.
- ScanQueryEngine (in the default signature when "columns" is empty).
- Various StorageAdapters, which also have the effect of reordering
  the column order in segmentMetadata queries, and therefore in SQL
  schemas as well.

Follow-up to apache#16849.

* Fix compilation.

* Additional fixes.

* Fix.

* Fix style.

* Omit nonexistent columns from the row signature.

* Fix tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Segment Format and Ser/De

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants