Skip to content

[Backport] Druid automated quickstart#13551

Closed
findingrish wants to merge 148 commits intoapache:25.0.0from
findingrish:backport_druid_quickstart
Closed

[Backport] Druid automated quickstart#13551
findingrish wants to merge 148 commits intoapache:25.0.0from
findingrish:backport_druid_quickstart

Conversation

@findingrish
Copy link
Copy Markdown
Contributor

Backports #13365

rishabh singh and others added 30 commits November 15, 2022 11:08
* Firehose migration doc

* Update migrate-from-firehose-ingestion.md

* Updated with review comments and suggestions

* Update migrate-from-firehose-ingestion.md

* Update migrate-from-firehose-ingestion.md

* Update migrate-from-firehose-ingestion.md
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
supervise script changes to process java opts array
use argparse, leave free memory, logging
…stry. (apache#13403)

* Attach IO error to parse error when we can't contact Avro schema registry.

The change in apache#12080 lost the original exception context. This patch
adds it back.

* Add hamcrest-core.

* Fix format string.
* Prepare master branch for next release, 26.0.0

* Use docker image for druid 24.0.1

* Fix version in druid-it-cases pom.xml
* Suppress jackson-databind CVE-2022-42003 and CVE-2022-42004
(cherry picked from commit 1f4d892)
* Suppress CVEs
(cherry picked from commit ed55baa)
* Suppress vulnerabilities from druid-website package
(cherry picked from commit c0fb364)
* Add more suppressions for website package
(cherry picked from commit 9bba569)
clintropolis and others added 29 commits December 6, 2022 15:52
…verview.type=http (apache#13499)

* fix issue with http server inventory view blocking data node http server shutdown with long polling

* adjust

* fix test inspections
* Processors for Window Processing

This is an initial take on how to use Processors
for Window Processing.  A Processor is an interface
that transforms RowsAndColumns objects.
RowsAndColumns objects are essentially combinations
of rows and columns.

The intention is that these Processors are the start
of a set of operators that more closely resemble what
DB engineers would be accustomed to seeing.

* Wire up windowed processors with a query type that
can run them end-to-end.  This code can be used to
actually run a query, so yay!

* Wire up windowed processors with a query type that
can run them end-to-end.  This code can be used to
actually run a query, so yay!

* Some SQL tests for window functions. Added wikipedia 
data to the indexes available to the
SQL queries and tests validating the windowing
functionality as it exists now.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
Changes:
- Limit max batch size in `SegmentAllocationQueue` to 500
- Rename `batchAllocationMaxWaitTime` to `batchAllocationWaitTime` since the actual
wait time may exceed this configured value.
- Replace usage of `SegmentInsertAction` in `TaskToolbox` with `SegmentTransactionalInsertAction`
…3486)

* add padding and keywords

* add arrayOfDoubles

* Update docs/development/extensions-core/datasketches-tuple.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/datasketches-tuple.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/datasketches-tuple.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/datasketches-tuple.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/development/extensions-core/datasketches-tuple.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* partiton int

* fix docs

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update to native ingestion doc

* Update docs/ingestion/native-batch.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* Update native-batch.md

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
* Remove stray reference to fix OOM while merging sketches

* Update future to add result from executor service

* Update tests and address review comments

* Address review comments

* Moved mock

* Close threadpool on teardown

* Remove worker task cancel
* improve compaction status display

* even more accurate

* fix snapshot
…e#13525)

1) Edited the TooManyBuckets error message to mention PARTITIONED BY
   instead of segmentGranularity.

2) Added error-code-specific anchors in the docs.

3) Add information to various error codes in the docs about common
   causes and solutions.
* Enhanced MSQ table functions
* HTTP, LOCALFILES and INLINE table functions powered by
catalog metadata.
* Documentation
…ache#13537)

The planner sets sqlInsertSegmentGranularity in its context when using
PARTITIONED BY, which sets it on every native query in the stack (as all
native queries for a SQL query typically have the same context).
QueryKit would interpret that as a request to configure bucketing for
all native queries. This isn't useful, as bucketing is only used for
the penultimate stage in INSERT / REPLACE.

So, this patch modifies QueryKit to only look at sqlInsertSegmentGranularity
on the outermost query.

As an additional change, this patch switches the static ObjectMapper to
use the processwide ObjectMapper for deserializing Granularities. Saves
an ObjectMapper instance, and ensures that if there are any special
serdes registered for Granularity, we'll pick them up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.