Skip to content

[Backport] Druid quickstart: Update task memory#13570

Closed
findingrish wants to merge 34 commits intoapache:masterfrom
findingrish:backport_update_task_memory
Closed

[Backport] Druid quickstart: Update task memory#13570
findingrish wants to merge 34 commits intoapache:masterfrom
findingrish:backport_update_task_memory

Conversation

@findingrish
Copy link
Copy Markdown
Contributor

Backports #13563

kfaraz and others added 30 commits November 21, 2022 20:39
* Fix web-console snapshots

* Revert changes to package and package-lock.json
* Add sketch fetching framework

* Refactor code to support sequential merge

* Update worker sketch fetcher

* Refactor sketch fetcher

* Refactor sketch fetcher

* Add context parameter and threshold to trigger sequential merge

* Fix test

* Add integration test for non sequential merge

* Address review comments

* Address review comments

* Address review comments

* Resolve maxRetainedBytes

* Add new classes

* Renamed key statistics information class

* Rename fetchStatisticsSnapshotForTimeChunk function

* Address review comments

* Address review comments

* Update documentation and add comments

* Resolve build issues

* Resolve build issues

* Change worker APIs to async

* Address review comments

* Resolve build issues

* Add null time check

* Update integration tests

* Address review comments

* Add log messages and comments

* Resolve build issues

* Add unit tests

* Add unit tests

* Fix timing issue in tests
* Backport firehose PR 12981

* Update migrate-from-firehose-ingestion.md
* Suppress jackson-databind CVE-2022-42003 and CVE-2022-42004
(cherry picked from commit 1f4d892)
* Suppress CVEs
(cherry picked from commit ed55baa)
* Suppress vulnerabilities from druid-website package
(cherry picked from commit c0fb364)
* Add more suppressions for website package
(cherry picked from commit 9bba569)

Co-authored-by: Rohan Garg <7731512+rohangarg@users.noreply.github.com>
…e#13438)

* fixes BlockLayoutColumnarLongs close method to nullify internal buffer.

* fixes other BlockLayoutColumnar supplier close methods to nullify internal buffers.

* fix spotbugs

(cherry picked from commit b091b32)
apache#13421)

* we can read where we want to
we can leave your bounds behind
'cause if the memory is not there
we really don't care
and we'll crash this process of mine
* Update and document experimental features
(cherry picked from commit ccbf3ab)
* Updated
(cherry picked from commit d7b8fae)
* Update experimental-features.md
* Updated after review
(cherry picked from commit 975ae24)
* Updated
(cherry picked from commit eb8268e)
* Update materialized-view.md
(cherry picked from commit 53c3bde)
* Update experimental-features.md
(cherry picked from commit 77148f7)
* Update nested columns docs

* Update nested-columns.md
…13445)

Detects self-redirects, redirect loops, long redirect chains, and redirects to unknown servers.
Treat all of these cases as an unavailable service, retrying if the retry policy allows it.

Previously, some of these cases would lead to a prompt, unretryable error. This caused
clients contacting an Overlord during a leader change to fail with error messages like:

org.apache.druid.rpc.RpcException: Service [overlord] redirected too many times

Additionally, a slight refactor of callbacks in ServiceClientImpl improves readability of
the flow through onSuccess.

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
…s to parse exception in MSQ (apache#13366) (apache#13454)

* initial commit

* fix test

* push the json changes

* reduce the area of the try..catch

* Trigger Build

* review
…ache#13459) (apache#13464)

* Fix an issue with WorkerSketchFetcher not terminating on shutdown

* Change threadpool name
* add ability to make inputFormat part of the example datasets (apache#13402)

* Web console: Index spec dialog (apache#13425)

* add index spec dialog

* add sanpshot

* Web console: be more robust to aux queries failing and improve kill tasks (apache#13431)

* be more robust to aux queries failing

* feedback fixes

* remove empty block

* fix spelling

* remove killAllDataSources from the console

* don't render duration if aggregated (apache#13455)
* Update LDAP configuration docs

(cherry picked from commit e74bd89)

* Updated after review

(cherry picked from commit 882e0b2)

* Update auth-ldap.md

Updated.

(cherry picked from commit d4f0797)

* Update auth-ldap.md

(cherry picked from commit fbec7b2)

* Updated spelling file

(cherry picked from commit ef5316b)

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
(cherry picked from commit 1a9b42a)

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
(cherry picked from commit 1018d9a)

* Update docs/operations/auth-ldap.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
(cherry picked from commit dd81b3f)

* Update auth-ldap.md

(cherry picked from commit f0655cf)
) (apache#13493)

In a cluster with a large number of streaming tasks (~1000), SegmentAllocateActions 
on the overlord can often take very long intervals of time to finish thus causing spikes 
in the `task/action/run/time`. This may result in lag building up while a task waits for a
segment to get allocated.

The root causes are:
- large number of metadata calls made to the segments and pending segments tables
- `giant` lock held in `TaskLockbox.tryLock()` to acquire task locks and allocate segments

Since the contention typically arises when several tasks of the same datasource try
to allocate segments for the same interval/granularity, the allocation run times can be
improved by batching the requests together.

Changes
- Add flags
   - `druid.indexer.tasklock.batchSegmentAllocation` (default `false`)
   - `druid.indexer.tasklock.batchAllocationMaxWaitTime` (in millis) (default `1000`)
- Add methods `canPerformAsync` and `performAsync` to `TaskAction`
- Submit each allocate action to a `SegmentAllocationQueue`, and add to correct batch
- Process batch after `batchAllocationMaxWaitTime`
- Acquire `giant` lock just once per batch in `TaskLockbox`
- Reduce metadata calls by batching statements together and updating query filters
- Except for batching, retain the whole behaviour (order of steps, retries, etc.)
- Respond to leadership changes and fail items in queue when not leader
- Emit batch and request level metrics
…he#13495)

* Update docs for useBatchedSegmentSampler
* Update docs for round robin assigment
* Update to native ingestion doc

(cherry picked from commit aba83f2)

* Update native-batch.md

* Update native-batch.md
…verview.type=http (apache#13499) (apache#13515)

* fix issue with http server inventory view blocking data node http server shutdown with long polling

* adjust

* fix test inspections
…pache#13517)

Changes:
- Limit max batch size in `SegmentAllocationQueue` to 500
- Rename `batchAllocationMaxWaitTime` to `batchAllocationWaitTime` since the actual
wait time may exceed this configured value.
- Replace usage of `SegmentInsertAction` in `TaskToolbox` with `SegmentTransactionalInsertAction`
… (apache#13529)

* Remove stray reference to fix OOM while merging sketches

* Update future to add result from executor service

* Update tests and address review comments

* Address review comments

* Moved mock

* Close threadpool on teardown

* Remove worker task cancel
…ache#13537) (apache#13542)

The planner sets sqlInsertSegmentGranularity in its context when using
PARTITIONED BY, which sets it on every native query in the stack (as all
native queries for a SQL query typically have the same context).
QueryKit would interpret that as a request to configure bucketing for
all native queries. This isn't useful, as bucketing is only used for
the penultimate stage in INSERT / REPLACE.

So, this patch modifies QueryKit to only look at sqlInsertSegmentGranularity
on the outermost query.

As an additional change, this patch switches the static ObjectMapper to
use the processwide ObjectMapper for deserializing Granularities. Saves
an ObjectMapper instance, and ensures that if there are any special
serdes registered for Granularity, we'll pick them up.

(cherry picked from commit 5581488)

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
* Web console: add arrayOfDoublesSketch and other small fixes (apache#13486)
* add padding and keywords
* add arrayOfDoubles
* Update docs/development/extensions-core/datasketches-tuple.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/development/extensions-core/datasketches-tuple.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/development/extensions-core/datasketches-tuple.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/development/extensions-core/datasketches-tuple.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Update docs/development/extensions-core/datasketches-tuple.md
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* partiton int
* fix docs
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
* Web console: improve compaction status display (apache#13523)
* improve compaction status display
* even more accurate
* fix snapshot
* MSQ: Improve TooManyBuckets error message, improve error docs. (apache#13525)
1) Edited the TooManyBuckets error message to mention PARTITIONED BY
   instead of segmentGranularity.
2) Added error-code-specific anchors in the docs.
3) Add information to various error codes in the docs about common
   causes and solutions.
* update error anchors (apache#13527)
* update snapshot
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
…attening (apache#13519) (apache#13546)

* add protobuf flattener, direct to plain java conversion for faster flattening, nested column tests
findingrish and others added 4 commits December 13, 2022 11:31
* Zero-copy local deep storage.

This is useful for local deep storage, since it reduces disk usage and
makes Historicals able to load segments instantaneously.

Two changes:

1) Introduce "druid.storage.zip" parameter for local storage, which defaults
   to false. This changes default behavior from writing an index.zip to writing
   a regular directory. This is safe to do even during a rolling update, because
   the older code actually already handled unzipped directories being present
   on local deep storage.

2) In LocalDataSegmentPuller and LocalDataSegmentPusher, use hard links
   instead of copies when possible. (Generally this is possible when the
   source and destination directory are on the same filesystem.)

Co-authored-by: Gian Merlino <gianmerlino@gmail.com>
…pache#13567)

* Add validation checks to worker chat handler apis

* Merge things and polishing the error messages.

* Minor error message change

* Fixing race and adding some tests

* Fixing controller fetching stats from wrong workers.
Fixing race
Changing default mode to Parallel
Adding logging.
Fixing exceptions not propagated properly.

* Changing to kernel worker count

* Added a better logic to figure out assigned worker for a stage.

* Nits

* Moving to existing kernel methods

* Adding more coverage

Co-authored-by: cryptoe <karankumar1100@gmail.com>
(cherry picked from commit 2b605aa)

Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com>
Changes:
* Use 80% of memory specified for running services (versus 50% earlier).
* Tasks get either 512m / 1024m or 2048m now (versus 512m or 2048m earlier). 
* Add direct memory for router.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants