Global limit for MSQ controller tasks implemented by nozjkoitop · Pull Request #16889 · apache/druid

nozjkoitop · 2024-08-13T07:34:51Z

Improvements

Implemented a way to limit on the number of query controller tasks running at the same time. This limit specifies what percentage or amount of task slots can be allocated to query controllers. If the limit is reached, the tasks would wait for resources instead of potentially blocking the execution of other tasks (and failing after a timeout).

Rationale

There is no mechanism in Druid to prevent the cluster from being overloaded with controller tasks. Currently, it could cause a significant slowdown in processing and may lead to temporary deadlock situations.

Introduced new configuration options

druid.indexer.queue.controllerTaskSlotRatio - optional value which defines the proportion of available task slots that can be allocated to msq controller tasks. This is a floating-point value between 0 and 1. Defaults to null.
druid.indexer.queue.maxControllerTaskSlots - optional value which specifies the maximum number of task slots that can be allocated to controller tasks. This is an integer value that provides a hard limit on the number of task slots available for msq controller tasks. Defaults to null.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

asdf2014

Remove useless tail

Co-authored-by: Benedict Jin <asdf2014@apache.org>

nozjkoitop · 2024-08-14T10:03:45Z

Remove useless tail

Done, thanks

kfaraz

Currently, it could cause a significant slowdown in processing and may lead to temporary deadlock situations.

@nozjkoitop , rather than proceeding with a limit, I think we should try to figure out what is causing the slowdown in processing and/or the deadlock. Can you elaborate on this?

Once we have done an analysis of exactly what goes wrong when we have too many controller tasks and we have decided to impose a limit, it should not be done through the TaskQueue as done in this PR. Instead, it should be similar to the implementation of parallelIndexTaskSlotRatio and the config should most likely live in WorkerTaskRunnerConfig.

druid/indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/WorkerTaskRunnerConfig.java

Lines 37 to 50 in 73ff9f9

    
             /** 
        
              * The number of task slots that a parallel indexing task can take is restricted using this config as a multiplier 
        
              * 
        
              * A value of 1 means no restriction on the number of slots ParallelIndexSupervisorTasks can occupy (default behaviour) 
        
              * A value of 0 means ParallelIndexSupervisorTasks can occupy no slots. 
        
              * Deadlocks can occur if the all task slots are occupied by ParallelIndexSupervisorTasks, 
        
              * as no subtask would ever get a slot. Set this config to a value < 1 to prevent deadlocks. 
        
              * 
        
              * @return ratio of task slots available to a parallel indexing task at a worker level 
        
              */ 
        
             public double getParallelIndexTaskSlotRatio() 
        
             { 
        
               return parallelIndexTaskSlotRatio; 
        
             }

cryptoe

I think this is a nice feature to have. Left some comments.
Thanks @nozjkoitop for taking this up.

cryptoe · 2024-09-08T02:07:17Z

            continue;
          }
-          if (taskIsReady) {
+          if (taskIsReady && !isControllerTaskLimitReached(task.getType(), true)) {


Should It just be limited to MSQ controllers or other job types as well. Maybe Take a json as a input ? where key is the taskType and the value is the limit/float.

Also what is the user behavior if the task is pending for launch due to limit.

Can we communicate why the task is not launching to the user.

If the cluster is totally starved, does the controller get timed out eventually and removed from queue?

…limit

This reverts commit 580999f.

This reverts commit 1b75e47.

…rged behavior with parallelIndexTaskSlotRatio

nozjkoitop · 2024-09-12T15:34:25Z

Thanks @kfaraz, @cryptoe for your comments
The most trivial deadlock scenario occurs when we queue a group of controller tasks but don't have available task slots for actual workers. This results in tasks hanging and eventually timing out. Thanks for highlighting the WorkerTaskRunnerConfig, it seems like the great place for this configuration. I've updated the behavior and merged it with parallelIndexTaskSlotRatio, now it's more flexible, also I've added the logging to inform the user why the task is Pending

kfaraz

We need to ensure that we do not remove any of the existing properties or config fields.

kfaraz · 2024-09-23T08:31:17Z

  @JsonProperty
-  private double parallelIndexTaskSlotRatio = 1;
+  @JsonDeserialize(using = CustomJobTypeLimitsDeserializer.class)
+  private Map<String, Number> customJobTypeLimits = new HashMap<>();


Why do we need a custom deserializer?
Can't we just have a map from String to Double?

It was added mostly for validation as the idea is to have double / integer values to have an option to specify not only the ratio but also a limit

cryptoe · 2024-09-24T08:33:08Z

      @JsonProperty("worker") Worker worker,
      @JsonProperty("currCapacityUsed") int currCapacityUsed,
      @JsonProperty("currParallelIndexCapacityUsed") int currParallelIndexCapacityUsed,
+      @JsonProperty("currTypeSpecificCapacityUsed") Map<String, Integer> typeSpecificCapacityMap,


This should be nullable no ?

You're right, thanks

cryptoe · 2024-09-24T08:33:37Z

    return workerParallelIndexCapacity;
  }

+  public boolean canRunTask(Task task, Map<String, Number> taskLimits)


Can you please add java docs for this method.

cryptoe · 2024-09-24T08:37:09Z

  }

+  @JsonProperty("currTypeSpecificCapacityUsed")
+  public Map<String, Integer> getCurrTypeSpecificCapacityUsed()


I thought we had deprecated Zk based runner in favour of http.
@kfaraz Does this change still make sense ?

Yes, ZK-based task runner is deprecated and we should not support the new feature with ZK.

cryptoe · 2024-09-24T08:38:51Z

 |`druid.indexer.runner.taskAssignmentTimeout`|How long to wait after a task has been assigned to a Middle Manager before throwing an error.|`PT5M`|
 |`druid.indexer.runner.minWorkerVersion`|The minimum Middle Manager version to send tasks to. The version number is a string. This affects the expected behavior during certain operations like comparison against `druid.worker.version`. Specifically, the version comparison follows dictionary order. Use ISO8601 date format for the version to accommodate date comparisons. |"0"|
 | `druid.indexer.runner.parallelIndexTaskSlotRatio`| The ratio of task slots available for parallel indexing supervisor tasks per worker. The specified value must be in the range `[0, 1]`. |1|
+|`druid.indexer.runner.taskSlotLimits`| A map where each key is a task type, and the corresponding value represents the limit on the number of task slots that a task of that type can occupy on a worker. The key is a `String` that specifies the task type. The value can either be a Double or Integer. A `Double` in the range [0, 1], representing a ratio of the available task slots that tasks of this type can occupy. An `Integer` that is greater than or equal to 0, representing an absolute limit on the number of task slots that tasks of this type can occupy.|Empty map|


Could you please provide an example as well ?

How does this interact with compaction slots ?

Example added! Good catch on the compaction slots. I'll need to test that, but based on the code, it looks like compaction slots availability will be checked twice (in the duty, and during the worker selection) if related entry will be included in the taskSlotLimits map. If there's a conflict, some tasks might end up in a pending state for a while.

Confirmed that the number of submitted tasks is linked to the compaction task slot limits. However, execution might be delayed if the custom limit is smaller than the one set for compaction.

For example here, taskSlotsMax = 3, but in the overlord configuration, I have druid.indexer.runner.taskSlotLimits={"compact": 2}.

I think the limit on compaction tasks (or kill tasks for that matter) should not be a concern.
This is a runtime property, typically controlled by an admin.
So, if an admin wants to restrict the number of concurrent compaction tasks, it is fair to honor that irrespective of the value of compactionTaskSlotRatio or maxCompactionTaskSlots set in the coordinator dynamic configs.

We just need to call it out clearly in the release notes and the docs of the new property.

nozjkoitop · 2024-10-10T10:38:17Z

Hey @cryptoe, what do you think about this solution? I've used SelectWorkerStrategies to utilize dynamic configuration of global limits, and it seems like a good option to me.

cryptoe · 2024-11-21T13:40:45Z

Apologies, I have been meaning to get to this PR. Will try to finish it by EOW.

github-actions · 2025-01-21T00:20:25Z

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

nozjkoitop · 2025-01-21T08:21:38Z

Hi @cryptoe will you have some time to review this changes?

cryptoe · 2025-03-04T15:21:29Z

@nozjkoitop Could you please rebase this PR. Will take a look again. Overall LGTM
cc @kfaraz

…q-controller-task-limit

nozjkoitop · 2025-03-05T11:13:29Z

@cryptoe Done

kfaraz · 2025-03-05T11:36:07Z

Thanks for your work on this, @nozjkoitop !
I will try to finish the review of this PR later this week.

cryptoe · 2025-03-05T12:47:05Z

    return workerParallelIndexCapacity;
  }

+  public Map<String, Integer> incrementTypeSpecificCapacity(String type, int capacityToAdd)


Should there be a corresponding decrement ?

cryptoe · 2025-03-05T12:48:20Z

      @JsonProperty("worker") Worker worker,
      @JsonProperty("currCapacityUsed") int currCapacityUsed,
      @JsonProperty("currParallelIndexCapacityUsed") int currParallelIndexCapacityUsed,
+      @JsonProperty("currCapacityUsedByTaskType") Map<String, Integer> currCapacityUsedByTaskType,


or this should be immutable no ?

It is immutable. incrementTypeSpecificCapacity doesn't change the fields. This typeSpecificCapacity is managed exactly like parallelIndexCapacityUsed, which is immutableWorker.getCurrParallelIndexCapacityUsed() + parallelIndexTaskCapacity in ImmutableWorkerInfo constructor arguments, but here the incremented value is created in ImmutableWorkerInfo itself. Wrt decrement, I dont see any for currParallelIndexCapacityUsed and currCapacityUsed either, if I'm not mistaken it's managed by Provisioning Strategy

cryptoe · 2025-03-06T05:04:14Z

Thanks for the changes.
Lets wait for @kfaraz review and then we can get this merged.

kfaraz

Left some suggestions.

kfaraz

Thanks for the changes, @nozjkoitop !

Changes --------- - Add field `taskLimits` to the following worker select strategies `equalDistribution`, `equalDistributionWithCategorySpec`, `fillCapacityWithCategorySpec`, `fillCapacity` - Add sub-fields `maxSlotCountByType` and `maxSlotRatioByType` to `taskLimits` - Apply these limits per worker when assigning new tasks --------- Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

) * Some debug configs * use postgresql as the default metadata store and set a few debug log * Add s3 extension, update local storage directory, use emoji in website title * Update favicon, easier to find the console tab * Add indexer server, add some basic security config, updated historical and broker to use the common druid root directory * Some policy config * add checks for SegmentMetadataQuery * Add thread.sleep for flaky. * auth config * format, and remove temp folder rules * added NoopPolicyEnforcer and RestrictAllTablesPolicyEnforcer class * Support pushing and streaming task payload for HDFS (#17742) Implement pushTaskPayload/streamTaskPayload as introduced in #14887 for HDFS storage to allow larger mm-less ingestion payloads when using HDFS as the deep storage location. * Remove usages of deprecated API Files.write() (#17761) * Add deprecated com.google.common.io.Files#write to forbiddenApis * Replace deprecated Files.write() * Doc: Fix description typo for sqlserver metadata store (#17771) Mistakenly categories under deep storage instead of metadata store. * Fix binding of segment metadata cache on CliOverlord (#17772) Changes --------- - Bind `SegmentMetadataCache` only once to `HeapMemorySegmentMetadataCache` in `SQLMetadataStorageDruidModule` - Invoke start and stop of the cache from `DruidOverlord` rather than on lifecycle start/stop - Do not override the binding in `CliOverlord` * Docs: Remove semicolon from example (#17759) * Restrict segment metadata kill query till maxInterval from last kill task time (#17770) Changes --------- - Use `maxIntervalToKill` to determine search interval for killing unused segments. - If no segment has been killed for the datasource yet, use durationToRetain * Update the Supervisor endpoint to not restart the Supervisor if the spec was unmodified (#17707) Add an optional query parameter called skipRestartIfUnmodified to the /druid/indexer/v1/supervisor endpoint. Callers can set skipRestartIfUnmodified=true to not restart the supervisor if the spec is unchanged. Example: curl -X POST --header "Content-Type: application/json" -d @supervisor.json localhost:8888/druid/indexer/v1/supervisor?skipRestartIfUnmodified=true * Reduce noisy coordinator logs (#17779) * Emit time lag from Kafka supervisor (#17735) Changes --------- - Emit time lag from Kafka similar to Kinesis as metrics `ingest/kafka/lag/time`, `ingest/kafka/maxLag/time`, `ingest/kafka/avgLag/time` - Add new method in `KafkaSupervisor` to fetch timestamps of latest records in stream to compute time lag - Add new field `emitTimeLagMetrics` in `KafkaSupervisorIOConfig` to toggle emission of new metrics * fix processed row formatting (#17756) * Web console: add suggestions for table status filtering. (#17765) * suggest filter values when known * update snapshots * add more d * fix load rule clamp * better segment timeline init * Remove all usages of skife config (#17776) Changes --------- - Usages of skife config had been deprecated in #14695 and `LegacyBrokerParallelMergeConfig` is the last config class that still uses it. - Remove `org.skife.config` from pom, licenses, log4j2.xml, etc. - Add validation for deleted property paths in `StartupInjectorBuilder.PropertiesValidator` - Use the replacement flattened configs (which remove the `.task` and `.pool` substring) * Add field `taskLimits` to worker select strategies (#16889) Changes --------- - Add field `taskLimits` to the following worker select strategies `equalDistribution`, `equalDistributionWithCategorySpec`, `fillCapacityWithCategorySpec`, `fillCapacity` - Add sub-fields `maxSlotCountByType` and `maxSlotRatioByType` to `taskLimits` - Apply these limits per worker when assigning new tasks --------- Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * remove NullValueHandlingConfig, NullHandlingModule, NullHandling (#17778) * Docs: Add SQL query example (#17593) * Docs: Add query example * Update after review * Update query * Update docs/api-reference/sql-api.md --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * More logging cleanup on Overlord (#17780) * Remove maven.twttr repo from pom (#17797) remove usage of dependency:go-offline from build scripts - as it tries to download excluded artifacts --------- Co-authored-by: Zoltan Haindrich <kirk@rxd.hu> * fix bug (#17791) * Log query stack traces for DEVELOPER and OPERATOR personas. (#17790) Currently, query stack traces are logged only when "debug: true" is set in the query context. This patch additionally logs stack traces targeted at the DEVELOPER or OPERATOR personas, because for these personas, stack traces are useful more often than not. We continue to omit stack traces by default for USER and ADMIN, because these personas are meant to interact with the API, not with code or logs. Skipping stack traces minimizes clutter in the logs. * Set useMaxMemoryEstimates=false for MSQ tasks (#17792) * Web console: fix go to task selecting correct task type (#17788) * fix go to task selecting correct task type * support autocompact also * support scheduled_batch, refactor * one more state and update tests * Enable ComponentSuppliers to run queries using Dart (#17787) Enables Calcite*Test-s and quidem tests to run queries with Dart. needed some minor tweaks: changed to use interfaces at some places renamed DartWorkerClient to DartWorkerClientImpl and made DartWorkerClient an interface reused existing parts of the MSQ test system to run the query * Fix single container config creates failing peon tasks (#17794) * Fix single container config creates failing peon tasks * More obvious array error output * Update `k8s-jobs.md` reference (#17805) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * Footer Copyright Year Update (#17751) * Update docusaurus.config.js * Update docusaurus.config.js * [Revert] Reduce number of metadata transaction retries (#17808) * Revert "Run JDK 21 workflows with latest JDK. (#17694)" (#17806) * Revert "Run JDK 21 workflows with latest JDK. (#17694)" This reverts commit 31ede5c * Review comments. * Review comments. * Revert "reject publishing actions with a retriable error code if a earlier task is still publishing (#17509)" This reverts commit aca56d6. * Fix unstable tests after #17787 and dart usage in quidem-ut (#17814) * fixes * fix cleanup * Use "mix" shuffle spec for target size with nil clusterBy. (#17810) When a nil clusterBy is used, we have no way of achieving a particular target size, so we need to fall back to a "mix" spec (unsorted single partition). This comes up for queries like "SELECT COUNT(*) FROM FOO LIMIT 1" when results use a target size, such as when we are inserting into another table or when we are writing to durable storage. * Docs: Recommend using runtime property javaOptsArray instead of javaOpts * Add minor checks in jetty utils (#17817) Add minor checks in jetty utils class * CI improvement: Leverage cancelled() instead of always() for CI jobs (#17819) * Make MSQ tests use the same datasets as other similar tests (#17818) MSQ tests had their own way of creating the segments/etc - this have lead to that custom datasets didn't worked with them. This patch alters a few things to make it possible to access CompleteSegment for the active segments - which fixed the issue and also enabled the removal of the extra loading codes. * Add unnest tests to quidem (#17825) This PR adds the sql-native unnest tests to quidem. This set of tests has 6392 queries in total, with 5247 positive tests and 1145 negative tests. * Web console: show loader on aux queries (#17804) * show loader on aux queries * show supervisors if not on page 0 * refactor * fix bug fetching data when columns are added or removed * update test * Use compaction dynamic config to enable compaction supervisors (#17782) Changes --------- - Remove runtime property object `CompactionSupervisorConfig` - Add fields `useSupervisors` and `engine` to cluster-level compaction dynamic config - Remove unused field `useAutoScaleSlots` * Retry segment publish task actions without holding locks (#17816) #17802 reverted a retry of failed segment publish actions. This patch attempts to address the original issue by retrying the segment publish task actions on the client (i.e. task) side without holding any locks so that other transactions are not blocked. Changes Add retries to TransactionalSegmentPublisher Add field retryable to SegmentPublishResult Remove class DataStoreMetadataUpdateResult and use SegmentPublishResult instead * Add the capability to turboload segments onto historicals (#17775) Add the capability to set Historicals into a turbo loading mode, to focus on loading segments at the cost of query performance. Context -------- Currently, when a new Historical is started, it initially starts out using a bootstrap thread pool. It uses this thread pool to load any existing cached segments and broadcast segments. Once it loads any segments from both these sources, the historical switches to a smaller thread-pool and begins to serve queries. In certain cases, it would be useful to have the historical switch back to this mode, and focus on loading segments, either to continue loading the initial non-bootstrap segments, or to catch up with assigned segments. This PR adds a coordinator dynamic config that allows servers to be configured to use the larger bootstrap threadpool to load segments faster. Changes --------- - Added a new dynamic coordinator configuration, `turboLoadingNodes`. - Ignore `druid.coordinator.loadqueuepeon.http.batchSize` for servers in `turboLoadingNodes` - Add API on historical to return loading capabilities i.e. num loading threads in normal and turbo mode * Fix resource leak for GroupBy query merge buffer when query matched result cache (#17823) * Fix resource leak for GroupBy query merge buffer when match result cache * Fix resource leak for GroupBy query merge buffer when match result cache * Add test * Add test * Add comment * Add test * Add metric and simulation test for turbo loading mode (#17830) Changes --------- - Add field `loadingMode` to `SegmentChangeStatus` - Including loading mode in `DataSegmentChangeResponse` - Include loading mode in the `description` of metrics emitted from `HttpLoadQueuePeon` - Add simulation test to verify loading mode metrics * Update query example (#17811) * String util upgrade for jdk9+ (#17795) * Update StringUtils.replace() after fix in JDK9 * Upgrade optimized string replace algorithm * Update methods by re-using declared StringUtils#replace method * Replace hard-coded UTF-8 encodings with StandardCharsets * Documentation Fix (#17826) * Enable to run quidem tests against multiple configurations; add conditionals; cleanup framework init (#17829) * cleans up `SqlTestFramework` initialization to leave the `OverrideModule` empty - so that tests could more easily take over parts * remove the `QueryComponentSupplier#createEngine` factory method - instead uses a `Class<SqlEngine>` and use the `injector` to initialize it * enables the usage of `!disabled <supplier> <message>` - to mark cases which are not yet supported with a specific configuration for some reason * fixes that `datasets` was not respecting the `rollup` specification of the ingest * enables to use `MultiComponentSupplier` backed tests - these will turn into matrix tests over multiple componentsuppliers - enabling running the same testcase in different scenarios * Fix failing test in DimensionSchemaUtilsTest (#17832) * Improve performance of segment metadata cache on Overlord (#17785) Description ----------- #17653 introduces a cache for segment metadata on the Overlord. This patch is a follow up to that to make the cache more robust, performant and debug-friendly. Changes --------- - Do not cache unused segments This significantly reduces sync time in cases where the cluster has a lot of unused segments. Unused segments are needed only during segment allocation to ensure that a duplicate ID is not allocated. This is a rare DB query which is supported by sufficient indexes and thus need not be cached at the moment. - Update cache directly when segments are marked as unused to avoid race conditions with DB sync. - Fix NPE when using segment metadata cache with concurrent locks. - Atomically update segment IDs and pending segments in a `HeapMemoryDatasourceSegmentCache` using methods `syncSegmentIds()` and `syncPendingSegments()` rather than updating one by one. This ensures that the locks are held for a shorter period and the update made to the cache is atomic. Main updated classes ---------------------- - `IndexerMetadataStorageCoordinator` - `OverlordDataSourcesResource` - `HeapMemorySegmentMetadataCache` - `HeapMemoryDatasourceSegmentCache` Cleaner cache sync -------------------- In every sync, the following steps are performed for each datasource: - Retrieve ALL used segment IDs from metadata store - Atomically update segment IDs in cache and determine list of segment IDs which need to be refreshed. - Fetch payloads of segments that need to be refreshed - Atomically update fetched payloads into the cache - Fetch ALL pending segments - Atomically update pending segments into the cache - Clean up empty intervals from datasource caches * GroupBy: Fix offsets on outer queries. (#17837) Prior to this patch, an offset specified on a groupBy that itself has an inner groupBy would lead to an error like "Cannot push down offsets". This happened because of a violated assumption: the processing logic assumes that offsets have been pushed into limits (so limit pushdown optimizations can safely be used). This patch adjusts processing to incorporate offsets into limits during processing of subqueries. Later on, in post-processing, offsets are applied as written. * Enable build cache for web-console (#17831) * run audit fix (#17836) * Do not block task actions on Overlord if segment metadata cache is syncing (#17824) * Do not use segment metadata cache until leader has synced * Read from cache only when synced, but write even if sync is pending * Fix compilation * Fix checkstyle, test * Revert some extra changes * Add 3 modes of cache usage * Move enum to SegmentMetadataCache * Run tests in all 3 cache modes * Fix docs and IT configs * Fix config binding * Remove forbidden api * Fix typos, docs and enum casing * Fix doc * Add json, array, aggregation function tests to quidem (#17842) This PR adds the sql-native portion of the json, array, and aggregation function tests to quidem. It adds a total of 9965 queries, with 6752 positive tests and 3213 negative tests. * Optionally include Content-Disposition header in statement results API response (#17840) Adds support for an optional filename query parameter to the /druid/v2/sql/statements/{queryId}/results API. When provided, the response will include a header Content-Disposition: attachment; filename="{filename}", which will instruct a web browser to save the response as a file rather than displaying it inline. This save-as-attachment behavior could be achieved by adding a "download" attribute to the results link, but this only works for same-origin URLs (as in the Web Console). If the UI origin is different from the Druid API origin, browsers will ignore the attribute and serve the results inline, which is poor UX for files that are potentially very large. For the sake of consistency, all successful responses in SqlStatementResource.doGetResults may include this header, even if there are no results. Release note Improved: The "Get query results" statements API supports an optional filename query parameter. When provided, the response will instruct web browsers to save the results as a file instead of showing them inline (via the Content-Disposition header). * Web console: download follow up (#17845) * set filename * update download button * added markdown support * add test * better download * fix TSV * better download behaviour and tests * always show download all button * Fix flaky unit tests in SegmentBootstrapperTest and KinesisIndexTaskTest (#17841) Changes: - Fix flakiness in SegmentBootstrapperTest - Make TestSegmentCacheManager thread safe by moving from ArrayList to CopyOnWriteArrayList - Modify assertions to disregard list ordering since order of list modifications is not always deterministic - Fix flaky KinesisIndexTask tests. * Web console: responding to user feedback about the explore view and fixing bugs (#17844) * better debounce * better cumpose filter * hook up preview filters * better stack handling * fix some props * refactor stack to facet * fix hover part 1 * line hover part 2 * start adding moduleWhere * info popover * add filter icon * toggle button * module filter bar * update TestSegmentCacheManager * revert some style changes * validate datasource in CachingClusteredClient as well * fix build failure and update style * changes * add inlineds test * add sanity check on segment * inject policy enforcer * add PolicyEnforcer binding in MSQTestBase * add check in SinkQuerySegmentWalker * more tests in realtime server * revert config change in examples * revert config change in integration test config * more tests in msq * another test for unnest in msq * add support for policy from extension * more test * refactor MSQTaskQueryMakerTest to use an instance of MSQTaskQueryMaker * Add test for JoinDataSource * add policyEnforcer to withPolicies, and validate segment after segment mapping * fix binding and test * add policy module * mock planner toolbox * revert some injection * add test for stream appenderator * update PolicyEnforcer to take ReferenceCountingSegment as param * update to QueryLifecycleTest * update to SqlTestFramework * pass enforcer to BroadcastJoinSegmentMapFnProcessor and add test. PolicyEnforcer should also deal with multiple layer wrapped segments/ * ReferenceCountingSegment is not allowed to wrap with a SegmentReference, and PolicyEnforcer now validates all segments, remove test cases for inline/lookup. * moving ReferenceCountingSegment to another pr * Revert "Merge remote-tracking branch 'cecemei/debug' into policy" This reverts commit 25ffb7c, reversing changes made to 1e6632f. --------- Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Virushade <70288012+GWphua@users.noreply.github.com> Co-authored-by: Eyal Yurman <eyal.yurman@gmail.com> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> Co-authored-by: Frank Chen <frank.chen021@outlook.com> Co-authored-by: Chetan Patidar <122344823+chetanpatidar26@users.noreply.github.com> Co-authored-by: aho135 <ash023@ucsd.edu> Co-authored-by: Adithya Chakilam <35785271+adithyachakilam@users.noreply.github.com> Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com> Co-authored-by: Misha <mikhailsviatohorof@gmail.com> Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Clint Wylie <cwylie@apache.org> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Zoltan Haindrich <kirk@rxd.hu> Co-authored-by: Gian Merlino <gianmerlino@gmail.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Om Kenge <88768848+omkenge@users.noreply.github.com> Co-authored-by: Karan Kumar <karankumar1100@gmail.com> Co-authored-by: Lars Francke <lars.francke@stackable.tech> Co-authored-by: Adarsh Sanjeev <adarshsanjeev@gmail.com> Co-authored-by: Akshat Jain <akjn11@gmail.com> Co-authored-by: Andy Tsai <61856143+weishiuntsai@users.noreply.github.com> Co-authored-by: Maytas Monsereenusorn <maytasm@apache.org> Co-authored-by: jtuglu-netflix <jtuglu@netflix.com> Co-authored-by: Lucas Capistrant <capistrant@users.noreply.github.com>

Changes --------- - Add field `taskLimits` to the following worker select strategies `equalDistribution`, `equalDistributionWithCategorySpec`, `fillCapacityWithCategorySpec`, `fillCapacity` - Add sub-fields `maxSlotCountByType` and `maxSlotRatioByType` to `taskLimits` - Apply these limits per worker when assigning new tasks --------- Co-authored-by: sviatahorau <mikhail.sviatahorau@deep.bi> Co-authored-by: Benedict Jin <asdf2014@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> (cherry picked from commit 8b56824)

Global msq limit implemented

b8c6f23

github-actions Bot added Area - Documentation Area - Ingestion labels Aug 13, 2024

nozjkoitop added 2 commits August 13, 2024 09:43

fix post-merge compilation issues

1b75e47

fix tests compilation

580999f

github-advanced-security AI found potential problems Aug 13, 2024

View reviewed changes

Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/config/TaskQueueConfig.java Fixed

Comment thread indexing-service/src/test/java/org/apache/druid/indexing/overlord/SimpleTaskRunner.java Fixed

address spell check failure

87db3da

asdf2014 reviewed Aug 14, 2024

View reviewed changes

Comment thread docs/configuration/index.md Outdated

asdf2014 reviewed Aug 14, 2024

View reviewed changes

Update docs/configuration/index.md

c2287eb

Co-authored-by: Benedict Jin <asdf2014@apache.org>

kfaraz requested a review from cryptoe August 16, 2024 02:50

kfaraz reviewed Sep 6, 2024

View reviewed changes

cryptoe reviewed Sep 8, 2024

View reviewed changes

nozjkoitop and others added 4 commits September 12, 2024 16:54

Merge branch 'apache:master' into feature-global-msq-controller-task-…

b7889cc

…limit

Revert "fix tests compilation"

02a195a

This reverts commit 580999f.

Revert "fix post-merge compilation issues"

ea3ad87

This reverts commit 1b75e47.

Addressed comments moved configs to the WorkerTaskRunnerConfig and me…

9cc11f5

…rged behavior with parallelIndexTaskSlotRatio

nozjkoitop requested review from cryptoe and kfaraz September 12, 2024 15:37

nozjkoitop added 2 commits September 13, 2024 11:42

Fix failing checks

19a1a52

Checkstyle fix

5d45ef4

kfaraz requested changes Sep 23, 2024

View reviewed changes

Rollback parallelIndexRatio removal

1a09ca6

nozjkoitop requested a review from kfaraz September 24, 2024 07:23

cryptoe reviewed Sep 24, 2024

View reviewed changes

nozjkoitop added 2 commits September 24, 2024 14:33

Address review comments

ef7ae81

Address spellchecks

23e1c36

nozjkoitop requested a review from cryptoe September 24, 2024 14:02

nozjkoitop added 3 commits October 10, 2024 12:09

Global limit with dynamic config implemented using select strategies

6834b52

conflicts resolved

fcaaa04

Fix compilation and spellcheck failures

7be4502

nozjkoitop requested a review from cryptoe October 10, 2024 10:38

Trigger the checks

6e1501f

github-actions Bot added the stale label Jan 21, 2025

github-actions Bot removed the stale label Jan 22, 2025

Merge remote-tracking branch 'upstream/master' into feature-global-ms…

89150ff

…q-controller-task-limit

cryptoe reviewed Mar 5, 2025

View reviewed changes

TODO's resolved

7f5f982

nozjkoitop requested a review from cryptoe March 5, 2025 14:55

cryptoe approved these changes Mar 6, 2025

View reviewed changes

kfaraz reviewed Mar 6, 2025

View reviewed changes

Comment thread docs/configuration/index.md Outdated

nozjkoitop added 2 commits March 6, 2025 12:01

Comments addressed, LimiterUtils logic moved to TaskLimits class

832f62a

Checkstyle fix

72f79b7

kfaraz approved these changes Mar 6, 2025

View reviewed changes

kfaraz merged commit 8b56824 into apache:master Mar 6, 2025

kgyrtkirk added this to the 33.0.0 milestone Apr 14, 2025

	/**
	* The number of task slots that a parallel indexing task can take is restricted using this config as a multiplier
	*
	* A value of 1 means no restriction on the number of slots ParallelIndexSupervisorTasks can occupy (default behaviour)
	* A value of 0 means ParallelIndexSupervisorTasks can occupy no slots.
	* Deadlocks can occur if the all task slots are occupied by ParallelIndexSupervisorTasks,
	* as no subtask would ever get a slot. Set this config to a value < 1 to prevent deadlocks.
	*
	* @return ratio of task slots available to a parallel indexing task at a worker level
	*/
	public double getParallelIndexTaskSlotRatio()
	{
	return parallelIndexTaskSlotRatio;
	}

Conversation

nozjkoitop commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Improvements

Rationale

Introduced new configuration options

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asdf2014 left a comment

Choose a reason for hiding this comment

Uh oh!

nozjkoitop commented Aug 14, 2024

Uh oh!

kfaraz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cryptoe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozjkoitop commented Sep 12, 2024

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozjkoitop Sep 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozjkoitop commented Oct 10, 2024

Uh oh!

cryptoe commented Nov 21, 2024

Uh oh!

github-actions Bot commented Jan 21, 2025

Uh oh!

nozjkoitop commented Jan 21, 2025

Uh oh!

cryptoe commented Mar 4, 2025

Uh oh!

nozjkoitop commented Mar 5, 2025

Uh oh!

kfaraz commented Mar 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nozjkoitop commented Aug 13, 2024 •

edited

Loading

kfaraz left a comment •

edited

Loading

nozjkoitop Sep 24, 2024 •

edited

Loading

nozjkoitop Mar 5, 2025 •

edited

Loading