Skip to content

By default dont allow index_hadoop tasks to run on a cluster, forcing operators to acknowledge that they are using a deprecated feature#18239

Merged
cryptoe merged 6 commits intoapache:masterfrom
capistrant:block-hadoop-tasks
Jul 18, 2025
Merged

By default dont allow index_hadoop tasks to run on a cluster, forcing operators to acknowledge that they are using a deprecated feature#18239
cryptoe merged 6 commits intoapache:masterfrom
capistrant:block-hadoop-tasks

Conversation

@capistrant
Copy link
Copy Markdown
Contributor

@capistrant capistrant commented Jul 11, 2025

Description

Druid 34 publicly deprecating index_hadoop type tasks. The eventual removal of index_hadoop is a big change and will force any users of the task type to make a large change to their cluster operations. This PR aims to ensure no operator is caught off guard by the deprecation and planned removal. It will fail index_hadoop tasks with an error stating the why, unless the operator updates their runtime configs to allow index_hadoop

Alternate approach

I'm sure we could shift the failure left and fail before the overlord even tries to submit the task.

Release note

Druid cluster operators must opt-in to using the now deprecated index_hadoop task type in their Druid clusters. If you wish to be able to continue submitting index_hadoop typed tasks, please set the following runtime property to true: druid.indexer.task.allowHadoopTaskExecution

Note that this property needs to be set in the local context of your running ingest task. The easiest way to achieve this is to set it in common.runtime.properties.


Key changed/added classes in this PR
  • TaskConfig
  • IndexHadoopTask

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

… operators to acknolwedge that they are using a deprecated feature
}
log.warn("Running deprecated index_hadoop task [%s]. "
+ "Hadoop indexing framework is deprecated and will be removed in a future release. "
+ "Please migrate to the new indexing framework.",
Copy link
Copy Markdown
Contributor Author

@capistrant capistrant Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trusted autocomplete from copilot for this msg 😆 should have read it closer. "the new indexing framework" is not useful. Will update

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😛

@capistrant capistrant added this to the 34.0.0 milestone Jul 11, 2025
Comment on lines +50 to +59
final HadoopIndexTask task = new HadoopIndexTask(
null,
new HadoopIngestionSpec(
DataSchema.builder()
.withDataSource("foo")
.withGranularity(
new UniformGranularitySpec(
Granularities.DAY,
null,
ImmutableList.of(Intervals.of("2010-01-01/P1D"))

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
Builder.withObjectMapper
should be avoided because it has been deprecated.
@cryptoe cryptoe merged commit 2d563de into apache:master Jul 18, 2025
139 of 142 checks passed
capistrant added a commit to capistrant/incubator-druid that referenced this pull request Jul 18, 2025
… operators to acknowledge that they are using a deprecated feature (apache#18239)

* By default dont allow index_hadoop tasks to run on a cluster, forcing operators to acknolwedge that they are using a deprecated feature

* update unclear recommendation from log

* Fixup codeql warning

* fix UT
capistrant added a commit that referenced this pull request Jul 19, 2025
… operators to acknowledge that they are using a deprecated feature (#18239) (#18290)

* By default dont allow index_hadoop tasks to run on a cluster, forcing operators to acknolwedge that they are using a deprecated feature

* update unclear recommendation from log

* Fixup codeql warning

* fix UT
ashibhardwaj pushed a commit to ashibhardwaj/druid that referenced this pull request Jul 23, 2025
… operators to acknowledge that they are using a deprecated feature (apache#18239)

* By default dont allow index_hadoop tasks to run on a cluster, forcing operators to acknolwedge that they are using a deprecated feature

* update unclear recommendation from log

* Fixup codeql warning

* fix UT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants