Skip to content

Add limit to task payload size#16512

Merged
georgew5656 merged 11 commits intoapache:masterfrom
georgew5656:addTaskPayloadLimit
May 31, 2024
Merged

Add limit to task payload size#16512
georgew5656 merged 11 commits intoapache:masterfrom
georgew5656:addTaskPayloadLimit

Conversation

@georgew5656
Copy link
Copy Markdown
Contributor

Currently it's possible to submit a really big task payload and have it OOM the overlord if the request fails and logs the whole thing in SQLMetadataStorageActionHandler.insertEntryWithHandle.

If the request happens to be larger than max_allowed_packet for a mysql metadata store it will always fail. Since really large task payloads seem to cause overlord instability in general I think it makes sense to limit the size of task payloads at the task queue level.

Description

Add a new config that sets a limit for task payload sizes. Throw a exception if the limit is exceeded.

The default limit of 60 MB is based on the 64 MB default value of max_allowed_packet in MySQL 8+.

I would ideally like to use the http request content-length header to calculate the size of the task payload rather than re-serializing it in memory but we also call taskQueue.add directly from supervisors so that would bypass the check. If others think this is acceptable and it would be better to check Content-Length in OverlordResource I am fine with changing this logic.

Release note

  • Adding a new guardrail for submitting tasks that are too large
Key changed/added classes in this PR
  • TaskQueue

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • [ X added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented May 29, 2024

@georgew5656 , thanks for the changes!

It makes sense to fail early on large payloads with a clear error message, but I have some concerns.

  • Such tasks already fail on insertion into the metadata store. Do you think the current error message in this case is not clear enough? Instead of a new check/config, should we try to improve the existing error message by maybe catching it and throwing a cleaner DruidException?
  • Or do you feel that there are other challenges with large task payloads too (say, Overlord facing a memory crunch)?
  • One problem with introducing this new config with a default of 64MiB is that some clusters would already be using large task payloads and would have set their max_allowed_packet to large enough values that allow such payloads. And if we go with a default that is large enough to accommodate all clusters, then we would probably defeat the purpose of the config itself.
  • There is also the MSQ case to consider where the user just submits a SQL and the task payload is actually generated by the system. How does the user control the task payload size in that case? Maybe we don't fail in the case of msq task types?

One compromise could be to keep the check and just warn about large task payloads for the time being, with a log message and maybe also an alert event.

Let me know what you think!

@georgew5656
Copy link
Copy Markdown
Contributor Author

georgew5656 commented May 29, 2024

@georgew5656 , thanks for the changes!

It makes sense to fail early on large payloads with a clear error message, but I have some concerns.

  • Such tasks already fail on insertion into the metadata store. Do you think the current error message in this case is not clear enough? Instead of a new check/config, should we try to improve the existing error message by maybe catching it and throwing a cleaner DruidException?
  • Or do you feel that there are other challenges with large task payloads too (say, Overlord facing a memory crunch)?
  • One problem with introducing this new config with a default of 64MiB is that some clusters would already be using large task payloads and would have set their max_allowed_packet to large enough values that allow such payloads. And if we go with a default that is large enough to accommodate all clusters, then we would probably defeat the purpose of the config itself.
  • There is also the MSQ case to consider where the user just submits a SQL and the task payload is actually generated by the system. How does the user control the task payload size in that case? Maybe we don't fail in the case of msq task types?

One compromise could be to keep the check and just warn about large task payloads for the time being, with a log message and maybe also an alert event.

Let me know what you think!

the direct problem seems to be the error message that gets thrown when the write to db fails is a little TOO descriptive in this case (it tries to log the whole thing and this can easily cause mem issues).

i don't think the config totally loses its purpose with a higher default, since you can still choose to lower it to limit the blast radius of large payloads.

maybe it would make sense to just truncate that error message and throw a warn in OverlordResource when a task payload is very large?

I personally don't think a task payload above a certain size makes sense. for msq, would it really try to generate that large of a payload? 60 MB is huge for metadata

@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented May 29, 2024

maybe it would make sense to just truncate that error message and throw a warn in OverlordResource when a task payload is very large?

Yes, we should definitely do these two items even if we decide not to do the rest.

i don't think the config totally loses its purpose with a higher default, since you can still choose to lower it to limit the blast radius of large payloads.

True, but I don't think most cluster operators would think of updating this config when they encounter a problem. They would be more likely to increase the max_allowed_packet unless we update the error message thrown for max_allowed_packet to suggest using the new config.

I personally don't think a task payload above a certain size makes sense. for msq, would it really try to generate that large of a payload? 60 MB is huge for metadata

I agree that 60MiB is large enough, but I do recall some cases where users had to resort to increasing the max_allowed_packet. Example,

Could not send query: query size is >= to max_allowed_packet (67108864)

In conclusion, we could do the following:

  • Truncate the error message thrown when we exceed max_allowed_packet and include info about using the new config.
  • Log a warning when task payload crosses say 80% of max limit.
  • Log a stronger warning and raise an alert when task payload crosses 100% of max limit, maybe also mentioning that such task payloads will be rejected in future Druid releases.

@georgew5656 georgew5656 force-pushed the addTaskPayloadLimit branch from 6c2211b to 1a3f95c Compare May 29, 2024 17:13
Comment thread docs/configuration/index.md Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
Comment on lines +519 to +525
log.warn("Received a task payload > [%d] with id [%s]. and datasource [%s]" +
" There may be downstream issues caused by managing this large payload." +
"Increase druid.indexer.queue.maxTaskPayloadSize to ignore this warning.",
config.getMaxTaskPayloadSize(),
task.getId(),
task.getDataSource()
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.warn("Received a task payload > [%d] with id [%s]. and datasource [%s]" +
" There may be downstream issues caused by managing this large payload." +
"Increase druid.indexer.queue.maxTaskPayloadSize to ignore this warning.",
config.getMaxTaskPayloadSize(),
task.getId(),
task.getDataSource()
);
log.warn(
"Task[%s] of datasource[%s] has payload size[%d] larger than the recommended maximum[%d]." +
+ " Large task payloads may cause stability issues in the Overlord and may fail while persisting to the metadata store."
+ " Use smaller task payloads or increase 'druid.indexer.queue.maxTaskPayloadSize' to suppress this warning.",
task.getId(), task.getDataSource(), config.getMaxTaskPayloadSize()
);

Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
@georgew5656 georgew5656 requested a review from kfaraz May 30, 2024 19:21
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
georgew5656 and others added 3 commits May 31, 2024 06:16
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nitpicks, otherwise looks good.
Please update the release note in the PR description mentioning the usage of the new config.

Comment thread docs/configuration/index.md Outdated
} else if (payload.length() > TASK_SIZE_WARNING_THRESHOLD) {
log.warn(
"Task[%s] of datasource[%s] has payload size[%d] larger than the recommended maximum[%d]. " +
"Large task payloads may cause stability issues in the Overlord and may fail while persisting to the metadata store." +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: missing space

Suggested change
"Large task payloads may cause stability issues in the Overlord and may fail while persisting to the metadata store." +
"Large task payloads may cause stability issues in the Overlord and may fail while persisting to the metadata store. " +

Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java Outdated
georgew5656 and others added 2 commits May 31, 2024 08:19
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
…ord/TaskQueue.java

Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
@georgew5656 georgew5656 merged commit 0936798 into apache:master May 31, 2024
@georgew5656
Copy link
Copy Markdown
Contributor Author

the intellij inspection failures look unrelated to me

@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants