Skip to content

Fix mark segment unused when overshadowed by zero replica segment#16181

Merged
kfaraz merged 1 commit intoapache:masterfrom
kfaraz:fix_mark_unused
Mar 21, 2024
Merged

Fix mark segment unused when overshadowed by zero replica segment#16181
kfaraz merged 1 commit intoapache:masterfrom
kfaraz:fix_mark_unused

Conversation

@kfaraz
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz commented Mar 21, 2024

Bug

In the MarkOvershadowedSegmentsAsUnused duty, the coordinator marks a segment as unused
if it is overshadowed by a segment currently being served by a historical or broker.
But it is possible to have segments that are eligible for a load rule but require zero replicas to be loaded.
(Such segments can be queried only using the MSQ engine).
If such a zero-replica segment overshadows any other segment, the overshadowed segment will never be
marked as unused and will continue to exist in the metadata store as a dangling segment.

Changes

  • In a coordinator run, keep track of segments that are eligible for a load rule but require zero replicas
  • Allow the zero-replicas segments to overshadow old segments and hence mark the latter as unused
  • Add simulation test to verify new behaviour. This test fails with the current code.
  • Clean up javadocs

Release note

Fix bug in the MarkOvershadowedSegmentsAsUnused coordinator duty to also consider segments that are overshadowed by a segment that requires zero replicas.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@AmatyaAvadhanula
Copy link
Copy Markdown
Contributor

the coordinator marks a segment as unused if it is overshadowed by a segment currently being served by a historical or broker.

Could you please share when should a segment only on the brokers, but not on any of the historicals, overshadow segments?

@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Mar 21, 2024

Could you please share when should a segment only on the brokers, but not on any of the historicals, overshadow segments?

This could be possible for a broadcast segment that is missing from the historicals because of some intermittent issue. The logic regarding the brokers is pre-existing and is not being modified in this PR.

Just to clarify, the segments being considered here are already overshadowed. We are not changing their overshadowed status in any way.
The code in MarkOvershadowedSegmentsAsUnused just decides which overshadowed segments can be marked as unused.

Copy link
Copy Markdown
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kfaraz! LGTM

@kfaraz kfaraz merged commit 3529021 into apache:master Mar 21, 2024
@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Mar 21, 2024

Thanks a lot for the prompt review, @AmatyaAvadhanula , @adarshsanjeev !

@kfaraz kfaraz deleted the fix_mark_unused branch March 21, 2024 07:27
if (DateTimes.nowUtc().isBefore(coordinatorStartTime.plus(delayMillis))) {
log.info(
"Skipping MarkAsUnused until [%s] have elapsed after coordinator start [%s].",
"Skipping MarkAsUnused until [%s] have elapsed after coordinator start time[%s].",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Skipping MarkAsUnused until [%s] have elapsed after coordinator start time[%s].",
"Skipping MarkAsUnused until [%s] have elapsed after coordinator start time [%s] ms.",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, @cryptoe , I saw your comment after the PR was merged. I will try to include this refactor in a later PR.

@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants