Skip to content

Fix Cannot mark an unqueryable datasource's segments used / unused#16127

Merged
abhishekrb19 merged 4 commits intoapache:masterfrom
zachjsh:mark-used-unqueryable-datasource
Mar 15, 2024
Merged

Fix Cannot mark an unqueryable datasource's segments used / unused#16127
abhishekrb19 merged 4 commits intoapache:masterfrom
zachjsh:mark-used-unqueryable-datasource

Conversation

@zachjsh
Copy link
Copy Markdown
Contributor

@zachjsh zachjsh commented Mar 14, 2024

Fixes #16125

Description

This fixes an issue in which a datasource that does not have any segments loaded on historicals, cannot have its segments mark used or unused. Such a datasource may arise because all of it's segments were previously marked as unused, or if all of its segments are cold. This bug effectively did not allow a user to recover or hard-delete a datasource in this state unless they used the apis to mark ALL segments used / unused.

The issue was that there was a check that short circuits the work to mark segments used / unused if it thinks that the datasource does not exist; this check was equated to whether the datasource had any segments presently loaded on any historicals. The issue was fixed by removing this short circuit check.

Release note

Fixed bug that dissallowed marking segment by id / specific interval as used or unused if the datasource doest not have any segments loaded on historicals.


Key changed/added classes in this PR

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Comment thread server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java Outdated
Comment thread server/src/main/java/org/apache/druid/server/http/DataSourcesResource.java Outdated
@abhishekrb19
Copy link
Copy Markdown
Contributor

Good catch 👍 It's also probably worth mentioning in the PR description that this bug exists only when users are marking segments as used by interval or by segment ids for a datasource that contains no used segments. In other words, users can still recover a datasource by marking all segments as used, via markAsUsedAllNonOvershadowedSegments, which doesn't have the short circuit logic.

@zachjsh zachjsh changed the title FIX Cannot mark an unqueryable datasource's segments used Fix Cannot mark an unqueryable datasource's segments used / unused Mar 15, 2024
@zachjsh zachjsh requested review from abhishekrb19 and kfaraz March 15, 2024 17:23
return 0;
}
if (payload == null || !payload.isValid()) {
log.warn("Invalid request payload: [%s]", payload);

Check notice

Code scanning / CodeQL

Use of default toString()

Default toString(): MarkDataSourceSegmentsPayload inherits toString() from Object, and so is not suitable for printing.
abhishekrb19 added a commit to abhishekrb19/incubator-druid that referenced this pull request Mar 15, 2024
Assert.assertEquals(ImmutableMap.of("numChangedSegments", 0), response.getEntity());
EasyMock.verify(segmentsMetadataManager);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a similar test for markAsUnused API where there are no used segments or only used segments in the deep storage, and the operation still proceeds now that we have removed the short circuit? If not, I can add a test in this patch #16136

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@abhishekrb19 abhishekrb19 merged commit f3d77fe into apache:master Mar 15, 2024
@zachjsh zachjsh deleted the mark-used-unqueryable-datasource branch March 16, 2024 00:34
@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot mark an unqueryable datasources segments used

5 participants