Skip to content

Conversation

@SaketaChalamchala
Copy link
Contributor

@SaketaChalamchala SaketaChalamchala commented Jun 18, 2025

What changes were proposed in this pull request?

To provide visibility into the Key Deleting Service progress, added a "Deletion Progress" section to the OM Web UI displaying the cumulative deletion progress (no. of keys and size) in the last 1 day as well as the KDS service status, Last run details and the next run information.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12887

How was this patch tested?

Unit Tests for metrics
final-om-gc-progress-ui

@jojochuang jojochuang added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Jun 24, 2025
@jojochuang
Copy link
Contributor

Unable to open the screenshot.

<th>Reclaimed Size</th>
<th>#Reclaimed Keys</th>
<th>#Iterated Keys</th>
<th>#NotReclaimable Keys</th>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NotReclaimable keys makes sense in context of snapshots. We could change it to

Suggested change
<th>#NotReclaimable Keys</th>
<th>#NotReclaimable Keys (Referred by snapshots)</th>

@SaketaChalamchala
Copy link
Contributor Author

This PR might need to be merged after #8677

@jojochuang jojochuang requested a review from sadanand48 July 7, 2025 15:31
@jojochuang
Copy link
Contributor

Will merge #8677 today to unblock this one.

@SaketaChalamchala SaketaChalamchala marked this pull request as ready for review July 30, 2025 19:07
Copilot AI review requested due to automatic review settings July 30, 2025 19:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a "Deletion Progress" section to the OM Web UI to provide visibility into the Key Deleting Service progress, displaying cumulative deletion metrics for the last 24 hours along with service status and run information.

Key changes include:

  • Added comprehensive deletion metrics tracking for both Active Object Store (AOS) and snapshot operations
  • Enhanced the OM Web UI with a new deletion progress section showing metrics and service status
  • Extended the Key Deleting Service to track and report detailed deletion statistics

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
TestKeyDeletingService.java Added unit tests for deletion metrics and updated existing test methods
ozoneManager.js Added utility functions for data formatting and JMX query for DeletingServiceMetrics
om-overview.html Added new UI section displaying deletion progress metrics and service status
KeyDeletingService.java Enhanced to track deletion statistics and update metrics after task completion
PendingKeysDeletion.java Extended to include key block replicated size and not reclaimable key count
KeyManagerImpl.java Updated to populate additional metrics data for pending deletions
DeletingServiceMetrics.java Added new metrics for 24-hour intervals, service state, and last run statistics
BackgroundService.java Added task completion callback to support metrics updates
Comments suppressed due to low confidence (1)

hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestKeyDeletingService.java:825

  • The test is passing null for the keyBlockReplicatedSize parameter which means the deletion size metrics won't be properly tested. Consider providing a mock map to verify size tracking functionality.
        keyDeletingService.processKeyDeletes(blockGroups, keysToModify, renameEntriesToBeDeleted, null, null, null);

// Add JMX query to fetch DeletingServiceMetrics data
$http.get("jmx?qry=Hadoop:service=OzoneManager,name=DeletingServiceMetrics")
.then(function (result) {
if (result.data.beans && result.data.beans.length > 0) {
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing closing brace for the function that contains this if statement. The function appears to be incomplete and will cause a syntax error.

Copilot uses AI. Check for mistakes.
this.kdsCurRunTimestamp.set(timestamp);
}

private void resetMetrics() {
Copy link
Contributor

@jojochuang jojochuang Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why reset metrics after 24hours? usually metrics are reset when restart, and that's good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already collect deletion metrics since OM uptime. The thought was on OM UI it showing the progress of deletion over the last 24h would be more interesting for someone who has performed a delete and wants to track how the delete is proceeding.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a quick look and I think this is ready to go.

Comment on lines 430 to 436
aosDeletionStats.updateDeletionStats(purgeResult.getKey().getKey(), purgeResult.getKey().getValue(),
keyBlocksList.size() + pendingKeysDeletion.getNotReclaimableKeyCount(),
pendingKeysDeletion.getNotReclaimableKeyCount());
} else {
snapshotDeletionStats.updateDeletionStats(purgeResult.getKey().getKey(), purgeResult.getKey().getValue(),
keyBlocksList.size() + pendingKeysDeletion.getNotReclaimableKeyCount(),
pendingKeysDeletion.getNotReclaimableKeyCount());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems these two cases are almost exactly the same except the object itself is aosDeletionStats or snapshotDeletionStats. A refactoring is recommended though not required.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loos good to me.
This PR should ideally split into two parts due to its complexity: metric implementation, and frontend code.

Will merge later.

@jojochuang jojochuang merged commit 37589a4 into apache:master Aug 1, 2025
1 check passed
@jojochuang
Copy link
Contributor

Thanks @SaketaChalamchala !

swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
…apache#8655)

Co-authored-by: Wei-Chiu Chuang <jojochuang@gmail.com>
(cherry picked from commit 37589a4)

 Conflicts:
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/DeletingServiceMetrics.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java
	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/KeyDeletingService.java
	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestKeyDeletingService.java

Change-Id: Ic874a0bb516712100b5bc5a584a2a4a54858d1da
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants