-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-14121. Parallelize NSSummary Tree rebuild. #9473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
241d7e8 to
9b4b595
Compare
| break; | ||
| } catch (Exception e) { | ||
| LOG.error("{}: Error in flush loop", taskName, e); | ||
| // Continue processing other batches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error or task failure is not reported ... need have mechanism to report failure if db have some issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sumitagrawl,
The async flusher now tracks a FAILED state on any DB write error or any other error it records the exception and stops processing.
Worker threads check flusher health before processing each record and stop within milliseconds if a failure is detected.
The queue also rejects new batches immediately after failure, and close() propagates the original DB exception so the main task fails cleanly.
Result: No wasted work, fast failure detection, protected queue, and clear errors with the original DB issue kept.
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| </description> | ||
| </property> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid whitespace-only changes.
What changes were proposed in this pull request?
Earlier, Recon rebuilt the NSSummary tree using a single thread and wrote directly to the DB, which was very slow for large namespaces. This change makes the rebuild parallel, faster, and safer.
During a rebuild, Recon splits the OM DB tables into ranges and processes them in parallel using multiple iterator and worker threads. Workers scan records and build in-memory summary updates but never read from Recon DB, keeping them fast and avoiding contention.
When workers accumulate enough updates, they send batches to a single background async flusher through a bounded queue. The flusher is the only component that writes to Recon DB. It merges updates, propagates file sizes and counts up the directory tree, and commits everything using batched DB writes.
For FSO, the rebuild runs in two phases: first the directory phase to build the directory structure, then the file phase to apply file size and count updates. Each phase has its own flusher so file updates never depend on missing directories.
If a DB write fails, the flusher immediately marks itself as failed. Workers detect this quickly and stop processing, new batches are rejected, and the original error is propagated so the task fails cleanly.
Overall, this approach significantly reduces rebuild time for large namespaces while keeping DB writes controlled, consistent, and correct.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14121
How was this patch tested?
Locally the results are the following comparing sequential iteration (old approach) vs parallel iteration (new approach)