Skip to content

Conversation

@prashantwason
Copy link
Member

@prashantwason prashantwason commented Dec 1, 2025

Describe the issue this Pull Request addresses

After a job completes, its job group and description continues to show on the Spark history server UI for following jobs which may not be setting their own job status correctly. This causes confusion as stale job descriptions persist in the UI.

Summary and Changelog

Added clearJobStatus() API to HoodieEngineContext and corresponding calls after parallel operations complete.

Changes:

  • Added abstract clearJobStatus() method to HoodieEngineContext
  • Added implementations in HoodieSparkEngineContext (clears via javaSparkContext.setJobDescription(null)), HoodieFlinkEngineContext, HoodieJavaEngineContext, and HoodieLocalEngineContext (no-op)
  • Added clearJobStatus() calls after parallel operations in:
    • HoodieBloomIndex.java - after range pruning
    • CleanPlanActionExecutor.java - after partition listing and file slice generation
    • BaseSparkCommitActionExecutor.java - after clustering handle, write operations, workload profile, and commit
    • FSUtils.java - after parallel path listing
    • FileSystemBackedTableMetadata.java - after partition and file listing

Impact

Enhancement to the Spark History Server UI. Job descriptions will now be properly cleared after operations complete, preventing stale status from appearing on subsequent jobs.

Risk Level

Low. This is a cosmetic change limited to the Spark History Server UI display.

  • No functional changes to data processing
  • Only cleanup operations added after existing parallel operations
  • Existing unit and integration tests should pass

Documentation Update

None required. This is an internal implementation detail that improves UI clarity.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@prashantwason prashantwason changed the title Add clearJobStatus() calls after setJobStatus() operations [MINIR] Add clearJobStatus() calls after setJobStatus() operations Dec 1, 2025
@prashantwason prashantwason changed the title [MINIR] Add clearJobStatus() calls after setJobStatus() operations [MINOR] Add clearJobStatus() calls after setJobStatus() operations Dec 1, 2025
@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Dec 1, 2025
@prashantwason prashantwason force-pushed the pw_oss_commit_porting_2 branch from aae6c95 to 43fae98 Compare December 4, 2025 19:22
@prashantwason prashantwason changed the title [MINOR] Add clearJobStatus() calls after setJobStatus() operations fix(spark): Add clearJobStatus() calls after setJobStatus() operations Dec 4, 2025
@nsivabalan
Copy link
Contributor

hey @prashantwason : can you rebase w/ latest master.

@prashantwason prashantwason force-pushed the pw_oss_commit_porting_2 branch from 43fae98 to b1c04f9 Compare January 5, 2026 19:53
@github-actions github-actions bot added size:XL PR with lines of changes > 1000 and removed size:S PR with lines of changes in (10, 100] labels Jan 5, 2026
After a job is completed its status (description shown in UI) should be cleared otherwise any future job which does not set its own job status will end up showing the older job status which is confusing on the UI.

Currently, setting and clearing job status is only implemented for Spark which shows the status on SHS UI in the jobs tab.
@prashantwason prashantwason force-pushed the pw_oss_commit_porting_2 branch from b1c04f9 to 753f516 Compare January 5, 2026 20:05
@github-actions github-actions bot added size:S PR with lines of changes in (10, 100] and removed size:XL PR with lines of changes > 1000 labels Jan 5, 2026
@hudi-bot
Copy link
Collaborator

hudi-bot commented Jan 5, 2026

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantwason could you check if the PR #9899 is still useful after your change?

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yihua yihua merged commit 67d4b05 into apache:master Jan 8, 2026
73 of 74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants