Skip to content

Ignore exceptions when pushing task logs to ensure task success#18210

Merged
FrankChen021 merged 2 commits intoapache:masterfrom
GWphua:ignore-tasklog-exception
Jul 11, 2025
Merged

Ignore exceptions when pushing task logs to ensure task success#18210
FrankChen021 merged 2 commits intoapache:masterfrom
GWphua:ignore-tasklog-exception

Conversation

@GWphua
Copy link
Copy Markdown
Contributor

@GWphua GWphua commented Jul 7, 2025

Description

During task execution, the task is able to finish without disruptions, but is unable to complete successfully. Error logs show that the task is unable to push task logs to HDFS. This PR aims to allow the tasks to succeed even if the task is unable to push its logs to deep storage.

2024-12-26T12:41:40,943 INFO [forking-task-runner-3] org.apache.druid.indexing.overlord.ForkingTaskRunner - Exception caught during executionorg.apache.hadoop.ipc.RemoteException: Router <REDACTED> is in safe mode and cannot handle WRITE requests    
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.checkSafeMode(RouterRpcServer.java:563)    
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.checkOperation(RouterRpcServer.java:548)    
at org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.create(RouterClientProtocol.java:383)    
at org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.create(RouterRpcServer.java:660)    
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:522)    
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)    
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:637)    
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:605)    
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)    
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1146)    
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1249)    
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1147)    
at java.security.AccessController.doPrivileged(Native Method)    
at javax.security.auth.Subject.doAs(Subject.java:422)    
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2005)    
at org.apache.hadoop.ipc.DeepHandlerManager$SurfaceHandler.run(DeepHandlerManager.java:269)

Replaced unhandled exceptions with WARN loggings.

Release note

Tasks can successfully complete after finishing, even if there are problems pushing logs and report to deep storage.


Key changed/added classes in this PR
  • ForkingTaskRunner

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FrankChen021 FrankChen021 merged commit fa562a2 into apache:master Jul 11, 2025
76 checks passed
@cecemei cecemei added this to the 35.0.0 milestone Oct 21, 2025
@GWphua GWphua deleted the ignore-tasklog-exception branch November 18, 2025 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants