-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-7098. Provide a way for admin to identify all unhealthy container replicas #4443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| required int64 firstSeenTime = 2; | ||
| required int64 lastSeenTime = 3; | ||
| required int64 bcsId = 4; | ||
| required string state = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An idea was to pull an enum from https://github.com/mladjan-gadzic/ozone/blob/HDDS-7098/hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto#L213 and use it here in order not to maintain same enum in two places but I could not make it. Because of that I decided not to use enum at all and stick with string instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it be backward compatible if it's a required field, or is backward compat not a concern here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say backward compatibility is not a concern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would break rolling upgrade, or when recon and SCM are in different versions.
I see that rolling upgrade is not yet a thing, so I am okay with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a powerful reason for making the proto message required. I would change this to optional. If the code depends on it, it should handle it rather than having proto layer throw an exception on parsing.
|
cc @adoroszlai can you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be ideal to reuse the State Enum type.
Would the answer in this stackoverflow article help address the import problem? https://stackoverflow.com/questions/66524793/how-to-use-the-enum-of-one-proto-file-in-another-file
@jojochuang thanks for the review. I like that idea as well. I've tried it, but there is an issue with backward compatibility: |
Does the compatibility check also fail if the field is |
Yes, it does. Backward compatibility check fails for |
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not being able to use the enum is somewhat unfortunate, but there is a precedent for using the string state in hdds.proto from SCMContainerReplicaProto, which likely had to use a string for the same reason. I have one question inline, but overall LGTM if others are satisfied with the current state of the protos.
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconContainerManager.java
Show resolved
Hide resolved
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @mladjan-gadzic LGTM. Does anyone involved in the earlier proto discussion have outstanding concerns? @kerneltime @jojochuang @adoroszlai
|
@errose28 thanks for the review! |
|
Failure of |
|
Thanks @mladjan-gadzic for the patch, @devmadhuu, @errose28, @jojochuang, @kerneltime for the review. |
What changes were proposed in this pull request?
Add state field for container replica response for Recon API.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7098
How was this patch tested?