-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-11268. Add --table mode for OM/SCM Roles CLI #7016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks @slfan1989 the table looks nice. In the description the |
Thank you for your attention to this pr! I have tried to output verbose information using After completing this pr, I can submit a new pr to add support for the verbose option in ozone admin om/scm, including support for tabular-style output. |
|
Thanks @slfan1989 for the patch. Please check acceptance test failure: https://github.com/apache/ozone/actions/runs/10236952464/job/28319907708?pr=7016#step:6:9 |
|
@errose28 @adoroszlai Can you help me review this pr again? Thank you very much! |
| Should Contain ${output_without_id_passed} no Ozone Manager service ID specified | ||
|
|
||
| List om roles as TABLE with OM service ID passed | ||
| ${output_without_id_passed} = Execute ozone admin om roles --table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slfan1989 , this should be "output_with_id_passed". BTW, the first command "ozone admin om roles --table" doesn't have service ID set, why it's under this "with OM service ID passed" category?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for helping review the code. The issue was due to my oversight, and I have updated the unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChenSammi Can you help review this pr again? thank you very much!
whbing
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slfan1989 for working on this. Overall looks good, left a few comments.
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/utils/FormattingCLIUtils.java
Show resolved
Hide resolved
|
Seems some related ci failures, such as @slfan1989 From my point, maybe we don't need to make so many changes. How about simplifying the codes similar to the |
|
@slfan1989 , thanks for improving the patch. For this scmHaEnabled, can you explain a little bit about its purpose? |
|
|
||
| // Determine which header to use based on whether Ratis is enabled or not. | ||
| if (isRatisEnabled) { | ||
| formattingCLIUtils.addHeaders(RATIS_SCM_ROLES_HEADER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slfan1989 , thanks for improving the patch. For this scmHaEnabled, can you explain a little bit about its purpose?
Thank you very much for reviewing this PR! I’d like to explain why I added the scmRatisEnable:
When using --table to output SCM information, we need to first determine the length of the table headers.
For non-HA SCMs, we will display it in the following format:
+-----------------------------------------------+
| Storage Container Manager Roles |
+---------------------------------+-------------+
| Host Name | Port |
+---------------------------------+-------------+
For HA SCMs, we will display it in the following format:
+---------------------------------------------------------------------------------------------------------------+
| Storage Container Manager Roles |
+---------------------------------+------------+----------+--------------------------------------+--------------+
| Host Name | Ratis Port | Role | Node ID | Host Address |
+---------------------------------+------------+----------+--------------------------------------+--------------+
I need a clear method to determine if SCM has Ratis enabled.
I am considering two approaches:
Approach 1: Use the ratisRole length of the returned data for judgment.
Approach 2: Add a return attribute that explicitly indicates whether Ratis is being used.
From my perspective, Approach 2 is better, as the code looks clearer.
@whbing Thank you for your question! I believe using a clear method to indicate whether SCM is using Ratis would be better. While a JSON output approach is also feasible, determining whether Ratis is used based on the size of the length might not be very reliable. |
4845692 to
e7b81f3
Compare
| * while `false` indicates that it is in STANDALONE mode. | ||
| * @throws IOException an I/O exception of some sort has occurred. | ||
| */ | ||
| boolean isSCMRatisEnable() throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isSCMRatisEnable -> isScmRatisEnable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing the code! I will improve this part of the code.
| ScmInfo.Builder builder = new ScmInfo.Builder() | ||
| .setClusterId(resp.getClusterId()) | ||
| .setScmId(resp.getScmId()) | ||
| .setScmRatisEnabled(resp.getScmRatisEnabled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here to be backward compatible, we should first check whether the response has "scmRatisEnabled" field. If it doesn't have, which means the SCM server is an old version server, then we need to set this ScmRatisEnabled field in other way, for example, the size of RolesList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here to be backward compatible, we should first check whether the response has "scmRatisEnabled" field.
Thank you for the reminder! It's indeed very necessary. I will check if ScmRatisEnabled exists.
If it doesn't have, which means the SCM server is an old version server, then we need to set this ScmRatisEnabled field in other way, for example, the size of RolesList.
We use the size of RolesList to determine if Ratis is enabled. If RolesList.size > 1, we consider Ratis enabled. If RolesList.size <= 1, we consider Ratis not enabled ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one case that SCM ratis is ended, but only one SCM instance in the raft group. In this case, we can check whether the role string has LEADER or FOLLOWER key word, if neither key words exist, then ratis is not enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion! I will continue to improve the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChenSammi Could you review this PR again? Thank you very much!
9ae14fa to
b5e7a3a
Compare
| .setRatisPeerRoles(resp.getPeerRolesList()); | ||
|
|
||
| return builder.build(); | ||
| // By default, we assume that SCM Ratis is not enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have considered whether the ScmRatisEnabled field exists in the proto. If the field is not present, we determine whether Ratis is enabled based on whether peerRolesList includes leader or follower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChenSammi Kindly ping, Can you review this PR again? Thank you very much!
whbing
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slfan1989 for update. Overall looks good. Some comments left.
| Should Not Be Equal ${leader} ${EMPTY} | ||
| Assert Leader Present in TABLE | ||
| [Arguments] ${output} | ||
| Should Match Regexp ${output} \|\s+om\d+\s+\|\s+LEADER\s+\| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
om hostis a user-defined string and does not necessarily start withom, right ?- NIT: string don't look aligned.
- I verified this RE, and it seems that any format will pass. So, this RE looks like it needs to be changed.
You can install robot in python and execute it locally:
om-roles-test.robot:
*** Settings ***
Documentation Smoke test for listing om roles.
Library OperatingSystem
Test Timeout 5 minutes
*** Keywords ***
Assert Leader Present in TABLE
[Arguments] ${output}
Should Match Regexp ${output} \|\s+om\d+\s+\|\s+LEADER\s+\|
*** Test Cases ***
List om roles as TABLE without OM service ID passed
${result}= Set Variable | host32 | om32 | OTHER
Should Be Equal ${result} | host32 | om32 | OTHER
Assert Leader Present in TABLE ${result}run: robot om-roles-test.robot
==============================================================================
Om-Roles-Test :: Smoke test for listing om roles.
==============================================================================
List om roles as TABLE without OM service ID passed | PASS |
------------------------------------------------------------------------------
Om-Roles-Test :: Smoke test for listing om roles. | PASS |
1 test, 1 passed, 0 failed
==============================================================================
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
om host is a user-defined string and does not necessarily start with om, right ?
This regex matches strings containing "om" followed by digits, specifically formatted with "LEADER" surrounded by pipe delimiters.
If we replace it with other characters (like "scm"), it should not match.
NIT: string don't look aligned.
This should be fixed, and I will modify the format issue.
I verified this RE, and it seems that any format will pass. So, this RE looks like it needs to be changed.
You can install robot in python and execute it locally:
Sorry, I don't think there's an issue with this regex.
Test1 regex test
The regex validation indicates that, based on the provided example, it cannot match the regex.
I tested the regex functionality in VSCode, and everything is working fine. I'm not sure if Robot has issues recognizing the regular expression, so I changed it to ensure that it works properly with Robot.
==============================================================================
Om-Roles
==============================================================================
List om roles as TABLE without OM service ID passed | FAIL |
'| host32 | om32 | OTHER' does not match '\|.*LEADER.*'
------------------------------------------------------------------------------
Om-Roles | FAIL |
1 test, 0 passed, 1 failed
==============================================================================
Test Passed.
Assert Leader Present in TABLE
[Arguments] ${output}
Should Match Regexp ${output} \\|.*LEADER.*
*** Test Cases ***
List om roles as TABLE without OM service ID passed
${result}= Set Variable | host32 | om32 | LEADER
Log ${result}
Assert Leader Present in TABLE ${result}
==============================================================================
Om-Roles
==============================================================================
List om roles as TABLE without OM service ID passed | PASS |
------------------------------------------------------------------------------
Om-Roles | PASS |
1 test, 1 passed, 0 failed
==============================================================================
|
@whbing @ChenSammi Could you please help review this PR again? Thank you very much! |
| * This is used to check if 'leader' or 'follower' exists, | ||
| * in order to confirm whether we have enabled Ratis. | ||
| */ | ||
| private final List<String> scmRatisRolesToCheck = Arrays.asList("leader", "follower"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable scmRatisRolesToCheck is only referenced in one place, and I'm not sure if defining it as a global variable is a good idea. Additionally, from my understanding, using static might be better if it needs to be used. @ChenSammi Please help give some suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@errose28 @adoroszlai May I take up some of your time to ask for some advice? Thank you very much!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whbing How should I modify this? Should I change the variable to uppercase and make it static?
|
@whbing @ChenSammi This PR is a small feature, but I believe it is valuable. We have already deployed it internally for 2 months, and I use this feature whenever switching SCM or OM. It should also make the information display more user-friendly for other users. If there are no major issues, I hope we can merge this PR into the master branch. cc: @errose28 @adoroszlai |
|
@adoroszlai @errose28 @whbing @ChenSammi I'm sorry to disturb everyone, but I really need this pr to assist with some subsequent work, such as #7266 (HDDS-11463. Track and display failed DataNode storage locations in SCM). I think we should provide a way to present this in a table format. Currently, I'm not sure what issues this PR has. |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slfan1989 for the patch. I have some minor code comments, otherwise looks good.
| private final String clusterId; | ||
| private final String scmId; | ||
| private final List<String> peerRoles; | ||
| private final Boolean scmRatisEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally Boolean instead of boolean? Since its value is set in the constructor from a boolean parameter, it doesn't seem to be ever set to null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for helping review the code! I will improve this part of the code.
...pache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
Outdated
Show resolved
Hide resolved
...-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisRolesSubcommand.java
Outdated
Show resolved
Hide resolved
...-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/GetScmRatisRolesSubcommand.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| // In STANDALONE mode, output only the Host Name and Port information. | ||
| if (roleItems.length == 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that roleItems.length matches the value expected according to isRatisEnabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your revision looks great. We can simplify it by removing the array size check, since we'll standardize it to formattingCLIUtils.addLine(roleItems);.
…/scm/GetScmRatisRolesSubcommand.java Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…/scm/GetScmRatisRolesSubcommand.java Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
|
@adoroszlai Could you help review this PR again? Thank you very much! |
adoroszlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slfan1989 for updating the patch. Two previous minor items still apply, but no need to push another commit just for these.
@ChenSammi @whbing do you have any other comments?
...pache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
Outdated
Show resolved
Hide resolved
...pache/hadoop/hdds/scm/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java
Outdated
Show resolved
Hide resolved
…/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
…/protocolPB/StorageContainerLocationProtocolClientSideTranslatorPB.java Co-authored-by: Doroszlai, Attila <6454655+adoroszlai@users.noreply.github.com>
|
@adoroszlai Thank you very much for reviewing the code. Friendly ping. @whbing @ChenSammi |
|
Sorry for the late reply due to the holiday. I'll have a look ASAP. |
|
Thanks for @slfan1989 for the update, +1. |
|
@adoroszlai Thank you very much for reviewing this PR! Regarding whbing, he/she has agreed to this PR. Regarding ChenSammi , he/she may have been quite busy recently and hasn’t responded to this PR in the past month. If he/she has any feedback later, I’m willing to make improvements to the code at any time. Can we merge this PR into the master branch? so that I can continue working on HDDS-11463(#7266) as discussed with Ethan. |
|
Thanks @slfan1989 for the patch, @ChenSammi, @errose28, @whbing for the review. |
|
@adoroszlai Thank you again for your help with this PR! |



What changes were proposed in this pull request?
When viewing roles of OM/SCM via CLI, a lengthy string is typically displayed, which lacks user-friendly formatting. We aim to improve its presentation. I have added a
tableoption to allow displaying this data in a tabular format.om-current/bin/ozone admin om roles
om-current/bin/ozone admin om roles --table
scm-current/bin/ozone admin scm roles
scm-current/bin/ozone admin scm roles --table
What is the link to the Apache JIRA
JIRA: HDDS-11268. [CLI] Improve CLI Display OM/SCM Roles.
How was this patch tested?
To validate this in a production environment.