Skip to content

KAFKA-13914: Add command line tool kafka-metadata-quorum.sh#12469

Merged
hachikuji merged 9 commits intoapache:trunkfrom
dengziming:KAFKA-13914
Aug 20, 2022
Merged

KAFKA-13914: Add command line tool kafka-metadata-quorum.sh#12469
hachikuji merged 9 commits intoapache:trunkfrom
dengziming:KAFKA-13914

Conversation

@dengziming
Copy link
Copy Markdown
Member

@dengziming dengziming commented Aug 2, 2022

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Add MetadataQuorumCommand to describe quorum status, I'm trying to use arg4j style command format, currently, we only support one sub-command which is "describe" and we can specify 2 arguments which are --status and --replication.

# describe quorum status
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --replication

ReplicaId	LogEndOffset	Lag	LastFetchTimeMs	LastCaughtUpTimeMs	Status  	
0        	10          	        0  	-1             	        -1                	                 Leader  	
1        	10          	        0  	-1             	        -1                	                 Follower	
2        	10          	        0  	-1             	        -1                	                 Follower	

kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status
ClusterId:                             fMCL8kv1SWm87L_Md-I2hg
LeaderId:                             3002
LeaderEpoch:                      2
HighWatermark:                  10
MaxFollowerLag:                 0
MaxFollowerLagTimeMs:   -1
CurrentVoters:                    [3000,3001,3002]
CurrentObservers:              [0,1,2]

# specify AdminClient properties
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 --command-config config.properties describe

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

MetadataQuorumCommandTest and MetadataQuorumCommandErrorTest

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dengziming. I am hoping to persuade @jsancio to let us get this into 3.3.

Comment thread core/src/main/scala/kafka/admin/MetadataQuorumCommand.scala Outdated
Comment thread core/src/main/scala/kafka/admin/MetadataQuorumCommand.scala Outdated
Copy link
Copy Markdown
Member Author

@dengziming dengziming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review @hachikuji , There are some test failures: "Found 1 unexpected threads during @BeforeAll".
I will spend some time solving this failure and resolving your comments.

Comment thread core/src/main/scala/kafka/admin/MetadataQuorumCommand.scala Outdated
Comment thread core/src/main/scala/kafka/admin/MetadataQuorumCommand.scala Outdated
Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Left some small comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises an error if there is only one voter. I think we can just get rid of the filter? Then the max follower lag is just 0 if there is only one voter. We need another adjustment for maxFollowerLagTimeMs so that it also ends up as 0 when there is only one voter.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I also added a method testOnlyOneBrokerAndOneController for this case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's conventional to leave off parenthesis for getters in Scala. there are a few of these in here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially we could use TransactionsCommand.prettyPrintTable to do the formatting.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransactionsCommand is in a separate module, I moved this method to Utils in the clients module and adjusted the method.

Copy link
Copy Markdown
Member Author

@dengziming dengziming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestions @hachikuji , and I also updated the MetadataQuorumCommandTest to test more cases, PTAL again.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I also added a method testOnlyOneBrokerAndOneController for this case.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransactionsCommand is in a separate module, I moved this method to Utils in the clients module and adjusted the method.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, left a few more comments. Seems like there are some test failures as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative might be to put this in server-common instead. We would have to add it as a dependency for tools, but it might be kind of nice to keep it out of clients if it's not really needed there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you saw my comment. I do think it is a little nicer to raise an IllegalStateException. It gives us a little protection from the future in case we break something in the parsing logic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about NodeId instead of ReplicaId to go along with the node.id configuration?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe rename voter since this function is also handling observers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be better not to expose this here if it is not supported by both implementations. We can still cast the ClusterInstance to RaftClusterInstance to get access to the method.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it necessary to specify controllers when the type is ZK?

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

@hachikuji
Copy link
Copy Markdown
Contributor

@dengziming It looks like we need this patch in order to stabilize the new tests: #12517. I will try to merge it today and trigger a rebuild.

@dengziming dengziming force-pushed the KAFKA-13914 branch 3 times, most recently from 94ecc08 to cf8378e Compare August 19, 2022 09:24
@hachikuji
Copy link
Copy Markdown
Contributor

@dengziming I've disabled MetadataQuorumCommandTest for now so that we can get this checked in. I see a similar issue in #12469, but I am unable to reproduce it locally. Let's follow up to try and get to the bottom of it so that we can enable the test.

@hachikuji
Copy link
Copy Markdown
Contributor

hachikuji commented Aug 19, 2022

@dengziming I reverted the commit to disable the test. I think we found the cause of the flakiness and fixed here: d1936ab.

@hachikuji hachikuji merged commit 150fd5b into apache:trunk Aug 20, 2022
hachikuji pushed a commit that referenced this pull request Aug 20, 2022
 Add `MetadataQuorumCommand` to describe quorum status, I'm trying to use arg4j style command format, currently, we only support one sub-command which is "describe" and we can specify 2 arguments which are --status and --replication.

```
# describe quorum status
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --replication

ReplicaId	LogEndOffset	Lag	LastFetchTimeMs	LastCaughtUpTimeMs	Status  	
0        	10          	        0  	-1             	        -1                	                 Leader  	
1        	10          	        0  	-1             	        -1                	                 Follower	
2        	10          	        0  	-1             	        -1                	                 Follower	

kafka-metadata-quorum.sh --bootstrap-server localhost:9092 describe --status
ClusterId:                             fMCL8kv1SWm87L_Md-I2hg
LeaderId:                             3002
LeaderEpoch:                      2
HighWatermark:                  10
MaxFollowerLag:                 0
MaxFollowerLagTimeMs:   -1
CurrentVoters:                    [3000,3001,3002]
CurrentObservers:              [0,1,2]

# specify AdminClient properties
kafka-metadata-quorum.sh --bootstrap-server localhost:9092 --command-config config.properties describe --status
```

Reviewers: Jason Gustafson <jason@confluent.io>
@dengziming
Copy link
Copy Markdown
Member Author

Thanks for your patience @hachikuji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Kafka Broker tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants