KAFKA-17018: update MetadataVersion for the Kafka release 3.9#16841
KAFKA-17018: update MetadataVersion for the Kafka release 3.9#16841cmccabe merged 1 commit intoapache:trunkfrom
Conversation
- Mark 3.9-IV0 as stable. Metadata version 3.9-IV0 should return Fetch version 17. - Move ELR to 4.0-IV0. Remove 3.9-IV1 since it's no longer needed. - Create a new 4.0-IV1 MV for KIP-848.
| * IT CANNOT BE CHANGED.</strong> | ||
| */ | ||
| public static final MetadataVersion LATEST_PRODUCTION = IBP_3_8_IV0; | ||
| public static final MetadataVersion LATEST_PRODUCTION = IBP_3_9_IV0; |
There was a problem hiding this comment.
Is KIP-1005 done? Could we set latestVersionUnstable to false for ListOffsetRequest in 3.9?
KIP-1005 is completed and backport to 3.9, so yes latestVersionUnstable should be false now.
There was a problem hiding this comment.
@junrao : yes, KIP-1005 will ship in AK 3.9.
There was a problem hiding this comment.
do you plan to update
I assumed @cmccabe will update it in this PR 😄
Will file a jira and PR for trunk and 3.9 later
There was a problem hiding this comment.
Yes -- sorry I was imprecise. If the api version is advertised by a MV that is stable it sould also be stable.
I didn't realize ListOffset was being used in the replication path. I guess this is just for tiered storage.
There was a problem hiding this comment.
However, by default, v9 of ListOffset is not supported with latestVersionUnstable set to true. So, I still don't understand why some existing tests don't fail. We probably need to understand the testing gap. If we have sufficient test coverage, we probably don't need additional protocols.
yes, you are right. we should have a test for replica path. I just write a IT which can produce following error when LIST_OFFSET has unstable version:
[2024-08-14 03:27:13,119] ERROR Closing socket for 127.0.0.1:36885-127.0.0.1:48854-3 because of error (kafka.network.Processor:76)
org.apache.kafka.common.errors.InvalidRequestException: Received request api key LIST_OFFSETS with version 9 which is not enabled
Also, it seems we create IT with unstable APIs always [0] and I guess that is why we don't see failed tests.
Summary
- broker-to-broker: the LIST_OFFSET version is based on MV [1], and we use IT to protect us from updating MV with an unstable version (https://issues.apache.org/jira/browse/KAFKA-17336)
- new-client-to-old-broker: the builder of LIST_OFFSET will check the timestamp and its matched version (https://issues.apache.org/jira/browse/KAFKA-17331)
- old-client-to-new-broker: the broker will check the timestamp and its matched version (https://issues.apache.org/jira/browse/KAFKA-17331)
I didn't realize ListOffset was being used in the replication path. I guess this is just for tiered storage.
@jolshan not really, the request is used to truncate in the replication path. I have attached a sample code in https://issues.apache.org/jira/browse/KAFKA-17336. we will complete it tomorrow.
[0]
[1]
There was a problem hiding this comment.
Thanks. https://issues.apache.org/jira/browse/KAFKA-17336 should be good to not release with unstable versions that break api usage.
There was a problem hiding this comment.
Also, it seems we create IT with unstable APIs always [0] and I guess that is why we don't see failed tests.
Thanks for the investigation, @chia7712. Maybe we just need to disable unstable API/MV in a few existing integration tests.
There was a problem hiding this comment.
Maybe we just need to disable unstable API/MV in a few existing integration tests.
yep, we are looking for the similar tests
| // TEST_2 released right before MV 3.9-IVO was released, and it depends on this metadata version | ||
| TEST_2(2, MetadataVersion.IBP_3_9_IV0, Collections.singletonMap(MetadataVersion.FEATURE_NAME, MetadataVersion.IBP_3_9_IV0.featureLevel())); | ||
| // TEST_2 released right before MV 4.0-IVO was released, and it depends on this metadata version | ||
| TEST_2(2, MetadataVersion.IBP_4_0_IV0, Collections.singletonMap(MetadataVersion.FEATURE_NAME, MetadataVersion.IBP_4_0_IV0.featureLevel())); |
There was a problem hiding this comment.
@jolshan : What MV should TEST_2 depend on? Do we need to keep changing it with new MVs? It would be useful to add a comment to document this.
There was a problem hiding this comment.
I'm sorry I haven't gotten a chance to update this. I was hoping to have one of the test versions be the latest unstable version, but I believe I updated the tests for them to work either way now. Let me double check.
If so we can just keep at 3.9.
There was a problem hiding this comment.
Ok. This is the test that will fail if we don't change the version to 4.0. https://github.com/apache/kafka/blame/98cdf9717049e87ba34bb5161276577fcb8bd1c4/server-common/src/test/java/org/apache/kafka/server/common/FeaturesTest.java#L137
There was a problem hiding this comment.
We have a similar test here for real features, so if we think this is sufficient, we can remove the test I linked above. https://github.com/apache/kafka/blame/98cdf9717049e87ba34bb5161276577fcb8bd1c4/server-common/src/test/java/org/apache/kafka/server/common/FeaturesTest.java#L51
There was a problem hiding this comment.
Thanks, Justine. Could we just use MetadataVersion.latestTesting() to avoid having to keep updating this with new MVs?
There was a problem hiding this comment.
LatestTesting doesn't quite work since it is not guaranteed that latestTesting is unstable right?
There was a problem hiding this comment.
Well, the only case when LatestTesting is stable is that there is no unstable MV, right?
There was a problem hiding this comment.
Right -- in that case this test will fail. https://github.com/apache/kafka/blame/98cdf9717049e87ba34bb5161276577fcb8bd1c4/server-common/src/test/java/org/apache/kafka/server/common/FeaturesTest.java#L137
My understanding is that we don't create a new MV when we mark one as stable, so for example, if all the MVs were marked stable, the test would fail.
There was a problem hiding this comment.
Alternatively, I could write the test to check if the latest MV is stable, and if so, skip the check.
There was a problem hiding this comment.
We should probably just document that the last MV should always be unstable. I don't think it's ever been stable since "unstable" was created (we don't seem to stop developing features...)
Should we do this in a follow-on so that we can unblock 3.9?
| */ | ||
| @ParameterizedTest | ||
| @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV2", "3.9-IV1"}) | ||
| @ValueSource(strings = {"3.4-IV0", "3.5-IV2", "3.6-IV0", "3.7-IV2", "4.0-IV0"}) |
There was a problem hiding this comment.
I had a followup comment in #16347 (comment). Do we need to add 4.0-IV0 for testNoLeaderEpochBumpOnEmptyTargetIsr?
There was a problem hiding this comment.
It would not make sense to add 4.0-IV0 to testNoLeaderEpochBumpOnEmptyTargetIsr, since in 4.0 an empty target ISR actually WILL trigger a leader epoch bump.
The reason is because once KIP-966 is implemented, empty ISRs will actually be possible, whereas they are not prior to KIP-966
junrao
left a comment
There was a problem hiding this comment.
@cmccabe : Thanks for the explanation. LGTM assuming we will have a followup PR to address #16841 (comment).
will file PR later (https://issues.apache.org/jira/browse/KAFKA-17319) |
- Mark 3.9-IV0 as stable. Metadata version 3.9-IV0 should return Fetch version 17. - Move ELR to 4.0-IV0. Remove 3.9-IV1 since it's no longer needed. - Create a new 4.0-IV1 MV for KIP-848. Reviewers: Jun Rao <junrao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
…#16901) After some discussion on: #16841 (comment) We decided it is best for test version to always map to MetadataVersion.latestTesting. We should always have one unstable MV (either because there is a feature being worked on OR when we mark the latest version stable we create a new one). Reviewers: Jun Rao <junrao@apache.org>
Mark 3.9-IV0 as stable. Metadata version 3.9-IV0 should return Fetch version 17.
Move ELR to 4.0-IV0. Remove 3.9-IV1 since it's no longer needed.
Create a new 4.0-IV1 MV for KIP-848.