KAFKA-17011: Fix a bug preventing features from supporting v0#16421
KAFKA-17011: Fix a bug preventing features from supporting v0#16421jolshan merged 14 commits intoapache:trunkfrom
Conversation
As part of KIP-584, brokers expose a range of supported versions for each feature. For example, metadata.version might be supported from 1 to 21. (Note that feature level ranges are always inclusive, so this would include both level 1 and 21.) These supported ranges are supposed to be able to include 0. For example, it should be possible for a broker to support a kraft.version between 0 and 1. However, in older software versions, there is an assertion in org.apache.kafka.common.feature.SupportedVersionRange that prevents this. This causes problems when the older software attempts to deserialize an ApiVersionsResponse containing such a range. In order to resolve this dilemma, this PR bumps the version of ApiVersionsRequest from 3 to 4. Clients which send v4 promise to be able to handle ranges including 0. Clients which send v3 will not be exposed to these ranges -- the feature will simply be omitted from the response. This work is part of KIP-1022.
|
Did we check the kraft upgrade system tests? I can kick off a job. Looks like there may also be some issue with the build. |
|
It seems simpler to just pass the ApiVersionsResponse version as a parameter to the Supplier (hence making it a Function, I guess) Updated. |
|
Let's take a look at |
|
It looks like the test was checking the ApiVersionsRequest header version against 3. So it needed to be updated to 4 |
|
The test failures pass locally. |
junrao
left a comment
There was a problem hiding this comment.
@cmccabe : Thanks for the updated PR. A couple of comments.
Also, I asked a followup question in the mailing list on whether it's better to suppress V0 features completely or just exclude V0 from such features. It would be useful to reach consensus on that.
junrao
left a comment
There was a problem hiding this comment.
@cmccabe : Thanks for the updated PR. The code LGTM. There was still some discussion in the mailing list for KIP-1022 on the best approach for addressing this issue. We probably need to reach some consensus there first.
Also, are the following 2 test failures related?
Build / JDK 8 and Scala 2.12 / kafka.admin.ListOffsetsIntegrationTest."testThreeRecordsInOneBatchHavingDifferentCompressionTypeWithServer(String).quorum=kraft"
Build / JDK 8 and Scala 2.12 / org.apache.kafka.tools.GetOffsetShellTest.testTopicPartitionsArgWithInternalIncluded [2] Type=Raft-Isolated, MetadataVersion=4.0-IV0,Security=PLAINTEXT
|
#16183 will wait for this PR in order to not break tests. |
| { "name": "SupportedFeatures", "type": "[]SupportedFeatureKey", "ignorable": true, | ||
| "versions": "3+", "tag": 0, "taggedVersions": "3+", | ||
| "about": "Features supported by the broker.", | ||
| "about": "Features supported by the broker. Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.", |
There was a problem hiding this comment.
MinSupportedVersion -> MinVersion
There was a problem hiding this comment.
Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.
This needs to be adjusted accordingly since we now keep the feature but returns 1 as the minSupportedVersion.
There was a problem hiding this comment.
This needs to be adjusted accordingly since we now keep the feature but returns 1 as the minSupportedVersion.
done
| def testSendV4ApiVersionsRequest(quorum: String): Unit = { | ||
| val response = sendApiVersionsRequest(4) | ||
| if (quorum.equals("kraft")) { | ||
| assertFeatureHasMinVersion("group.version", response.data().supportedFeatures(), 0) |
| return new BrokerRegistrationRequest(data, version); | ||
| if (version < 4) { | ||
| // Workaround for KAFKA-17011: for BrokerRegistrationRequest versions older than 4, | ||
| // exclude support version ranges that begin with 0. |
There was a problem hiding this comment.
I think we still need to finalize/agree here about omission vs. using 1 for older versions of the api.
Right now we have omit for broker registration, but using version 1 for ApiVersions. We also have the option to use 1 for both or omit for both. Note: In 3.8, we omit for both requests.
I think there is some benefit in not including a version with 0 to older controllers rather than 1, but can also understand that it is confusing to have different approaches for different APIs.
@junrao I know you had some opinions here.
There was a problem hiding this comment.
It seems that it's simpler for the old BrokerRegistrationRequest to be consistent with the older version of ApiVersionResponse, i.e., to include the feature but set the minSupportedVersion to 1, instead of 0.
There was a problem hiding this comment.
OK. Just like in ApiVersionsResponse, we will translate min version = 0 to min version = 1 here.
| { "name": "SupportedFeatures", "type": "[]SupportedFeatureKey", "ignorable": true, | ||
| "versions": "3+", "tag": 0, "taggedVersions": "3+", | ||
| "about": "Features supported by the broker.", | ||
| "about": "Features supported by the broker. Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.", |
There was a problem hiding this comment.
Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.
This needs to be adjusted accordingly since we now keep the feature but returns 1 as the minSupportedVersion.
| }, | ||
| { "name": "Features", "type": "[]Feature", | ||
| "about": "The features on this broker", "versions": "0+", "fields": [ | ||
| "about": "The features on this broker. Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.", "versions": "0+", "fields": [ |
There was a problem hiding this comment.
Note: in v0-v3, features with MinSupportedVersion = 0 must be left out.
This needs to be adjusted accordingly since we now keep the feature but returns 1 as the minSupportedVersion.
| return new BrokerRegistrationRequest(data, version); | ||
| if (version < 4) { | ||
| // Workaround for KAFKA-17011: for BrokerRegistrationRequest versions older than 4, | ||
| // exclude support version ranges that begin with 0. |
There was a problem hiding this comment.
It seems that it's simpler for the old BrokerRegistrationRequest to be consistent with the older version of ApiVersionResponse, i.e., to include the feature but set the minSupportedVersion to 1, instead of 0.
| })) | ||
| } | ||
| if (unstableFeatureVersionsEnabled) { | ||
| features.put("kraft.version", new SupportedVersionRange(1)) |
There was a problem hiding this comment.
Should we add kraft.version to PRODUCTION_FEATURES, which is supposed to be the place that tracks all features?
There was a problem hiding this comment.
I will do that in a follow-on PR.
There was a problem hiding this comment.
I'm a little confused here. This has one argument? And is kraft.version part of this PR?
There was a problem hiding this comment.
I'm a little confused here. This has one argument?
There's a constructor that creates a range 0-N based on just being given N. I will change it to be the two-argument constructor since I guess that's confusing.
And is kraft.version part of this PR?
I want to have it here for test purposes. I will create it for real in a follow-on PR, if this PR ever concludes.
There was a problem hiding this comment.
I guess I was confused because I didn't see it in the file.
but 🤷♀️| private Map<String, Short> finalizedFeatures = null; | ||
| private long finalizedFeaturesEpoch = 0; | ||
| private boolean zkMigrationEnabled = false; | ||
| private boolean suppressFeatureLevel0 = false; |
There was a problem hiding this comment.
suppressFeatureLevel0 => alterFeatureLevel0 ?
| return this; | ||
| } | ||
|
|
||
| public Builder setSuppressFeatureLevel0(boolean suppressFeatureLevel0) { |
There was a problem hiding this comment.
suppressFeatureLevel0 => alterFeatureLevel0 ?
| final boolean zkMigrationEnabled | ||
| private static SupportedFeatureKeyCollection maybeFilterSupportedFeatureKeys( | ||
| Features<SupportedVersionRange> latestSupportedFeatures, | ||
| boolean suppressV0 |
|
|
||
| @ParameterizedTest | ||
| @ValueSource(booleans = {false, true}) | ||
| public void testSuppressV0Features(boolean suppressV0Features) { |
There was a problem hiding this comment.
testSuppressV0Features => testAlterV0Features ?
suppressV0Features => alterV0Features ?
| def enabledApis: collection.Set[ApiKeys] | ||
|
|
||
| def apiVersionResponse(throttleTimeMs: Int): ApiVersionsResponse | ||
| def apiVersionResponse(throttleTimeMs: Int, suppressFeatureLevel0: Boolean): ApiVersionsResponse |
There was a problem hiding this comment.
suppressFeatureLevel0 => alterFeatureLevel0 ?
| override def apiVersionResponse(throttleTimeMs: Int): ApiVersionsResponse = { | ||
| override def apiVersionResponse( | ||
| throttleTimeMs: Int, | ||
| suppressFeatureLevel0: Boolean |
There was a problem hiding this comment.
suppressFeatureLevel0 => alterFeatureLevel0 ?
| val supportedFeatures = brokerFeatures.supportedFeatures | ||
| override def apiVersionResponse( | ||
| throttleTimeMs: Int, | ||
| suppressFeatureLevel0: Boolean |
There was a problem hiding this comment.
suppressFeatureLevel0 => alterFeatureLevel0 ?
|
Looks like there are still quite a few failures for ApiVersionsRequestTest and BrokerFeaturesTest |
|
|
||
| // Change expected message to reflect latest MetadataVersion (SupportedMaxVersion increases when adding a new version) | ||
| assertEquals("Feature: kraft.version\tSupportedMinVersion: 0\t" + | ||
| "SupportedMaxVersion: 1\tFinalizedVersionLevel: 0\t", outputWithoutEpoch(features.get(0))); |
There was a problem hiding this comment.
How does this test work? Do we explicitly set the FinalizedVersionLevel for kraft.version to 0 somewhere?
There was a problem hiding this comment.
Do we explicitly set the FinalizedVersionLevel for kraft.version to 0 somewhere?
anything not enabled is at level 0
|
The context here is that we’re adding kraft.version in 3.9. But not in this PR. I did add kraft.version to supportedFeatures, simply so that I could verify that the RPC results looked like what I wanted. This will cause no compatibility problems since there is no way to set kraft.version yet. |
jolshan
left a comment
There was a problem hiding this comment.
lgtm as long as tests pass. (For the followup that adds kraft.version, let's try to follow the conventions for new features so we don't hardcode everything 👍 )
|
Test failures are unrelated. I will merge. |
…#16421) As part of KIP-584, brokers expose a range of supported versions for each feature. For example, metadata.version might be supported from 1 to 21. (Note that feature level ranges are always inclusive, so this would include both level 1 and 21.) These supported ranges are supposed to be able to include 0. For example, it should be possible for a broker to support a kraft.version between 0 and 1. However, in older software versions, there is an assertion in org.apache.kafka.common.feature.SupportedVersionRange that prevents this. This causes problems when the older software attempts to deserialize an ApiVersionsResponse containing such a range. In order to resolve this dilemma, this PR bumps the version of ApiVersionsRequest from 3 to 4. Clients which send v4 promise to be able to handle ranges including 0. Clients which send v3 will not be exposed to these ranges. The feature will show up as having a minimum version of 1 instead. This work is part of KIP-1022. Similarly, this PR also introduces a new version of BrokerRegistrationRequest, and specifies that the older versions of that RPC cannot handle supported version ranges including 0. Therefore, 0 is translated to 1 in the older requests. Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
| iter.hasNext(); ) { | ||
| BrokerRegistrationRequestData.Feature feature = iter.next(); | ||
| if (feature.minSupportedVersion() == 0) { | ||
| feature.setMinSupportedVersion((short) 1); |
There was a problem hiding this comment.
There is another issue caused by changing the version from 0 to 1. #15685 add the check which assumes the "default version" is 0
Hence, the features having min=1 can NOT pass the check. see following error message:
[2024-09-05 11:22:22,146] INFO [QuorumController id=2] registerBroker: event failed with UnsupportedVersionException in 127 microseconds. Exception message: Unable to register because the broker does not support version 0 of group.version. It wants a version between 1 and 1, inclusive. (org.apache.kafka.controller.QuorumController)
There was a problem hiding this comment.
@chia7712 Thanks. Is there a way to reproduce this (e.g. an existing test)?
There was a problem hiding this comment.
Is there a way to reproduce this (e.g. an existing test)?
run the following command with #17084
TC_PATHS="tests/kafkatest/tests/core/kraft_upgrade_test.py::TestKRaftUpgrade.test_combined_mode_upgrade" \
/bin/bash tests/docker/run_tests.sh
There was a problem hiding this comment.
Originally I thought that we could not set version 0 with the storage or update tools since it would treat it as disabling the feature. But @dajac pointed out to me that this is happening by setting the MV and that picks default features.
However, the min version 1 should only be the case for older request versions. I thought we fixed this to so we wouldn't have the requirement for version 4 and for 4.0 where group version is introduced.
There was a problem hiding this comment.
When I last talked to Colin about this, I believe he said we could not disable the feature if the min version is > 0.
There was a problem hiding this comment.
When I last talked to Colin about this, I believe he said we could not disable the feature if the min version is > 0.
@cmccabe : Could you confirm if this is case? How would people upgrade from an old release where a feature is not turned on?
There was a problem hiding this comment.
I would suspect you have to upgrade the feature before the upgrade to the new release, but we can let Colin confirm.
There was a problem hiding this comment.
Checked with Colin offline. To summarize, if we do increase minVersion in the future, we need to bump up the default version for that feature. An old release may need to first upgrade to a bridge release where the new default version could be set. Given that, we don't need to change the logic in ClusterControlManager.
There was a problem hiding this comment.
Given that, we don't need to change the logic in ClusterControlManager.
I will sync it to KAFKA-17429 and #17128
As part of KIP-584, brokers expose a range of supported versions for each feature. For example, metadata.version might be supported from 1 to 21. (Note that feature level ranges are always inclusive, so this would include both level 1 and 21.)
These supported ranges are supposed to be able to include 0. For example, it should be possible for a broker to support a kraft.version between 0 and 1. However, in older software versions, there is an assertion in org.apache.kafka.common.feature.SupportedVersionRange that prevents this. This causes problems when the older software attempts to deserialize an ApiVersionsResponse containing such a range.
In order to resolve this dilemma, this PR bumps the version of ApiVersionsRequest from 3 to 4. Clients which send v4 promise to be able to handle ranges including 0. Clients which send v3 will not be exposed to these ranges. The feature will show up as having a minimum version of 1 instead. This work is part
of KIP-1022.
Similarly, this PR also introduces a new version of BrokerRegistrationRequest, and specifies that the
older versions of that RPC cannot handle supported version ranges including 0. Therefore, 0 is translated to 1 in the older requests.