Skip to content

KAFKA-10817; Add clusterId validation to Fetch handling#10129

Merged
hachikuji merged 5 commits intoapache:trunkfrom
dajac:KAFKA-10817
Feb 19, 2021
Merged

KAFKA-10817; Add clusterId validation to Fetch handling#10129
hachikuji merged 5 commits intoapache:trunkfrom
dajac:KAFKA-10817

Conversation

@dajac
Copy link
Copy Markdown
Member

@dajac dajac commented Feb 16, 2021

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dajac dajac marked this pull request as ready for review February 17, 2021 17:55
@dajac dajac requested a review from hachikuji February 17, 2021 17:55
Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I had a couple small comments. One additional question is whether the schema should explicitly specify the type of clusterId as "uuid." Sadly, we have already released the version with this field in 2.7:

    { "name": "ClusterId", "type": "string", "versions": "12+", "nullableVersions": "12+", "default": "null", "taggedVersions": "12+", "tag": 0,
      "about": "The clusterId if known. This is used to validate metadata fetches prior to broker registration." },

It is not used, though. We could consider tag 0 to be burned, and introduce a new clusterId field with the "uuid" type. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am considering if we should have a more explicit name here. Specifically we are checking that the clusterId matches. Maybe INCONSISTENT_CLUSTER_ID would be clearer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. INVALID is mainly used by all the other errors so it also makes sense to follow the "naming convention". I don't feel strong either ways.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Feb 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any existing INVALID* error code that seems to fit this case. Usually "invalid" is reserved for cases where the field is structurally invalid. For example, INVALID_GROUP_ID is used when the groupid is empty in APIs where we require it to be non-empty. The closest similar case is INVALID_PRODUCER_ID_MAPPING.

We are going to add an INCONSISTENT_TOPIC_ID in #10143. Perhaps that is enough cover here? The usage is similar: the request indicates an id which does not match the local state.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth logging something here?

@hachikuji
Copy link
Copy Markdown
Contributor

I had some offline discussion with @cmccabe and @ijuma about the clusterId type. I think it makes sense for now to keep it as a string since this is consistent with most uses in the current codebase and it is what is documented in the KIP. I do wonder if it might be more efficient to do the comparison on the string rather than converting to Uuid though.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Feb 18, 2021

I agree that comparing the string is going to be more efficient than converting to uuid and then comparing.

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Feb 18, 2021

@hachikuji @ijuma Yeah, that makes sense. I have changed this.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Feb 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any existing INVALID* error code that seems to fit this case. Usually "invalid" is reserved for cases where the field is structurally invalid. For example, INVALID_GROUP_ID is used when the groupid is empty in APIs where we require it to be non-empty. The closest similar case is INVALID_PRODUCER_ID_MAPPING.

We are going to add an INCONSISTENT_TOPIC_ID in #10143. Perhaps that is enough cover here? The usage is similar: the request indicates an id which does not match the local state.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small detail which is probably ok. The clusterId field in the fetch schema is not currently marked as ignorable. That should be ok since it is only used in the raft implementation which can guarantee that we will have version 12 and above. On the other hand, I don't see any harm making the field ignorable since we are accepting a null value anyway. Is it worth changing that?

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hachikuji
Copy link
Copy Markdown
Contributor

I verified tests locally. I am going to merge to trunk and 2.8 since we seem blocked by jenkins at the moment.

@hachikuji hachikuji merged commit bbf145b into apache:trunk Feb 19, 2021
hachikuji pushed a commit that referenced this pull request Feb 19, 2021
This patch adds clusterId validation in the `Fetch` API as documented in KIP-595. A new error code `INCONSISTENT_CLUSTER_ID` is returned if the request clusterId does not match the value on the server. If no clusterId is provided, the request is treated as valid.

Reviewers: Jason Gustafson <jason@confluent.io>
) {
FetchRequestData request = (FetchRequestData) requestMetadata.data;

if (!hasValidClusterId(request)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only validate FetchRequest here? should we also add validation to the other 4 rpcs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants