Skip to content

Conversation

@bge-kernel-panic
Copy link

We needed to be able to decode the exact consumer offset data (and not Kafka's idea of it through its builtin deserializer - it glosses over version differences and we wanted to know the exact data stored).

This PR adds:

  1. Support for various binary formats used internally by Kafka, such as varint and variable length strings
  2. Support for offset data decoding. This should be current up to Kafka 2.6. Note that group data is only partially decoded as it contains an array of records; we were not particularly interested in that data, we only wanted to parse offsets and not have kcat stop because it was seeing data it could not decode in __consumer_offsets.

Benoit Goudreault-Emond added 4 commits July 13, 2022 18:25
- add : and , as literal characters
- hopefully fix endian swap for 16 bit integers
- Add "S" - count + string, used by Kafka for consumer offsets
- Add "U" - UUID
- Add "v" - varint (used by Kafka for some values)
- Add "C" - varint count + string
- Add "t" - 64 bit millisecond timestamp - note milliseconds are not displayed
            this uses ctime(3) which isn't the best (especially since the newline
            is removed through direct munging of the buffer) but should work well
            enough for this use case
The problem with consumer offset is that it's versioned, so we need
to have some special logic for it.

This is a bit of a hack, reusing macros from the regular unpack and
thus creating a bunch of fake variables so the macros work. But it
does work properly.
Handle v3 group metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant