KAFKA-10439: Connect's Values to parse BigInteger as Decimal with zero scale. by avocader · Pull Request #9320 · apache/kafka

avocader · 2020-09-22T18:51:47Z

The org.apache.kafka.connect.data.Values#parse method parses integers, which are larger than Long.MAX_VALUE as double with Schema.FLOAT64_SCHEMA.

That means we are losing precision for these larger integers.

For example:

SchemaAndValue schemaAndValue = Values.parseString("9223372036854775808");

returns:

SchemaAndValue{schema=Schema{FLOAT64}, value=9.223372036854776E18}

Also, this method parses values, that can be parsed as FLOAT32 to FLOAT64.

This PR changes parsing logic, to use FLOAT32/FLOAT64 for numbers that don't have and fraction part(decimal.scale()!=0) only, and use an arbitrary-precision org.apache.kafka.connect.data.Decimal otherwise.
Also, it updates the method to parse numbers, that can be represented as float to FLOAT64.

Added unit tests, that cover parsing BigInteger, Byte, Short, Integer, Long, Float, Double types.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

…o scale.

kkonstantine

Thanks @avocader
The fix LGTM!

The test failures were only related to known broken tests in
org.apache.kafka.clients.ClientUtilsTest and org.apache.kafka.clients.ClusterConnectionStatesTest that are unrelated to the changes here and have been fixed in the meantime.

…o scale. (#9320) The `org.apache.kafka.connect.data.Values#parse` method parses integers, which are larger than `Long.MAX_VALUE` as `double` with `Schema.FLOAT64_SCHEMA`. That means we are losing precision for these larger integers. For example: `SchemaAndValue schemaAndValue = Values.parseString("9223372036854775808");` returns: `SchemaAndValue{schema=Schema{FLOAT64}, value=9.223372036854776E18}` Also, this method parses values that can be parsed as `FLOAT32` to `FLOAT64`. This PR changes parsing logic, to use `FLOAT32`/`FLOAT64` for numbers that don't have fraction part(`decimal.scale()!=0`) only, and use an arbitrary-precision `org.apache.kafka.connect.data.Decimal` otherwise. Also, it updates the method to parse numbers, that can be represented as `float` to `FLOAT64`. Added unit tests, that cover parsing `BigInteger`, `Byte`, `Short`, `Integer`, `Long`, `Float`, `Double` types. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>

…o scale. (apache#9320) The `org.apache.kafka.connect.data.Values#parse` method parses integers, which are larger than `Long.MAX_VALUE` as `double` with `Schema.FLOAT64_SCHEMA`. That means we are losing precision for these larger integers. For example: `SchemaAndValue schemaAndValue = Values.parseString("9223372036854775808");` returns: `SchemaAndValue{schema=Schema{FLOAT64}, value=9.223372036854776E18}` Also, this method parses values that can be parsed as `FLOAT32` to `FLOAT64`. This PR changes parsing logic, to use `FLOAT32`/`FLOAT64` for numbers that don't have fraction part(`decimal.scale()!=0`) only, and use an arbitrary-precision `org.apache.kafka.connect.data.Decimal` otherwise. Also, it updates the method to parse numbers, that can be represented as `float` to `FLOAT64`. Added unit tests, that cover parsing `BigInteger`, `Byte`, `Short`, `Integer`, `Long`, `Float`, `Double` types. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>

* commit '2804257fe221f37e5098bd': (67 commits) KAFKA-10562: Properly invoke new StateStoreContext init (apache#9388) MINOR: trivial cleanups, javadoc errors, omitted StateStore tests, etc. (apache#8130) KAFKA-10564: only process non-empty task directories when internally cleaning obsolete state stores (apache#9373) KAFKA-9274: fix incorrect default value for `task.timeout.ms` config (apache#9385) KAFKA-10362: When resuming Streams active task with EOS, the checkpoint file is deleted (apache#9247) KAFKA-10028: Implement write path for feature versioning system (KIP-584) (apache#9001) KAFKA-10402: Upgrade system tests to python3 (apache#9196) KAFKA-10186; Abort transaction with pending data with TransactionAbortedException (apache#9280) MINOR: Remove `TargetVoters` from `DescribeQuorum` (apache#9376) Revert "KAFKA-10469: Resolve logger levels hierarchically (apache#9266)" MINOR: Don't publish javadocs for raft module (apache#9336) KAFKA-9929: fix: add missing default implementations (apache#9321) KAFKA-10188: Prevent SinkTask::preCommit from being called after SinkTask::stop (apache#8910) KAFKA-10338; Support PEM format for SSL key and trust stores (KIP-651) (apache#9345) KAFKA-10527; Voters should not reinitialize as leader in same epoch (apache#9348) MINOR: Refactor unit tests around RocksDBConfigSetter (apache#9358) KAFKA-6733: Printing additional ConsumerRecord fields in DefaultMessageFormatter (apache#9099) MINOR: Annotate test BlockingConnectorTest as integration test (apache#9379) MINOR: Fix failing test due to KAFKA-10556 PR (apache#9372) KAFKA-10439: Connect's Values to parse BigInteger as Decimal with zero scale. (apache#9320) ...

…o scale. (apache#9320) The `org.apache.kafka.connect.data.Values#parse` method parses integers, which are larger than `Long.MAX_VALUE` as `double` with `Schema.FLOAT64_SCHEMA`. That means we are losing precision for these larger integers. For example: `SchemaAndValue schemaAndValue = Values.parseString("9223372036854775808");` returns: `SchemaAndValue{schema=Schema{FLOAT64}, value=9.223372036854776E18}` Also, this method parses values that can be parsed as `FLOAT32` to `FLOAT64`. This PR changes parsing logic, to use `FLOAT32`/`FLOAT64` for numbers that don't have fraction part(`decimal.scale()!=0`) only, and use an arbitrary-precision `org.apache.kafka.connect.data.Decimal` otherwise. Also, it updates the method to parse numbers, that can be represented as `float` to `FLOAT64`. Added unit tests, that cover parsing `BigInteger`, `Byte`, `Short`, `Integer`, `Long`, `Float`, `Double` types. Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>

KAFKA-10439: Connect's Values to parse BigInteger as Decimal with zer…

5fd47c5

…o scale.

kkonstantine added the connect label Oct 6, 2020

kkonstantine approved these changes Oct 6, 2020

View reviewed changes

kkonstantine merged commit 06a5a68 into apache:trunk Oct 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-10439: Connect's Values to parse BigInteger as Decimal with zero scale.#9320

KAFKA-10439: Connect's Values to parse BigInteger as Decimal with zero scale.#9320
kkonstantine merged 1 commit intoapache:trunkfrom
avocader:KAFKA-10439

avocader commented Sep 22, 2020

Uh oh!

kkonstantine left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avocader commented Sep 22, 2020

Committer Checklist (excluded from commit message)

Uh oh!

kkonstantine left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants