Skip to content

KAFKA-9667: Connect JSON serde strip trailing zeros#8230

Merged
rhauch merged 2 commits intoapache:trunkfrom
big-andy-coates:exact_decimals
May 7, 2020
Merged

KAFKA-9667: Connect JSON serde strip trailing zeros#8230
rhauch merged 2 commits intoapache:trunkfrom
big-andy-coates:exact_decimals

Conversation

@big-andy-coates
Copy link
Copy Markdown
Contributor

@big-andy-coates big-andy-coates commented Mar 5, 2020

Fixes: KAFKA-9667

The Connect Json serde was recently enhanced to support serializing decimals as standard JSON numbers, (Original work done under KAFKA-8595), e.g. 1.23.  However, there is a bug in the implementation: it's stripping trailing zeros!  1.23 is not the same as 1.230.  The first is a number accurate to 2 decimal places, where as the later is accurate to 3 dp.

It is important that trailing zeros are not dropped when de(serializing) decimals.  For some use-cases it may be acceptable to drop the trailing zeros, but for others it definitely is not.

Current Functionality

If a JSON object was to contain the number 1.230 then the Java JsonDeserializer would correctly deserialize this into a BigDecimal. The BigDecimal would have a scale of 3, which is correct.

However, if that same BigDecimal was then serialized back to JSON using the Java JsonSerializer it would incorrectly strip the zeros, serializing to 1.23

Expected Functionality

When serializing, trailing zeros should be maintained.  For example, a BigDecimal such as 1.230, (3 dp), should be serialized as 1.230

Compatibility

Both the old serialized number, e.g. 1.23, and the proposed corrected serialized number, e.g. 1.230, are valid JSON numbers. Downstream consumers should have no issue deserializing either.

This is not super urgent, but would be good to get into the next point releases of 2.4 and 2.5.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

This change turns on exact decimal processing in Jackson for decimals, meaning trailing zeros are maintained. This means a value of `1.2300` can be deserialized and re-serialized to JSON without any loss of information.
@big-andy-coates
Copy link
Copy Markdown
Contributor Author

Related to confluentinc/ksql#4710

Copy link
Copy Markdown
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (Not approving since I'm not a committer)

Comment thread connect/json/src/main/java/org/apache/kafka/connect/json/JsonDeserializer.java Outdated
Copy link
Copy Markdown
Contributor

@rhauch rhauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @big-andy-coates. The changes look great, but I have a few comments in-line.

Comment thread connect/json/src/main/java/org/apache/kafka/connect/json/JsonDeserializer.java Outdated
Comment thread connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java Outdated
@big-andy-coates
Copy link
Copy Markdown
Contributor Author

Waiting on code freeze to lift before merging.

rayokota added a commit to rayokota/schema-registry that referenced this pull request Mar 12, 2020
This is the same fix as KAFKA-9667.
See apache/kafka#8230
rayokota added a commit to confluentinc/schema-registry that referenced this pull request Mar 12, 2020
* DG-295 Fix trailing zeroes being stripped

This is the same fix as KAFKA-9667.
See apache/kafka#8230

* Minor cleanup

* Fix decimal handling in tests
@big-andy-coates
Copy link
Copy Markdown
Contributor Author

@rhauch can this be merged now?

@rhauch
Copy link
Copy Markdown
Contributor

rhauch commented May 6, 2020

okay to test

Copy link
Copy Markdown
Contributor

@rhauch rhauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks, @big-andy-coates.

Will merge pending a green build.

@rhauch
Copy link
Copy Markdown
Contributor

rhauch commented May 7, 2020

retest this please

@rhauch rhauch merged commit 4f61b00 into apache:trunk May 7, 2020
rhauch pushed a commit that referenced this pull request May 7, 2020
This change turns on exact decimal processing in JSON Converter for deserializing decimals, meaning trailing zeros are maintained. Serialization was already using the decimal scale to output the right value, so this change means a value of `1.2300` can now be serialized to JSON and deserialized back to Connect without any loss of information.

Author: Andy Coates <big-andy-coates@users.noreply.github.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Almog Gavra <almog@confluent.io>
rhauch pushed a commit that referenced this pull request May 7, 2020
This change turns on exact decimal processing in JSON Converter for deserializing decimals, meaning trailing zeros are maintained. Serialization was already using the decimal scale to output the right value, so this change means a value of `1.2300` can now be serialized to JSON and deserialized back to Connect without any loss of information.

Author: Andy Coates <big-andy-coates@users.noreply.github.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Almog Gavra <almog@confluent.io>
Kvicii pushed a commit to Kvicii/kafka that referenced this pull request May 8, 2020
* 'trunk' of github.com:apache/kafka:
  KAFKA-9290: Update IQ related JavaDocs (apache#8114)
  KAFKA-9928: Fix flaky GlobalKTableEOSIntegrationTest (apache#8600)
  KAFKA-6145: Set HighAvailabilityTaskAssignor as default in streams_upgrade_test.py (apache#8613)
  KAFKA-9667: Connect JSON serde strip trailing zeros (apache#8230)
  MINOR: Log4j Improvements on Fetcher (apache#8629)
qq619618919 pushed a commit to qq619618919/kafka that referenced this pull request May 12, 2020
This change turns on exact decimal processing in JSON Converter for deserializing decimals, meaning trailing zeros are maintained. Serialization was already using the decimal scale to output the right value, so this change means a value of `1.2300` can now be serialized to JSON and deserialized back to Connect without any loss of information.

Author: Andy Coates <big-andy-coates@users.noreply.github.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Almog Gavra <almog@confluent.io>
jwijgerd pushed a commit to buxapp/kafka that referenced this pull request May 14, 2020
This change turns on exact decimal processing in JSON Converter for deserializing decimals, meaning trailing zeros are maintained. Serialization was already using the decimal scale to output the right value, so this change means a value of `1.2300` can now be serialized to JSON and deserialized back to Connect without any loss of information.

Author: Andy Coates <big-andy-coates@users.noreply.github.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Almog Gavra <almog@confluent.io>
@big-andy-coates big-andy-coates deleted the exact_decimals branch July 3, 2020 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants