Skip to content

KAFKA-9074: Correct Connect’s Values.parseString to properly parse a time and timestamp literal#7568

Merged
rhauch merged 5 commits intoapache:trunkfrom
rhauch:kafka-9074
Feb 4, 2020
Merged

KAFKA-9074: Correct Connect’s Values.parseString to properly parse a time and timestamp literal#7568
rhauch merged 5 commits intoapache:trunkfrom
rhauch:kafka-9074

Conversation

@rhauch
Copy link
Copy Markdown
Contributor

@rhauch rhauch commented Oct 21, 2019

Time and timestamp literal strings contain a : character, but the internal parser used in the Values.parseString(String) method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Copy link
Copy Markdown
Contributor

@ncliang ncliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, with good test coverage! Just a few comments I have.

Comment thread connect/api/src/main/java/org/apache/kafka/connect/data/Values.java Outdated
Comment thread connect/api/src/main/java/org/apache/kafka/connect/data/Values.java Outdated
Comment thread connect/api/src/main/java/org/apache/kafka/connect/data/Values.java Outdated
Comment thread connect/api/src/main/java/org/apache/kafka/connect/data/Values.java Outdated
Comment thread connect/api/src/main/java/org/apache/kafka/connect/data/Values.java Outdated
Copy link
Copy Markdown
Contributor

@ncliang ncliang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@C0urante
Copy link
Copy Markdown
Contributor

C0urante commented Oct 22, 2019

@rhauch the changes here look good but I think they're insufficient for nested values. I've written a couple of failing tests that fail on the code in this PR to demonstrate my point:

    @Test
    public void shouldParseDateStringAsDateInArray() throws Exception {
        String dateStr = "2019-08-23";
        String arrayStr = "[" + dateStr + "]";
        SchemaAndValue result = Values.parseString(arrayStr);
        assertEquals(Type.ARRAY, result.schema().type());
        Schema elementSchema = result.schema().valueSchema();
        assertEquals(Type.INT32, elementSchema.type());
        assertEquals(Date.LOGICAL_NAME, elementSchema.name());
        java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_DATE_FORMAT_PATTERN).parse(dateStr);
        assertEquals(Collections.singletonList(expected), result.value());
    }

    @Test
    public void shouldParseTimeStringAsDateInArray() throws Exception {
        String dateStr = "14:34:54.346Z";
        String arrayStr = "[" + dateStr + "]";
        SchemaAndValue result = Values.parseString(arrayStr);
        assertEquals(Type.ARRAY, result.schema().type());
        Schema elementSchema = result.schema().valueSchema();
        assertEquals(Type.INT32, elementSchema.type());
        assertEquals(Date.LOGICAL_NAME, elementSchema.name());
        java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_TIME_FORMAT_PATTERN).parse(dateStr);
        assertEquals(Collections.singletonList(expected), result.value());
    }

    @Test
    public void shouldParseTimeStringAsDateInMap() throws Exception {
        String dateStr = "14:34:54.346Z";
        String arrayStr = "{3:" + dateStr + "}";
        SchemaAndValue result = Values.parseString(arrayStr);
        assertEquals(Type.MAP, result.schema().type());
        Schema keySchema = result.schema().keySchema();
        assertEquals(Type.INT8, keySchema.type());
        Schema valueSchema = result.schema().valueSchema();
        assertEquals(Type.INT32, valueSchema.type());
        assertEquals(Date.LOGICAL_NAME, valueSchema.name());
        java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_TIME_FORMAT_PATTERN).parse(dateStr);
        assertEquals(Collections.singletonMap((byte) 3, expected), result.value());
    }

If these cases (e.g., use as elements in arrays/maps) should be handled for times and timestamp literals, can we address them in this PR as well?

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Oct 23, 2019

Thanks, @C0urante. I've added your suggested tests plus a few more.

Copy link
Copy Markdown
Contributor

@C0urante C0urante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the added coverage @rhauch

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Oct 28, 2019

Retest this, please.

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Oct 28, 2019

The previous build passed on JDK 8, and failures on other two were unrelated. But retesting anyway.

@hachikuji, would you mind reviewing this? #7284 is blocked by this. Thanks!

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Oct 28, 2019

The JDK 11 / Scala 2.13 had this unrelated failure:

Test Result (1 failure / +1)
  kafka.api.SslAdminClientIntegrationTest.testSynchronousAuthorizerAclUpdatesBlockRequestThreads

@alozano3
Copy link
Copy Markdown
Contributor

What's the status of this PR? Is this ready to merge once the flaky tests pass? I am waiting for this in order to finish the PR #7284
Regards.

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Nov 15, 2019

@alozano3, we missed AK 2.4 code freeze for this PR, and because it needs to be backported to older branches we're waiting to merge this to master until we can backport to the 2.4 branch.

@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Jan 21, 2020

Retest this please

…a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.
@rhauch
Copy link
Copy Markdown
Contributor Author

rhauch commented Jan 21, 2020

Rebased on trunk to fix conflicts.

The original logic was introduced in AK 1.1.0, but since we typically backport only 2-3 releases my plan is to backport this as far back as the 2.2 branch.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhauch rhauch merged commit dd7a314 into apache:trunk Feb 4, 2020
rhauch added a commit that referenced this pull request Feb 4, 2020
…a time and timestamp literal (#7568)

* KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
rhauch added a commit that referenced this pull request Feb 4, 2020
…a time and timestamp literal (#7568)

* KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
rhauch added a commit that referenced this pull request Feb 4, 2020
…a time and timestamp literal (#7568)

* KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
ijuma added a commit to confluentinc/kafka that referenced this pull request Feb 7, 2020
Conflicts:
* build.gradle: moved avro plugin definition below
newly added test retry plugin.

* apache-github/trunk:
  MINOR: further InternalTopologyBuilder cleanup  (apache#8046)
  MINOR: Add timer for update limit offsets (apache#8047)
  HOTFIX: Fix spotsbug failure in Kafka examples (apache#8051)
  KAFKA-9447: Add new customized EOS model example (apache#8031)
  KAFKA-8164: Add support for retrying failed (apache#8019)
  HOTFIX: checkstyle for newly added unit test
  KAFKA-9261; Client should handle unavailable leader metadata (apache#7770)
  MINOR: Fix typos introduced in KIP-559 (apache#8042)
  MINOR: Fixing null handilg in ValueAndTimestampSerializer (apache#7679)
  KAFKA-9113: Clean up task management and state management (apache#7997)
  MINOR: fix checkstyle issue in ConsumerConfig.java (apache#8038)
  KAFKA-9491; Increment high watermark after full log truncation (apache#8037)
  KAFKA-9477 Document RoundRobinAssignor as an option for partition.assignment.strategy (apache#8007)
  KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal (apache#7568)
  KAFKA-9492; Ignore record errors in ProduceResponse for older versions (apache#8030)
stanislavkozlovski pushed a commit to stanislavkozlovski/kafka that referenced this pull request Feb 18, 2020
…a time and timestamp literal (apache#7568)

* KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
qq619618919 pushed a commit to qq619618919/kafka that referenced this pull request May 12, 2020
…a time and timestamp literal (apache#7568)

* KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal

Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.

This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.

Author: Randall Hauch <rhauch@gmail.com>
Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants