KAFKA-9074: Correct Connect’s Values.parseString to properly parse a time and timestamp literal#7568
Conversation
ncliang
left a comment
There was a problem hiding this comment.
Looks good in general, with good test coverage! Just a few comments I have.
|
@rhauch the changes here look good but I think they're insufficient for nested values. I've written a couple of failing tests that fail on the code in this PR to demonstrate my point: @Test
public void shouldParseDateStringAsDateInArray() throws Exception {
String dateStr = "2019-08-23";
String arrayStr = "[" + dateStr + "]";
SchemaAndValue result = Values.parseString(arrayStr);
assertEquals(Type.ARRAY, result.schema().type());
Schema elementSchema = result.schema().valueSchema();
assertEquals(Type.INT32, elementSchema.type());
assertEquals(Date.LOGICAL_NAME, elementSchema.name());
java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_DATE_FORMAT_PATTERN).parse(dateStr);
assertEquals(Collections.singletonList(expected), result.value());
}
@Test
public void shouldParseTimeStringAsDateInArray() throws Exception {
String dateStr = "14:34:54.346Z";
String arrayStr = "[" + dateStr + "]";
SchemaAndValue result = Values.parseString(arrayStr);
assertEquals(Type.ARRAY, result.schema().type());
Schema elementSchema = result.schema().valueSchema();
assertEquals(Type.INT32, elementSchema.type());
assertEquals(Date.LOGICAL_NAME, elementSchema.name());
java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_TIME_FORMAT_PATTERN).parse(dateStr);
assertEquals(Collections.singletonList(expected), result.value());
}
@Test
public void shouldParseTimeStringAsDateInMap() throws Exception {
String dateStr = "14:34:54.346Z";
String arrayStr = "{3:" + dateStr + "}";
SchemaAndValue result = Values.parseString(arrayStr);
assertEquals(Type.MAP, result.schema().type());
Schema keySchema = result.schema().keySchema();
assertEquals(Type.INT8, keySchema.type());
Schema valueSchema = result.schema().valueSchema();
assertEquals(Type.INT32, valueSchema.type());
assertEquals(Date.LOGICAL_NAME, valueSchema.name());
java.util.Date expected = new SimpleDateFormat(Values.ISO_8601_TIME_FORMAT_PATTERN).parse(dateStr);
assertEquals(Collections.singletonMap((byte) 3, expected), result.value());
}If these cases (e.g., use as elements in arrays/maps) should be handled for times and timestamp literals, can we address them in this PR as well? |
|
Thanks, @C0urante. I've added your suggested tests plus a few more. |
|
Retest this, please. |
|
The previous build passed on JDK 8, and failures on other two were unrelated. But retesting anyway. @hachikuji, would you mind reviewing this? #7284 is blocked by this. Thanks! |
|
The JDK 11 / Scala 2.13 had this unrelated failure: |
|
What's the status of this PR? Is this ready to merge once the flaky tests pass? I am waiting for this in order to finish the PR #7284 |
|
@alozano3, we missed AK 2.4 code freeze for this PR, and because it needs to be backported to older branches we're waiting to merge this to |
|
Retest this please |
…a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.
|
Rebased on The original logic was introduced in AK 1.1.0, but since we typically backport only 2-3 releases my plan is to backport this as far back as the |
…a time and timestamp literal (#7568) * KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
…a time and timestamp literal (#7568) * KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
…a time and timestamp literal (#7568) * KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
Conflicts: * build.gradle: moved avro plugin definition below newly added test retry plugin. * apache-github/trunk: MINOR: further InternalTopologyBuilder cleanup (apache#8046) MINOR: Add timer for update limit offsets (apache#8047) HOTFIX: Fix spotsbug failure in Kafka examples (apache#8051) KAFKA-9447: Add new customized EOS model example (apache#8031) KAFKA-8164: Add support for retrying failed (apache#8019) HOTFIX: checkstyle for newly added unit test KAFKA-9261; Client should handle unavailable leader metadata (apache#7770) MINOR: Fix typos introduced in KIP-559 (apache#8042) MINOR: Fixing null handilg in ValueAndTimestampSerializer (apache#7679) KAFKA-9113: Clean up task management and state management (apache#7997) MINOR: fix checkstyle issue in ConsumerConfig.java (apache#8038) KAFKA-9491; Increment high watermark after full log truncation (apache#8037) KAFKA-9477 Document RoundRobinAssignor as an option for partition.assignment.strategy (apache#8007) KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal (apache#7568) KAFKA-9492; Ignore record errors in ProduceResponse for older versions (apache#8030)
…a time and timestamp literal (apache#7568) * KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
…a time and timestamp literal (apache#7568) * KAFKA-9074: Correct Connect’s `Values.parseString` to properly parse a time and timestamp literal Time and timestamp literal strings contain a `:` character, but the internal parser used in the `Values.parseString(String)` method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value. This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction. Author: Randall Hauch <rhauch@gmail.com> Reviewers: Chris Egerton <chrise@confluent.io>, Nigel Liang <nigel@nigelliang.com>, Jason Gustafson <jason@confluent.io>
Time and timestamp literal strings contain a
:character, but the internal parser used in theValues.parseString(String)method tokenizes on the colon character to tokenize and parse map entries. The colon could be escaped, but then the backslash character used to escape the colon is not removed and the parser fails to match the literal as a time or timestamp value.This fix corrects the parsing logic to properly parse timestamp and time literal strings whose colon characters are either escaped or unescaped. Additional unit tests were added to first verify the incorrect behavior and then to validate the correction.
Committer Checklist (excluded from commit message)