Skip to content

MINOR: improve JavaDocs about auto-repartitioning in Streams DSL#6269

Merged
bbejeck merged 1 commit intoapache:trunkfrom
mjsax:minor-improve-repartition-javadocs
Feb 18, 2019
Merged

MINOR: improve JavaDocs about auto-repartitioning in Streams DSL#6269
bbejeck merged 1 commit intoapache:trunkfrom
mjsax:minor-improve-repartition-javadocs

Conversation

@mjsax
Copy link
Copy Markdown
Member

@mjsax mjsax commented Feb 14, 2019

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@mjsax mjsax added the streams label Feb 14, 2019
@mjsax
Copy link
Copy Markdown
Member Author

mjsax commented Feb 14, 2019

Call for review @guozhangwang @bbejeck @vvcephei @ableegoldman

The question about this comes up on a regular basis -- this should help to clarify and reduce number of questions.

* }
* });
* }</pre>
* <p>
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side cleanup: not necessary after a <pre></pre> block

* Transforming records might result in an internal data redistribution if a key based operator (like an aggregation
* or join) is applied to the result {@code KStream}.
* (cf. {@link #transformValues(ValueTransformerSupplier, String...)})
* </p>
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaDocs is not HTML. No closing </p> tag required.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I actually just looked it up, because I was surprised all those other <p> elements were not closed... Apparently, even in HTML, you don't have to close <p> elements:

Paragraphs are block-level elements, and notably will automatically close if another block-level element is parsed before the closing </p> tag.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p

Sometimes, you just have to stop and marvel at HTML...

* {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG}, "storeName" is an
* internally generated name, and "-changelog" is a fixed suffix.
*
* <p>
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some missing paragraph breaks

Copy link
Copy Markdown
Contributor

@vvcephei vvcephei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I asked a question, but it's independent of your proposed change.

* Transforming records might result in an internal data redistribution if a key based operator (like an aggregation
* or join) is applied to the result {@code KStream}.
* (cf. {@link #transformValues(ValueTransformerSupplier, String...)})
* </p>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I actually just looked it up, because I was surprised all those other <p> elements were not closed... Apparently, even in HTML, you don't have to close <p> elements:

Paragraphs are block-level elements, and notably will automatically close if another block-level element is parsed before the closing </p> tag.
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p

Sometimes, you just have to stop and marvel at HTML...

* }
* }</pre>
* Even if any upstream operation was key-changing, no auto-repartition is triggered.
* If repartitioning is required, a call to {@link #through(String)} should be performed before {@code transform()}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bummer, since it forces you to come up with a name for the repartition topic, which also makes its lifecycle your responsibility. And also, Streams won't know that it's "just" a repartition topic, so it can't be optimized.

Can I create a Jira to add a way to mark the transform as "repartition required"? Or maybe we can automatically do this when the transform has a state store?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a JIRA: https://issues.apache.org/jira/browse/KAFKA-7608 and also inactive KIP https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+KStream+with+Connecting+Topic+Creation+and+Repartition+Hint

It's not fully backed out yet, but there is stuff we could do to improve the user experience.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks!

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @mjsax LGTM

@bbejeck bbejeck merged commit 9fe89f3 into apache:trunk Feb 18, 2019
@bbejeck
Copy link
Copy Markdown
Member

bbejeck commented Feb 18, 2019

cherry-picked to 2.2

@mjsax mjsax deleted the minor-improve-repartition-javadocs branch February 19, 2019 01:57
jarekr pushed a commit to confluentinc/kafka that referenced this pull request Apr 18, 2019
* ak/trunk: (45 commits)
  KAFKA-7487: DumpLogSegments misreports offset mismatches (apache#5756)
  MINOR: improve JavaDocs about auto-repartitioning in Streams DSL (apache#6269)
  KAFKA-7935: UNSUPPORTED_COMPRESSION_TYPE if ReplicaManager.getLogConfig returns None (apache#6274)
  KAFKA-7895: Fix stream-time reckoning for suppress (apache#6278)
  KAFKA-6569: Move OffsetIndex/TimeIndex logger to companion object  (apache#4586)
  MINOR: add log indicating the suppression time (apache#6260)
  MINOR: Make info logs for KafkaConsumer a bit more verbose (apache#6279)
  KAFKA-7758: Reuse KGroupedStream/KGroupedTable with named repartition topics (apache#6265)
  KAFKA-7884; Docs for message.format.version should display valid values (apache#6209)
  MINOR: Save failed test output to build output directory
  MINOR: add test for StreamsSmokeTestDriver (apache#6231)
  MINOR: Fix bugs identified by compiler warnings (apache#6258)
  KAFKA-6474: Rewrite tests to use new public TopologyTestDriver [part 4] (apache#5433)
  MINOR: fix bypasses in ChangeLogging stores (apache#6266)
  MINOR: Make MockClient#poll() more thread-safe (apache#5942)
  MINOR: drop dbAccessor reference on close (apache#6254)
  KAFKA-7811: Avoid unnecessary lock acquire when KafkaConsumer commits offsets (apache#6119)
  KAFKA-7916: Unify store wrapping code for clarity (apache#6255)
  MINOR: Add missing Alter Operation to Topic supported operations list in AclCommand
  KAFKA-7921: log at error level for missing source topic (apache#6262)
  ...
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants