Skip to content

Conversation

@lostluck
Copy link
Contributor

Fixes #22900

This adds one time use stream decoders for use whenever we know reiterations won't occur (see issue).

Adds tests, and improves test coverage with a datasource tests for the various iterator handlers (on demand and fixed), unit tests for the on demand decode streams, a new test for pardo short reads, and a primitives test for GBK short reads.

Fixes previous benign issue where value iteration streams were not being closed after the processing DoFn finished, and test for same. Since we always fully read the byte stream for a value, we never ran into a short read problem before. Streaming decode short reads on the data side would cause problems that don't occur with short reads for Side Inputs.

There's a followup task for optimizing short reads when the coder being used is length prefixed, filed in #22901.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@github-actions github-actions bot added the go label Aug 26, 2022
@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #22904 (4b6ace8) into master (c866771) will increase coverage by 0.01%.
The diff coverage is 80.68%.

@@            Coverage Diff             @@
##           master   #22904      +/-   ##
==========================================
+ Coverage   73.88%   73.90%   +0.01%     
==========================================
  Files         713      713              
  Lines       94321    94464     +143     
==========================================
+ Hits        69691    69813     +122     
- Misses      23342    23354      +12     
- Partials     1288     1297       +9     
Flag Coverage Δ
go 51.36% <80.68%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/go/pkg/beam/core/runtime/exec/datasource.go 66.30% <76.31%> (+0.74%) ⬆️
sdks/go/pkg/beam/core/runtime/exec/fullvalue.go 83.59% <81.73%> (-2.29%) ⬇️
sdks/go/pkg/beam/core/runtime/exec/fn.go 70.76% <100.00%> (+2.44%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label go.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple nits, overall LGTM - this is a good approach

@lostluck
Copy link
Contributor Author

Thanks for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request][Go SDK]: Specialize post-GBK iteration to decode on demand, if not-reiterated.

2 participants