Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-14489] Remove non-SDF version of TextIO. #17712

Merged
merged 4 commits into from
May 24, 2022

Conversation

lostluck
Copy link
Contributor

Removes the non-SDF version of TextIO, so that it's old pattern can't be copied. All meaningful runners understand SDFs, and we're confident in their execution at this stage.

Per Go policy, the old *Sdf methods are marked deprecated, but will not be removed until a major version change, because this is a user use package.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@codecov
Copy link

codecov bot commented May 19, 2022

Codecov Report

Merging #17712 (b95bafc) into master (6774b74) will increase coverage by 0.00%.
The diff coverage is 44.15%.

@@           Coverage Diff           @@
##           master   #17712   +/-   ##
=======================================
  Coverage   74.00%   74.01%           
=======================================
  Files         695      695           
  Lines       91798    91798           
=======================================
+ Hits        67938    67944    +6     
+ Misses      22612    22608    -4     
+ Partials     1248     1246    -2     
Flag Coverage Δ
go 50.44% <44.15%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/go/pkg/beam/io/textio/textio.go 55.15% <44.15%> (-10.42%) ⬇️
sdks/go/pkg/beam/pardo.go 47.41% <0.00%> (-3.03%) ⬇️
sdks/go/pkg/beam/core/runtime/graphx/translate.go 43.01% <0.00%> (ø)
...o/pkg/beam/io/rtrackers/offsetrange/offsetrange.go 75.70% <0.00%> (ø)
sdks/go/pkg/beam/core/sdf/wrappedbounded.go 0.00% <0.00%> (ø)
sdks/go/pkg/beam/core/runtime/exec/sdf.go 71.16% <0.00%> (+0.15%) ⬆️
sdks/go/pkg/beam/runners/dataflow/dataflow.go 53.64% <0.00%> (+0.62%) ⬆️
...ks/go/pkg/beam/runners/dataflow/dataflowlib/job.go 22.84% <0.00%> (+6.57%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6774b74...b95bafc. Read the comment docs.

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @jrmccluskey for label go.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

func ReadAllSdf(s beam.Scope, col beam.PCollection) beam.PCollection {
s = s.Scope("textio.ReadAllSdf")
s = s.Scope("textio.ReadAll")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on moving this file's contents into the main textio.go file? If we're removing the distinction between read and readSdf, splitting doesn't make sense anymore IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, no reason to logically split it out if it's SDFs all the way down

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Great call! Moved and ensured placement was reasonable, and dropped vestigial SDFs that were no longer necessary (though added documentation that they are SDFs and that they're useful for splitting within files.)

func ReadSdf(s beam.Scope, glob string) beam.PCollection {
s = s.Scope("textio.ReadSdf")
s = s.Scope("textio.Read")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be overthinking this, but is this (minorly) breaking if anyone is using this for a composite transform or checking it in a test? It might be worth pulling out the rest of this function out into its own helper and then having each caller of ReadSdf set its scope before calling it.

Relatedly, does this overwrite the scope set by ReadAllSdf? (probably not worth changing at this point for the same breaking reason, I'm just curious)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC it becomes a sub-scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it could be confusing, or breaking, so changed it to maintain scopes.

Copy link
Contributor Author

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

func ReadSdf(s beam.Scope, glob string) beam.PCollection {
s = s.Scope("textio.ReadSdf")
s = s.Scope("textio.Read")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it could be confusing, or breaking, so changed it to maintain scopes.

func ReadAllSdf(s beam.Scope, col beam.PCollection) beam.PCollection {
s = s.Scope("textio.ReadAllSdf")
s = s.Scope("textio.ReadAll")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Great call! Moved and ensured placement was reasonable, and dropped vestigial SDFs that were no longer necessary (though added documentation that they are SDFs and that they're useful for splitting within files.)

Copy link
Contributor

@damccorm damccorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks!

@lostluck lostluck merged commit acea402 into apache:master May 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants