-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-14546] Fix errant pass for empty collections in Count #17813
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17813 +/- ##
==========================================
+ Coverage 74.09% 74.11% +0.02%
==========================================
Files 697 697
Lines 91980 92036 +56
==========================================
+ Hits 68148 68212 +64
+ Misses 22583 22574 -9
- Partials 1249 1250 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Can one of the admins verify this patch? |
1 similar comment
Can one of the admins verify this patch? |
Assigning reviewers. If you would like to opt out of this review, comment R: @damccorm for label go. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
if err := ptest.Run(p); err != nil { | ||
t.Errorf("Pipeline failed: %v", err) | ||
} | ||
}) | ||
} | ||
} | ||
|
||
func TestCount_Bad(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify and combine these tests into 1 by adding a expectErr
variable to the tests struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, although I don't see too much value in bundling them into one suite. The setup isn't particularly long or complicated to test the function, so deduplicating it doesn't add much value IMO. Totally cool doing it if you feel strongly though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue its worth doing since more/duplicated code => more opportunities for bugs to slip in when updates are needed and more for a future developer (maybe us) to understand (code is a liability). I'm not going to block on it though, its not very important
@@ -30,6 +30,10 @@ func Count(s beam.Scope, col beam.PCollection, name string, count int) { | |||
if typex.IsKV(col.Type()) { | |||
col = beam.DropKey(s, col) | |||
} | |||
|
|||
if count > 0 { | |||
NonEmpty(s, col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sum did need it, confirmed via unit test. Will add some short validation for Hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a small "fail on empty" check for Hash.
The portable failure looks legit as well:
Not sure if the test is broken (and erroneously passing) or if something here is broken |
It's strange, from what I can tell the problem lies in the synthetic code since Count() gets used heavily in a number of places and we have both unit testing and the other integration tests passing. Let me see if that's reproducible on other runners |
Run Go Flink ValidatesRunner |
Run Go PostCommit |
Found the problem, the synthetic StepCfg struct was not exported so the number of elements to emit in the synthetic tests was getting set to 0, producing empty PCollections |
Run Go Flink ValidatesRunner |
if err := ptest.Run(p); err != nil { | ||
t.Errorf("Pipeline failed: %v", err) | ||
} | ||
}) | ||
} | ||
} | ||
|
||
func TestCount_Bad(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue its worth doing since more/duplicated code => more opportunities for bugs to slip in when updates are needed and more for a future developer (maybe us) to understand (code is a liability). I'm not going to block on it though, its not very important
R: @youngoli for final approval |
R: @lostluck since they have context |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks!
Run GoPortable PreCommit |
Adds a NonEmpty() passert utility to check that a given PCollection has at least one element, then uses it to ensure that a collection passed to Count() with a non-zero expected number of elements has at least one element to avoid an erroneous passing test.
Discovered and fixed an instance of erroneous passing with the synthetic steps tests, and applies the same fix to
passert.Sum
.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.