[BEAM-14440] Add basic fuzz tests to the coders package #17587

jrmccluskey · 2022-05-09T15:08:36Z

Adds a few small fuzz test examples to the coders package, validating that a successful Encode() call will be followed by a successful Decode() call and then have an output matching the input for strings, ints, bytes, and doubles.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

asf-ci · 2022-05-09T15:08:38Z

Can one of the admins verify this patch?

codecov · 2022-05-09T15:12:17Z

Codecov Report

Merging #17587 (5816bc8) into master (c9883b2) will decrease coverage by 3.74%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #17587      +/-   ##
==========================================
- Coverage   73.93%   70.18%   -3.75%     
==========================================
  Files         691      694       +3     
  Lines       91560    97194    +5634     
==========================================
+ Hits        67694    68220     +526     
- Misses      22633    27668    +5035     
- Partials     1233     1306      +73

Flag	Coverage Δ
go	`43.09% <ø> (-7.13%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
sdks/go/pkg/beam/testing/ptest/ptest.go	`43.85% <0.00%> (-3.32%)`	⬇️
sdks/go/pkg/beam/register/emitter.go	`47.69% <0.00%> (ø)`
sdks/go/pkg/beam/register/iter.go	`67.21% <0.00%> (ø)`
sdks/go/pkg/beam/register/register.go	`7.60% <0.00%> (ø)`
sdks/go/pkg/beam/core/runtime/exec/datasource.go	`65.55% <0.00%> (+0.08%)`	⬆️
sdks/go/pkg/beam/core/runtime/exec/fn.go	`69.55% <0.00%> (+1.92%)`	⬆️
sdks/go/pkg/beam/core/runtime/exec/pardo.go	`59.10% <0.00%> (+11.15%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9883b2...5816bc8. Read the comment docs.

github-actions · 2022-05-09T15:38:29Z

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label go.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

jrmccluskey · 2022-05-09T16:23:41Z

Run GoPortable PreCommit

damccorm · 2022-05-09T16:35:02Z

sdks/go/pkg/beam/core/graph/coder/stringutf8_test.go

@@ -120,3 +120,26 @@ func TestEncodeDecodeStringUTF8LP(t *testing.T) {
 		})
 	}
 }
+
+func FuzzEncodeDecodeStringUTF8LP(f *testing.F) {


Is there a reason to keep the above test that is doing the same thing as this one? Same question for the tests in other files

We could replace all of the encode/decode tests with fuzz tests that use the testValues as a seed corpus if we wanted to.

I'd vote we do this rather than duplicating it - it will simplify any future test changes

Actually I've made a crucial error in that assumption: the fuzz tests ignore any input where the encoding fails, which wouldn't catch a regression in encoding behavior if we replaced the initial encoding tests

Hm, ok - I'd probably prefer that we split those out into their own test case so we're not double testing the same things, but I don't feel strongly about it - I'll approve and leave that up to you (and/or future reviewers)

There's an argument for moving the fuzz tests for all of the coders to their own file for clarity

What is the reasoning behind ignoring encoding failures in the fuzz tests instead of just failing them? If that small change was made there's no reason we couldn't replace the existing tests, right?

I'm actually curious on this one as well - I'd initially assumed there were inputs that should fail when encoding, but I can't think of any. Are there inputs like that? If so, are there few enough that we can special case them?

damccorm · 2022-05-09T16:43:32Z

sdks/go/pkg/beam/core/graph/coder/bytes_test.go

@@ -59,3 +59,23 @@ func TestEncodeDecodeBytes(t *testing.T) {
 		})
 	}
 }
+
+func FuzzEncodeDecodeBytes(f *testing.F) {


Looking at https://go.dev/doc/fuzz/, it looks like these tests won't be run automatically since we're not passing in the fuzz flag, right? We'll just run the seed corpus?

This is neat and probably helps us a bit, but investing heavily in tests that don't get run continuously has a pretty limited ceiling since it can't be used to prove correctness of future changes without taking non-obvious manual steps. We should definitely keep our eyes on this issue to enable continuous fuzzing - until we get there, I'd probably vote we don't make too big of a push towards adding a bunch of fuzz tests. Alternately, we could consider adding our fuzz tests to the set of validations that gets done before a release.

Again - this PR is great and provides a valuable map towards adding fuzz tests, I just want to make sure we're calling out the downsides before going too far down that road.

The continuous running element is something we can't/don't do right now, the idea would be to eventually add Beam to OSS Fuzz (https://google.github.io/oss-fuzz/) so we have that continuous coverage.

Right now these are mostly proof of concept and relatively simple. I don't anticipate a big push to add fuzz testing everywhere, just targeted coverage where appropriate over time.

What's the difference between having continuous fuzzing by adding Beam to OSS Fuzz, vs just having a script that runs go test ./... with the fuzz flag and running it as a Jenkins cron job? The latter seems like something we could do immediately as far as I can tell.

IIRC testing with the fuzz flag is not guaranteed to ever terminate. I tried testing this out and its been running FuzzEncodeDecodeBytes for several minutes.

It also seemed like go test -fuzz ./... didn't actually correctly run the fuzz tests, it seems like they may need to be run one at a time:

dannymccormick-macbookpro:coder dannymccormick$ go test -fuzz ./... testing: will not fuzz, -fuzz matches more than one fuzz test: [FuzzEncodeDecodeBytes FuzzEncodeDecodeDouble FuzzEncodeDecodeUInt64 FuzzEncodeDecodeInt32 FuzzEncodeDecodeStringUTF8LP]

github-actions · 2022-05-11T21:07:24Z

R: @youngoli for final approval

youngoli

This PR looks good enough to merge in its current state. I do have some comments on the discussions above, but any actual changes based on those discussions can come in follow-up PRs.

youngoli · 2022-05-16T03:33:33Z

Run GoPortable PreCommit

youngoli · 2022-05-16T05:22:27Z

Run GoPortable PreCommit

jrmccluskey · 2022-05-16T17:36:41Z

Run GoPortable PreCommit

jrmccluskey added 4 commits May 9, 2022 10:38

UTF8 fuzz test

4af6d09

Ints fuzz tests

7a88b9c

Bytes fuzz tests

7f6670c

Double fuzz test

4a019a2

github-actions bot added the go label May 9, 2022

github-actions bot added the Next Action: Reviewers label May 9, 2022

damccorm reviewed May 9, 2022

View reviewed changes

damccorm approved these changes May 11, 2022

View reviewed changes

Move fuzz tests to standalone test file

5816bc8

youngoli approved these changes May 16, 2022

View reviewed changes

youngoli merged commit 341a836 into apache:master May 16, 2022

jrmccluskey deleted the fuzz branch May 16, 2022 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEAM-14440] Add basic fuzz tests to the coders package #17587

[BEAM-14440] Add basic fuzz tests to the coders package #17587

jrmccluskey commented May 9, 2022

asf-ci commented May 9, 2022

codecov bot commented May 9, 2022 •

edited

Loading

github-actions bot commented May 9, 2022

jrmccluskey commented May 9, 2022

damccorm May 9, 2022

jrmccluskey May 9, 2022

damccorm May 9, 2022

jrmccluskey May 9, 2022

damccorm May 9, 2022

jrmccluskey May 9, 2022

youngoli May 16, 2022

damccorm May 16, 2022

damccorm May 9, 2022

jrmccluskey May 9, 2022

damccorm May 9, 2022

youngoli May 16, 2022

damccorm May 17, 2022

github-actions bot commented May 11, 2022

youngoli left a comment •

edited

Loading

youngoli commented May 16, 2022

youngoli commented May 16, 2022

jrmccluskey commented May 16, 2022

[BEAM-14440] Add basic fuzz tests to the coders package #17587

[BEAM-14440] Add basic fuzz tests to the coders package #17587

Conversation

jrmccluskey commented May 9, 2022

GitHub Actions Tests Status (on master branch)

asf-ci commented May 9, 2022

codecov bot commented May 9, 2022 • edited Loading

Codecov Report

github-actions bot commented May 9, 2022

jrmccluskey commented May 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented May 11, 2022

youngoli left a comment • edited Loading

Choose a reason for hiding this comment

youngoli commented May 16, 2022

youngoli commented May 16, 2022

jrmccluskey commented May 16, 2022

codecov bot commented May 9, 2022 •

edited

Loading

youngoli left a comment •

edited

Loading