Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-14440] Add basic fuzz tests to the coders package #17587

Merged
merged 5 commits into from
May 16, 2022

Conversation

jrmccluskey
Copy link
Contributor

Adds a few small fuzz test examples to the coders package, validating that a successful Encode() call will be followed by a successful Decode() call and then have an output matching the input for strings, ints, bytes, and doubles.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@asf-ci
Copy link

asf-ci commented May 9, 2022

Can one of the admins verify this patch?

@github-actions github-actions bot added the go label May 9, 2022
@codecov
Copy link

codecov bot commented May 9, 2022

Codecov Report

Merging #17587 (5816bc8) into master (c9883b2) will decrease coverage by 3.74%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #17587      +/-   ##
==========================================
- Coverage   73.93%   70.18%   -3.75%     
==========================================
  Files         691      694       +3     
  Lines       91560    97194    +5634     
==========================================
+ Hits        67694    68220     +526     
- Misses      22633    27668    +5035     
- Partials     1233     1306      +73     
Flag Coverage Δ
go 43.09% <ø> (-7.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/go/pkg/beam/testing/ptest/ptest.go 43.85% <0.00%> (-3.32%) ⬇️
sdks/go/pkg/beam/register/emitter.go 47.69% <0.00%> (ø)
sdks/go/pkg/beam/register/iter.go 67.21% <0.00%> (ø)
sdks/go/pkg/beam/register/register.go 7.60% <0.00%> (ø)
sdks/go/pkg/beam/core/runtime/exec/datasource.go 65.55% <0.00%> (+0.08%) ⬆️
sdks/go/pkg/beam/core/runtime/exec/fn.go 69.55% <0.00%> (+1.92%) ⬆️
sdks/go/pkg/beam/core/runtime/exec/pardo.go 59.10% <0.00%> (+11.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9883b2...5816bc8. Read the comment docs.

@github-actions
Copy link
Contributor

github-actions bot commented May 9, 2022

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damccorm for label go.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@jrmccluskey
Copy link
Contributor Author

Run GoPortable PreCommit

@@ -120,3 +120,26 @@ func TestEncodeDecodeStringUTF8LP(t *testing.T) {
})
}
}

func FuzzEncodeDecodeStringUTF8LP(f *testing.F) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to keep the above test that is doing the same thing as this one? Same question for the tests in other files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could replace all of the encode/decode tests with fuzz tests that use the testValues as a seed corpus if we wanted to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote we do this rather than duplicating it - it will simplify any future test changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I've made a crucial error in that assumption: the fuzz tests ignore any input where the encoding fails, which wouldn't catch a regression in encoding behavior if we replaced the initial encoding tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, ok - I'd probably prefer that we split those out into their own test case so we're not double testing the same things, but I don't feel strongly about it - I'll approve and leave that up to you (and/or future reviewers)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an argument for moving the fuzz tests for all of the coders to their own file for clarity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning behind ignoring encoding failures in the fuzz tests instead of just failing them? If that small change was made there's no reason we couldn't replace the existing tests, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually curious on this one as well - I'd initially assumed there were inputs that should fail when encoding, but I can't think of any. Are there inputs like that? If so, are there few enough that we can special case them?

@@ -59,3 +59,23 @@ func TestEncodeDecodeBytes(t *testing.T) {
})
}
}

func FuzzEncodeDecodeBytes(f *testing.F) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at https://go.dev/doc/fuzz/, it looks like these tests won't be run automatically since we're not passing in the fuzz flag, right? We'll just run the seed corpus?

This is neat and probably helps us a bit, but investing heavily in tests that don't get run continuously has a pretty limited ceiling since it can't be used to prove correctness of future changes without taking non-obvious manual steps. We should definitely keep our eyes on this issue to enable continuous fuzzing - until we get there, I'd probably vote we don't make too big of a push towards adding a bunch of fuzz tests. Alternately, we could consider adding our fuzz tests to the set of validations that gets done before a release.

Again - this PR is great and provides a valuable map towards adding fuzz tests, I just want to make sure we're calling out the downsides before going too far down that road.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The continuous running element is something we can't/don't do right now, the idea would be to eventually add Beam to OSS Fuzz (https://google.github.io/oss-fuzz/) so we have that continuous coverage.

Right now these are mostly proof of concept and relatively simple. I don't anticipate a big push to add fuzz testing everywhere, just targeted coverage where appropriate over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, SGTM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between having continuous fuzzing by adding Beam to OSS Fuzz, vs just having a script that runs go test ./... with the fuzz flag and running it as a Jenkins cron job? The latter seems like something we could do immediately as far as I can tell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC testing with the fuzz flag is not guaranteed to ever terminate. I tried testing this out and its been running FuzzEncodeDecodeBytes for several minutes.

It also seemed like go test -fuzz ./... didn't actually correctly run the fuzz tests, it seems like they may need to be run one at a time:

dannymccormick-macbookpro:coder dannymccormick$ go test -fuzz ./...
testing: will not fuzz, -fuzz matches more than one fuzz test: [FuzzEncodeDecodeBytes FuzzEncodeDecodeDouble FuzzEncodeDecodeUInt64 FuzzEncodeDecodeInt32 FuzzEncodeDecodeStringUTF8LP]

@github-actions
Copy link
Contributor

R: @youngoli for final approval

Copy link
Contributor

@youngoli youngoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good enough to merge in its current state. I do have some comments on the discussions above, but any actual changes based on those discussions can come in follow-up PRs.

@youngoli
Copy link
Contributor

Run GoPortable PreCommit

2 similar comments
@youngoli
Copy link
Contributor

Run GoPortable PreCommit

@jrmccluskey
Copy link
Contributor Author

Run GoPortable PreCommit

@youngoli youngoli merged commit 341a836 into apache:master May 16, 2022
@jrmccluskey jrmccluskey deleted the fuzz branch May 16, 2022 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants