Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libfuzzer ubsan build: BAD BUILD: UBSan build of <fuzzer> seems to be compiled with ASan. #5317

Open
asraa opened this issue Mar 8, 2021 · 21 comments

Comments

@asraa
Copy link
Contributor

asraa commented Mar 8, 2021

Related: #4743

https://oss-fuzz-build-logs.storage.googleapis.com/log-6cb883f9-5026-4454-bc7c-a878a93962fb.txt

After fixing unrelated build issues, Envoy has been running into bad build checks on 100% of fuzzers.

Step #14: BAD BUILD: UBSan build of /tmp/not-out/symbol_table_fuzz_test seems to be compiled with ASan.

My hope was to reproduce this locally and figure out the number of ASan calls happening so I could check what's happening, but it can't reproduce locally. If it's small, hoping this can be increased, but I don't know what the threshold should be.

ASAN_CALLS_THRESHOLD_FOR_NON_ASAN_BUILD=0

@jonathanmetzman
Copy link
Contributor

jonathanmetzman commented Mar 8, 2021

Related: #4743

https://oss-fuzz-build-logs.storage.googleapis.com/log-6cb883f9-5026-4454-bc7c-a878a93962fb.txt

After fixing unrelated build issues, Envoy has been running into bad build checks on 100% of fuzzers.

Step #14: BAD BUILD: UBSan build of /tmp/not-out/symbol_table_fuzz_test seems to be compiled with ASan.

My hope was to reproduce this locally and figure out the number of ASan calls happening so I could check what's happening, but it can't reproduce locally. If it's small, hoping this can be increased, but I don't know what the threshold should be.

ASAN_CALLS_THRESHOLD_FOR_NON_ASAN_BUILD=0

Just checking, you can't reproduce this by building fuzzers with ubsan locally and then running check_build?

@asraa
Copy link
Contributor Author

asraa commented Mar 8, 2021

Yeah, I tried when I commented on the issue (built them and ran check build), but I'll re-try right now as well.

@jonathanmetzman
Copy link
Contributor

I just tried to repro this locally and couldn't either. Very weird.
I notice though that an ASAN build is happening first. Maybe, some kind of state is getting saved between builds? Though I have no idea how this could happen.

@asraa
Copy link
Contributor Author

asraa commented Mar 8, 2021

That would be bizarre since I notice that most of the vars (eg ASAN_CALLS) are local. I'm not 100% sure how it would be possible for the fuzz target binaries to be accidentally shared without a lot of failures in other people's build...

@jonathanmetzman
Copy link
Contributor

Anyway, I'm trying to do a build on Google Cloud Build with only UBSAN. If this succeeds I think it will be evidence for my state theory.

@jonathanmetzman
Copy link
Contributor

jonathanmetzman commented Mar 8, 2021

Also doing another build that will skip testing.
Then we will be able to actually download this build.

@asraa
Copy link
Contributor Author

asraa commented Mar 8, 2021

Thanks!

@jonathanmetzman
Copy link
Contributor

Suspect there's some kind of state issue.
UBSAN builds without ASAN succeed.
https://console.cloud.google.com/cloud-build/builds;region=global/fa79ffd3-3c87-4625-aeed-cb8d297efa12;step=5?organizationId=433637338589&project=oss-fuzz

@oliverchang Any ideas what could be happening here?

@asraa
Copy link
Contributor Author

asraa commented Mar 31, 2021

Looking at a latest failed build https://oss-fuzz-build-logs.storage.googleapis.com/log-ea90f23b-921e-496b-9285-85faa267cb44.txt

Seems like it's expected to run the address job and then the undefined job sequentially.

Step #4: + bazel build --verbose_failures --dynamic_mode=off --spawn_strategy=standalone --genrule_strategy=standalone '--local_cpu_resources=HOST_CPUS*0.32' --//source/extensions/wasm_runtime/v8:enabled=false --build_tag_filters=-no_asan --config=oss-fuzz --copt=-D__SANITIZE_ADDRESS__ --copt=-DADDRESS_SANITIZER=1 --linkopt=-fsanitize=address //test/server/config_validation:xds_fuzz_test_oss_fuzz 

AND

Step #13: + bazel build --verbose_failures --dynamic_mode=off --spawn_strategy=standalone --genrule_strategy=standalone '--local_cpu_resources=HOST_CPUS*0.32' --//source/extensions/wasm_runtime/v8:enabled=false --build_tag_filters=-no_asan --config=oss-fuzz '--linkopt="/usr/local/lib/clang/12.0.0/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.a"' //test/server/config_validation:xds_fuzz_test_oss_fuzz 

When I scroll to the second I see a bunch of failures:

Step #13: mkdir: cannot create directory '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/common/stats/stat_merger_fuzz_test_oss_fuzz.tar.staging': File exists

and I also see

Step #13: + tar -xvf bazel-bin/test/extensions/compression/gzip/compressor_fuzz_test_oss_fuzz.tar -C /workspace/out/undefined
[...]
Step #14: BAD BUILD: UBSan build of /tmp/not-out/least_request_load_balancer_fuzz_test seems to be compiled with ASan.
Step #14: 
Step #14: BAD BUILD: UBSan build of /tmp/not-out/hpack_fuzz_test seems to be compiled with ASan.

It looks like bad build checks are performed on targets in /tmp/not-out/, so maybe the address binaries are carried over.

Throwing shots at the dark but maybe if they went to /tmp/not-out/undefined and checked there that'd fix things.

I'm skeptical that's the issue because everyone running multiple sanitizers with build checks would be seeing this issue...

Edit: Don't think so

recreate_directory(TMP_FUZZER_DIR)
.
I also ran some local experiments. If I build an Envoy fuzzer locally without sanitizers (just as a regression test), I find that there are 2 calls to __asan_poison_memory_region each.

@asraa
Copy link
Contributor Author

asraa commented Apr 1, 2021

Sorry for the crazy amount of spam, trying to trace down where the state sharing is coming from.

I just want to make sure that

Step #13: INFO: From Action test/common/network/utility_fuzz_test_oss_fuzz.tar:
Step #13: mkdir: cannot create directory '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/common/network/utility_fuzz_test_oss_fuzz.tar.staging': File exists
Step #13: ln: failed to create symbolic link '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/common/network/utility_fuzz_test_oss_fuzz.tar.staging/utility_fuzz_test': File exists

isn't causing some issue. Do you think it's possible that the oss-fuzz packages aren't being re-set in the second run and we're copying the old ones? (We only clear bazel-* directory, not bazel cache at the end of the build script). If so, we could recreate $STAGING_DIR here:
https://github.com/bazelbuild/rules_fuzzing/blob/fc0e7d437fda6666b9f5f25d1a40ae919cfcecf2/fuzzing/private/oss_fuzz/package.bzl#L32-L38

Are other rules_fuzzing projects with multiple sanitizers running OK? Cel-cpp seems fine, but I don't see those lines in their log.
@stefanbucur

@stefanbucur
Copy link
Contributor

Sorry for the crazy amount of spam, trying to trace down where the state sharing is coming from.

I just want to make sure that

Step #13: INFO: From Action test/common/network/utility_fuzz_test_oss_fuzz.tar:
Step #13: mkdir: cannot create directory '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/common/network/utility_fuzz_test_oss_fuzz.tar.staging': File exists
Step #13: ln: failed to create symbolic link '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/common/network/utility_fuzz_test_oss_fuzz.tar.staging/utility_fuzz_test': File exists

isn't causing some issue. Do you think it's possible that the oss-fuzz packages aren't being re-set in the second run and we're copying the old ones? (We only clear bazel-* directory, not bazel cache at the end of the build script). If so, we could recreate $STAGING_DIR here:

Hmmm, I think I know what's going on: The --spawn_strategy=standalone flags mean that build actions aren't sandboxed, so all intermediate build artifacts used by the Bazel fuzzing rules aren't cleaned up automatically: https://docs.bazel.build/versions/master/user-manual.html#flag--spawn_strategy

I think the fix is easy: just remove the staging dir before creating it in the action script. Let me fix this now.

https://github.com/bazelbuild/rules_fuzzing/blob/fc0e7d437fda6666b9f5f25d1a40ae919cfcecf2/fuzzing/private/oss_fuzz/package.bzl#L32-L38

Are other rules_fuzzing projects with multiple sanitizers running OK?
@stefanbucur

I think the other projects would not be affected because they use the default (sandboxed) execution model, which does this cleanup automatically.

@asraa
Copy link
Contributor Author

asraa commented Apr 1, 2021

AH! Totally didn't think of that flag. Thank you! I can bump in Envoy!!

@stefanbucur
Copy link
Contributor

AH! Totally didn't think of that flag. Thank you! I can bump in Envoy!!

Great! The fix is submitted and you can now use this new release in Envoy: https://github.com/bazelbuild/rules_fuzzing/releases/tag/v0.1.3

@jonathanmetzman
Copy link
Contributor

I'm confused about this. I feel like the container we are building the builds from should be torn down and restarted fresh. So I can't understand how state is persisting.

@stefanbucur
Copy link
Contributor

I'm confused about this. I feel like the container we are building the builds from should be torn down and restarted fresh. So I can't understand how state is persisting.

Is each sanitizer configuration built in a separate container? If yes, is there any external storage that is mounted and persisted across runs? (IIRC, this is the case for local runs using infra/helper.py.)

@jonathanmetzman
Copy link
Contributor

I'm confused about this. I feel like the container we are building the builds from should be torn down and restarted fresh. So I can't understand how state is persisting.

Is each sanitizer configuration built in a separate container? If yes, is there any external storage that is mounted and persisted across runs? (IIRC, this is the case for local runs using infra/helper.py.)

Maybe things like /work are persisted but from what I've been told by @oliverchang nothing should be persisted

@asraa
Copy link
Contributor Author

asraa commented Apr 5, 2021

If somehow bazel cache (/builder/home/.cache/bazel/_bazel_root) is persisting across these containers which seems to be the case, we should probably be clearing it between builds (since we can't re-use anyway) to save space.

@oliverchang
Copy link
Collaborator

On our build infra (cloud build), only /workspace is persisted. For each sanitizer, we set e.g. OUT=/workspace/out/address. Does the bazel build write anything above the address dir?

@asraa
Copy link
Contributor Author

asraa commented Apr 6, 2021

Huh! Only the binaries are written to out. The cache is not in /workspace for sure, the error seemed to indicate the cache was persisted across the asan/ubsan:

Step #13: mkdir: cannot create directory '/builder/home/.cache/bazel/_bazel_root/4e9824db8e7d11820cfa25090ed4ed10/execroot/envoy/bazel-out/k8-fastbuild/bin/test/extensions/tracers/xray/xray_fuzz_test_oss_fuzz.tar.staging': File exists

jonathanmetzman added a commit that referenced this issue Jul 19, 2021
GCB passes HOME as env var to the docker container. It sets
HOME to /builder/home which is persisted accross builds.
This issue causes build breakages in
#6035
and possibly #5317.
Perhaps more insidiuosly it can cause fuzzers to be built with
the wrong instrumentation.
@jonathanmetzman
Copy link
Contributor

jonathanmetzman commented Jul 19, 2021

I haven't confirmed this issue was fixed by #6069 but I suspect it is.
If we look again at the build logs, we can see the .cache directory being stored in /builder/home (EDIT: Asra helpfully pointed this out above).
As pointed out in #6035, /builder/home is actually persisted across builds (!!!)

jonathanmetzman added a commit that referenced this issue Jul 20, 2021
[infra][build] Set HOME=/root on GCB when doing fuzzer builds.
GCB passes HOME as env var to the docker container. It sets
HOME to /builder/home which is persisted accross builds.
This issue causes build breakages in
#6035
and possibly #5317.
Perhaps more insidiuosly it can cause fuzzers to be built with
the wrong instrumentation.
@maflcko
Copy link
Contributor

maflcko commented Jun 13, 2023

Is this still an issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants