-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors: remove verifier truncation heuristic, rely on ENOSPC #756
Conversation
Have you actually managed to get ENOSPC from loading BTF? In my experience an EINVAL or similar will override ENOSPC when loading invalid BTF. Maybe you're talking about ENOSPC from valid BTF with a small log? Agreed re bringing the check for ENOSPC back, but I think we might still need the heuristic for BTF. We could fall back if the logErr is nil, or we could factor it out into a separate function and call that from the BTF loading code. |
I see, I misinterpreted your I'll think of something else. |
This feels clearer to me. BPF and BTF verifier logs have different styles and their APIs have different semantics. |
0076beb
to
6147394
Compare
Just having seen a 4,6MB log from a bpf_host complexity issue, I started work on automatically growing the log buffer instead of hardcoding a size here, but hit a small bug in the lib. A possible solution is out, but still needs review: cilium/ebpf#756. Grow the buffer to 10MiB to at least get access to the full contents. Fixes: cilium#20738 Signed-off-by: Timo Beckers <timo@isovalent.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to capture full verifier errors of large programs, the caller can
choose to repeatedly retry loading the program with progressively larger log
buffers, since the important info is always at the end of the log.
I think this is the first time I understand what you are trying to achieve, and why you need .Truncated to be accurate. I didn't consider that use case when introducing .Truncated, so I didn't mind that the heuristic isn't bomb proof.
You're trying to make .Truncated "reliable". Have you considered implementing resize-and-retry in the lib instead? Seems like a useful feature to have.
Just having seen a 4,6MB log from a bpf_host complexity issue, I started work on automatically growing the log buffer instead of hardcoding a size here, but hit a small bug in the lib. A possible solution is out, but still needs review: cilium/ebpf#756. Grow the buffer to 10MiB to at least get access to the full contents. Fixes: #20738 Signed-off-by: Timo Beckers <timo@isovalent.com>
Add a test to make sure a default Spec can be instantiated if so desired. This is convenient for exercising/experimenting with BTF verifier error handling. Signed-off-by: Timo Beckers <timo@isovalent.com>
6147394
to
47302a3
Compare
In order to capture full verifier errors of large programs, the caller can choose to repeatedly retry loading the program with progressively larger log buffers, since the important info is always at the end of the log. In doing so, it's incredibly likely for the last character of the buffer to be a newline, resulting in the truncated flag being set to false when it should instead be set to true, regardless of the kernel's ENOSPC return code. This commit removes the newline heuristic that currently aims to detect log truncation. Instead, we now require ENOSPC to be sent by the kernel in order for the Truncated flag to be set in the VerifierError. The heuristic was initially introduced for parsing BTF verifier errors, since there's a latent kernel bug that causes ENOSPC only being returned on a successful BTF load. Unfortunately, it turns out that not all BTF verifier errors end in a newline either (see calls to btf_verifier_log()), making the newline heuristic ineffective for BTF logs as well. ErrorWithLog now takes an ancillary error from a second load operation that is used to determine truncation status. This second error will not be wrapped into the VerifierError. Signed-off-by: Timo Beckers <timo@isovalent.com>
47302a3
to
3bad83a
Compare
In order to capture full verifier errors of large programs, the caller can
choose to repeatedly retry loading the program with progressively larger log
buffers, since the important info is always at the end of the log.
In doing so, it's incredibly likely for the last character of the buffer to
be a newline, resulting in the truncated flag being set to false when it should
instead be set to true, regardless of the kernel's ENOSPC return code.
This commit removes the newline heuristic that currently aims to detect log
truncation. Instead, we now require ENOSPC to be sent by the kernel in order
for the Truncated flag to be set in the VerifierError.
The heuristic was initially introduced for parsing BTF verifier errors,
since there's a latent kernel bug that causes ENOSPC only being returned
on a successful BTF load. Unfortunately, it turns out that not all BTF
verifier errors end in a newline either (see calls to btf_verifier_log()),
making the newline heuristic ineffective for BTF logs as well.
ErrorWithLog now takes an ancillary error from a second load operation that
is used to determine truncation status. This second error will not be wrapped
into the VerifierError.