Skip to content

Uncoordintated omission - validating http load genertaors with bpf_validator #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

johnaohara
Copy link
Member

No description provided.

#[Buckets = 27, SubBuckets = 2048]
----------------------------------------------------------
92597 requests in 20.00s, 9.01MB read <6>
Socket errors: connect 5, read 0, write 0, timeout 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket errors? Some have been timed out...are important info?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, and tbh I did not capture that info. It will mean we will likely be missing measurements for 5 requests... but I don't think that invalidates the message in the post.

Av Throughput: 5142.425405 req/sec <6>

----
<1> The average RTT was 0.240 ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order and phrasing content in the bullet points is slightly different from the one to compare with i.e. https://github.com/RedHatPerf/redhatperf.github.io/pull/21/files#diff-e4c84ea749a91a7a6497412e17ff5f4973aec06e593c39f88e6c5abab5211a6aR432

and my eyes are flipping to detect which parts are relevant/different for comparison

A tabular/summary/chart view would help IMO


https://github.com/johnaohara/bpf_validator/[bpf_validator] allows us to independently verify that the numbers produced by a Load Driver are not biased, and contain the full sample count.

He tested against https://hyperfoil.io/[Hyperfoil] and https://github.com/giltene/wrk2[wrk2] to confirm the results presented by Hyperfoil are a accurate representation of what happened during a load test.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We instead of He?


During our validation, we discovered;

* Hyperfoil to add 0.072ms on average to RTT and the summary statistics reports all requests sent.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extplain what RTT stands for?

A typical load test will be of the form shown above.

* **System Under Test (SUT)**: a dedicated system that contains an operating system, hosted application and related process to support the application. The application typically runs as a service, listening for requests on a particular port. When a request is revived, the network stack processes the request and pass it to the application, which handles the request and sends a response.
* **Load Driver**: The job of the load driver is to replicate virtual users and measure their experience. Typical measurements include samples of Round Trip Time (RTT) or Throughput (req/sec)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the explanation for RTT to the first time it's referenced?

* **System Under Test (SUT)**: a dedicated system that contains an operating system, hosted application and related process to support the application. The application typically runs as a service, listening for requests on a particular port. When a request is revived, the network stack processes the request and pass it to the application, which handles the request and sends a response.
* **Load Driver**: The job of the load driver is to replicate virtual users and measure their experience. Typical measurements include samples of Round Trip Time (RTT) or Throughput (req/sec)

The purpose of the load driver is to characterize the SUT, from the perspective of a virtual user.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the load driver it to characterize the user/client, not the SUT?


By filtering out all packets that are sent to, and received from, a specific remote port a timestamp is recorded when a packet is sent and also when a full HTTP response has been received.

A map of timestamps is maintained within the kernel code, which calculates the RTT for each individual request. The timestamps are sent to a user space application, that records the timestamps in a http://www.hdrhistogram.org/[hdrHistogram]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add . to the end

@franz1981 franz1981 requested a review from Copilot June 30, 2025 14:50
@franz1981
Copy link
Contributor

Let's see if @copilot is smart enough (will do on my article too)

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive AsciiDoc post that explains the concept of uncoordinated omission in HTTP load testing and demonstrates how to validate load generators using the bpf_validator tool. Key changes include:

  • Creating a detailed guide with performance results and comparison tests between Hyperfoil and wrk2.
  • Providing step-by-step instructions and sample command outputs for running the bpf_validator tool.
  • Including comprehensive statistical analysis and comparisons of request rates and latencies.

What can we do about it? How can we determine if our load driver is

* a) accurate
* b) reporting summary results from all the sample
Copy link
Preview

Copilot AI Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider correcting 'sample' to 'samples' for grammatical accuracy.

Suggested change
* b) reporting summary results from all the sample
* b) reporting summary results from all the samples

Copilot uses AI. Check for mistakes.

99.9th Percentile: 0.841215
99.99th Percentile: 1.735679

99954 requests in 9995496s <3>
Copy link
Preview

Copilot AI Jun 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test duration value '9995496s' appears incorrect; please verify and update it to the intended duration format (e.g., '20.00s').

Suggested change
99954 requests in 9995496s <3>
99954 requests in 99954s <3>

Copilot uses AI. Check for mistakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was by accident but 9995496s is indeed suspicious (good catch AI?!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants