Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance regression tests in CI #4701

Merged
merged 36 commits into from
Aug 19, 2024
Merged

Conversation

kaukabrizvi
Copy link
Contributor

@kaukabrizvi kaukabrizvi commented Aug 10, 2024

Description of changes:

This PR integrates the regression tests from #4667 into the s2n-tls CI pipeline. The regression_ci.yml workflow would now run on any pull request to main, comparing the performance of all harnesses in tests/regression against the mainline performance. The test will fail if the performance regression exceeds the predefined threshold (currently set to a constant, which will be updated after merging #4698). The GitHub Actions job (example flow) uploads the performance results for the PR, the mainline, and their differences as artifacts, accessible to the developer regardless of the test outcome. This aids in debugging performance issues that may cause test failures.

Call-outs:

  • Valgrind Installation: The tests require Valgrind, but the version available via apt install valgrind on the runner is outdated and lacks the necessary cachegrind functionality. To address this, Valgrind 3.23 is installed from source to ensure compatibility with the regression tests.
  • Mainline vs. PR Comparison: In tests/regression/src/lib.rs, I updated the comparison logic to correctly handle mainline vs. PR branches. Previously, the check was based on whether both commits were in the same log, which isn't applicable to CI integration. Now, the logic checks if one of the versions is the mainline, treating it as the older version. If both or neither are mainline, the check determines which commit is older. This ensures the diff profile accurately represents PR - Mainline performance.

This workflow was tested on my personal branch by creating pull requests to my personal branch and verifying the expected outcomes for pass/failure in the performance CI. It should also run on this PR, allowing reviewers to observe the logs in the checks below as 'Performance Regression Test / regression-test (pull_request)'

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@kaukabrizvi kaukabrizvi marked this pull request as ready for review August 11, 2024 19:25
git fetch origin main
git checkout main

# Regenrate bindings for main branch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Regenrate bindings for main branch
# Regenerate bindings for main branch

tests/regression/src/scratch Outdated Show resolved Hide resolved
.github/workflows/regression_ci.yml Show resolved Hide resolved
.github/workflows/regression_ci.yml Outdated Show resolved Hide resolved
.github/workflows/regression_ci.yml Outdated Show resolved Hide resolved
tests/regression/src/lib.rs Outdated Show resolved Hide resolved
tests/regression/src/lib.rs Outdated Show resolved Hide resolved
@kaukabrizvi
Copy link
Contributor Author

kaukabrizvi commented Aug 12, 2024

Since I changed the file storage scheme to target/regression_artifacts and that change isn't currently in mainline. This PR fails the regression CI check because when we switch to mainline, files are stored in the old format which the PR branch doesn't recognize. Which is why you get the faliure in CI since the diff test only recognizes one folder stored in the expected location (the PR branch performance).

@@ -160,18 +186,28 @@ mod tests {
test_name: test_name.to_string(),
commit_hash: git::extract_commit_hash(&raw_files[0]),
};

println!("{}", profile1.test_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think you forgot to remove this?

.github/workflows/regression_ci.yml Show resolved Hide resolved
Copy link
Contributor

@maddeleine maddeleine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the workflow that you're adding in this PR is failing?

tests/regression/src/lib.rs Outdated Show resolved Hide resolved
.github/workflows/regression_ci.yml Show resolved Hide resolved
(profile1, profile2)
} else if git::is_older_commit(&profile2.commit_hash, &profile1.commit_hash) {
(profile2, profile1)
if git::is_mainline(&profile1.commit_hash) ^ git::is_mainline(&profile2.commit_hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really following the logic to return the correct tuple here. What is the result you're trying to return in this function? Why do you need an xor?

Copy link
Contributor Author

@kaukabrizvi kaukabrizvi Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the result you're trying to return in this function?

You can think of this as a sort on the commits:

  1. Whichever commit is mainline appears first
  2. If is_mainline() is equivalent for both commits, the older commit appears first

This is necessary for the diff functionality to know which version is the standard and which version we want to test for comparison. Between mainline and PR branch, mainline should be the standard for comparison, otherwise the older commit between two PR's which are on the same branch should be the standard for comparison to the new commit.

Why do you need an xor?

The xor returns true if exactly one of the commits is mainline, then we can return whichever commit is mainline first. If both or neither are mainline the condition evaluates to false, then we must check for which commit is older, so we move to the else condition in that case which checks for which commit is older.

I will add comments to this function to make this more clear in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems like we don't need to be guessing at which commit is our "baseline". Doesn't the caller of these tests always know which commit is the baseline versus which one is the altered code? Like, you should always know I want to know the performance change that occurs from "this commit" to "this commit".

Copy link
Contributor Author

@kaukabrizvi kaukabrizvi Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, the approach was to have the caller identify "baseline" and "altered" as an environment variable upon invocation, similar to your suggestion. However after some discussion, we decided to solely rely on the commit id to auto detect new/old and in CI, branch/mainline. This means that with solely the commit id's available to identify files we have to figure out which commit ID is "baseline" vs "altered", which this logic attempts to solve. We could add back an environment variable for the caller to set "baseline" and "altered" to make the tool more flexible in its usage but if we base profiles on solely their commit id, the current approach is needed.

Copy link
Contributor

@maddeleine maddeleine Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm that's a bummer, I kind of disagree with that approach because this logic looks pretty brittle as it is now. I think it would help for me to know what exactly is our goal here, do we expect this to automatically detect the right commit both when running this locally and running this in CI? Like, is the logic here only for running in CI or is it also trying to work for running the tests locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here attempts to cover both, running locally and in CI. Locally, we would want the older of two commits as "baseline" and in CI, we would want main to be "baseline". I agree with you though that the logic is brittle, especially for local testing since I can imagine scenarios where the caller wants to compare two commits that are not on the same branch or set the newer commit as "baseline". To make the usage a bit more extensible, I think we should add an identifier to let the caller decide, so that the artifacts produced are baseline/commit_hash.test_name.raw and altered/commit_hash.test_name.raw. I would vote to do that in a separate PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah then I think you should just make this logic to work in CI. So it would just be like:

if (profile1.is_mainline()) {
    (profile1, profile2)
} else {
    (profile2, profile1)
}

I don't think there's much point in commiting complex logic if you're going to rip it out in a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the direction that integv2 went, I think it's important to keep things at least theoretically runnable locally. Also makes putting demos together much easier, etc.

Copy link
Contributor

@maddeleine maddeleine Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if I am running this locally, I would probably create a branch with my changes and want to compare that branch to mainline. So even then, I really would only care about the difference between mainline and my branch.

That being said, the consequences of getting the order wrong are very minor. But I do think this code should be cleaned up in the future.

@kaukabrizvi
Copy link
Contributor Author

@maddeleine

It seems like the workflow that you're adding in this PR is failing?

Since I changed the file storage scheme to target/regression_artifacts and that change isn't currently in mainline. This PR fails the regression CI check because when we switch to mainline, files are stored in the old format which the PR branch doesn't recognize when diffing.

I made a change to disable running this check when changes are made to the regression crate. However, the paths-ignore doesn't apply here because all paths that are changing must be present in the paths-ignore. We are also making changes to a file outside of tests/regression so this PR doesn't qualify. I don't think we should add .github/workflows/regression_ci.yml to paths-ignore because the only way to test changes to the .yml file are for it to run as a check in CI. So, it will fail for this PR which is alright because it is not currently a mandatory check, but for any future PR's which make a change solely to tests/regression, this test will not run to avoid issues like the one described above.

@maddeleine maddeleine merged commit 87f4a05 into aws:main Aug 19, 2024
35 of 36 checks passed
@kaukabrizvi kaukabrizvi deleted the regression-ci branch August 21, 2024 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants