Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance regression tests in CI #4701

Merged
merged 36 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
969c31f
Added GA workflow for regression testing
kaukabrizvi Aug 8, 2024
ac4195d
Update branch to personal for testing
kaukabrizvi Aug 8, 2024
9f47298
Added comments
kaukabrizvi Aug 8, 2024
c23002e
Renamed GA file
kaukabrizvi Aug 8, 2024
6493239
Checkout personal for testing
kaukabrizvi Aug 8, 2024
7a55923
Mainline check in query function
kaukabrizvi Aug 8, 2024
46a691b
Added create directory for diff files
kaukabrizvi Aug 8, 2024
0d5ed11
Fixed annotated output file path
kaukabrizvi Aug 8, 2024
d8fb254
Added always to upload on every run
kaukabrizvi Aug 9, 2024
c927f87
Changed stat directory
kaukabrizvi Aug 9, 2024
7a906fe
Debug statements in workflow
kaukabrizvi Aug 9, 2024
8661fa1
Fixed directory location
kaukabrizvi Aug 9, 2024
1bce5ad
Deal with moving multiple files at once
kaukabrizvi Aug 9, 2024
b9a88a7
Get full commit id
kaukabrizvi Aug 9, 2024
0e9fb3e
Update output directory to perf_outputs
kaukabrizvi Aug 9, 2024
6e68c0b
Simplified output file paths to avoid redundancy
kaukabrizvi Aug 9, 2024
ddca12a
Split mainline search into two steps
kaukabrizvi Aug 9, 2024
47ab67c
Merge branch 'aws:main' into regression-ci
kaukabrizvi Aug 10, 2024
1be1700
Remove cargo build, detail comments
kaukabrizvi Aug 10, 2024
f427602
Increase regression threshold
kaukabrizvi Aug 10, 2024
cb336c5
Change 'personal' to 'main'
kaukabrizvi Aug 10, 2024
f23f4cf
Address PR feedback
kaukabrizvi Aug 12, 2024
d91952c
Fix get commit hash for new file storage scheme
kaukabrizvi Aug 12, 2024
149c300
Change checkout to switch in comments
kaukabrizvi Aug 12, 2024
bf45366
Change switch to checkout for PR checkout
kaukabrizvi Aug 12, 2024
650416c
Changed switch to checkout in mainline checkout
kaukabrizvi Aug 12, 2024
1f0272e
Change checkout back to switch for mainline
kaukabrizvi Aug 12, 2024
124d01c
Fix formatting
kaukabrizvi Aug 12, 2024
e852dcf
Ignore test when changes are made to regression crate
kaukabrizvi Aug 13, 2024
5f5344d
Fix file path for path-ignore
kaukabrizvi Aug 13, 2024
d15d6e0
Clarify is_mainline and valgrind installation
kaukabrizvi Aug 13, 2024
f596a6f
Simplify valgrind installation comment
kaukabrizvi Aug 13, 2024
fad287d
Merge branch 'main' into regression-ci
kaukabrizvi Aug 16, 2024
6c40862
Fix merge conflicts
kaukabrizvi Aug 16, 2024
79ae666
Update actions to latest version
kaukabrizvi Aug 16, 2024
222ad90
Merge branch 'main' into regression-ci
kaukabrizvi Aug 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions .github/workflows/regression_ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0
name: Performance Regression Test

on:
pull_request:
jmayclin marked this conversation as resolved.
Show resolved Hide resolved
branches:
- main
paths-ignore:
- tests/regression/**

jobs:
regression-test:
runs-on: ubuntu-latest

steps:
# Checkout the code from the pull request branch
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}

# Install the stable Rust toolchain
- name: Install Rust toolchain
id: toolchain
run: |
rustup toolchain install stable
rustup override set stable

# Update the package list on the runner
- name: Update package list
run: sudo apt-get update

# Download and install Valgrind 3.23 from source
- name: Download Valgrind 3.23 Source
maddeleine marked this conversation as resolved.
Show resolved Hide resolved
run: |
wget https://sourceware.org/pub/valgrind/valgrind-3.23.0.tar.bz2
tar -xjf valgrind-3.23.0.tar.bz2
cd valgrind-3.23.0
./configure
make
sudo make install

# Generate the necessary bindings
- name: Generate
run: ${{env.ROOT_PATH}}bindings/rust/generate.sh --skip-tests

# Run performance tests using Valgrind for current branch
- name: Run scalar performance test (curr)
env:
PERF_MODE: valgrind
run: cargo test --release --manifest-path=tests/regression/Cargo.toml

# Switch to the main branch
- name: Switch to mainline
run: |
git fetch origin main
git switch main

# Regenerate bindings for main branch
- name: Generate
run: ${{env.ROOT_PATH}}bindings/rust/generate.sh --skip-tests

# Run performance tests using Valgrind for main branch
- name: Run scalar performance test (prev)
env:
PERF_MODE: valgrind
run: cargo test --release --manifest-path=tests/regression/Cargo.toml

# Checkout pull request branch again
# This is required for cg_annotate diff to locate the changes in the PR to properly annotate the output diff file
- name: Checkout pull request branch
run: git checkout ${{ github.event.pull_request.head.sha }}

# Run the differential performance test
- name: Run diff test
env:
PERF_MODE: diff
run: cargo test --release --manifest-path=tests/regression/Cargo.toml

# Upload the performance output artifacts. This runs even if run diff test fails so debug files can be accessed
- name: Upload artifacts
jmayclin marked this conversation as resolved.
Show resolved Hide resolved
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: regression_artifacts
path: tests/regression/target/regression_artifacts
2 changes: 1 addition & 1 deletion tests/regression/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The performance benchmarking framework utilizes CPU Instruction count across API

Ensure you have the following installed:
- Rust (with Cargo)
- Valgrind (for cachegrind instrumentation)
- Valgrind (for cachegrind instrumentation): Valgrind 3.23 or newer is required to run the tests, since cachegrind annotation is not included in earlier versions. If this version is not automatically downloaded by running `apt install valgrind`, it can be installed manually by following https://valgrind.org/downloads/

## Running the Harnesses with Valgrind (scalar performance)
To run the harnesses with Valgrind and store the annotated results, run:
Expand Down
68 changes: 54 additions & 14 deletions tests/regression/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,37 @@ pub mod git {
}

pub fn extract_commit_hash(file: &str) -> String {
// input: "target/$commit_id/test_name.raw"
// input: "target/regression_artifacts/$commit_id/test_name.raw"
// output: "$commit_id"
file.split("target/")
file.split("target/regression_artifacts/")
.nth(1)
.and_then(|s| s.split('/').next())
.map(|s| s.to_string())
.unwrap_or_default() // This will return an empty string if the Option is None
}

pub fn is_mainline(commit_hash: &str) -> bool {
// Execute the git command to check which branches contain the given commit.
let output = Command::new("git")
.args(["branch", "--contains", commit_hash])
.output()
.expect("Failed to execute git branch");

// If the command fails, it indicates that the commit is either detached
// or does not exist in any branches. Meaning, it is not part of mainline.
if !output.status.success() {
return false;
}

// Convert the command output to a string and check each line.
let branches = String::from_utf8_lossy(&output.stdout);
branches.lines().any(|branch| {
// Trim the branch name to remove any leading or trailing whitespace.
// The branch name could be prefixed with '*', indicating the current branch.
// We check for both "main" and "* main" to account for this possibility.
branch.trim() == "main" || branch.trim() == "* main"
})
}
}

#[cfg(test)]
Expand Down Expand Up @@ -117,7 +140,7 @@ mod tests {
impl RawProfile {
fn new(test_name: &str) -> Self {
let commit_hash = git::get_current_commit_hash();
create_dir_all(format!("target/{commit_hash}")).unwrap();
create_dir_all(format!("target/regression_artifacts/{commit_hash}")).unwrap();

let raw_profile = Self {
test_name: test_name.to_owned(),
Expand All @@ -143,7 +166,10 @@ mod tests {
}

fn path(&self) -> String {
format!("target/{}/{}.raw", self.commit_hash, self.test_name)
format!(
"target/regression_artifacts/{}/{}.raw",
self.commit_hash, self.test_name
)
}

// Returns the annotated profile associated with a raw profile
Expand All @@ -154,8 +180,9 @@ mod tests {
/// Return the raw profiles for `test_name` in "git" order. `tuple.0` is older than `tuple.1`
///
/// This method will panic if there are not two profiles.
/// This method will also panic if both commits are on different logs (not mainline).
fn query(test_name: &str) -> (RawProfile, RawProfile) {
let pattern = format!("target/**/*{}.raw", test_name);
let pattern = format!("target/regression_artifacts/**/*{}.raw", test_name);
let raw_files: Vec<String> = glob::glob(&pattern)
.expect("Failed to read glob pattern")
.filter_map(Result::ok)
Expand All @@ -167,18 +194,28 @@ mod tests {
test_name: test_name.to_string(),
commit_hash: git::extract_commit_hash(&raw_files[0]),
};

let profile2 = RawProfile {
test_name: test_name.to_string(),
commit_hash: git::extract_commit_hash(&raw_files[1]),
};

if git::is_older_commit(&profile1.commit_hash, &profile2.commit_hash) {
(profile1, profile2)
} else if git::is_older_commit(&profile2.commit_hash, &profile1.commit_hash) {
(profile2, profile1)
// xor returns true if exactly one commit is mainline
if git::is_mainline(&profile1.commit_hash) ^ git::is_mainline(&profile2.commit_hash) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really following the logic to return the correct tuple here. What is the result you're trying to return in this function? Why do you need an xor?

Copy link
Contributor Author

@kaukabrizvi kaukabrizvi Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the result you're trying to return in this function?

You can think of this as a sort on the commits:

  1. Whichever commit is mainline appears first
  2. If is_mainline() is equivalent for both commits, the older commit appears first

This is necessary for the diff functionality to know which version is the standard and which version we want to test for comparison. Between mainline and PR branch, mainline should be the standard for comparison, otherwise the older commit between two PR's which are on the same branch should be the standard for comparison to the new commit.

Why do you need an xor?

The xor returns true if exactly one of the commits is mainline, then we can return whichever commit is mainline first. If both or neither are mainline the condition evaluates to false, then we must check for which commit is older, so we move to the else condition in that case which checks for which commit is older.

I will add comments to this function to make this more clear in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seems like we don't need to be guessing at which commit is our "baseline". Doesn't the caller of these tests always know which commit is the baseline versus which one is the altered code? Like, you should always know I want to know the performance change that occurs from "this commit" to "this commit".

Copy link
Contributor Author

@kaukabrizvi kaukabrizvi Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, the approach was to have the caller identify "baseline" and "altered" as an environment variable upon invocation, similar to your suggestion. However after some discussion, we decided to solely rely on the commit id to auto detect new/old and in CI, branch/mainline. This means that with solely the commit id's available to identify files we have to figure out which commit ID is "baseline" vs "altered", which this logic attempts to solve. We could add back an environment variable for the caller to set "baseline" and "altered" to make the tool more flexible in its usage but if we base profiles on solely their commit id, the current approach is needed.

Copy link
Contributor

@maddeleine maddeleine Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm that's a bummer, I kind of disagree with that approach because this logic looks pretty brittle as it is now. I think it would help for me to know what exactly is our goal here, do we expect this to automatically detect the right commit both when running this locally and running this in CI? Like, is the logic here only for running in CI or is it also trying to work for running the tests locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here attempts to cover both, running locally and in CI. Locally, we would want the older of two commits as "baseline" and in CI, we would want main to be "baseline". I agree with you though that the logic is brittle, especially for local testing since I can imagine scenarios where the caller wants to compare two commits that are not on the same branch or set the newer commit as "baseline". To make the usage a bit more extensible, I think we should add an identifier to let the caller decide, so that the artifacts produced are baseline/commit_hash.test_name.raw and altered/commit_hash.test_name.raw. I would vote to do that in a separate PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah then I think you should just make this logic to work in CI. So it would just be like:

if (profile1.is_mainline()) {
    (profile1, profile2)
} else {
    (profile2, profile1)
}

I don't think there's much point in commiting complex logic if you're going to rip it out in a different PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the direction that integv2 went, I think it's important to keep things at least theoretically runnable locally. Also makes putting demos together much easier, etc.

Copy link
Contributor

@maddeleine maddeleine Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, if I am running this locally, I would probably create a branch with my changes and want to compare that branch to mainline. So even then, I really would only care about the difference between mainline and my branch.

That being said, the consequences of getting the order wrong are very minor. But I do think this code should be cleaned up in the future.

// Return the mainline as first commit
if git::is_mainline(&profile1.commit_hash) {
(profile1, profile2)
} else {
(profile2, profile1)
}
} else {
panic!("The commits are not in the same log");
// Neither or both profiles are on the mainline, so return the older one first
if git::is_older_commit(&profile1.commit_hash, &profile2.commit_hash) {
(profile1, profile2)
} else if git::is_older_commit(&profile2.commit_hash, &profile1.commit_hash) {
(profile2, profile1)
} else {
panic!("The commits are not in the same log, are identical, or there are not two commits available");
}
}
}
}
Expand Down Expand Up @@ -211,7 +248,10 @@ mod tests {
}

fn path(&self) -> String {
format!("target/{}/{}.annotated", self.commit_hash, self.test_name)
format!(
"target/regression_artifacts/{}/{}.annotated",
self.commit_hash, self.test_name
)
}

fn instruction_count(&self) -> i64 {
Expand Down Expand Up @@ -240,7 +280,7 @@ mod tests {
assert_command_success(diff_output.clone());

// write the diff to disk
create_dir_all(format!("target/diff")).unwrap();
create_dir_all("target/regression_artifacts/diff").unwrap();
let diff_content = String::from_utf8(diff_output.stdout)
.expect("Invalid UTF-8 in cg_annotate --diff output");
write(diff_profile.path(), diff_content).expect("Failed to write to file");
Expand All @@ -249,7 +289,7 @@ mod tests {
}

fn path(&self) -> String {
format!("target/diff/{}.diff", self.test_name)
format!("target/regression_artifacts/diff/{}.diff", self.test_name)
}

fn assert_performance(&self, max_diff: f64) {
Expand Down
Loading