Fix CI failures #938

inancgumus · 2023-06-16T05:39:43Z

The solution in this PR still needs to be better, but it's better than the current workflows. When we run it multiple times, tests eventually pass. Feel free to push commits to this PR, if needed.

PS: While testing this PR, GH has automatically closed the PR multiple times 😄

ka3de

Overall I think the changes make sense in any case, although I'm still a little bit confused as to why CI tests started to fail, which makes me doubt whether it is due to a change in GH or a change that we introduced 🤔

ka3de · 2023-06-21T07:22:59Z

tests/browser_test.go

@@ -217,6 +217,9 @@ func TestBrowserLogIterationID(t *testing.T) {
 }

 func TestMultiBrowserPanic(t *testing.T) {
+	// TODO: This test never works on CI, fix it.
+	t.Skip("skipping, fix this")


I wonder if we should use an ENV var, set by us in CI, as a flag in order to skip this test or not. In that way at least we will always run this test when testing locally. Do you think that is worth it?

Makes sense; I was thinking about the same, too. Especially for the disable-gpu one. We can do it whether we're running on Github by checking the GITHUB_ACTIONS environment variable.

chromium/browser_type.go

ankur22

LGTM. I think it would be good to get some clarification on the changes so that we're aligned 🙂

Thanks for tackling this issue!!

ankur22 · 2023-06-21T08:01:58Z

.github/workflows/test.yml

@@ -21,6 +21,19 @@ jobs:
        platform: [ubuntu-latest]
    runs-on: ${{ matrix.platform }}
    steps:
+      - name: Setup Chrome


This is interesting! So we were relying on Chrome already being present in the CI test VMs/Containers up until we started seeing these issues?

Yeah, it's interesting, indeed. It didn't work when I removed it.

ankur22 · 2023-06-21T08:06:22Z

.github/workflows/test.yml

-            export GOMAXPROCS=1
-          fi
+          # Run with less concurrency to reduce CI flakiness.
+          export GOMAXPROCS=1


I'm not sure about this change. I think it makes sense since when we see a flakey tests we don't have the resources to resolve them straight away, and they can be quite demanding tests to fix. At the same time, i think we've found some issues in CI that we don't see in our local test runs, which are generally race conditions or deadlocks. I'll go with the consensus on this one 🙂

That's a good point. I wonder if we should have one set of tests specifically run with higher concurrency, or maybe just leave the current setup and expect it to pass in any case.

This is the one that makes it consistently (eh, to some degree) pass. Otherwise, it constantly fails.

When I'm testing this PR with several commits (in and out), and adding/removing some/all of the tests, I noticed, the nature of the tests doesn't matter for the outcome of the test runs. When there is concurrency, there is a failure here.

ankur22 · 2023-06-21T08:06:56Z

.github/workflows/test.yml

@@ -55,7 +55,7 @@ jobs:
          export GOMAXPROCS=1
          args=("-p" "1" "-race")
          export K6_BROWSER_HEADLESS=true
-          go test "${args[@]}" -timeout 5m ./...
+          go test "${args[@]}" -timeout 10m ./...


Why do we need such a large timeout?

Since the concurrency is 1 now, and it was 2 before, I thought it would make sense to double the timeout. When I tested it, tests took about 2x longer.

ankur22 · 2023-06-21T08:10:01Z

which makes me doubt whether it is due to a change in GH or a change that we introduced

We're seeing this CI issue with our tagged releases too. I've tested v0.10.0 and v0.9.0 😞

inancgumus · 2023-06-21T08:25:34Z

Overall I think the changes make sense in any case, although I'm still a little bit confused as to why CI tests started to fail, which makes me doubt whether it is due to a change in GH or a change that we introduced 🤔
LGTM. I think it would be good to get some clarification on the changes so that we're aligned 🙂
Thanks for tackling this issue!!

Sure :)

TBH, I'm as confused as you are :( I believe this issue is caused by GH infrastructure changes and not because of our changes. As I mentioned in my previous edits in the PR description, I've tried with earlier merges. Then I even tried back to v0.8.0. Nothing has changed.

This PR is totally created by experimentation. So I don't have the best answers to your questions :( TLDR: I don't even know myself what/why I'm doing there 😄 I've tried to answer your questions in the comments above.

inancgumus · 2023-06-21T10:28:27Z

Although this one is a fix that demands no cost, #947 seems like a more elegant fix. Closing this one. We can reopen it if needed.

Update:
Out of curiosity, I've tried what happens here with this change. Inspired by @ankur22's ubuntu-pro platform idea in #947.

It worked brilliantly with the addition of ubuntu-pro ($). Otherwise, we need to retry one or two times ($=free).

Even with a concurrency of one, tests finish in ~5m.

Note that workers (CI tasks) are still concurrent. (edited)

I'm just curious to see what happens :-)

inancgumus closed this Jun 16, 2023

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch from 42ffe3e to 6b21ec7 Compare June 16, 2023 05:44

inancgumus reopened this Jun 16, 2023

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch from 3aeba0b to 1a54d3d Compare June 16, 2023 05:49

inancgumus changed the title ~~Reduce parallelism on CI to reduce flakiness~~ CI flakiness reduce experiments Jun 16, 2023

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch from 1a54d3d to 97dc7d8 Compare June 20, 2023 11:20

inancgumus closed this Jun 20, 2023

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch from 97dc7d8 to 92ac6b7 Compare June 20, 2023 13:01

inancgumus reopened this Jun 20, 2023

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch 3 times, most recently from 7598c0b to 9e28ccc Compare June 20, 2023 13:56

inancgumus changed the title ~~CI flakiness reduce experiments~~ Fix CI failures Jun 20, 2023

inancgumus added 5 commits June 20, 2023 18:32

Skip TestMultiBrowserPanic

af3885b

Disable GPU when launching chrome

a56b8c5

Install Chrome for CI tests

db699bb

Run tests less concurrent on CI

6b546b0

Increase test timeout on CI

ae6fbe7

inancgumus force-pushed the fix/ci-flakiness-reduce-parallelism branch 2 times, most recently from 94957b5 to ae6fbe7 Compare June 20, 2023 15:52

inancgumus requested review from ankur22 and ka3de June 20, 2023 16:11

inancgumus self-assigned this Jun 20, 2023

inancgumus added the ci label Jun 20, 2023

inancgumus marked this pull request as ready for review June 20, 2023 16:11

ka3de reviewed Jun 21, 2023

View reviewed changes

ankur22 approved these changes Jun 21, 2023

View reviewed changes

inancgumus closed this Jun 21, 2023

Update to ubuntu latest 8 cores

30c885e

I'm just curious to see what happens :-)

inancgumus reopened this Jun 23, 2023

inancgumus closed this Jun 23, 2023

ankur22 mentioned this pull request Jun 23, 2023

Fix CI integration issue by upgrading runner #951

Merged

4 tasks

inancgumus deleted the fix/ci-flakiness-reduce-parallelism branch May 7, 2024 13:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CI failures #938

Fix CI failures #938

inancgumus commented Jun 16, 2023 •

edited

Loading

ka3de left a comment

ka3de Jun 21, 2023

inancgumus Jun 21, 2023

ankur22 left a comment

ankur22 Jun 21, 2023 •

edited

Loading

inancgumus Jun 21, 2023

ankur22 Jun 21, 2023 •

edited

Loading

ka3de Jun 21, 2023

inancgumus Jun 21, 2023

inancgumus Jun 21, 2023 •

edited

Loading

ankur22 Jun 21, 2023

inancgumus Jun 21, 2023

ankur22 commented Jun 21, 2023

inancgumus commented Jun 21, 2023 •

edited

Loading

inancgumus commented Jun 21, 2023 •

edited

Loading

Fix CI failures #938

Fix CI failures #938

Conversation

inancgumus commented Jun 16, 2023 • edited Loading

ka3de left a comment

Choose a reason for hiding this comment

ka3de Jun 21, 2023

Choose a reason for hiding this comment

inancgumus Jun 21, 2023

Choose a reason for hiding this comment

ankur22 left a comment

Choose a reason for hiding this comment

ankur22 Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

inancgumus Jun 21, 2023

Choose a reason for hiding this comment

ankur22 Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

ka3de Jun 21, 2023

Choose a reason for hiding this comment

inancgumus Jun 21, 2023

Choose a reason for hiding this comment

inancgumus Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

ankur22 Jun 21, 2023

Choose a reason for hiding this comment

inancgumus Jun 21, 2023

Choose a reason for hiding this comment

ankur22 commented Jun 21, 2023

inancgumus commented Jun 21, 2023 • edited Loading

inancgumus commented Jun 21, 2023 • edited Loading

inancgumus commented Jun 16, 2023 •

edited

Loading

ankur22 Jun 21, 2023 •

edited

Loading

ankur22 Jun 21, 2023 •

edited

Loading

inancgumus Jun 21, 2023 •

edited

Loading

inancgumus commented Jun 21, 2023 •

edited

Loading

inancgumus commented Jun 21, 2023 •

edited

Loading