Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make sure that the job required by branch protection doesnt result in skipped (treated as passing) when matrix jobs fail #19

Open
travi opened this issue Aug 29, 2023 · 7 comments

Comments

@travi
Copy link
Member

travi commented Aug 29, 2023

minimum pipeline changes

we've seen this problem in the past but havent noticed it for a while. the conventional-changelog preset breakages highlighted that it still existed because we had several dev-dependencies merged even though they very clearly failed the tests.

to stop the bleeding specifically related to those broken updates, i updated the pipelines of the impacted repos:

we need to apply similar changes to the remaining repos in the org to avoid similar issues elsewhere

maybe rethink the required job responsibilities?

currently, we have the test job set as the required job and do linting in that job so that it has actual steps to run. however, the primary reason that job existed was to enable the branch protection since requiring checks to pass with a matrix of node versions is difficult.

with the updated steps that are specifically about coordinating the results of the needed job dependencies, should we move the linting steps out of this job? i think i lean in this direction, but open to thoughts.

steps that are not supported by all node/npm versions in our supported matrix

since we had that job available, it was a convenient job to run tasks that we needed to limit more tightly than the full test matrix. one example that comes to mind is the fact that npm audit signatures is not supported in the version of npm bundled with node v18.0.0, so it falls back to npm audit instead, which wasnt our intended check for these pipelines. if we go back to just running verification in the matrix jobs, we dont have a natural way to handle this situation.

in my other projects, i have two verification jobs, one for the matrix of supported versions defined by engines.node and another to run specifically against the "development" node version for the project, which is defined in a .nvmrc (we've talked about adding .nvmrc files to our repos, but i've failed to follow through on it. i'd like to still do this). this enables me to execute steps like this in the development job and not in the matrix job. what would we think about something like this?

an example: https://github.com/form8ion/project/blob/master/.github/workflows/node-ci.yml

@travi
Copy link
Member Author

travi commented Sep 16, 2023

another example of this in the core repo, where i havent applied a similar fix yet: semantic-release/semantic-release#2958

i'll apply this fix there soon and work on improving the overall pipeline structure that is enabled by this change.

travi added a commit to semantic-release/semantic-release that referenced this issue Sep 17, 2023
travi added a commit to semantic-release/semantic-release that referenced this issue Sep 17, 2023
and used npm-run-all2 to define verification script groups that can be parallelized while also enabling scripts to be run independantly

for semantic-release/.github#19
travi added a commit to semantic-release/semantic-release that referenced this issue Sep 17, 2023
travi added a commit to semantic-release/semantic-release that referenced this issue Sep 18, 2023
and used npm-run-all2 to define verification script groups that can be parallelized while also enabling scripts to be run independantly

for semantic-release/.github#19
travi added a commit to semantic-release/semantic-release that referenced this issue Oct 8, 2023
and used npm-run-all2 to define verification script groups that can be parallelized while also enabling scripts to be run independantly

for semantic-release/.github#19
travi added a commit to semantic-release/github that referenced this issue Oct 20, 2023
@sheerlox
Copy link
Member

sheerlox commented Nov 8, 2023

I discovered that using if: always() on the required check with this setup (that I also use on my repositories) will make it pass if the test workflow is canceled since it only checks for failures, and always() prevents that step from being canceled. It's probably not a big deal but I think it's worth mentioning it.

@sheerlox
Copy link
Member

sheerlox commented Nov 8, 2023

Also, if the tests running twice on Renovate PRs (push & pull_request events) is an issue, I've crafted a solution to ignore push events if a PR is open for the head branch: sheerlox/import-from-esm@ce97a59 (example skipped push workflow run).
I'm okay preparing PRs if that's of any interest.

@travi
Copy link
Member Author

travi commented Nov 8, 2023

I discovered that using if: always() on the required check with this setup (that I also use on my repositories) will make it pass if the test workflow is canceled since it only checks for failures, and always() prevents that step from being canceled. It's probably not a big deal but I think it's worth mentioning it.

yep, definitely worth mentioning. the always() solved the problem i was after , but i havent seen the cancelled problem yet. maybe we should switch to the now recommended !cancelled() alternative

@travi
Copy link
Member Author

travi commented Nov 8, 2023

Also, if the tests running twice on Renovate PRs (push & pull_request events) is an issue, I've crafted a solution to ignore push events if a PR is open for the head branch: sheerlox/import-from-esm@ce97a59 (example skipped push workflow run).

there a couple reasons that i prefer to run both based on the push and pull_request events

  • the pull_request events always create a temporary merge commit with the target branch before running the workflow. this is beneficial when the target and base branches arent in sync, but it can be unclear if a failure is because of the merge with the target branch or specific to the feature branch in isolation. also running based on the push branch provides an isolated execution so that it is clear which situation is resulting in the failure
  • i'd like to get back to using branch merge for some renovate configurations, like lockfile maintenance. we backed away from this approach previously because of complications with branch protection settings. there is now more flexibility available for enabling bypasses for certain circumstances, especially with repository rules rather than traditional branch protection. with branch merge, no pr is created, so the push trigger is needed

@sheerlox
Copy link
Member

sheerlox commented Nov 8, 2023

this is beneficial when the target and base branches arent in sync

Since this solution only applies to Renovate branches and Renovate always rebases the branch when it gets outdated, I don't think this applies here

with branch merge, no pr is created, so the push trigger is needed

I've crafted the solution exactly because I needed the push trigger on renovate/** branches and the pull_request trigger at the same time, so it's goal is to skip the push event only if it detects an open PR for a branch starting with renovate/ 😉

Having duplicate checks running is less of an issue on semantic-release/semantic-release because it doesn't run on multiple OS, but when the test workflow triggers 9 jobs (3 OS * 3 Node versions) of 5 mins each, the solution becomes really useful for speeding up the feedback loop (since the number of concurrent job runners is limited to 20 on the free plan, I believe organization-wide, although that's not very clear).

Since it might look like I'm arguing, I'm not particularly pushing for the solution to be adopted, just letting you know it exists and works well if that's of any interest 😄

@travi
Copy link
Member Author

travi commented Nov 10, 2023

Since it might look like I'm arguing

not at all. i really appreciate the thoughts. i didnt dig deep enough to understand fully, so the additional explanation is definitely helpful.

Having duplicate checks running is less of an issue on semantic-release/semantic-release because it doesn't run on multiple OS, but when the test workflow triggers 9 jobs (3 OS * 3 Node versions) of 5 mins each, the solution becomes really useful for speeding up the feedback loop (since the number of concurrent job runners is limited to 20 on the free plan, I believe organization-wide, although that's not very clear).

i'm normally not too worried about limiting the number of jobs since it normally ends up just being a matter of queueing longer. when builds are stable, i dont find myself waiting for a pipeline for feedback too often, so the extra waiting isnt much of a problem.

however, the core project has had some instability that i havent found the time/energy to fix yet, so that situation has been a bit different there. plus, i'd like to get a windows runner in place there too, which would make the problem more of an issue (quite honestly, the instability has been a demotivator to invest in getting windows added to the matrix because of all of this).

in short, i like the idea. i havent fully processed everything yet, but thanks for getting it on my radar. happy to continue conversations through PRs if you wanted to start testing any of these things in our pipelines. i'd suggest keeping independent changes separated into independent PRs so that simpler changes can be merged quickly, but i like the ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants