Triggering a subset of the pre-submit CI (ci.yaml) #255

terrykong · 2023-09-20T22:34:39Z

The pre-submit CI ci.yaml is extremely helpful for testing a change E2E, but in many cases we don't need to run all of the CI to test every change. Addressing this will speed up our development velocity since we don't have to wait for a very long CI pipeline to pass in order merge in a change.

Here is a summary of some of our options:

Option 1: minimal pre-submit; reverts on failures

@yhtang initially proposed the idea of slimming down our pre-submit CI so that only minimal CI is run for PRs, and then the full ci.yaml can be run once the commit is in main. To handle the case where commits in main can cause failures, there can be added logic that will reverts the PR if ci.yaml fails

Option 2: selectively running tests

We can add another "command" similar to /assistant where members of JAX-Toolbox can enumerate the tests needed. So assuming there's four stages:

base -> jax -> paxml
                     |
                     -> t5x

We can comment on a PR something like:

/ci jax

which would jax build and test steps and any descendant jobs (t5x/pax).

/ci t5x paxml

which would run only t5x and paxml

/ci
/ci base

would run everything.

The text was updated successfully, but these errors were encountered:

ko3n1g · 2023-10-10T19:52:24Z

I would be curious whether using GH workflows paths filter as well as a docker caching strategy could help you in this regard?

Since you have quite clearly separated dependencies, I suppose you could quite easily (after re-arranging some files into folders) condition the execution of a workflow to changes of a specified set of files.

At that point docker caching might not even be relevant anymore but using the inline cache together with cache-from could in some situations maybe still speed up your ci pipeline. To still fetch security updates you could build the image nightly/weekly w/o using the cache.

Having said that, those custom commands you have implemented are admittedly quite cool (plus of course provide the benefit of having explicit control over the execution of your test suite).

ko3n1g · 2024-04-23T07:59:52Z

I would like to share my perspective on this issue:

Option 1, so a triage-mode, sounds initially like a quicker solution but I'm afraid of some edge cases where it could backfire. The easiest I can think of is that between two nightly runs multiple PRs got merged. The auto-triage would revert all of these PRs even though potentially only a single was malicious. While we can definitely more quickly merge PRs into main, I am not sure if the 24hrs delay induced by nightly workflows will allows us to increase our velocity.

Option 2, so a way of triggering ci runs of selected stages, sounds very interesting. I have some remarks and comments about this approach in which we - one way or another - trigger sub-selected workflows run by /ci <stage>:

which would jax build and test steps and any descendant jobs (t5x/pax).

If I understand you correctly, the argument <stage> expresses the lowest-level stage to start the tests from which will include all higher-level stages automatically. Example: stage=jax will trigger paxml, t5x, maxtext tests too. Given that higher-level containers are dependent on lower-level containers, I think we want this mechanism to function the other way around. We want to define <stage> as the cutoff-point of the highest-level container and only test everything below it. Example: stage=jax will build & test only jax:base and jax:jax. That way, we can debug from bottom to top in a pareto-efficient way.

Last thought: We have two options on how to implement either of these approaches. We can do it via a "PR CLI" as @terrykong suggested, or we can go via GHA workflow_dispatch. Personally, I have no strong feeling over one or the other. However, I could imagine that a GHA workflow_dispatch is less complex and thus easier to understand & maintain.

//cc @yhtang

terrykong · 2024-04-23T16:10:28Z

If I understand you correctly, the argument expresses the lowest-level stage to start the tests from which will include all higher-level stages automatically. Example: stage=jax will trigger paxml, t5x, maxtext tests too. Given that higher-level containers are dependent on lower-level containers, I think we want this mechanism to function the other way around. We want to define as the cutoff-point of the highest-level container and only test everything below it. Example: stage=jax will build & test only jax:base and jax:jax. That way, we can debug from bottom to top in a pareto-efficient way.

So I guess both of these serve different purposes. Here are some scenarios that have happened in the past:

someone makes a change to stage=jax

In this case, we're interested in how that change affects all the leaf frameworks that depend on jax

someone makes a change to stage=paxml

In this case, we're interested in testing just paxml and its descendants since we assume the same base, e.g., jax

Example: stage=jax will build & test only jax:base and jax:jax.

I think this is helpful as you said, in debugging, since it only tests the jax change so you can move quickly; but I think for the presubmit, we'd still want to know how an ancestor's change affects descendants since that's a simulation of what will happen in the nightly when the jax stack gets bumped.

ko3n1g linked a pull request Apr 30, 2024 that will close this issue

ci: Selective triggering #775

Open

ko3n1g self-assigned this Apr 30, 2024

ko3n1g linked a pull request Apr 30, 2024 that will close this issue

ci: Selective triggering #775

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triggering a subset of the pre-submit CI (ci.yaml) #255

Triggering a subset of the pre-submit CI (ci.yaml) #255

terrykong commented Sep 20, 2023

ko3n1g commented Oct 10, 2023

ko3n1g commented Apr 23, 2024 •

edited

Loading

terrykong commented Apr 23, 2024 •

edited

Loading

Triggering a subset of the pre-submit CI (ci.yaml) #255

Triggering a subset of the pre-submit CI (ci.yaml) #255

Comments

terrykong commented Sep 20, 2023

Option 1: minimal pre-submit; reverts on failures

Option 2: selectively running tests

ko3n1g commented Oct 10, 2023

ko3n1g commented Apr 23, 2024 • edited Loading

terrykong commented Apr 23, 2024 • edited Loading

ko3n1g commented Apr 23, 2024 •

edited

Loading

terrykong commented Apr 23, 2024 •

edited

Loading