Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip Mode #6039

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Skip Mode #6039

wants to merge 3 commits into from

Conversation

wxtim
Copy link
Member

@wxtim wxtim commented Mar 27, 2024

Closes #5641 (also fixes #5975, and fixes #5820 )

Skip Mode Proposal Doc

This branch includes (marked against skip mode proposal):

  1. A new task run mode called "skip" to sit alongside the existing "live" and "simulation" and "dummy" modes.
  2. Skip mode settings in `[runtime][<namespace>][skip].
  3. Task-level control of the run mode - allowing run mode of tasks to override that of the workflow using [runtime][<namespace>]run mode.
    a. Broadcast can change run mode for future task job submissions.
    b. Cylc Validate and lint will warn about the setting not being live.
  4. cylc set --out skip sets outputs from skip mode.
  5. Tests to ensure that run mode = skip respects is_held flag.
  6. Tests to ensure that force triggering a task will not override the run mode.

Extras
7. Run Mode is available as an task attribute in the UI
8. When tasks are run in skip mode, the prerequisites which correspond to the outputs they generate should be marked as satisfied by skip mode rather than satisfied naturally for provenance reasons. For the purpose of cylc remove logic, satisfied by skip mode should be treated the same as satisfied naturally.

There are two extensions, which I haven't dealt with yet, because I want to ensure that the basic functionality works, and move to the substantial documentation PR which need follow this.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • Feature: PR against master

@wxtim wxtim marked this pull request as draft March 27, 2024 12:57
@wxtim wxtim force-pushed the feature.skip_mode branch 2 times, most recently from b43d296 to d26c315 Compare March 28, 2024 09:11
@oliver-sanders oliver-sanders added this to the cylc-8.x milestone Apr 2, 2024
@wxtim wxtim self-assigned this Apr 22, 2024
@wxtim wxtim requested review from oliver-sanders, markgrahamdawson and MetRonnie and removed request for markgrahamdawson April 23, 2024 10:00
@wxtim wxtim marked this pull request as ready for review April 23, 2024 14:26
@MetRonnie MetRonnie modified the milestones: 8.x, 8.4.0 Apr 23, 2024
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
@MetRonnie MetRonnie added the config change Involves a change to global or workflow config label Apr 23, 2024
@wxtim wxtim requested a review from MetRonnie April 24, 2024 15:10
@wxtim wxtim force-pushed the feature.skip_mode branch 4 times, most recently from 3d002ee to dd0a9fc Compare April 25, 2024 12:56
cylc/flow/prerequisite.py Outdated Show resolved Hide resolved
cylc/flow/scheduler_cli.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/scheduler.py Outdated Show resolved Hide resolved
cylc/flow/scheduler.py Outdated Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
@wxtim wxtim force-pushed the feature.skip_mode branch 3 times, most recently from 0d42eab to c37ef72 Compare May 8, 2024 12:43
@wxtim wxtim requested a review from MetRonnie May 8, 2024 13:47
@wxtim wxtim force-pushed the feature.skip_mode branch 2 times, most recently from 3839a8a to d7ee653 Compare May 9, 2024 08:50
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/unicode_rules.py Outdated Show resolved Hide resolved
cylc/flow/task_proxy.py Outdated Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/skip.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/skip.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/nonlive.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/nonlive.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code getting close, small things:

cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
tests/functional/run_modes/06-run-mode-overrides.t Outdated Show resolved Hide resolved
tests/functional/run_modes/06-run-mode-overrides.t Outdated Show resolved Hide resolved
tests/integration/scripts/test_validate_integration.py Outdated Show resolved Hide resolved
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
@wxtim wxtim marked this pull request as draft August 20, 2024 12:22
@wxtim wxtim marked this pull request as ready for review August 21, 2024 07:20
@oliver-sanders
Copy link
Member

oliver-sanders commented Aug 30, 2024

Have started functional testing, looking good so far, a couple of small issues:

1) Cannot broadcast [skip] settings

E.G. This command should configure a task to run in skip mode AND fail when it does so:

$ cylc broadcast -s 'run mode = skip' -s '[skip]outputs = failed'

However, the [skip] part seems to be ignored. Not a biggie from a functionality perspective (why would anyone want to do this!), but if we're allowing users to do this it should work.

2) The started output isn't being produced?

This example stalls for me:

a? & a:started => b

[runtime]
  [[root]]
    run mode  = skip

3) Skip mode messes with the average task runtime

Skip mode tasks are not excluded from the average task runtime calculation which messes it up a fair bit.

Not a biggie, can be moved to a follow-on issue.

4) Skip mode messes with the analysis view

Skip mode tasks are not excluded from the analysis view statistics which is a tad confusing.

Not a biggie, can be punted into a small issue for the data workflows team. Just make sure that it is easy to filter out skip mode tasks in the task_jobs DB table (i.e. by filtering by platform name).

Opened: cylc/cylc-uiserver#626

@wxtim
Copy link
Member Author

wxtim commented Sep 2, 2024

  1. Cannot broadcast [skip] settings

Fixed by f2fb60e. It turned out to be a bit of a rabbit hole with disable task event handlers. I can see wanting to change this as being a reason to broadcast skip mode settings.

  1. The started output isn't being produced?

No, but I don't think it's my change at fault - Since it appears to exist as an entirely separate bug present in Cylc as far back as 8.0.0! I'll attempt to fix anyway!

Proposed fix up at #6351 - since it's a bug. Will rebase and fix after.

  1. Skip mode messes with the average task runtime

Fixed 660ca3a

I have noticed that dummy.py looks like

@wxtim wxtim marked this pull request as draft September 2, 2024 12:32
@wxtim wxtim force-pushed the feature.skip_mode branch 2 times, most recently from 969267e to 2b395d0 Compare September 2, 2024 13:57
@MetRonnie MetRonnie removed their request for review September 2, 2024 14:56
@wxtim wxtim mentioned this pull request Sep 3, 2024
8 tasks
@wxtim wxtim force-pushed the feature.skip_mode branch 2 times, most recently from b0e4d94 to e208f54 Compare September 5, 2024 08:42
@wxtim wxtim marked this pull request as ready for review September 5, 2024 08:51
cylc/flow/network/schema.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/skip.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/skip.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/simulation.py Outdated Show resolved Hide resolved
cylc/flow/run_modes/simulation.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
cylc/flow/task_state.py Outdated Show resolved Hide resolved
}
}
id_ = flow(cfg)
schd = scheduler(id_, run_mode=workflow_run_mode, paused_start=False)
Copy link
Member

@oliver-sanders oliver-sanders Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This integration test is actually running real jobs!

However, I don't think it's actually testing that the run mode is correctly overridden because it doesn't check how the task is submitted.

Also, it's testing live and skip modes, however, the workflow run modes are live and simulation?!

Here's my suggestion, it goes with the small task_job_mgr change I posted above (makes it easier to suppress live jobs from running whilst still testing non-live submission):

from cylc.flow.task_state import TASK_STATUS_SUBMITTED, TASK_STATUS_SUCCEEDED

@pytest.fixture
def capture_live_submissions(capcall):
    """Capture live submission attempts.

    This prevents real jobs from being submitted to the system.

    If you call this fixture from a test, it will return a set of tasks that
    would have been submitted had this fixture not been used.
    """
    def fake_submit(self, _workflow, itasks, *_):
        for itask in itasks:
            for status in (TASK_STATUS_SUBMITTED, TASK_STATUS_SUCCEEDED):
                self.task_events_mgr.process_message(
                    itask,
                    'INFO',
                    status,
                    '2000-01-01T00:00:00Z',
                    '(received)',
                )
        return itasks

    # suppress and capture live submissions
    submit_live_calls = capcall(
        'cylc.flow.task_job_mgr.TaskJobManager._submit_live_task_jobs',
        fake_submit
    )

    def get_submissions():
        nonlocal submit_live_calls
        return {
            itask.identity
            for ((_self, _workflow, itasks, *_), _kwargs) in submit_live_calls
            for itask in itasks
        }

    return get_submissions


@pytest.mark.parametrize('workflow_run_mode', ['live', 'skip'])  # TODO: Should this be 'simulation'?
async def test_run_mode_override_from_config(
    capture_live_submissions, flow, scheduler, run, complete, workflow_run_mode
):
    """Test that `[runtime][<namespace>]run mode` overrides workflow modes."""
    id_ = flow({
        'scheduling': {
            'graph': {
                'R1': 'live & skip',
            },
        },
        'runtime': {
            'live': {'run mode': 'live'},
            'skip': {'run mode': 'skip'},
        }
    })
    schd = scheduler(id_, run_mode=workflow_run_mode, paused_start=False)
    async with run(schd):
        await complete(schd)

    if workflow_run_mode == 'live':
        assert capture_live_submissions() == {'1/live'}
    else:
        assert capture_live_submissions() == set()

Note: This test requires #6326 (because I was too lazy to add an itask.jobs entry, but could do that also)

Could go further by mocking away the get_submission_method method to test which of the dummy submission methods is called if desired.

Might be worth pulling in the workflow run modes in case they change too:

@pytest.mark.parametrize('workflow_run_mode', [mode.value for mode in WORKFLOW_MODES])

wxtim and others added 3 commits September 20, 2024 08:53
* Add `[runtime][<namespace>]run mode` and `[runtime][<namespace>][skip]`.
* Spin run mode functionality into separate modules.
* Run sim mode check with every main loop - we don't know if any tasks are
  in sim mode from the scheduler, but it doesn't cost much to check
  if none are.
* Implemented separate job "submission" pathway switching.
* Implemented skip mode, including output control logic.
* Add a linter and a validation check for tasks in nonlive modes,
  and for combinations of outputs
* Enabled setting outputs as if task ran in skip mode using
  `cylc set --out skip`.
* Testing for the above.

Schema: use `Enum` for task run mode instead of `String` (#61)

* Schema: use `Enum` for task run mode instead of `String`

* Tidy

fixup merge

fix broken functional test

Improve cylc set --out skip
* Improve documentation of feature in cylc set --help
* Allow cylc set --out=skip,optional_output
* Test the above

Remove test: We don't want users opting out of validating
[runtime][ns][simulation/skip] because we can now
changes these in a running workflow.

stop users opting out of validating workflows without validating ski/simulation taskdef sections

added tests for db entries in nonlive mode

ensure db entries for all four modes are correct.

move the change file toi the correct name

get localhost of platforms 'simulation' 'skip' or 'dummy' not defined. (They probably shouldn't be, but that's a site specific choice...)

fix tests with extra messages surfaces by using log_filter

make cylc clean and remote tidy not try to clean or tidy platforms

stop dummy mode appearing to submit twice

Prevent cleanup from attempting to remote clean platforms skip and simulation

Update cylc/flow/run_modes/skip.py

Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>

fix small issues from OS review

response to review: Make satisfaction method correct according the proposal

Response to review
* Allow only run modes skip and live to be selectable in the config.
* Disallow workflow run modes sim and dummy from being overridden.
* Attach Run mode to task_proxy rather than the task def.

Response to review
* Allow only run modes skip and live to be selectable in the config.
* Disallow workflow run modes sim and dummy from being overridden.
* Attach Run mode to task_proxy rather than the task def.

don't run sim time check unless workflow in sim mode

test that skip mode is only applied to a single task.

remove now illegal items from test

Response to review:
- Remove Workflow Mode for tasks and make them default to live.
- Ensure that we are checking (assert) log_filters.
- Remove need for polling in functional test. :)

usin enums

Apply suggestions from code review

Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config change Involves a change to global or workflow config
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow some real tasks in sim mode? dummy mode: subtract err-script from jobs skip mode
3 participants