Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdk): Allow disabling default caching via a CLI flag and env var #11222

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

DharmitD
Copy link
Contributor

@DharmitD DharmitD commented Sep 17, 2024

Description of your changes:
Allow setting a default of execution caching disabled via a compiler CLI flag and env var.
Resolves #11092

Key Changes:

CLI Update (kfp dsl compile):
A new flag --disable-execution-caching-by-default is added. When set, this flag disables caching for all pipeline tasks by default. Behavior specified in pipeline code still takes precedence.

A corresponding Environment Variable KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT is added as well.

Details:

By default, tasks default to caching enabled if the pipeline code doesn't specify a desired behavior via set_caching_options. For example:

@dsl.pipeline(name='my-pipeline')
def my_pipeline():
    task_1 = create_dataset()
    task_2 = create_dataset()
    task_1.set_caching_options(False)

In this example, task 1 won't try to use the cache, but task 2 will, even though the author didn't specify anything about task 2. The resulting pipeline spec will look like this:

root:
  dag:
    tasks:
      task-1:
        cachingOptions: {}
      task-2:
        cachingOptions:
          enableCache: true

This PR enhances the DSL compiler to allow the user to override that default behavior via a command line flag. This change is done:

  • in an additive way, using an optional feature flag

  • in a way that doesn't alter the DSL or pipeline spec interfaces

  • In a way that doesn't change the default behavior for those who prefer the default of caching enabled

After this change, this command will continue to result in the default behavior:

kfp dsl compile --py my_pipeline.py --output my_pipeline.yaml

Whereas this command will cause tasks to default to caching disabled if no desired caching behavior is specified in code:

kfp dsl compile --py my_pipeline.py --output my_pipeline.yaml --disable-execution-caching-by-default

For example, this pipeline compiled with --disable-execution-caching-by-default

@dsl.pipeline(name='my-pipeline')
def my_pipeline():
    task_1 = create_dataset()
    task_2 = create_dataset()
    task_1.set_caching_options(False)

would then result in both tasks having caching disabled:

root:
  dag:
    tasks:
      task-1:
        cachingOptions: {}
      task-2:
        cachingOptions: {}

even though nothing was specified about task_2 in the code.

Usage

CLI flag for dsl compile.

A new flag --disable-execution-caching-by-default is added. When set, this flag disables caching for all pipeline tasks by default.

Example:

kfp dsl compile --py my_pipeline.py --output my_pipeline.yaml --disable-execution-caching-by-default

Environment Variable KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT

You can also set the default caching behavior by using the environment variable. When set to true, 1, or other truthy values, it will disable execution caching by default for all pipelines. When set to false or when absent, the default of caching enabled remains.
Example:

KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT=true \
kfp dsl compile --py my_pipeline.py --output my_pipeline.yaml

This environment variable also works for Compiler().compile().

Given the following pipeline file my_pipeline.py:

ffrom kfp import dsl
from kfp.compiler import Compiler

@dsl.component
def create_dataset():
    pass

@dsl.pipeline(name='my-pipeline')
def my_pipeline():
    task_1 = create_dataset()
    task_2 = create_dataset()
    task_1.set_caching_options(False)

Compiler().compile(
    pipeline_func=my_pipeline,
    package_path='my_pipeline.yaml',
)

Executing this

KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT=true \
python my_pipeline.py

will result in task_2 having caching disabled.

This PR is made in collaboration with @gregsheremeta. Thank you, Greg for your contributions!

Checklist:

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign connor-mccarthy for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@DharmitD DharmitD marked this pull request as draft September 17, 2024 22:43
@DharmitD DharmitD changed the title WIP: feat(sdk): Allow setting a default of execution caching disabled via … WIP: feat(sdk): Allow disabling default caching via a CLI flag and env var Sep 17, 2024
@google-oss-prow google-oss-prow bot added size/L and removed size/S labels Sep 18, 2024
@DharmitD DharmitD force-pushed the disable-default-caching-flag-var branch from c305fa5 to b4b38ae Compare September 18, 2024 22:05
…a compiler CLI flag and env var

Co-authored-by: Greg Sheremeta <gshereme@redhat.com>
Signed-off-by: ddalvi <ddalvi@redhat.com>
@DharmitD DharmitD force-pushed the disable-default-caching-flag-var branch 8 times, most recently from eedb220 to fa7a432 Compare September 20, 2024 00:00
Signed-off-by: ddalvi <ddalvi@redhat.com>
@DharmitD DharmitD force-pushed the disable-default-caching-flag-var branch from fa7a432 to 626e1c1 Compare September 20, 2024 00:11
@DharmitD DharmitD marked this pull request as ready for review September 20, 2024 02:10
@DharmitD DharmitD changed the title WIP: feat(sdk): Allow disabling default caching via a CLI flag and env var feat(sdk): Allow disabling default caching via a CLI flag and env var Sep 20, 2024
@diegolovison
Copy link
Contributor

/hold

Copy link
Contributor

@diegolovison diegolovison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote my thoughts on this PR as a general feedback without explicit approval.

# or the env var KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT.
# align with click's treatment of env vars for boolean flags.
# per click doc, "1", "true", "t", "yes", "y", and "on" are all converted to True
_execution_caching_default = not str(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By assigning _execution_caching_default at the class level, it gets executed upon import. This represented a significant problem in Pipeline version 1, as detailed in: kubeflow/kfp-tekton#1345 , kubeflow/kfp-tekton#1339 , https://github.com/kubeflow/kfp-tekton/pull/1336/files

The user will need to define KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT before importing the compiler.
The developer code will end up like the following that is not the natural way to develop an application in Python.

os.environ["KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT"] = True
from kfp.client import Client

I was really affected by the order of the definition explained above, and if we are going to follow this approach, we need to make sure to document it everywhere. It took me a lot of time to find out that the import order really matter while working with Pipelines v1.

Also, the Pipeline._execution_caching_default is being set globally and can be modified by the flag or environment variable.

If the community believes that we need to follow this approach, this PR should add documentation about the side effects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user will need to define KFP_DISABLE_EXECUTION_CACHING_BY_DEFAULT before importing the compiler.

You are correct. I believe that the typical end user will set this env var in their terminal, and then run the python file containing the compiler. You're perhaps not that typical user.

A key requirement of this task is to make sure that no pipeline code needs to change. Since PipelineTask is loaded by the interpreter as soon as the @dsl.pipeline is interpreted, and that's before the Compiler.compile function is in play, we saw no other way to do this.

Do you have a suggestion for an alternative way, where @dsl.pipeline is not changed, and no pipeline code would ever need to be changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A key requirement of this task is to make sure that no pipeline code needs to change. Since PipelineTask is loaded by the interpreter as soon as the @dsl.pipeline is interpreted, and that's before the Compiler.compile function is in play, we saw no other way to do this.

+1 to what Greg mentioned here, this seems to be the only way to have the var plumbed in without having to change the Pipeline code. Any alternate way would lead to a bunch of pipeline API call changes, and we want to avoid that.

My vote would be to keep this logic as is, and we'll add documentation on this approach to avoid any confusion. @diegolovison could you suggest where the documentation needs to go?

def test_compile_with_caching_flag_enabled(self):
with tempfile.NamedTemporaryFile(suffix='.py') as temp_pipeline:
# Write the pipeline function to a temporary file
temp_pipeline.write(b"""
Copy link
Member

@gmfrasca gmfrasca Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we just have a static file we can reference, rather than trying to dynamically write and use the file via temp files? It would be easier and cleaner to manage, especially since we are writing essentially the same file 3 separate times.

That said, if there is a technical reason for using tempfiles, since it appears the pipelines are the same across all 3 `test_compile_ functions, we should extract this 'tiny-pipeline' into a variable and simply reference it here instead

result = self.invoke(
['--py', temp_pipeline.name, '--output', 'test_output.yaml'])
print(result.output) # Print the command output
self.assertEqual(result.exit_code, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be good to have some additional/more vigorous validation instead of just a return code check here. Perhaps some checks on the contents of test_output.yaml?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Granted, this is for testing the CLI, not the compiler itself, so it likely could be something lightweight, even if it's just 'Check for existence of test_output.yaml and ensure it's not empty' to verify the command actually did something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] allow setting a default of execution caching disabled via a compiler CLI flag and env var
4 participants