-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: script function called under workflow no longer returns a Step #710
Conversation
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
* Keep the behaviour that the script template itself is added to the list of templates in the Workflow * Change to look-before-you-leap for returned object from the CallableTemplateMixin's __call__ to make it clearer what's happening Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
9314199
to
f605e06
Compare
Codecov Report
@@ Coverage Diff @@
## main #710 +/- ##
=======================================
- Coverage 77.5% 77.4% -0.1%
=======================================
Files 45 45
Lines 3341 3346 +5
Branches 634 639 +5
=======================================
+ Hits 2591 2592 +1
- Misses 575 576 +1
- Partials 175 178 +3
... and 3 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: Elliot Gunton <egunton@bloomberg.net>
# Notes on callable templates under a Workflow: | ||
# * If the user calls a script directly under a Workflow (outside of a Steps/DAG) then we add the script | ||
# template to the workflow and return None. | ||
# * Containers, ContainerSets and Data objects (i.e. subclasses of CallableTemplateMixin) are already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have examples in which containers are defined outside of a workflow. Practically, users can instantiate, say, a container (used for notifications on workflow completion) outside of a workflow, someplace in the codebase, and then import / use the callable directly. In that case we want to add the template. Rather than address this for scripts only, perhaps we can make it so that there's an attempt to add the template to the workflow unless it exists already? I think that might already be implemented in some places, if it not already in the context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a question of whether you want to be able to add containers as a template through a callable (I would say no). The options are:
1. WAD: use a function to retrieve the container defined elsewhere (but returning a fresh instantiation)
def get_container():
return Container(
name="whalesay",
image="docker/whalesay:latest",
command=["cowsay"],
args=["{{inputs.parameters.message}}"],
inputs=Parameter(name="message"),
)
with Workflow(
generate_name="arguments-parameters-",
entrypoint="whalesay",
arguments=Parameter(name="message", value="hello world"),
) as w:
c = get_container() # Container is instantiated in the function call and added correctly, c is now callable in steps/dags
2. Enhancement: make Containers callable under Workflow
s
whalesay = Container(
name="whalesay",
image="docker/whalesay:latest",
command=["cowsay"],
args=["{{inputs.parameters.message}}"],
inputs=Parameter(name="message"),
)
with Workflow(
generate_name="arguments-parameters-",
entrypoint="whalesay",
arguments=Parameter(name="message", value="hello world"),
) as w:
whalesay() # Container is already instantiated outside so this doesn't currently work, should this return something or None?
I'm hesitant to add that functionality as we somewhat accidentally added it for script-decorated functions. It encourages bad practice (IMO) of global variables - e.g. what if the whalesay
container is modified in code elsewhere (changing the image etc) without me realising? We should try to stay close to being a definition language (as opposed to going full OOP) as a yaml alternative. The script-decorator works as a workflow-level callable because it's a fresh function call each time. Making the container callable under Workflows also confuses the callable container pattern under steps/dags - should a user be given a container from the call? e.g.:
with Workflow(
generate_name="arguments-parameters-",
entrypoint="whalesay",
arguments=Parameter(name="message", value="hello world"),
) as w:
container = whalesay()
with Steps(name="s"):
# Do I call `container()` or `whalesay()` here?
whalesay()
container()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can make it so that there's an attempt to add the template to the workflow unless it exists already
This would be a good bit of validation we can add! I don't believe we check anywhere for these scoped name clashes - e.g. steps/dags could also have step/task names validated for clashes
Edit: We do check for the template name clash when adding the template from a step/task sub node here! But no other validation on templates themselves being added (i.e. I can add many Script templates with the same name, but multiple steps using the same template will only add it once).
if isinstance(_context.pieces[-1], (Steps, Parallel)): | ||
return Step(*args, template=self, **kwargs) | ||
|
||
return Task(*args, template=self, **kwargs) | ||
except InvalidType: | ||
pass | ||
if isinstance(_context.pieces[-1], DAG): | ||
return Task(*args, template=self, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very very nice!
Pull Request Checklist
Description of PR
list of templates in the Workflow
Step
object is returned from the callof the script
CallableTemplateMixin's
__call__
to make it clearer what's happening