-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split out and handle 'params' in mapped operator #26100
Conversation
5824f87
to
7b2115b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes explaining the implementation, hopefully this is easier to review with them.
airflow/models/mappedoperator.py
Outdated
@@ -162,6 +165,7 @@ class OperatorPartial: | |||
""" | |||
|
|||
operator_class: Type["BaseOperator"] | |||
params: Union[ParamsDict, dict] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be able to expand a task against params
, we need to be able to “merge” default DAG- and task-level params with the user-mapped params. So params
is split out of other partial kwargs and treated specially. Partial kwargs should never contain the "params"
key.
@@ -256,7 +256,6 @@ def partial( | |||
partial_kwargs.setdefault("end_date", end_date) | |||
partial_kwargs.setdefault("owner", owner) | |||
partial_kwargs.setdefault("email", email) | |||
partial_kwargs.setdefault("params", default_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don’t set the default params in partial_kwargs
…
return OperatorPartial( | ||
operator_class=operator_class, | ||
kwargs=partial_kwargs, | ||
params=partial_params, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… but pass them to this separate attribute.
@@ -363,7 +363,6 @@ def _expand(self, expand_input: ExpandInput, *, strict: bool) -> XComArg: | |||
task_id = get_unique_task_id(partial_kwargs.pop("task_id"), dag, task_group) | |||
if task_group: | |||
task_id = task_group.child_id(task_id) | |||
params = partial_kwargs.pop("params", None) or default_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this line is deleted since it’s now no-op.
# Ordering is significant; mapped kwargs should override partial ones. | ||
|
||
# If params appears in the mapped kwargs, we need to merge it into the | ||
# partial params, overriding existing keys. | ||
params = copy.copy(self.params) | ||
with contextlib.suppress(KeyError): | ||
params.update(mapped_kwargs["params"]) | ||
|
||
# Ordering is significant; mapped kwargs should override partial ones, | ||
# and the specially handled params should be respected. | ||
return { | ||
"task_id": self.task_id, | ||
"dag": self.dag, | ||
"task_group": self.task_group, | ||
"params": self.params, | ||
"start_date": self.start_date, | ||
"end_date": self.end_date, | ||
**self.partial_kwargs, | ||
**mapped_kwargs, | ||
"params": params, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actual merging happens here. The merged params
is put last so it overrides the incomplete mapped_kwargs["params"]
.
airflow/models/baseoperator.py
Outdated
"""Template all attributes listed in template_fields. | ||
"""Template all attributes listed in *self.template_fields*. | ||
|
||
This mutates the attributes in-place and is irreversible. | ||
|
||
:param context: Dict with values to apply on content | ||
:param jinja_env: Jinja environment | ||
:param context: Context dict with values to apply on content. | ||
:param jinja_env: Jinja environment to use for rendering. | ||
""" | ||
if not jinja_env: | ||
jinja_env = self.get_template_env() | ||
self._do_render_template_fields(self, self.template_fields, context, jinja_env, set()) | ||
return self | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not change any behaviour, I only did it so the behaviour is easier to explain…
airflow/models/taskinstance.py
Outdated
original_task = self.task | ||
rendered_task = self.task.render_template_fields(context) | ||
if rendered_task is None: # Compatibility -- custom renderer, assume unmapped. | ||
return self.task | ||
original_task, self.task = self.task, rendered_task | ||
if rendered_task is not None: # Mapped operator, assign unmapped task. | ||
self.task = rendered_task | ||
return original_task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… here. Previously it’s difficult to follow when rendered_task
is None and when it’s not; now it’s simple—a mapped operator (MappedOperator subclases) performs unmapping and returns the unmapped task, while a non-mapped operator (BaseOperator subclasses) return None because no unmapping is needed.
7b2115b
to
50c6989
Compare
50c6989
to
d05e7c2
Compare
|
It turns out we need to update the template context a bit after unmapping. This also fixes a bug that context["task"] pointed to the mapped task but context["ti"].task is the unmapped task. They now both point to the unmapped one.
d05e7c2
to
e9b5861
Compare
Allow using
params
in expanding kwargs. Fix #24014.Syntax unlocked: