-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document and test overriding batch type inference #21844
Document and test overriding batch type inference #21844
Conversation
R: @yeandy |
Codecov Report
@@ Coverage Diff @@
## master #21844 +/- ##
==========================================
- Coverage 74.15% 74.13% -0.03%
==========================================
Files 698 698
Lines 92411 92433 +22
==========================================
- Hits 68530 68524 -6
- Misses 22630 22658 +28
Partials 1251 1251
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
DoFn is being applied to. | ||
|
||
Returns: | ||
``None`` if this DoFn cannot accept batches, a Beam typehint or a native |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``None`` if this DoFn cannot accept batches, a Beam typehint or a native | |
``None`` if this DoFn cannot accept batches, a Beam typehint, or a native |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
DoFn is being applied to. | ||
|
||
Returns: | ||
``None`` if this DoFn will never yield batches, a Beam typehint or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``None`` if this DoFn will never yield batches, a Beam typehint or | |
``None`` if this DoFn will never yield batches, a Beam typehint, or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want "Beam typehint or native typehint" as a unit to be the "else" clause. I updated the language to make that explicit instead of applying this. Thanks for pointing it out
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner_test.py
Outdated
Show resolved
Hide resolved
def _get_input_batch_type_normalized(self, input_element_type): | ||
return typehints.native_type_compatibility.convert_to_beam_type( | ||
self.get_input_batch_type(input_element_type)) | ||
|
||
def _get_output_batch_type_normalized(self, input_element_type): | ||
return typehints.native_type_compatibility.convert_to_beam_type( | ||
self.get_output_batch_type(input_element_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these private functions? Is it because normalizing to Beam types isn't going to be a common op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are convenience functions I provided for our internal use, users shouldn't call them. Users shouldn't call the others (get_{input,output}_batch_type
) either - but they are part of the public API since users can override them if they need to.
Come to think of it I should probably mark some other convenience functions we added as protected. I'll follow up with a PR for that.
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner_test.py
Outdated
Show resolved
Hide resolved
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner_test.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Andy Ye <andyye333@gmail.com>
Thanks @yeandy! |
* Document and test overriding batch type inference * address review comments * Update sdks/python/apache_beam/transforms/core.py Co-authored-by: Andy Ye <andyye333@gmail.com> Co-authored-by: Andy Ye <andyye333@gmail.com>
Fixes #21652
Some Batched DoFns (e.g. RunInference) will need to declare their input/output batch types dynamically based on some configuration. Technically a DoFn implementation should already be able to do this, but it's untested and undocumented. This PR simply documents the functions that need to be overridden (
get_input_batch_type
,get_output_batch_type
), and adds tests verifying it's possible.We also add new
_normalized
versions of these functions which are responsible for normalizing the typehints to Beam typehints. This allows users to return native typehints in their implementations if they prefer.GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.