-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial version of profiler #269
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this should be renamed to profiler
. aggregation is not what it is doing.
transforms/universal/aggregator/ray/src/aggregator_transform_ray.py
Outdated
Show resolved
Hide resolved
transforms/universal/aggregator/ray/src/aggregator_transform_ray.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there should be 2 test files so aggregation is done across multiple calls to transform()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 is good enough
@daw3rd one of the larger issues I am getting with this is testing. Because I am:
|
Maybe promoting the methods added in spark (highlighted below) to the super class and using them somehow in the super class would do it. Then for this transform, you override _validate_metadata() or use some other mechanism to get any metadata file. Similar override of the unhighlighted method could be done for the csv file? |
transforms/universal/aggregator/ray/src/aggregator_transform_ray.py
Outdated
Show resolved
Hide resolved
done |
Why are these changes needed?
New transform
Related issue number (if any).
https://github.ibm.com/ai-models-data/data-prep-kit-inner/issues/84