You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Presently, Dataset has methods to perform several actions—sort_by, join, and join_asof—with Acero. It would be especially helpful to provide a method to perform aggregations on datasets using Acero for convenient out of core processing.
The implementation can be modeled off of the existing Dataset Acero operations as well as the aggregate method of TableGroupBy.
Component(s)
Python
The text was updated successfully, but these errors were encountered:
Note that the implementation proposed in the above PR ends up being fairly inefficient because it can't fully leverage nodes for, e.g., projections and filtering. If interested, this functionality could be included—basically providing a dataframe-like interface to constructing an Acero plan—but that is a bit larger in scope. I made a first effort at this for my own use: https://github.com/sidneymau/dataplan
Describe the enhancement requested
Presently,
Dataset
has methods to perform several actions—sort_by
,join
, andjoin_asof
—with Acero. It would be especially helpful to provide a method to perform aggregations on datasets using Acero for convenient out of core processing.The implementation can be modeled off of the existing
Dataset
Acero operations as well as theaggregate
method ofTableGroupBy
.Component(s)
Python
The text was updated successfully, but these errors were encountered: