Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode #8364

Merged
merged 6 commits into from
Dec 5, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Nov 29, 2023

Which issue does this PR close?

N/A

Rationale for this change

It has bothered since it was introduced in #8006 that the AggregateExec uses something from the window exec module

What is really happening is that they both are being told how the input is ordered relative to important columns (e.g. partitioning or grouping).

For example, the ordering_mode in this output:

    |               AggregateExec: mode=Partial, gby=[region@0 as region, date_bin(600000000000, time@1) as date_bin_gapfill(IntervalMonthDayNano("600000000000"),cpu.time)], aggr=[AVG(cpu.user)], ordering_mode=Sorted    |
|

Is actually the result of the PartitionSearchMode

What changes are included in this PR?

  1. Move PartitionSearchMode into its own module
  2. Add some comments
  3. Rename to InputOrderMode

I would also like to propose renaming the structure to something that doesn't have the word Partitioning in it (Update: renamed to InputOrderMode)

Are these changes tested?

Are there any user-facing changes?

  1. Changed a use path for PartitionSearchMode
  2. Better documentation

@alamb alamb added the api change Changes the API exposed to users of the crate label Nov 29, 2023
@github-actions github-actions bot added the core Core DataFusion crate label Nov 29, 2023
/// (The vector stores the index of `a` in the respective PARTITION BY expression.)
/// - A `PARTITION BY a, b` or a `PARTITION BY b, a` can use `Sorted` mode.
///
/// ## Aggregations
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought making the GROUP BY correspondence made this example clearer, even though there is non trivial redundancy

/// Note these are the same examples as above, but with `GROUP BY` instead of
/// `PARTITION BY` to make the examples easier to read.
#[derive(Debug, Clone, PartialEq)]
pub enum PartitionSearchMode {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to rename this to something related to ordering rather than Partitioning if possible. Maybe InputOrder or InputOrderMode 🤔

Maybe @ozankabak has some thoughts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like InputOrderMode 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 4f4120e

@alamb alamb changed the title Minor: Move PartitionSearchMode into datafusion_physical_plan Minor: Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode Nov 30, 2023
@alamb alamb changed the title Minor: Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode Nov 30, 2023
@alamb alamb requested a review from ozankabak December 5, 2023 01:28
Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@alamb alamb merged commit e322839 into apache:main Dec 5, 2023
22 checks passed
appletreeisyellow pushed a commit to appletreeisyellow/datafusion that referenced this pull request Dec 15, 2023
…InputOrderMode` (apache#8364)

* Move PartitionSearchMode into datafusion_physical_plan

* Improve  comments

* Rename to InputOrderMode

* Update prost
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants