Enhance Enforce Dist capabilities to fix, sub optimal bad plans #7671

mustafasrepo · 2023-09-27T13:45:13Z

Which issue does this PR close?

Closes #.

Rationale for this change

Current EnforceDistribution rule produces optimal plans, when physical plan at its input is single partitioned plan (E.g physical plan doesn't contain any distribution changing operator). This assumption is valid for datafusion physical optimizer.

However, others may want to use this rule with their existing plans, that is already distributed.
In these cases, EnforceDistribution used to produce valid plans, generated plans sometimes were not optimal.

In this PR we extend capabilities of the EnforceDistribution rule, so that even if existing plan is already multi-partitioned, physical plan produced by EnforceDistribution is still optimal.

What changes are included in this PR?

The approach to accomplish this as follows.

Analyze input physical plan to determine requirements at the output of the query (Such has it has single distribution requirement, or it expects output to have a ASC ordering etc.).
Add an ancillary operator(GlobalRequiringExec) so that requirement of the query is not lost across rules.
In the EnforceDistribution rule, ignore-remove any distribution changing operator at its input.
Then satisfy distribution requirements of the operators using existing algortihm.

Are these changes tested?

Yes, new tests are added. To show that intentionally bad, multi-partitioned physical plans can be fixed by EnforceDistribution rule.

Are there any user-facing changes?

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs # datafusion/sqllogictest/test_files/groupby.slt

# Conflicts: # datafusion/core/src/datasource/listing/table.rs # datafusion/core/src/physical_optimizer/enforce_distribution.rs

# Conflicts: # datafusion/core/src/datasource/listing/table.rs

metesynnada

This addition is quite powerful, as well as the code LGTM. Thanks @mustafasrepo

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs

alamb

Thank you @mustafasrepo -- I think this is a great change. I had some small comments about the tests but they could be fixed as a follow on PR if needed

The name GlobalRequirements initially implied to me that it tracked something global to the entire plan, but upon reading more I realize it is modeling the requirements of the output / root of the plan. Maybe a name like OutputRequirements or RootRequirements might better capture the notion.

cc @devinjdangelo I think this might be a way to allow an insert / write plan to specify how it wants the input to be ordered and distributed .

This is somewhat related to a discussion we had on how a TableProvider could signal to the rest of the plan how it wanted its input partitioned / sorted: #6339 and #424

datafusion/core/src/physical_optimizer/global_requirements.rs

alamb · 2023-09-28T17:52:04Z

datafusion/core/src/physical_optimizer/global_requirements.rs

+/// Helper function that adds an ancillary `GlobalRequirementExec` to the given plan.
+/// First entry in the tuple is resulting plan, second entry indicates whether any
+/// `GlobalRequirementExec` is added to the plan.
+fn require_top_ordering_helper(


I see this is basically trying to reverse engineer the top level requirements based on the pattern of the plan.

Given GlobalRequirementExec can specify plan output requirements, I wonder if it would make more sense to directly create GlobalRequirementExec as pat of the initial physical plan, for example in https://github.com/apache/arrow-datafusion/blob/7b12666ec87ea741c3f5b56ddf1647f6d794f9e3/datafusion/core/src/physical_planner.rs#L536

That would also offer a nice way to control how inputs to write plans might look 🤔

This is an option also. I think we can add this during initial plan, then in a single pass remove this operator at the end of the optimizer.

I thought about this design. I think there are a couple of reason why current design is better

Currently GlobalRequirementExec is not necessarily the top executor in the physical plan. It is the executor above that defines output ordering.
Such as plan below

ProjectionExec OutputRequirementExec SortExec (specifies output ordering)

The reason for this strategy is that putting OutputRequirementExec at the top has couple of problems.
Consider alternative strategy where plan above turns the following

OutputRequirementExec ProjectionExec SortExec (specifies output ordering)

In this new version OutputRequirementExec may not have the necessary column to satisfy ordering desired at the output. Ordering should be satisfied before projection(These column may get lost during projection).
Consider another plan

WindowExec(OVER() sum()) Source (have an ordering)

If we were to require output ordering for it above plan would turn to

OutputRequirementExec WindowExec(OVER() sum()) Source (have an ordering)

However, output ordering in the plan above is accidental. It is not a requirement by the query. And planner should be free to mess this output ordering, if if is helpful.
In short output ordering requirement only when query contains ORDER BY clause at the end.
In this light, another strategy might be during plan creation we can insert OutputRequirementExec when we see LogicalPlan::Sort.

However, in following kind of queries

SELECT a, b FROM ( SELECT a, b FROM table1 ORDER BY a ) ORDER BY b

This would introduce unnecessary requirement for the subquery.
Hence to require only appropriate absolutely necessary output ordering, we need to traverse physical plan from top to bottom, as long as ordering is maintained. Then put the OutputRequirementExec on top of it (this is the current approach). For this we need to have a top down pass on the physical plan. This can be done in create_initial_plan stage also. After initial physical plan is created. However, current implementation generally insert corresponding operator(s) for the logicalPlan node. Hence it is not easy to integrate top-down traversal code here.

However, I think we can move OutputRequirementExec insertion code on top of here.
https://github.com/apache/arrow-datafusion/blob/7b12666ec87ea741c3f5b56ddf1647f6d794f9e3/datafusion/core/src/physical_planner.rs#L450

After initial plan is created. What do you think, about this place?

Thank you for the writeup and explanation @mustafasrepo -- I agree with your examples. Maybe the core problem is trying to figure out what the "required ordering" from the ExecutionPlan where it could be simply an accident of the output of the plan, as in your example

WindowExec(OVER() sum()) Source (have an ordering)

I think the required output order can probably be calculated from the logical plan, so maybe we should move it there. 🤔

However, I think we can move OutputRequirementExec insertion code on top of here.

That makes sense to me. Shall I try it?

I think the required output order can probably be calculated from the logical plan, so maybe we should move it there. 🤔

If we can reliably get the requirements at that stage, this should work too. Feel free to try and we can see if it helps simplify the code

datafusion/core/src/physical_optimizer/enforce_distribution.rs

alamb · 2023-09-28T18:00:13Z

datafusion/core/src/physical_optimizer/enforce_distribution.rs

@@ -3302,6 +3278,12 @@ mod tests {
            "ParquetExec: file_groups={2 groups: [[x], [y]]}, projection=[a, b, c, d, e], output_ordering=[a@0 ASC]",
        ];
        assert_optimized!(expected, exec, true);
+        let expected = &[


This seems like a regression -- the comments in the code say "the optimizer should not add a new sort exec" but then in this code the optimizer has added the new sort exec.

Maybe we need to update the test to add a GlobalRequirements node so that the sorts are not added?

the same comments apply to the other tests in this file

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs # datafusion/core/src/physical_optimizer/utils.rs

Update comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

mustafasrepo · 2023-09-29T07:33:50Z

The name GlobalRequirements initially implied to me that it tracked something global to the entire plan, but upon reading more I realize it is modeling the requirements of the output / root of the plan. Maybe a name like OutputRequirements or RootRequirements might better capture the notion.

I agree that, current name is a bit vague. Changed name to OutputRequirements.

… enforce dist

# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

alamb · 2023-09-29T16:45:58Z

datafusion/core/src/physical_optimizer/enforce_distribution.rs

@@ -3541,6 +3595,12 @@ mod tests {
        ];

        assert_optimized!(expected, plan.clone(), true);
+
+        let expected = &[


This behavior change is a regression from IOx's perspective as the plan is now resorting data that is already sorted. I will work on creating a reproducer / fix next week

This plan is generated when the bounded_order_preserving_variants configuration flag is false. When this flag is true, we get the sort-free result. The prior behavior was basically ignoring/overriding the flag.

For some detailed context: The flag basically lets the user choose whether they want SortPreservingMerges, or Repartition/Coalesce+Sort cascades. We ran some benchmarks and there is no clearly dominating strategy, each alternative comes out ahead in certain cases. In non-streaming cases, the first alternative typically came out ahead, so we let the default flag value to be false.

Since we are a stream-first platform, we set the flag to true at Synnada. Maybe IOx also wants to set this flag to true?

Ah, thank you for the hint. I will give it a try

Update here is that this suggested worked well -- thank you. I find the naming of bounded_order_preserving_variants to be confusing so I have proposed a new name here #7723

…he#7671) * Extend capabilities of enforcedist * Simplifications * Fix test * Do not use hard coded partition number * Add comments, Fix with_new_children of CustomPlan * Use sub-rule as separate rule. * Add unbounded method * Final review * Move util code to exectree file * Update variables and comments * Apply suggestions from code review Update comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Address reviews * Add new tests, do not satisfy requirement if not absolutely necessary enforce dist --------- Co-authored-by: Mehmet Ozan Kabak <ozankabak@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

mustafasrepo and others added 13 commits September 14, 2023 16:29

Extend capabilities of enforcedist

6b4b0b7

Simplifications

8781bb9

Merge branch 'apache_main' into enhance/enforce_dist

1edbed5

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs # datafusion/sqllogictest/test_files/groupby.slt

Fix test

dcadf41

Do not use hard coded partition number

4acbc07

Merge branch 'apache_main' into enhance/enforce_dist

da6bab7

# Conflicts: # datafusion/core/src/datasource/listing/table.rs # datafusion/core/src/physical_optimizer/enforce_distribution.rs

Add comments, Fix with_new_children of CustomPlan

bac64fe

Use sub-rule as separate rule.

c639fd3

Add unbounded method

09a129c

Final review

9a5dce9

Move util code to exectree file

fc59bb8

Merge branch 'apache_main' into enhance/enforce_dist

c62ee7c

# Conflicts: # datafusion/core/src/datasource/listing/table.rs

Update variables and comments

e7450fe

github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Sep 27, 2023

metesynnada approved these changes Sep 28, 2023

View reviewed changes

Merge branch 'apache_main' into enhance/enforce_dist

c77f1f6

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs

alamb approved these changes Sep 28, 2023

View reviewed changes

This was referenced Sep 28, 2023

Design how to respect output stream ordering #424

Closed

Improve cache usage in CI #7678

Merged

mustafasrepo and others added 2 commits September 29, 2023 09:37

Merge branch 'apache_main' into enhance/enforce_dist

4725e4a

# Conflicts: # datafusion/core/src/physical_optimizer/enforce_distribution.rs # datafusion/core/src/physical_optimizer/utils.rs

Apply suggestions from code review

50615f6

Update comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

mustafasrepo added 3 commits September 29, 2023 10:44

Address reviews

7320727

Add new tests, do not satisfy requirement if not absolutely necessary…

3b53ec9

… enforce dist

Merge branch 'apache_main' into enhance/enforce_dist

a7c2d92

# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

mustafasrepo merged commit fcd94fb into apache:main Sep 29, 2023
22 checks passed

alamb reviewed Sep 29, 2023

View reviewed changes

This was referenced Oct 2, 2023

bounded_order_preserving_variants configuration setting is confusingly named #7722

Closed

Rename bounded_order_preserving_variants config to prefer_exising_sort and update docs #7723

Merged

alamb mentioned this pull request Oct 11, 2023

Invalid plans result in SortPreservingRepartitionExec with no SortExprs #7794

Closed

tustvold mentioned this pull request Dec 17, 2023

Default datafusion.optimizer.prefer_existing_sort to true #8572

Closed

matthewgapp mentioned this pull request Jan 11, 2024

matt/feat/recursive ctes/config flag matthewgapp/arrow-datafusion#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Enforce Dist capabilities to fix, sub optimal bad plans #7671

Enhance Enforce Dist capabilities to fix, sub optimal bad plans #7671

mustafasrepo commented Sep 27, 2023

metesynnada left a comment

alamb left a comment

alamb Sep 28, 2023

mustafasrepo Sep 29, 2023

mustafasrepo Sep 29, 2023

alamb Sep 29, 2023

ozankabak Sep 29, 2023

alamb Sep 28, 2023

mustafasrepo commented Sep 29, 2023

alamb Sep 29, 2023

ozankabak Sep 29, 2023

alamb Sep 29, 2023

alamb Oct 2, 2023

Enhance Enforce Dist capabilities to fix, sub optimal bad plans #7671

Enhance Enforce Dist capabilities to fix, sub optimal bad plans #7671

Conversation

mustafasrepo commented Sep 27, 2023

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

metesynnada left a comment

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mustafasrepo commented Sep 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment