Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic filter pushdown to probe side #12781

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Lordworms
Copy link
Contributor

Which issue does this PR close?

Closes #7955

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate common Related to common crate labels Oct 7, 2024
@Lordworms
Copy link
Contributor Author

still in draft, similar to DuckDB what does, support single column less than and greater than to push to probe side. Gonna add benchmark test and more test cases later this week.

Some(Ok(batch)) => {
Some(Ok(mut batch)) => {
if let Some(dynamic_filters) = &mut self.dynamic_filter_info {
batch = dynamic_filters.filter_batch(&batch)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add the filter logic here, don't know if it is a good choice compare to adding a FilterExec down to scan.

final_expr: Option<Arc<dyn PhysicalExpr>>,
}

impl PhysicalDynamicFiltersInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
impl PhysicalDynamicFiltersInfo {
impl PhysicalDynamicFilters {

?

let mut new_filter_expr: Option<Arc<dyn PhysicalExpr>> = None;

for (i, col) in self.probe_side_columns.iter().enumerate() {
let min_value = self.aggregates[i].evaluate(records)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I think you want this be evaluated on other side of the join once (after loading the build side) and then be passed on to the other side.

Now the filter seems to be created from the probe side and the filter is evaluated from probe side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Push Dynamic Join Predicates into Scan ("Sideways Information Passing", etc)
2 participants