Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: substrait limit when fetch is None #7669

Merged
merged 4 commits into from
Sep 29, 2023
Merged

Conversation

waynexia
Copy link
Member

Which issue does this PR close?

related to GreptimeTeam/greptimedb#2495

Rationale for this change

I found the producer will replace fetch = None with fetch = 0, and consumer will always put Some in the position of fetch. This is incorrect for query without LIMIT, e.g.: SELECT * FROM t OFFSET 1

What changes are included in this PR?

Fix the bug. As substrait cannot express None, I use usize::MAX to represent None in producer, and convert it back to None in consumer.

This patch also has a minor change in LimitExec. I thought that condition was redundant so I removed it.

Are these changes tested?

I add a new round trip case

Are there any user-facing changes?

No

@github-actions github-actions bot added core Core DataFusion crate substrait and removed core Core DataFusion crate labels Sep 27, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @waynexia

This looks good to me.

I took the liberty of merging up from main to resolve a conflict and adding a comment about the use of USIZE::max

@@ -442,7 +442,7 @@ impl LimitStream {

match &poll {
Poll::Ready(Some(Ok(batch))) => {
if batch.num_rows() > 0 && self.skip == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the rationale that this condition is redundant because any batch with zero rows will also have self.skip equal to zero?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. To my understanding these two conditions are identical. So I removed one to make it simpler

@alamb alamb mentioned this pull request Sep 28, 2023
@alamb alamb merged commit 70cded6 into apache:main Sep 29, 2023
22 checks passed
@waynexia waynexia deleted the fix-limit branch September 29, 2023 17:14
Ted-Jiang pushed a commit to Ted-Jiang/arrow-datafusion that referenced this pull request Oct 7, 2023
* fix: substrait limit when fetch is None

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* Add comments

---------

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants