Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queries with date filters in the format 'yyyymmdd' give an error #7465

Closed
osawyerr opened this issue Sep 2, 2023 · 12 comments
Closed

Queries with date filters in the format 'yyyymmdd' give an error #7465

osawyerr opened this issue Sep 2, 2023 · 12 comments
Assignees
Labels
bug Something isn't working

Comments

@osawyerr
Copy link

osawyerr commented Sep 2, 2023

Describe the bug

Hi there. When running queries with date filters that are in the format 'yyyymmdd' , DataFusion throws an error. This is valid Postgres syntax. However dates in the format 'yyyy-mm-dd' succeed. Both are valid postgres syntax.

For both queries below, both are valid postgres syntax however:

--this query fails
select * from orders where o_orderdate = '19961006'

--but this succeeds
select * from orders where o_orderdate = '1996-10-06'

To Reproduce

For a date column run a query with the date in string format 'yyyymmdd' as shown below

select * from orders where o_orderdate = '19961006'

Error thrown is

Arrow error: Cast error: Cannot cast string '19961006' to value of Date32 type

Expected behavior

This should return the correct result and not fail

Additional context

No response

@osawyerr osawyerr added the bug Something isn't working label Sep 2, 2023
@tustvold
Copy link
Contributor

tustvold commented Sep 2, 2023

Without delimiters how can a parser distinguish between dd-mm-yyyy or yyyy-mm-dd? I'm very surprised postgres supports this and would be somewhat inclined not to copy this... Is there some particular use case for why you want this functionality?

@osawyerr
Copy link
Author

osawyerr commented Sep 2, 2023

Is there some particular use case for why you want this functionality?

Not specifically. I was querying a date column and realised it didn't work the same as postgres did. I assumed DataFusion was intended to support postgres-esque queries thats why I raised this as an issue.

@jackwener
Copy link
Member

jackwener commented Sep 13, 2023

When there isn't delimiters, '20200101' will be treated as format yyyymmdd.

Same above, '200101' will be treated as format yymmdd.

Both MySQL and PG support it. Just in my opinion, it quite is a common usage and these usage also is part of ISO standard.

Without delimiters how can a parser distinguish between dd-mm-yyyy or yyyy-mm-dd?

ISO standard does not permit MMDDYYYY, MMYYYYDD or any other unapproved combination to eliminate any confusion.

@alamb
Copy link
Contributor

alamb commented Sep 19, 2023

Another option here might be to implement the two argument for of to_timestamp / to_date / to_time where the second argument is a format string. I think this is tracked with #5398

@Tangruilin
Copy link
Contributor

@alamb . Maybe I can try to fix it.

@alamb
Copy link
Contributor

alamb commented Oct 16, 2023

Thank you @Tangruilin

@alamb
Copy link
Contributor

alamb commented Oct 16, 2023

Arrow has this function: https://docs.rs/arrow/latest/arrow/compute/kernels/cast_utils/fn.string_to_datetime.html

You could look at how it uses Chrono to handle the parsing the different timestamp formats, perhaps

@tustvold
Copy link
Contributor

It is actually https://docs.rs/arrow-cast/47.0.0/src/arrow_cast/parse.rs.html#571 that would need modification. It should be a relatively straightforward case of adding an additional match case

@Tangruilin
Copy link
Contributor

@alamb Maybe you can assign this issue to me.

@Tangruilin
Copy link
Contributor

Otherwise I can't find this issue well

@alamb Maybe you can assign this issue to me.

@jackwener
Copy link
Member

I have assigned this issue to you, thanks @Tangruilin

@Tangruilin
Copy link
Contributor

I have already fix this in arrow,this issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants