Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to unparse Time scalar value to String #11121

Merged
merged 3 commits into from
Jun 26, 2024

Conversation

goldmedal
Copy link
Contributor

Which issue does this PR close?

Part of #9494

Rationale for this change

What changes are included in this PR?

Are these changes tested?

yes

Are there any user-facing changes?

@github-actions github-actions bot added the sql SQL Planner label Jun 25, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @goldmedal -- looks good to me ❤️

@@ -1241,6 +1268,22 @@ mod tests {
)),
r#"CAST('1970-01-01 08:00:00.000010001 +08:00' AS TIMESTAMP)"#,
),
(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and the actual types are slightly different (so this code won't round trip) but I can't think of anything better for here

> select CAST('00:00:00.010001' AS TIME);
+-------------------------+
| Utf8("00:00:00.010001") |
+-------------------------+
| 00:00:00.010001         |
+-------------------------+
1 row(s) fetched.
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('00:00:00.010001' AS TIME));
+---------------------------------------+
| arrow_typeof(Utf8("00:00:00.010001")) |
+---------------------------------------+
| Time64(Nanosecond)                    |
+---------------------------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root cause is that we always use nanosecond precision to construct the TIME or TIMESTAMP value. This issue also occurs for TIMESTAMP:

> select arrow_typeof(CAST('1970-01-01 00:00:00.000010001' AS TIMESTAMP));
+-----------------------------------------------------+
| arrow_typeof(Utf8("1970-01-01 00:00:00.000010001")) |
+-----------------------------------------------------+
| Timestamp(Nanosecond, None)                         |
+-----------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('1970-01-01 00:00:10.001' AS TIMESTAMP));
+-----------------------------------------------+
| arrow_typeof(Utf8("1970-01-01 00:00:10.001")) |
+-----------------------------------------------+
| Timestamp(Nanosecond, None)                   |
+-----------------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

I tested some ways to construct a TIME value:

> select arrow_typeof(TIME '00:00:01.001');
+------------------------------------+
| arrow_typeof(Utf8("00:00:01.001")) |
+------------------------------------+
| Time64(Nanosecond)                 |
+------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('00:00:01' AS TIME));
+--------------------------------+
| arrow_typeof(Utf8("00:00:01")) |
+--------------------------------+
| Time64(Nanosecond)             |
+--------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

I think if we want to round-trip the result, we should generate the time or timestamp node of AST with the specific precision. However, I found DataFusion doesn't support constructing a time or timestamp value with the specified precision:

> select arrow_typeof(TIME(3) '00:00:01.001');
This feature is not implemented: Unsupported SQL type Time(Some(3), None)
> select arrow_typeof(CAST('00:00:01.001' AS TIME(3)));
This feature is not implemented: Unsupported SQL type Time(Some(3), None)
> select arrow_typeof(CAST('00:00:01.001' AS TIME(6)));
This feature is not implemented: Unsupported SQL type Time(Some(6), None)
> select arrow_typeof(TIMESTAMP(3) '1970-01-01 00:00:10.001');
This feature is not implemented: Unsupported SQL type Timestamp(Some(3), None)

I think this could be another issue but is required for expression round-trip.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root cause is that we always use nanosecond precision to construct the TIME or TIMESTAMP value. This issue also occurs for TIMESTAMP:

I agree with this analysis.

It is not clear to me that fully preserving the type of timestamps / times etc is necessary for Expr --> text. Thus I think we shouldn't make the code more complicated until we have some actual usecase for trying to preserve the time

Ok(ast::Expr::Cast {
kind: ast::CastKind::Cast,
expr: Box::new(ast::Expr::Value(SingleQuotedString(time))),
data_type: ast::DataType::Time(None, TimezoneInfo::None),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assign the precision here like

Suggested change
data_type: ast::DataType::Time(None, TimezoneInfo::None),
data_type: ast::DataType::Time(Some(6), TimezoneInfo::None),

It can be converted to String well.

CAST('02:46:41' AS TIME(6))

@alamb alamb merged commit 82f7bf4 into apache:main Jun 26, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 26, 2024

Thanks @goldmedal

@goldmedal
Copy link
Contributor Author

Thanks @alamb

@goldmedal goldmedal deleted the feature/unparse-time branch June 26, 2024 14:08
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* support to unparse Time scalar value to string

* fix typo

* cargo fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants