Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqllab/charts): casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp #18873

Merged
merged 4 commits into from
May 4, 2022

Conversation

yeachan153
Copy link
Contributor

@yeachan153 yeachan153 commented Feb 23, 2022

SUMMARY

Addresses #18871, #18596, #16487, #13661. This allows users to query date columns outside the maximum ranges of pandas timestamps: 1677-09-22 00:12:43.145225 and 2262-04-11 23:47:16.854775807.

The same fix as #14006, with a small additional fix for charts to hide the timestamps it cannot convert.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Sql Lab before:
image

Sql Lab after can query past the date ranges mentioned above:
image

Charts before:
image

Charts after now show data, excluding the dates that reside outside this period:
image

TESTING INSTRUCTIONS

  1. In Sql lab, try running select TIMESTAMP any_timestamp_outside_ranges_mentioned_above
  2. In charts, select a timeseries chart, and add a column with date ranges outside what's supported in pandas in the TIME COLUMN, it should still return a chart successfully and omit showing the date ranges that are unsupported.

ADDITIONAL INFORMATION

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! It would be nice if we could add a unit test to ensure these far in the past/future timestamps work with this fix (optimally those unit tests should fail on master branch).

superset/result_set.py Outdated Show resolved Hide resolved
@stale
Copy link

stale bot commented Apr 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 27, 2022
@simonvanderveldt
Copy link

Any chance this could be merged?

@stale stale bot removed the inactive Inactive for >= 30 days label Apr 27, 2022
@pull-request-size pull-request-size bot added size/M and removed size/S labels May 1, 2022
Copy link
Contributor Author

@yeachan153 yeachan153 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro Sorry for the delay, I've added a couple of unittests.

@villebro
Copy link
Member

villebro commented May 3, 2022

Thanks @yeachan153 ! Let me review/test now

@codecov
Copy link

codecov bot commented May 3, 2022

Codecov Report

Merging #18873 (f8a2cfd) into master (700829b) will increase coverage by 0.16%.
The diff coverage is 77.86%.

❗ Current head f8a2cfd differs from pull request most recent head ab25680. Consider uploading reports for the commit ab25680 to get more accurate results

@@            Coverage Diff             @@
##           master   #18873      +/-   ##
==========================================
+ Coverage   66.36%   66.53%   +0.16%     
==========================================
  Files        1621     1714      +93     
  Lines       63057    65051    +1994     
  Branches     6382     6724     +342     
==========================================
+ Hits        41850    43280    +1430     
- Misses      19547    20060     +513     
- Partials     1660     1711      +51     
Flag Coverage Δ
hive 52.90% <ø> (+0.61%) ⬆️
mysql 81.95% <ø> (+0.39%) ⬆️
postgres 81.99% <ø> (+0.39%) ⬆️
presto 52.76% <ø> (+0.63%) ⬆️
python 82.41% <ø> (+0.37%) ⬆️
sqlite 81.76% <ø> (+0.47%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ntrols/src/components/CertifiedIconWithTooltip.tsx 80.00% <ø> (ø)
...erset-ui-chart-controls/src/components/Tooltip.tsx 80.00% <ø> (ø)
...d/packages/superset-ui-chart-controls/src/index.ts 100.00% <ø> (ø)
...-chart-controls/src/sections/advancedAnalytics.tsx 14.28% <0.00%> (-19.05%) ⬇️
...rset-ui-chart-controls/src/sections/chartTitle.tsx 100.00% <ø> (ø)
...perset-ui-chart-controls/src/sections/sections.tsx 100.00% <ø> (ø)
...omponents/ColumnConfigControl/ColumnConfigItem.tsx 0.00% <ø> (ø)
.../shared-controls/components/RadioButtonControl.tsx 0.00% <ø> (ø)
...chart-controls/src/shared-controls/dndControls.tsx 35.89% <ø> (ø)
...controls/src/shared-controls/emitFilterControl.tsx 50.00% <ø> (ø)
... and 686 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 700829b...ab25680. Read the comment docs.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Very minor non-blocking comment (I'll leave the PR open for another day if you feel like fixing, otherwise merging as-is). Also, I love how the tests were placed at precisely the error threshold, very precise 😆

Comment on lines 63 to 98
@pytest.mark.parametrize(
"inpt, expected",
[
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2),
],
[
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1},
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2},
],
id="timestamp conversion fail"
),
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2)
],
[
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1},
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2}
],
id="timestamp conversion success"
)
]
)
def test_max_pandas_timestamp(inpt, expected) -> None:
from superset.db_engine_specs import BaseEngineSpec
from superset.result_set import SupersetResultSet

cursor_descr: DbapiDescription = [
("a", "datetime", None, None, None, None, False),
("b", "int", None, None, None, None, False),
]
results = SupersetResultSet(inpt, cursor_descr, BaseEngineSpec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this feels more pythonic:

Suggested change
@pytest.mark.parametrize(
"inpt, expected",
[
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2),
],
[
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1},
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2},
],
id="timestamp conversion fail"
),
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2)
],
[
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1},
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2}
],
id="timestamp conversion success"
)
]
)
def test_max_pandas_timestamp(inpt, expected) -> None:
from superset.db_engine_specs import BaseEngineSpec
from superset.result_set import SupersetResultSet
cursor_descr: DbapiDescription = [
("a", "datetime", None, None, None, None, False),
("b", "int", None, None, None, None, False),
]
results = SupersetResultSet(inpt, cursor_descr, BaseEngineSpec)
@pytest.mark.parametrize(
"input_, expected",
[
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), 2),
],
[
{"a": datetime.strptime("1677-09-22 00:12:43", "%Y-%m-%d %H:%M:%S"), "b": 1},
{"a": datetime.strptime("2262-04-11 23:47:17", "%Y-%m-%d %H:%M:%S"), "b": 2},
],
id="timestamp conversion fail"
),
pytest.param(
[
(datetime.strptime("1677-09-22 00:12:44", "%Y-%m-%d %H:%M:%S"), 1),
(datetime.strptime("2262-04-11 23:47:16", "%Y-%m-%d %H:%M:%S"), 2)
],
[
{"a": Timestamp("1677-09-22 00:12:44"), "b": 1},
{"a": Timestamp("2262-04-11 23:47:16"), "b": 2}
],
id="timestamp conversion success"
)
]
)
def test_max_pandas_timestamp(input_, expected) -> None:
from superset.db_engine_specs import BaseEngineSpec
from superset.result_set import SupersetResultSet
cursor_descr: DbapiDescription = [
("a", "datetime", None, None, None, None, False),
("b", "int", None, None, None, None, False),
]
results = SupersetResultSet(input_, cursor_descr, BaseEngineSpec)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing! Should be changed now

@villebro
Copy link
Member

villebro commented May 3, 2022

@yeachan153 there's also some linting issues - if you're ok with it, I can fix the remaining issues and push directly to your PR?

@yeachan153
Copy link
Contributor Author

yeachan153 commented May 3, 2022

@yeachan153 there's also some linting issues - if you're ok with it, I can fix the remaining issues and push directly to your PR?

Thanks, I think it should be fixed now? But feel free to push anything!

@villebro
Copy link
Member

villebro commented May 3, 2022

Thanks, I think it should be fixed now? But feel free to push anything!

I didn't see a pinned version of black, strangely though it tried to change the format in parts of the code I didn't touch, I didn't push those.

Thanks @yeachan153! let's see if CI passes - if not I'll push the remaining fixes so we can get it merged

@yeachan153
Copy link
Contributor Author

Thanks @yeachan153! let's see if CI passes - if not I'll push the remaining fixes so we can get it merged

Ah sorry, I didn't realise you can check which files black is complaining about in the workflow. There were two more files to lint. It should actually work this time!

@villebro
Copy link
Member

villebro commented May 4, 2022

Restarted CI - fingers crossed 🙂🤞

@villebro villebro merged commit 8b72354 into apache:master May 4, 2022
hughhhh pushed a commit to hve-labs/superset that referenced this pull request May 11, 2022
… result in out of bounds timestamp (apache#18873)

* fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts

* Add unittests

* Lint changes and parameter variable rename

* Fix linting
villebro pushed a commit that referenced this pull request May 26, 2022
… result in out of bounds timestamp (#18873)

* fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts

* Add unittests

* Lint changes and parameter variable rename

* Fix linting

(cherry picked from commit 8b72354)
michael-s-molina pushed a commit that referenced this pull request May 26, 2022
… result in out of bounds timestamp (#18873)

* fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts

* Add unittests

* Lint changes and parameter variable rename

* Fix linting

(cherry picked from commit 8b72354)
philipher29 pushed a commit to ValtechMobility/superset that referenced this pull request Jun 9, 2022
… result in out of bounds timestamp (apache#18873)

* fix casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp from sqllab and charts

* Add unittests

* Lint changes and parameter variable rename

* Fix linting
@mistercrunch mistercrunch added 🍒 1.5.1 🍒 1.5.2 🍒 1.5.3 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels lts-v1 size/M 🍒 1.5.1 🍒 1.5.2 🍒 1.5.3 🚢 2.0.0
Projects
None yet
5 participants