Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(migration): replace unquote with double percentages #30532

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

villebro
Copy link
Member

@villebro villebro commented Oct 6, 2024

SUMMARY

#23421 introduced a regression, where passwords containing @ characters would break migrations. To avoid having to decode the URL, we can simply escape the connection string for variable interpolation by doubling % chars, after which the original encoded connection string works as expected.

Note, that the change is functionally identical to the change proposed in the linked issue.

Here's one of many similar StackOverflow threads: https://stackoverflow.com/questions/39849641/in-flask-migrate-valueerror-invalid-interpolation-syntax-in-connection-string-a

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

This has been tested with a Postgres connection with a huge amount of special characters in the password, including @ that previously broke the migration workflow.

ADDITIONAL INFORMATION

  • Has associated issue: closes superset db upgrade mishandles DATABASE_URL with % in it #23176
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

codecov bot commented Oct 6, 2024

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.93%. Comparing base (76d897e) to head (0242f31).
Report is 839 commits behind head on master.

Files with missing lines Patch % Lines
superset/migrations/env.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #30532       +/-   ##
===========================================
+ Coverage   60.48%   83.93%   +23.44%     
===========================================
  Files        1931      533     -1398     
  Lines       76236    38540    -37696     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    32347    -13767     
+ Misses      28017     6193    -21824     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.00% <0.00%> (-0.16%) ⬇️
javascript ?
mysql 76.76% <0.00%> (?)
postgres 76.89% <0.00%> (?)
presto 53.49% <0.00%> (-0.31%) ⬇️
python 83.93% <0.00%> (+20.44%) ⬆️
sqlite 76.34% <0.00%> (?)
unit 60.72% <0.00%> (+3.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@giftig giftig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to note that the original problem here is the string being passed to a python-style interpolator accepting tokens like %s, and not anything to do with URLs. The original fix (mostly) worked by eliminating %s in the URL by decoding URL escape sequences like %20, which prevented the interpolator complaining about invalid interpolation sequences. However that leaves the URL improperly encoded, resulting in other issues popping up, and does not fix the problem where %s still appear in the string, e.g. when the password contains a literal %.

See the documentation link from @villebro 's linked stack overflow thread: https://alembic.sqlalchemy.org/en/latest/api/config.html#alembic.config.Config.set_main_option

Simply doubling the %s is the correct solution here so LGTM.

decoded_uri = urllib.parse.unquote(DATABASE_URI)
config.set_main_option("sqlalchemy.url", decoded_uri)
# Escape % chars in the database URI to avoid interpolation errors in ConfigParser
escaped_uri = DATABASE_URI.replace("%", "%%")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to urllib.parse.unquote anymore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above; this was never needed, it was actually a bit of a dodgy hack.

@villebro villebro merged commit 163b71e into apache:master Oct 7, 2024
45 of 46 checks passed
@villebro villebro deleted the villebro/interpolation-error branch October 7, 2024 03:05
@michael-s-molina michael-s-molina added v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch review:checkpoint Last PR reviewed during the daily review standup and removed risk:db-migration PRs that require a DB migration labels Oct 7, 2024
sadpandajoe pushed a commit that referenced this pull request Oct 7, 2024
@michael-s-molina michael-s-molina removed the review:checkpoint Last PR reviewed during the daily review standup label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XS v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch
Projects
Status: Cherried
Development

Successfully merging this pull request may close these issues.

superset db upgrade mishandles DATABASE_URL with % in it
4 participants