Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync WHATWG URL parser with upstream standards #17540

Closed
2 tasks done
TimothyGu opened this issue Dec 8, 2017 · 5 comments
Closed
2 tasks done

Sync WHATWG URL parser with upstream standards #17540

TimothyGu opened this issue Dec 8, 2017 · 5 comments
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. whatwg-url Issues and PRs related to the WHATWG URL implementation.

Comments

@TimothyGu
Copy link
Member

TimothyGu commented Dec 8, 2017

There have been some recent changes in the standards governing our new URL parser API. We need to keep up with those changes in our implementation of the API.

  • Add space to class string of iterator objects (whatwg/webidl@4fcfaea) (lib: add space to class string of iterator objects and updated tests accordingly #17558)
    Change the 'URLSearchParamsIterator' in

    defineIDLClass(URLSearchParamsIteratorPrototype, 'URLSearchParamsIterator', {
    to 'URLSearchParams Iterator', and update tests if necessary.

  • Percent-encode additional characters in "fragment state" (whatwg/url@7a3c69f) (url: added url fragment lookup table #17627)

    • Add a new FRAGMENT_ENCODE_SET lookup table like

      node/src/node_url.cc

      Lines 215 to 280 in e55b7d6

      static const uint8_t C0_CONTROL_ENCODE_SET[32] = {
      // 00 01 02 03 04 05 06 07
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 08 09 0A 0B 0C 0D 0E 0F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 10 11 12 13 14 15 16 17
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 18 19 1A 1B 1C 1D 1E 1F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 20 21 22 23 24 25 26 27
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 28 29 2A 2B 2C 2D 2E 2F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 30 31 32 33 34 35 36 37
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 38 39 3A 3B 3C 3D 3E 3F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 40 41 42 43 44 45 46 47
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 48 49 4A 4B 4C 4D 4E 4F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 50 51 52 53 54 55 56 57
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 58 59 5A 5B 5C 5D 5E 5F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 60 61 62 63 64 65 66 67
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 68 69 6A 6B 6C 6D 6E 6F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 70 71 72 73 74 75 76 77
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 78 79 7A 7B 7C 7D 7E 7F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x80,
      // 80 81 82 83 84 85 86 87
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 88 89 8A 8B 8C 8D 8E 8F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 90 91 92 93 94 95 96 97
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 98 99 9A 9B 9C 9D 9E 9F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // A0 A1 A2 A3 A4 A5 A6 A7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // A8 A9 AA AB AC AD AE AF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // B0 B1 B2 B3 B4 B5 B6 B7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // B8 B9 BA BB BC BD BE BF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // C0 C1 C2 C3 C4 C5 C6 C7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // C8 C9 CA CB CC CD CE CF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // D0 D1 D2 D3 D4 D5 D6 D7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // D8 D9 DA DB DC DD DE DF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // E0 E1 E2 E3 E4 E5 E6 E7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // E8 E9 EA EB EC ED EE EF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // F0 F1 F2 F3 F4 F5 F6 F7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // F8 F9 FA FB FC FD FE FF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80
      };
      but with bits corresponding to 0x20, 0x22, 0x3C, 0x3E, and 0x60 set in additional to what's already set in C0_CONTROL_ENCODE_SET, per spec.
    • Replace C0_CONTROL_ENCODE_SET with the new lookup table under kFragment state in URL::Parse().
    • Port web-platform-tests/wpt@cb0662b to
      test/fixtures/url-setter-tests.js and test/fixtures/url-tests.js.
    • Make corresponding changes in the documentation in doc/api/url.md
@TimothyGu TimothyGu added help wanted Issues that need assistance from volunteers or PRs that need help to proceed. mentor-available whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Dec 8, 2017
@TimothyGu TimothyGu changed the title Sync WHATWG URL parser with upstream standard changes Sync WHATWG URL parser with upstream standards Dec 8, 2017
@haejinjo
Copy link

haejinjo commented Dec 8, 2017

I'll handle the first one guys, don't you worry

@Kimeiga
Copy link
Contributor

Kimeiga commented Dec 8, 2017

Shit I really wanted that first one hahaha

@Kimeiga
Copy link
Contributor

Kimeiga commented Dec 8, 2017

Where would the new fragment encode set that we're building be used?

@TimothyGu
Copy link
Member Author

@Kimeiga Glad you decided to take this up! It will be used under case kFragment in the URL::Parse function.

@Kimeiga
Copy link
Contributor

Kimeiga commented Dec 8, 2017

I've completed the first two steps of the second task, but I'm not sure what to do for the third step. Can you lend a hand @TimothyGu ?

@TimothyGu TimothyGu reopened this Dec 13, 2017
@targos targos reopened this Dec 13, 2017
Kimeiga added a commit to Kimeiga/node that referenced this issue Dec 14, 2017
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

Fixes: nodejs#17540
Trott pushed a commit to Trott/io.js that referenced this issue Dec 15, 2017
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

PR-URL: nodejs#17627
Fixes: nodejs#17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Jan 8, 2018
PR-URL: #17558
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Jan 8, 2018
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

PR-URL: #17627
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue May 22, 2018
PR-URL: #17558
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue May 22, 2018
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

PR-URL: #17627
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Jun 14, 2018
PR-URL: #17558
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this issue Jun 14, 2018
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

PR-URL: #17627
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
rvagg pushed a commit that referenced this issue Aug 16, 2018
PR-URL: #17558
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Anatoli Papirovski <apapirovski@mac.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
rvagg pushed a commit that referenced this issue Aug 16, 2018
Percent-encoded additional characters in fragment state with new
FRAGMENT_ENCODE_SET lookup table. The fragment percent-encode set
includes the C0 control percent-encode set and code points U+0020,
U+0022, U+003C, U+003E, and U+0060.

PR-URL: #17627
Fixes: #17540
Reviewed-By: Timothy Gu <timothygu99@gmail.com>
Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
4 participants