Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client-only routes: catch-all matchPath has higher "specificity" than static paths #9705

Closed
juliansthl opened this issue Nov 5, 2018 · 17 comments
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby

Comments

@juliansthl
Copy link
Contributor

Description

If I specify a wildcard matchPath for the root path on my index page matchPath: '/*', the index page is always served, ignoring any "static" paths.

Steps to reproduce

Demo on CodeSandbox

Based on gatsby-starter-default, I'm adding this to gatsby-node.js:

const path = require('path');

exports.onCreatePage = async ({ page, actions }) => {
  const { createPage, deletePage } = actions;

  return new Promise(resolve => {
    const oldPage = { ...page };

    if (page.path === '/') {
      
      // Replace index
      deletePage(oldPage);

      createPage({
        ...page,
        matchPath: '/*',
      });

      // Create sub page
      createPage({
        path: `/sub/`,
        matchPath: `/sub/*`,
        component: path.resolve(`./src/templates/sub.js`),
      });
    }

    resolve();
  });
};

Expected result

  • When browsing to / or /foo I would expect to see the index page ./src/pages/index.js
  • When browsing to /sub/ or /sub/foo I would expect to see the sub page template ./src/templates/sub.js

I'm quite sure that this worked in the past and this comment from @pieh makes me believe this is the way the matching is supposed to work.

Actual result

gatsby always serves the index page ./src/pages/index.js

Environment

System:
    OS: macOS 10.14
    CPU: x64 Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
    Shell: 5.5.1 - /usr/local/bin/zsh
  Binaries:
    Node: 10.6.0 - ~/.nvm/versions/node/v10.6.0/bin/node
    Yarn: 1.9.4 - /usr/local/bin/yarn
    npm: 6.4.1 - ~/.nvm/versions/node/v10.6.0/bin/npm
  Browsers:
    Chrome: 70.0.3538.77
    Firefox: 62.0
    Safari: 12.0
  npmPackages:
    gatsby: ^2.0.19 => 2.0.38
    gatsby-image: ^2.0.15 => 2.0.19
    gatsby-plugin-manifest: ^2.0.5 => 2.0.7
    gatsby-plugin-offline: ^2.0.5 => 2.0.11
    gatsby-plugin-react-helmet: ^3.0.0 => 3.0.1
    gatsby-plugin-sharp: ^2.0.7 => 2.0.11
    gatsby-source-filesystem: ^2.0.4 => 2.0.7
    gatsby-transformer-sharp: ^2.1.4 => 2.1.7
  npmGlobalPackages:
    gatsby-cli: 2.4.3

Thanks!!

@juliansthl juliansthl changed the title client-only routes: catch-all matchPath has higher "specificity" than static paths Client-only routes: catch-all matchPath has higher "specificity" than static paths Nov 5, 2018
@pieh pieh added type: bug An issue or pull request relating to a bug in Gatsby help wanted Issue with a clear description that the community can help with. labels Nov 5, 2018
@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

Hey, this definitely is not working when there are multiple "nested" matchPaths (but static paths do work (i.e. if you navigate to /page-2 in your sandbox demo).

We would need to change this sorting

.sortBy(p => `${p.matchPath ? 1 : 0}${p.path}`)
so /sub/* would end up before /* (but still keep consistent order between builds).

Here's the input pages we need to sort and we want end result to have last 2 entries swapped so least specific matchPath is last in the array:

[
    {
        "path": "/404.html"
    },
    {
        "path": "/404/"
    },
    {
        "path": "/page-2/"
    },
    {
        "path": "/",
        "matchPath": "/*"
    },
    {
        "path": "/sub/",
        "matchPath": "/sub/*"
    }
]

@juliansthl
Copy link
Contributor Author

Hi @pieh ,

Ah, that makes sense.

I quickly played around with this and I think this could be a fix for pages-writer.js:

replace

.sortBy(p => `${p.matchPath ? 1 : 0}${p.path}`)

with

.orderBy(['matchPath', 'path'], ['desc', 'asc'])

In my tests this orders less specific matchPaths after more specific ones.

If this makes sense, I'd be happy to try create a PR for this.

Thanks so far!!

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

Potentially there is some scanario where this will fail with named URL parameter (not sure if we handle this in gatsby, but @reach/router support :param syntax):

[
    {
        "path": "/foo/",
        "matchPath": "/foo/:test"
    },
    {
        "path": "/foo-specific/",
        "matchPath": "/foo/34/*"
    }
]

will put first one before second one meaning we won't be able to access /foo-specific page

https://runkit.com/pieh/5be0406030d13600124d7071

@juliansthl
Copy link
Contributor Author

Ah, sure, I didn't think of this.

This should work (but might be overcomplicating…):
https://runkit.com/juliansthl/5be042e72513440012ffe9d1

const results = _(input)
    .map(p => ({
        ...p,
        hasMatchPath: p.matchPath ? 1 : 0,
        matchPathSegments: p.matchPath && p.matchPath.split('/').length,
    }))
    .orderBy(['hasMatchPath', 'matchPathSegments', 'path'], ['asc', 'desc', 'asc'])
    .map(p => _.pick(p, ['path', 'matchPath']))
    .value()

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

I don't think this is overcomplicating. And this looks good - maybe even we could combine hasMatchPath and matchPathSegments:

const results = _(input)
    // Ensure pages keep the same sorting through builds
    // and sort pages with matchPath to end so explicit routes
    // will match before general.
    //.sortBy(p => `${p.matchPath ? 1 : 0}${p.path}`)
    .map(p => ({
        ...p,
        //hasMatchPath: p.matchPath ? 1 : 0,
        matchPathSegments: p.matchPath ? -p.matchPath.split('/').length : 0,
    }))
    .orderBy(['matchPathSegments', 'path'], ['desc',  'asc'])
    .map(p => _.omit(p, ['matchPathSegments']))
    .value()

I wonder if creating 2 intermediate arrays (with 2 .map) has any performance implication vs using sortBy. In any case in the second .map we would want to _.omit added fields instead of picking (there are some extra fields in pages objects that I just didn't include in my example arrays as they don't affect sorting).

@juliansthl
Copy link
Contributor Author

juliansthl commented Nov 5, 2018

Awesome!

matchPathSegments: p.matchPath ? -p.matchPath.split('/').length : 0,

If I'm not mistaken, this would result in "/*" (matchPathSegments: -2) coming before "/sub/*" (matchPathSegments: -3) though, right?

We could do something like this, but I don't feel too good about hardcoding that number:
matchPathSegments: p.matchPath ? p.matchPath.split('/').length : 999999999999999,

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

We can use something like Number.MAX_SAFE_INTEGER

@juliansthl
Copy link
Contributor Author

I tried to do some tests in order to see what impact on performance the creation of the intermediate arrays has: https://runkit.com/juliansthl/5be05a512513440012fffa3c (scroll to bottom)

It looks like the sorting function with the proposed change takes around 2x as long as the original version (~16s vs. ~8s for 1000 random entries).
No idea what this means in relation to overall build time in a real world use-case.

Is there a reason the entries have to be sorted alphabetically or would it be sufficient to sort them in order of their matchPath specificity?

@juliansthl
Copy link
Contributor Author

juliansthl commented Nov 5, 2018

I think I found another solution that doesn't come with the performance penalty and still orders the entries alphabetically:

const results = _(input)
    // Ensure pages keep the same sorting through builds
    // and sort pages with matchPath to end so explicit routes
    // will match before general.
    //.sortBy(p => `${p.matchPath ? 1 : 0}${p.path}`)
    .sortBy(p => `${p.matchPath
            ? String.fromCharCode(122 - p.matchPath.split('/').length) // 122 = 'z'
            : 0
        }${p.path}`)
    .value()

I did some tests and the ordering doesn't seem to take significantly longer than originally:
https://runkit.com/juliansthl/5be06c826ca8210012bca53c

This would reliably work for up to 26 "matchPathSegments", which should be ok for real world scenarios.

What do you think?

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

Something like this would also work (this is your previous code just without extra mapping and added function to generate sorting values for entries):

const results = _(input)
    .orderBy(p => [
        p.matchPath ? 1 : 0, 
        p.matchPath && p.matchPath.split('/').length,
        p.path
        ]
    , ['asc', 'desc', 'asc'])
    .value()

@juliansthl
Copy link
Contributor Author

Ah, nice!

What is weird though, is that this operation now takes significantly longer (even longer than the one with the intermediate arrays): https://runkit.com/juliansthl/5be071af6ca8210012bcacb6

Thanks for all your help so far!!

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

So after adding some benchmarks for our relatively small input: https://runkit.com/pieh/5be04dba39eeb40015e74baf your most recent implementation definitely wins out

@juliansthl
Copy link
Contributor Author

So after adding some benchmarks for our relatively small input: https://runkit.com/pieh/5be04dba39eeb40015e74baf your most recent implementation definitely wins out

Awesome. Thanks for the benchmarking!
Do you see the limitation of max. 26 "matchPathSegments" as a deal breaker or anything else that should be improved?

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

Do we actually need to map to a-z range or can we use entire ascii range for sorting?

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

we could also format number to be 2 digits instead of mapping to a-z

@pieh
Copy link
Contributor

pieh commented Nov 5, 2018

I don't think it's likely to hit 26 path fragments limit, but would be great if we didn't have such limitation

@juliansthl
Copy link
Contributor Author

juliansthl commented Nov 5, 2018

we could also format number to be 2 digits instead of mapping to a-z

yes, you're absolutely right.

how about this?

const results = _(input)
    // Ensure pages keep the same sorting through builds
    // and sort pages with matchPath to end so explicit routes
    // will match before general.
    //.sortBy(p => `${p.matchPath ? 1 : 0}${p.path}`)
    .sortBy(p => `${p.matchPath
            ? 9999 - p.matchPath.split('/').length
            : '0000'
        }${p.path}`)
    .value()

pieh pushed a commit that referenced this issue Nov 6, 2018
As per [#9705](#9705), this changes the order of the pages in pages-writer.js in order to account for the matchPath specificity and therefore make "nested" matchPaths work.

The proposed fix `sortByNumeric` is the most performant solution I could come up with:
[Benchmarks](https://runkit.com/juliansthl/5be08b8f2513440012001807)

Thanks a lot to @pieh for all the help!

(this is my first PR, please let me know if I have to adjust anything)
gpetrioli pushed a commit to gpetrioli/gatsby that referenced this issue Jan 22, 2019
As per [gatsbyjs#9705](gatsbyjs#9705), this changes the order of the pages in pages-writer.js in order to account for the matchPath specificity and therefore make "nested" matchPaths work.

The proposed fix `sortByNumeric` is the most performant solution I could come up with:
[Benchmarks](https://runkit.com/juliansthl/5be08b8f2513440012001807)

Thanks a lot to @pieh for all the help!

(this is my first PR, please let me know if I have to adjust anything)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issue with a clear description that the community can help with. type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

No branches or pull requests

2 participants