Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How we can remove sift.js #11184

Closed
Moocar opened this issue Jan 21, 2019 · 6 comments
Closed

How we can remove sift.js #11184

Moocar opened this issue Jan 21, 2019 · 6 comments
Assignees

Comments

@Moocar
Copy link
Contributor

Moocar commented Jan 21, 2019

Summary

Exploration on how we can use loki.js for 100% of queries. Resulted from issue #11008

Problem and background

Currently, we have two methods of filtering down the nodes in a graphql query. redux/run-sift (sift) and loki/nodes-query.js (loki). The underlying libraries (sift and loki) are both pure data query engines. They simply filter down an array of objects using the specified filter. Loki is much faster for large data sets because it can use indexes. Therefore we hope to eventually use loki for all queries.

But in Gatsby, not all nodes have pure data. Some contain fields that must be "resolved" to get their real value. These include GraphQL fields declared by plugins (using the setFieldsOnGraphQLNodeType), or linked fields (field names ending in ___NODE). This means that before running loki or sift, we must first iterate through all nodes and call resolve on all these fields.

When a query is run, we currently check if the filter includes these unresolved fields. If it doesn't, we execute using loki, but if it does, we use run-sift to perform the query. It will call resolve on all the fields, and pass the resulting "resolved nodes" into the sift query (see resolveNodes function). It also caches the resolved nodes (cache key is by type, nodes length, and filter args) so that we don't perform the same work if the query is performed again.

Proposed Solution

The ideal approach is to "resolve" all fields on all nodes in loki before running queries. This would mean we wouldn't need to store any "resolved nodes" caches, resulting in a smaller memory footprint and less complex code. Here's how it would work:

  • Create a var called resolvedFields in db/nodes-query that stores whether all nodes of a type have had their lazy field resolve function called and the return result set in the loki collection. This var mirrors the structure of lazy-fields (Map of typeName -> set of fields)
  • When performing a query, if one or more of the query's filter fields are NOT contained in resolvedFields,
    • then we need to resolve those fields. For each node in the collection
      • Call the resolve function for each query filter field (can probably reuse run-sift resolveRecursive)
      • Update the node's field in the loki collection with the returned value.
      • Add the field to the resolvedFields.type Set to flag that all fields of that type have been resolved.
  • If all the query filter's fields are contained in resolvedFields, then we know that all nodes in the loki collection have had their required fields resolved
  • Now run the loki query as normal

Considerations/Other Stuff:

  • In gatsby develop, if a node is added, it will also need resolving. So consider running the above logic on the node whenever the CREATE_NODE event is emitted (ugly, might need more thought).
  • Another case is if a node's content changes. In its current implementation run-sift doesn't actually cover this since it caches the resolved nodes based on the size of the nodes collection. We could improve on this.
  • The case of a node being deleted is automatically handled since it won't be in the loki type collection anymore.
  • I could be wrong, but I think run-sift is performing an unneeded resolveRecursive when handling the case where the filter is a single {$eq: id} (here).
  • Could storing the resolved field's return values against the node in loki have unintended consequences?
  • This does not address the use of sift for the elemMatch operator as raised in chore(gatsby): update Sift to version 7 #11008.
  • The above solution should mean we can delete run-sift.js
  • Should result in smaller memory footprint (less caches)
@gatsbot gatsbot bot added the stale? Issue that may be closed soon due to the original author not responding any more. label Feb 16, 2019
@gatsbot
Copy link

gatsbot bot commented Feb 16, 2019

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

Thanks for being a part of the Gatsby community! 💪💜

@KyleAMathews
Copy link
Contributor

Not stale

@stefanprobst
Copy link
Contributor

With this and this we should be able to solve the elemMatch issue. As for storing the resolved nodes directly in the Loki db (if I understand the proposed solution correctly?), I don't think this will work because we can't JSON-serialize links (and circular refs), i.e. when a foreign-key ___NODE field has been expanded to a full node.

@m-allanson m-allanson added not stale and removed stale? Issue that may be closed soon due to the original author not responding any more. labels Mar 1, 2019
@Moocar
Copy link
Contributor Author

Moocar commented Mar 3, 2019

@stefanprobst Nice work adding $elemMatch. And argh, you're right about linked fields. One possible fix would be to have the resolver return the full object, but at JSON serialization time, we save the linked ID instead of the full object. And then on deserialization, we do the opposite. But this sounds pretty messy.

@me4502
Copy link
Contributor

me4502 commented Oct 2, 2019

An issue I've hit recently IMO is a fair bit of a blocker for entirely moving to lokiJS as well - #17935

@pvdz
Copy link
Contributor

pvdz commented Jun 16, 2020

How we can remove sift.js
Exploration on how we can use loki.js for 100% of queries. Resulted from issue #11008

So after some time the answer turns out: we remove both :)

Sift was disabled in 2.23.0 (code concretely to be dropped later this week) and loki has been removed a while ago. I think we can close this issue.

@pvdz pvdz closed this as completed Jun 16, 2020
@LekoArts LekoArts removed the not stale label Jul 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants