-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrency control in core-data #26325
Comments
Also, I'm acting as a reporter here, but I'm also happy to spin up a PR exploring a solution (or many PRs). |
I considered different strategies of removing the interference between resolvers and did a pairwise analysis of how different types of API operations interact when concurrent: Concurrent API operations on two entity recordsSame type, different IDsSame type, same IDRecord operations vs list operationsPartial results as in we need to re-request the data after the writes are finished: Some
That being said, I don't think the initial fix needs to include any of that. |
Tough problem! Thanks for the clear write-up, though. Are there any other libraries that we can look to for inspiration? It sounds like we need to make it so that, instead of calling
How common is this? Maybe it's not such a bad trade-off. |
@noisysocks my thoughts exactly! Only I wouldn't enqueue just network requests but everything asynchronous or affected by asynchronicity. If only network requests themselves are stacked there are less possible outcomes, but there are still many. For example: Note how the failed operation overwritten the result of the successful one. I can come up with more example like that. The point is that to get rid of interferences I would considering entire segments of code as "critical sections" or "atomic operations": This makes things conceptually simple by taking all the asynchronicity out of the equation - interfering operations are always executed serially and the outcomes are easily predictable as nothing else updates parts of the state they depend on. |
I'm exploring the idea of atomic operations and locks in #26389 |
Surfacing this comment here:
|
#26389 addressed the bulk of this problem. It would still be amazing to implement a lock-less solution or be more "optimistic" about different operations, but the bug part of this issue is now fixed 🎉 |
The problem
While investigating #22127, I discovered what appears to be a massive problem with API interactions in core-data.
There seems to be no concept of concurrency control.
A very simple example is that if I call
saveEntityRecord
twice on the same record, and it will spark two concurrent POST requests. One of them wins, but the client doesn't know which one. Similarly, I can triggersaveEntityRecord
anddeleteEntityRecord
at the same time and the result will vary depending on the exact timing. That's not very common at the moment, but obvious in #22127.Let's talk about a common problem. Consider this minimal component that renders some data retrieved using
getEntityRecords
and saves changes usingsaveEntityRecord
:https://github.com/samueljseay/gutenberg/blob/c852bee7ce33498c3ee7faca743fac9e473bb03c/test-plugins/core-data/js/index.js#L1-L57
Fetching data
How does the resolution flow look like? To answer that question, I will use a chart since reasoning about timing issues without one is just too hard. Time flows downwards:
Easy peasy, first withSelect only gets an empty list because nothing is stored yet, then resolver kicks in asynchronously, talks to the API, updates the store, and the store re-runs withSelect handler once the data is available.
Saving data
Each time the user clicks a checkbox, the component dispatches
saveEntityRecords()
and triggers the following chain of events:There's already a problem here,
saveEntityRecord()
callsgetRawEntityRecord()
, which doesn't consider the record with a specific id to be resolved even though a list of records from/wp/v2/books
was loaded earlier on. Instead, a resolver is triggered, and an explicit GET request is issued to/wp/v2/books/1
- that happens around the same time as thePUT
request to persist changes. Depending on the timing of both, GET results could override the lastRECEIVE_ITEMS
triggered bysaveEntityRecord()
. In that case, the user would see some flickering on the screen, and the store would end up with stale data.Fetching and saving data combined
What is really interesting, though, is what happens when we combine fetching AND saving. Let's take a look:
Woah, that's a lot of arrows and boxes! What happens on that chart is:
withSelect
as well.This is pretty fragile - there are multiple requests started around the same time, and they may be resolved in any order. If
GET /wp/v2/books
is resolved last, the store is stuck with stale data until the page is refreshed.Batch processing is affected tenfold as more entity records in the mix means more timing issues.
Possible solutions
Atomic operations
I really don't like how
saveEntityRecord()
triggers a bunch of side effects while everything is still up in the air. And even ifsaveEntityRecord
didn't, the user could. The point is that writes and reads mix together in unpredictable ways.A quick, immediate solution
We could prevent concurrent conflicting operations:
shared lock
andexclusive lock
maybe?). Selectors would "just work", but API reads (resolvers) and API writes would never run clashing operations at the same time.Also, to reduce the number of factors in the mix we could apply optimistic updates a bit differently. Namely, instead of using
receiveEntityRecords
, we could leverageeditedEntityRecord
by just adding one more edit signifying a "checkpoint". It would work like this: 1. Edits before the checkpoint are frozen as long as the checkpoint is in place, 2. Successful save discards all the edits before the checkpoint and replaces the entity record with the server response 3. Failed save simply removes the checkpoint, no rollback is needed.One other solution is to assume
core/data
is a low-level API that just does what it's told and shift the burden of concurrency control onto the consumer. As in: assume it's the developer's responsibility to avoid conflictingselect()
anddispatch()
calls. Even in this scenario,core/data
gets in its own way by triggering resolvers when it shouldn't. Also, the pitfalls are very generic so the logic implemented by each and every consumer would be almost the same. An alternative would be to build a higher-level API on top ofcore/data
that would understand locking and concurrency.Long term considerations
The above would solve the problem for now and may even suffice for a bit longer. It has some downsides though:
While 1 could potentially be addressed by 2, there is another simple solution: squash enqueued operations when possible. E.g. if there are two updates waiting to be processed, we could perform just one. Thee updates and a delete? Perform only the delete.
Re-use fetched data
getEntityRecords( 'postType', 'book' );
followed bygetEntityRecords( 'postType', 'book', 4 );
could ideally trigger just a single request - the first one to/wp/v2/books
. Ideally that would be the case even if the list request is still in progress.cc @youknowriad @mcsf @gziolo @draganescu @noisysocks @talldan @tellthemachines @kevin940726 @jorgefilipecosta @mtias @samueljseay @ellatrix @TimothyBJacobs
The text was updated successfully, but these errors were encountered: