Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: concurrently process lotus shed backfill events and recompute state #12330

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

akaladarshi
Copy link
Contributor

Related Issues

Fixes: #11744

Proposed Changes

  • Concurrently backfill the events in lotus-shed backfill-events cmd.
  • Add StateRecomputeTipset in StateAPI which recomputes the tipset incase events are not available

@akaladarshi
Copy link
Contributor Author

akaladarshi commented Jul 31, 2024

@aarshkshah1992 As I was going through SQLite, I noticed that it doesn’t support concurrent writes. It only supports single writer and multiple readers using WAL mode, which doesn’t serve our purpose.

Could you provide some suggestions regarding this? One possible solution I can think of is storing data in memory and writing it later.

When I say write later I mean we can have a channel to read the processed events and write to the DB using that channel (just a thought not sure about the feasibility)


// StateRecomputeTipset recomputes the state of the given tipset, without trying to lookup a pre-computed result
// in the chainstore.
StateRecomputeTipset(ctx context.Context, tsk types.TipSetKey) (cid.Cid, error) //perm:read
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only way to achieve this via the rpc? do we not expose enough of the pieces to recompute without adding a new API? We should have a bias against adding new RPC APIs if we can help it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also not sure about adding a new API as StateCompute was already there, that's why I raised a draft PR to gather everyone's thoughts, I will try to find some other way to do it by using existing APIs.

@rvagg
Copy link
Member

rvagg commented Aug 1, 2024

Yes, so the multiple-writer problem is precisely why this needs to be removed from lotus-shed and integrated into lotus itself. Currently we have people using this command but wrapping it in a retry loop so that if it fails a write (because of concurrent) then it just retries. But that risks the other side of this where the lotus daemon itself could fail a write but do so relatively silently so you have a hole in your events because you're backfilling older ones!

But the workaround is something like:

  1. turn on events but turn off historical events in your daemon config
  2. start the daemon
  3. run lotus-shed to backfill
  4. stop the daemon
  5. turn on historical events
  6. start the daemon

And as long as steps 3 and 4 don't run over an epoch boundary and you've backfilled right up to the current epoch, then it should be good.

But obviously this sucks as a method and is not something we can really recommend.

@rvagg
Copy link
Member

rvagg commented Aug 1, 2024

#12116 has some of the background of this discussion btw

I'm wondering here whether we should just pivot this PR to do the work inside lotus daemon. At least make a start at the backfill task. I wrote up some thoughts on what could be done here: #12116 (comment) do you think you'd be interested in trying to make a start on that? We (you @aarshkshah1992 and I) should chat more about this I think. Maybe have a video call but I'll let @aarshkshah1992 react here first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option to back-fill events by re-execution of messages
2 participants