Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/state: copy the snap when copying the state #22340

Merged
merged 2 commits into from
Feb 18, 2021

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Feb 17, 2021

This PR attempts to fix an issue that was found on YoloV3, where the sealer does not deliver on the snap/1 protocol. The reason for this, is that the state which the miner operates on does not have access to the snap, and thus is forced to use the trie-backend for reading and cannot write updates to the snapshot tree

On mainnet, that would be bad, because whenever a miner mines a block, it would cause a gap in the snapshot tree, making the miner fall out of sync with the snapshot and essentially nuking the snapshot functionality. Perhaps also going into regenerate-mode again and again.

This PR is a hacky first attempt to fix it. It does address the issue, but perhaps not optimally so, and there's an open question how we should handle the case where the snapDestructs are non-empty (if that could ever happen?).

To repro this case, I used a tiny private clique network and sync between two nodes locally. I can provide the files to repro it if anyone wants to give it a spin.

state.snapAccounts = make(map[common.Hash][]byte)
state.snapStorage = make(map[common.Hash]map[common.Hash][]byte)
if len(s.snapAccounts)+len(s.snapDestructs)+len(s.snapStorage) != 0 {
panic("Oy vey!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect this to blow up on pre-byzantium with intermediate root calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... deep copy then?

// If we copy, we need to ensure concurrency safety.
// If we don't copy, we run the risk of consensus breaking.
// In theory, as the state is copied, it's still 'fresh', and these
// should be empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The miner copies the state after running the transactions, but before claiming the block reward. It does this so it ca pile more txs on top. In that case, the state is fresh only if no tx populated it (i.e. post byzantium)

}
state.snapAccounts = make(map[common.Hash][]byte)
for k, v := range s.snapAccounts {
state.snapAccounts[k] = v
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: do we need to also copy the v byteslice?

Copy link
Member

@karalabe karalabe Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Important to know though that these values get inserted verbatim into a diff layer and when retrieving storage at least (or account RLP too) those get returned again verbatim. So snapshot.Storage(0x)[2] = 2 would modify it. That said, I don't know of any reason why you'd do such a thing :D

The snapshot explicitly warns:

// Note the returned slot is not a copy, please don't modify it.
func (dl *diffLayer) Storage(accountHash, storageHash common.Hash) ([]byte, error) {

for k, v := range s.snapStorage {
temp := make(map[common.Hash][]byte)
for kk, vv := range v {
temp[kk] = vv
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here, copy byteslice or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as account

}
}

state.snaps = s.snaps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for doing this a second time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... twice as secure ... (doh)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, squashpushed

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@karalabe
Copy link
Member

Perhaps to note for posterity, post-byzantium it should not ever happen that the maps contain something because we only ever commit once at the end of the block. Pre-byzantium however we do intermediate root after every txs which essentially flushes into these sets, thus we need the deep copy.

Debatable long term whether we should stop support for mining pre-byzantium (or pre-your-fav-fork), but that's for another day.

@karalabe karalabe added this to the 1.10.0 milestone Feb 18, 2021
@karalabe karalabe merged commit 52e5c38 into ethereum:master Feb 18, 2021
tony-ricciardi pushed a commit to tony-ricciardi/go-ethereum that referenced this pull request Jan 20, 2022
Cherry pick bug fixes from upstream for snapshots, which will enable higher transaction throughput. It also enables snapshots by default (which is one of the commits pulled from upstream).

Upstream commits included:

68754f3 cmd/utils: grant snapshot cache to trie if disabled (ethereum#21416)
3ee91b9 core/state/snapshot: reduce disk layer depth during generation
a15d71a core/state/snapshot: stop generator if it hits missing trie nodes (ethereum#21649)
43c278c core/state: disable snapshot iteration if it's not fully constructed (ethereum#21682)
b63e3c3 core: improve snapshot journal recovery (ethereum#21594)
e640267 core/state/snapshot: fix journal recovery from generating old journal (ethereum#21775)
7b7b327 core/state/snapshot: update generator marker in sync with flushes
167ff56 core/state/snapshot: gethring -> gathering typo (ethereum#22104)
d2e1b17 snapshot, trie: fixed typos, mostly in snapshot pkg (ethereum#22133)
c4deebb core/state/snapshot: add generation logs to storage too
5e9f5ca core/state/snapshot: write snapshot generator in batch (ethereum#22163)
18145ad core/state: maintain one more diff layer (ethereum#21730)
04a7226 snapshot: merge loops for better performance (ethereum#22160)
994cdc6 cmd/utils: enable snapshots by default
9ec3329 core/state/snapshot: ensure Cap retains a min number of layers
52e5c38 core/state: copy the snap when copying the state (ethereum#22340)
a31f6d5 core/state/snapshot: fix panic on missing parent
61ff3e8 core/state/snapshot, ethdb: track deletions more accurately (ethereum#22582)
c79fc20 core/state/snapshot: fix data race in diff layer (ethereum#22540)

Other changes
Commit f9b5530 (not from upstream) fixes an incorrect default DatabaseCache value due to an earlier bad merge.

Tested
Automated tests
Testing on a private testnet
Backwards compatibility
Enabling snapshots by default is a breaking change in terms of the CLI flags, but will not cause backwards incompatibility between the node and other nodes.

Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Co-authored-by: gary rong <garyrong0905@gmail.com>
Co-authored-by: Melvin Junhee Woo <melvin.woo@groundx.xyz>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Edgar Aroutiounian <edgar.factorial@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants