Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modularize beacon node backend #4718

Open
wants to merge 98 commits into
base: unstable
Choose a base branch
from

Conversation

eserilev
Copy link
Collaborator

@eserilev eserilev commented Sep 9, 2023

Issue Addressed

#4669

Proposed Changes

Modularize the beacon node backend to make it easier to add new database implementations

Additional Info

The existing KeyValueStore trait already does some abstraction, however the codebases assumes the KV store is always LevelDB.

I created a BeaconNodeBackend type that implements the KeyValueStore and ItemStore traits. I then replaced all references of LevelDb to the new BeaconNodeBackend type. Within the BeaconNodeBackend type I used the cfg macro which should allow us to switch between different database implementations via config changes.

@eserilev eserilev changed the title [WIP] Modularize beacon node backend Modularize beacon node backend Sep 9, 2023
@eserilev
Copy link
Collaborator Author

eserilev commented Jan 30, 2024

I'm currently working on adding Redb to the beacon node backend. Here are some notes I've taken so far on my work

Redb

Theres a few difference's between LevelDB and Redb that I'll describe below

fsync

Info about redb transaction durability can be found here
https://docs.rs/redb/latest/redb/enum.Durability.html#

For now I translate WriteOptions.sync = true to be Durability::Immediate and WriteOptions.sync = false to be Durability::Eventual

We could decide to use Paranoid instead of Immediate, but I think the additional guarantees of Paranoid are probably unecessary for our use case.

Compaction

LevelDB allows for compacting over a range of values. Redb only allows for compacting the full db. It seems like our LevelDB implementation compacts across all keys in the DB. If thats the case we should be all good here.

Tables

Redb introduces the concept of tables. We can either use one table for everything, or spin up tables by column name. I'm currently going down the path of spinning up tables by column name. This causes issues with do_atomically. We may need to make changes to this function to also accept a col: &str as an argument.

@eserilev
Copy link
Collaborator Author

eserilev commented Aug 21, 2024

This PR is in a good state for a review. We've looked at some stats for the redb node running on gown and compared it to another holesky node running leveldb. Db read i/o seems to be on par, and we actually saw lower db write i/o on redb. One thing to pay attention to is compaction. We dug into some logs and didnt see any drastic degredation in performance during compaction for redb. Just note that compaction does lock the db, so its definitely something to keep an eye on.

@@ -0,0 +1,117 @@
use crate::beacon_chain::BeaconChainTypes;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file and beacon_node/beacon_chain/src/schema_change/migration_schema_v19.rs seem to be added by mistake?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed those files

}

/// Compact all values in the states and states flag columns.
pub fn compact(&self) -> Result<(), Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add timing metrics for compact, either here on in the caller

Copy link
Collaborator Author

@eserilev eserilev Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beacon_node/store/src/database/interface.rs Show resolved Hide resolved
let table = read_txn.open_table(table_definition)?;
table.range(from..)?.map(|res| {
let (k, _) = res?;
K::from_bytes(k.value())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read metrics should cover the iterator too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column,
&start_period.to_le_bytes(),
move |sync_committee_bytes, _| {
let Ok(sync_committee_period) = u64::from_ssz_bytes(sync_committee_bytes) else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to silence this error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably safe to silence this error, but I'm now logging the error just in case
8ed2ff4

Comment on lines +104 to +106
pub fn sync(&self) -> Result<(), Error> {
self.put_bytes_sync("sync", b"sync", b"sync")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it gets called during the hot -> cold db migration.

// Flush to disk all the states that have just been migrated to the cold store.

My guess is that we do a quick write to the cold db just to ensure that there aren't any hanging state changes before starting the migration? sync is a dummy table that stores a single, non useful, key/value pair.

beacon_node/store/src/database/redb_impl.rs Outdated Show resolved Hide resolved
Comment on lines 182 to 183
PutKeyValue(String, Vec<u8>, Vec<u8>),
DeleteKey(String, Vec<u8>),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the tuple

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key.matches_column(column) && predicate(trimmed_key, trimmed_start_key)
})
.map(move |(bytes_key, value)| {
let key = bytes_key.remove_column_variable(column).ok_or_else(|| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add the metric here and increase DISK_DB_READ_BYTES with value to have the same meaning as the the usage with the per-item gettter

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok(Box::new(
iter.take_while(move |key| key.matches_column(column))
.map(move |bytes_key| {
metrics::inc_counter_vec(&metrics::DISK_DB_READ_COUNT, &[column.into()]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not ideal to count reads of keys (this function) as reads of entries (other uses of DISK_DB_READ_COUNT). If this iterators just read keys, maybe we should use a different metric DISK_DB_KEY_READ_COUNT or similar

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep makes sense, I've added separate metrics for db key read

@eserilev
Copy link
Collaborator Author

I've opened a separate issue around temp state cleanup performance for Redb. I'm going to handle that in a separate PR if thats ok with you guys

#6332

@eserilev
Copy link
Collaborator Author

eserilev commented Sep 1, 2024

Compaction in redb v2.1.2 is craaazy slow: cberner/redb#852

For now lets just make sure we stay on 2.1.1 until the issue is resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database ready-for-review The code is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants