Skip to content

Latest commit

 

History

History
108 lines (84 loc) · 4.54 KB

README.md

File metadata and controls

108 lines (84 loc) · 4.54 KB

Pot

A concise storage format, written for BonsaiDb.

Pot forbids unsafe code crate version Live Build Status HTML Coverage Report for main branch Documentation for main branch

Pot is an encoding format used within BonsaiDb. Its purpose is to provide an encoding format for serde that:

  • Is self-describing.

  • Is safe to run in production.

  • Is compact. While still being self-describing, Pot's main space-saving feature is not repeating symbols/identifiers more than one time while serializing. When serializing arrays of structures, this can make a major difference. The logs.rs example demonstrates this:

    $ cargo test --example logs -- average_sizes --nocapture
    Generating 1000 LogArchives with 100 entries.
    +-----------------+-----------+-----------------+
    | Format          | Bytes     | Self-Describing |
    +-----------------+-----------+-----------------+
    | pot             | 2,627,586 | yes             |
    +-----------------+-----------+-----------------+
    | cbor            | 3,072,369 | yes             |
    +-----------------+-----------+-----------------+
    | msgpack(named)  | 3,059,915 | yes             |
    +-----------------+-----------+-----------------+
    | msgpack         | 2,559,907 | no              |
    +-----------------+-----------+-----------------+
    | bincode(varint) | 2,506,844 | no              |
    +-----------------+-----------+-----------------+
    | bincode         | 2,755,137 | no              |
    +-----------------+-----------+-----------------+

Example

use serde_derive::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug, Eq, PartialEq)]
pub struct User {
    id: u64,
    name: String,
}

fn main() -> Result<(), pot::Error> {
    let user = User {
        id: 42,
        name: String::from("ecton"),
    };
    let serialized = pot::to_vec(&user)?;
    println!("User serialized: {serialized:02x?}");
    let deserialized: User = pot::from_slice(&serialized)?;
    assert_eq!(deserialized, user);

    // Pot also provides a "Value" type for serializing Pot-encoded payloads
    // without needing the original structure.
    let user: pot::Value<'_> = pot::from_slice(&serialized)?;
    println!("User decoded as value: {user}");

    Ok(())
}

Outputs:

User serialized: [50, 6f, 74, 00, a2, c4, 69, 64, 40, 2a, c8, 6e, 61, 6d, 65, e5, 65, 63, 74, 6f, 6e]
User decoded as value: {id: 42, name: ecton}

Benchmarks

Because benchmarks can be subjective and often don't mirror real-world usage, this library's authors aren't making any specific performance claims. The way Pot achieves space savings requires some computational overhead. As such, it is expected that a hypothetically perfect CBOR implementation could outperform a hypothetically perfect Pot implementation.

The results from the current benchmark suite executed on GitHub Actions are viewable here. The current suite is only aimed at comparing the default performance for each library.

Serialize into new Vec<u8>

Serialize Benchmark Violin Chart

Serialize into reused Vec<u8>

Serialize with Reused Buffer Benchmark Violin Chart

Deserialize

Deserialize Benchmark Violin Chart

Open-source Licenses

This project, like all projects from Khonsu Labs, is open-source. This repository is available under the MIT License or the Apache License 2.0.

To learn more about contributing, please see CONTRIBUTING.md.