Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 + Deflate serialization support with flate2's stream wrappers #43

Merged
merged 2 commits into from
Apr 6, 2017

Conversation

marshallpierce
Copy link
Collaborator

@marshallpierce marshallpierce commented Apr 6, 2017

Performance isn't awful but it isn't great either. Compression is slow and there's not a lot we can do about that.

test deserialize_large_dense_v2          ... bench:  18,970,418 ns/iter (+/- 944,849)
test deserialize_large_dense_v2_deflate  ... bench:  23,991,379 ns/iter (+/- 1,329,386)
test deserialize_large_sparse_v2         ... bench:  17,710,620 ns/iter (+/- 1,145,763)
test deserialize_large_sparse_v2_deflate ... bench:  20,297,392 ns/iter (+/- 1,568,093)

test serialize_large_dense_v2            ... bench:   7,154,361 ns/iter (+/- 307,599)
test serialize_large_dense_v2_deflate    ... bench:  94,926,041 ns/iter (+/- 3,333,186)
test serialize_large_sparse_v2           ... bench:   6,570,606 ns/iter (+/- 639,558)
test serialize_large_sparse_v2_deflate   ... bench:  63,914,342 ns/iter (+/- 4,191,686)

// least to the user, but for now we'll only take an easily reversible step towards that. There are
// still several ways the serializer interfaces could change to achieve better performance, so
// committing to anything right now would be premature.
trait TestOnlyHypotheticalSerializerInterface {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and its cousin in the private test module, are regrettable, but I still have a handful of TODOs to explore in serialization so I don't want to commit to anything just yet. Plus, I suspect it will not be common that users want to polymorphically choose which serializer they use (if for no other reason than that this feature is so new...).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd make me happy if we could at least somehow share this trait and its impls between the benchmarks and the tests though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would like that as well but I couldn't think of a way to do it without making it pub. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may be able to (ab)use the #[path] syntax to include a file with this trait + impls as a module in both...

.ok_or(DeserializeError::UsizeTypeTooSmall)?;

// TODO reuse deflate buf, or switch to lower-level flate2::Decompress
let mut deflate_reader = DeflateDecoder::new(reader.take(payload_len as u64));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allocation :'(

Copy link
Collaborator

@jonhoo jonhoo Apr 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the allocation of the DefaultDecoder you're saddened about? It's a relatively minor allocation, and it's not data-dependent, so I don't know that it'll add significant overhead..?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but mostly for intellectual vanity reasons (and a little bit because I want to keep the way to no-heap open) I want serialization and deserialization to be allocation-free. :)

self.compressed_buf.write_u32::<BigEndian>(0)?;

// TODO pluggable compressors? configurable compression levels?
// TODO benchmark https://github.com/sile/libflate
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many TODOs! But none of them are errors, just things I want to explore.

@@ -86,6 +90,8 @@
//! fn serialize<S: Serializer>(&self, serializer: S) -> Result<(), ()> {
//! // not optimal to not re-use the vec and serializer, but it'll work
//! let mut vec = Vec::new();
//! // pick the format you want to use
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this expanded on a little

@@ -176,13 +183,19 @@ mod benchmarks;
mod v2_serializer;
pub use self::v2_serializer::{V2Serializer, V2SerializeError};

#[path = "v2_deflate_serializer.rs"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know that Rust, when confronted with mod foo, will look for both foo/mod.rs and ./foo.rs by default, right? I don't think this #[path] should be necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd think that, and yet without this...

error: cannot declare a new module at this location
   --> src/serialization/serialization.rs:186:1
    |
186 | mod v2_deflate_serializer;
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #37872 <https://github.com/rust-lang/rust/issues/37872>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. Reading issue rust-lang/rust#37872, it seems like this is caused by us using #[path] elsewhere, which sort "taints" the path such that everything needs #[path] annotations. Weird... And unfortunate..


self.uncompressed_buf.clear();
self.compressed_buf.clear();
// TODO serialize directly into uncompressed_buf without the buffering inside v2_serializer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, yeah, that's unfortunate :(


{
// TODO reuse deflate buf, or switch to lower-level flate2::Compress
let mut compressor = DeflateEncoder::new(&mut self.compressed_buf, Compression::Default);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could speed up first-time serialization pretty substantially by resizing self.compressed_buf here to some fraction of self.uncompressed_buf.len(). This would avoid doing multiple Vec resizes during the first compression.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. That would make the pessimal case of single-use serializers less bad!

@jonhoo
Copy link
Collaborator

jonhoo commented Apr 6, 2017

LGTM!

@jonhoo jonhoo merged commit c071d40 into HdrHistogram:master Apr 6, 2017
@marshallpierce marshallpierce deleted the serialize-v2-deflate branch April 11, 2017 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants