Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output json optimizations #2436

Merged
merged 5 commits into from
Mar 16, 2022
Merged

Output json optimizations #2436

merged 5 commits into from
Mar 16, 2022

Conversation

mstoykov
Copy link
Contributor

@mstoykov mstoykov commented Mar 11, 2022

Mostly provoked by this community forum post.

Moving to using easyjson gave some improvement, but then I noticed we are spending a bunch of time in compression so decided to try out a different gzip implementation.

Combined improvement is twice better on both a CPU and allocations

name            old time/op    new time/op    delta
FlushMetrics-8    37.2ms ± 6%    18.2ms ± 6%  -50.99%  (p=0.000 n=10+9)

name            old alloc/op   new alloc/op   delta
FlushMetrics-8    2.44MB ± 1%    2.52MB ± 0%   +3.28%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
FlushMetrics-8     47.3k ± 1%     20.1k ± 0%  -57.59%  (p=0.000 n=10+10)

Breaking change: the tag keys are now not sorted in the output. This is due to no longer using the stdlib marshalling which does this.

Changelog entry under "new features" (I guess):

The JSON output was optimized and now should around 2x as more performant at outputting metrics. 

This means that it either will be able to export twice as many metrics and/or to use half the resources to do the same amount of metrics.

As a side effect there is a slight breaking change in which the `tags` field is no longer sorted, while it was previously. 

No changes in benchmarks but still a good idea.

Likely the gzip package has enough of a buffer for this to not matter in
this case
name            old time/op    new time/op    delta
FlushMetrics-8    37.6ms ±11%    23.2ms ±12%  -38.41%  (p=0.000 n=10+10)

name            old alloc/op   new alloc/op   delta
FlushMetrics-8    2.44MB ± 1%    2.52MB ± 0%   +3.54%  (p=0.000 n=10+10)

name            old allocs/op  new allocs/op  delta
FlushMetrics-8     47.2k ± 1%     20.1k ± 0%  -57.47%  (p=0.000 n=10+10)
name            old time/op    new time/op    delta
FlushMetrics-8    23.2ms ±12%    18.2ms ± 6%  -21.31%  (p=0.000 n=10+9)

name            old alloc/op   new alloc/op   delta
FlushMetrics-8    2.52MB ± 0%    2.52MB ± 0%     ~     (p=0.324 n=10+10)

name            old allocs/op  new allocs/op  delta
FlushMetrics-8     20.1k ± 0%     20.1k ± 0%     ~     (all equal)
@mstoykov mstoykov added this to the v0.38.0 milestone Mar 11, 2022
@github-actions github-actions bot requested review from na-- and oleiade March 11, 2022 10:23
@mstoykov
Copy link
Contributor Author

mstoykov commented Mar 11, 2022

Some of those can likely be used in the output/cloud as well as they don't seem that complicated to apply

using jwriter is like 10-15% faster and kaluspost gzip (which is the majority of the additional lines, unfortunately) seems to give like 20% better performance so that seem also quite nice.

Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though please add a PR description with a short description and a benchmark comparison for the cumulative speedup of all commits

@mstoykov
Copy link
Contributor Author

@na-- updated

@mstoykov
Copy link
Contributor Author

Note: if we change the time to be int64 instead of time.Time we can get even better performance
Compared to the optimized version here:

name            old time/op    new time/op    delta
FlushMetrics-8    18.2ms ± 6%    11.3ms ±12%  -37.96%  (p=0.000 n=9+10)

name            old alloc/op   new alloc/op   delta
FlushMetrics-8    2.52MB ± 0%    1.85MB ± 0%  -26.64%  (p=0.000 n=10+9)

name            old allocs/op  new allocs/op  delta
FlushMetrics-8     20.1k ± 0%      7.6k ± 0%  -62.31%  (p=0.000 n=10+10)

Compared to the original:

name            old time/op    new time/op    delta
FlushMetrics-8    37.2ms ± 6%    11.3ms ±12%  -69.59%  (p=0.000 n=10+10)

name            old alloc/op   new alloc/op   delta
FlushMetrics-8    2.44MB ± 1%    1.85MB ± 0%  -24.23%  (p=0.000 n=10+9)

name            old allocs/op  new allocs/op  delta
FlushMetrics-8     47.3k ± 1%      7.6k ± 0%  -84.01%  (p=0.000 n=10+10)

Unfortunately I can't figure out a not terrible way to support both so we can be backwards compatible :(

Also, of note is that I remembered there is a slight breaking change - the tag keys are now not sorted in the JSON output.

@mstoykov mstoykov merged commit 03e938d into master Mar 16, 2022
@mstoykov mstoykov deleted the outputJSONOptimizations branch March 16, 2022 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants