Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[extensions/file_storage] Change bbolt settings for better performance on large DB files #9003

Closed
swiatekm opened this issue Apr 1, 2022 · 2 comments · Fixed by #9004
Closed

Comments

@swiatekm
Copy link
Contributor

swiatekm commented Apr 1, 2022

Is your feature request related to a problem? Please describe.
File storage performance degrades significantly depending on the maximum size of the bbolt DB over its lifetime. Notably, this is true even if the DB is empty, as bbolt never frees disk space unless manually instructed to compact it (which is tricky to do online). The reason for the performance degradation is that bbolt keeps a freelist data structure to track storage space, and the default implementation of this structure is prone to fragmentation. In the default configuration, it also syncs this freelist to disk on every transaction, which can be expensive if it's large.

See the additional context section for benchmarks.

Describe the solution you'd like

  1. Disable freelist syncing for bbolt. This increases startup time by two orders of magnitude (~2 ms -> ~200ms for a 2 GB db file), but makes every operation close to 2x faster. See benchmarks below and the bbolt documentation for reference.
  2. Switch to a different freelist data structure. This makes small dbs around 10% slower, but larger ones many orders of magnitude faster. See the bbolt documentation for freelist types for reference.

Describe alternatives you've considered
Compaction solves this problem, but requires manually enabling it and restarting the application. It also requires additional disk space and increases start time much in the same way as not syncing the freelist does.

This is a problem we've originally discovered in core's persistent queue. It can also be mitigated there by explicitly rotating DB files. This is fairly complex though, and the options changes look like a performance win for everyone with little downside.

Additional context
I've added two additional benchmarks to the existing suite.
BenchmarkClientSetLargeDB does the same benchmark as BenchmarkClientSet, but it prepares the DB by inserting 2000 1Mi values, and then deleting them. BenchmarkClientInitLargeDB does the same preparation, and then reopens the DB, forcing it to regenerate the freelist. I'm going to submit a PR with these shortly.

With current settings:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkClientGet-16            	   83074	     13755 ns/op	    5432 B/op	      29 allocs/op
BenchmarkClientGet100-16         	   54278	     24637 ns/op	    7814 B/op	     128 allocs/op
BenchmarkClientSet-16            	   46999	     22104 ns/op	    6276 B/op	      42 allocs/op
BenchmarkClientSet100-16         	   14275	     71188 ns/op	   18338 B/op	     343 allocs/op
BenchmarkClientDelete-16         	   69243	     30410 ns/op	    5432 B/op	      29 allocs/op
BenchmarkClientSetLargeDB-16     	     445	   2397399 ns/op	 8525425 B/op	      63 allocs/op
BenchmarkClientInitLargeDB-16    	     768	   1507886 ns/op	  623847 B/op	      72 allocs/op

NoFreelistSync: true:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkClientGet-16            	  174212	      6659 ns/op	    5112 B/op	      19 allocs/op
BenchmarkClientGet100-16         	   82245	     14644 ns/op	    7488 B/op	     118 allocs/op
BenchmarkClientSet-16            	   87408	     13125 ns/op	    6179 B/op	      38 allocs/op
BenchmarkClientSet100-16         	   22010	     54789 ns/op	   18185 B/op	     338 allocs/op
BenchmarkClientDelete-16         	  184400	      6587 ns/op	    5112 B/op	      19 allocs/op
BenchmarkClientSetLargeDB-16     	    1609	   1031052 ns/op	 4217779 B/op	      38 allocs/op
BenchmarkClientInitLargeDB-16    	       7	 144174763 ns/op	43807000 B/op	   25171 allocs/op

NoFreelistSync: true, FreelistType: bbolt.FreelistMapType:

goos: darwin
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/extension/storage/filestorage
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkClientGet-16            	  155553	      7506 ns/op	    5088 B/op	      18 allocs/op
BenchmarkClientGet100-16         	   71113	     16664 ns/op	    7464 B/op	     117 allocs/op
BenchmarkClientSet-16            	   79338	     15134 ns/op	    6354 B/op	      40 allocs/op
BenchmarkClientSet100-16         	   19621	     66046 ns/op	   18646 B/op	     347 allocs/op
BenchmarkClientDelete-16         	  156417	      8031 ns/op	    5088 B/op	      18 allocs/op
BenchmarkClientSetLargeDB-16     	   70166	     25884 ns/op	    6372 B/op	      41 allocs/op
BenchmarkClientInitLargeDB-16    	       6	 187204655 ns/op	44238689 B/op	   25171 allocs/op
@jpkrohling
Copy link
Member

cc @djaglowski

@swiatekm
Copy link
Contributor Author

swiatekm commented Apr 4, 2022

I didn't know this while submitting this issue, but apparently the settings I proposed are also what etcd uses by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants