Metadata caching for parquet blocks #1564

annanay25 · 2022-07-12T13:36:34Z

Signed-off-by: Annanay annanayagarwal@gmail.com

What this PR does:

Updates parquet-go library to catch some caching related changes and other optimisations
Adds caching for certain metadata objects in parquet blocks - Header, Footer, BloomFilters, ColumnIndex, OffsetIndex. This will relax the current requirement to load these (small) metadata objects from backend on every search request.
Uses the same cache as currently configured to store bloom filters.

Which issue(s) this PR fixes:
Part of #1480

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Annanay <annanayagarwal@gmail.com>

joe-elliott

I'll admit I don't see a ton of benefit to an in memory cache, particularly one that only exists for the scope of a single search. From what I've seen parquet is good about only requesting metadata elements like the footer one time.

I think just adding support for the off process cache in parquet would be a nice boost.

tempodb/encoding/vparquet/parquet_reader.go

mdisibio · 2022-07-12T20:50:59Z

I'll admit I don't see a ton of benefit to an in memory cache, particularly one that only exists for the scope of a single search. From what I've seen parquet is good about only requesting metadata elements like the footer one time.

I think just adding support for the off process cache in parquet would be a nice boost.

This is making use of off-process cache with the new ReadRange(shouldCache) param: https://github.com/grafana/tempo/pull/1564/files#diff-e306c61bbe2576e7be962548144f5938df0cfbb1f2e5f8cfaa6c8211b7368cc2R36 The map is used to record the extents of the "metadata" sections: footer/indexes offset+lengths.

joe-elliott · 2022-07-12T20:54:36Z

This is making use of off-process cache with the new ReadRange(shouldCache) param: https://github.com/grafana/tempo/pull/1564/files#diff-e306c61bbe2576e7be962548144f5938df0cfbb1f2e5f8cfaa6c8211b7368cc2R36 The map is used to record the extents of the "metadata" sections: footer/indexes offset+lengths.

Wups, read it too quickly. Thank you 👍

tempodb/encoding/vparquet/parquet_reader.go

tempodb/encoding/vparquet/block_findtracebyid.go

joe-elliott · 2022-07-12T21:35:42Z

tempodb/encoding/vparquet/parquet_reader.go

+	}
+
+	// check if the offset and length is stored as a special object
+	if r.cachedObjects[off] == int64(len(p)) {


we're kind of "forking" the read path here. We're sending special queries off to the backend reader and other queries to the in memory buffered reader at. Is there a cleaner implementation here?

It feels like all calls should go through the same layers:
Parquet File -> In memory cache -> Off process cache (for specific segments of the file) -> Object storage

Is there a clean way to achieve this?

All calls are going through the same layers, ie. Tempo Querier -> Cached Reader -> Backend reader -> Object storage. The same underlying BackendReaderAt is called each time. If you look at this piece of logic here - https://github.com/grafana/tempo/pull/1564/files#diff-6c42681fadfdec225eb9164d0d49a2fe2033bb8ff0e820ecbd5010bb0bb629d8R37-R51
The difference is that just one of the readers is buffered.

We are just setting shouldCache to true in some special cases and leaving it as false for most other calls.

Not sure if I'm missing something, we can also chat offline to discuss.

Yeah, I can't say I have great ideas here. I think what I expect in some perfect world is

only one "reader" being passed to this object which all calls go through.

the special ranges (footer, metadata, etc) also passing through the in memory buffer

However, I think any attempt to do this while also caching the special ranges would be difficult and would make other sections of code worse. My largest concern is that by adding this code we will be increasing calls to the backend especially in cases where off process cache is not configured. Here is a read listing on searching the cluster column:

len=77631 off=664715997 hit=false dur=127.416656ms len=4096 off=4645575 hit=false dur=324.308924ms len=4096 off=4649671 hit=true dur=324.310427ms len=4096 off=4653767 hit=true dur=324.312226ms ... len=4096 off=47668776 hit=false dur=695.911273ms len=4096 off=47672872 hit=true dur=695.912669ms len=4096 off=47676968 hit=true dur=695.913873ms ... len=4096 off=85484375 hit=false dur=1.965658619s len=4096 off=85488471 hit=true dur=1.96566025s len=4096 off=85492567 hit=true dur=1.965661536s ...

The first read is the footer. Then it attempts to start reading the column in one of the row groups. You can see there is a buffered cache miss (hit = false) everytime it jumps to a new row group. If that first call is grabbing some kind of section metadata would it increase the number of calls to the backend b/c it wouldn't be hitting the buffered reader?

Maybe all I'm wanting here is a way to configure this on/off to assess impact.

Discussed offline, and went ahead with making cache settings configurable. Let me know if this works!

Signed-off-by: Annanay <annanayagarwal@gmail.com>

mdisibio

LGTM. Nice work, seeing this improvement through end to end.

annanay25 added 2 commits July 12, 2022 18:38

start adding metadata caching for parquet blocks

5b7065e

Signed-off-by: Annanay <annanayagarwal@gmail.com>

comments

e8b6d08

Signed-off-by: Annanay <annanayagarwal@gmail.com>

annanay25 requested review from joe-elliott, mdisibio, mapno, kvrhdn and zalegrala as code owners July 12, 2022 13:36

joe-elliott reviewed Jul 12, 2022

View reviewed changes

mdisibio reviewed Jul 12, 2022

View reviewed changes

tempodb/encoding/vparquet/parquet_reader.go Outdated Show resolved Hide resolved

mdisibio reviewed Jul 12, 2022

View reviewed changes

tempodb/encoding/vparquet/parquet_reader.go Outdated Show resolved Hide resolved

mdisibio reviewed Jul 12, 2022

View reviewed changes

tempodb/encoding/vparquet/block_findtracebyid.go Outdated Show resolved Hide resolved

joe-elliott reviewed Jul 12, 2022

View reviewed changes

annanay25 added 8 commits July 13, 2022 16:57

Review feedback

852ebef

Signed-off-by: Annanay <annanayagarwal@gmail.com>

Add missing file

9f6aded

Signed-off-by: Annanay <annanayagarwal@gmail.com>

lint

3cebad6

Signed-off-by: Annanay <annanayagarwal@gmail.com>

more lint

e25ed27

Signed-off-by: Annanay <annanayagarwal@gmail.com>

Add test for cache metadata settings, remove interface

14b7a7a

Signed-off-by: Annanay <annanayagarwal@gmail.com>

Merge branch 'main' into parquet-meta-caching

7c22f46

Signed-off-by: Annanay <annanayagarwal@gmail.com>

Make cache settings configurable

348613e

Signed-off-by: Annanay <annanayagarwal@gmail.com>

make vendor-check, docs

7f122f7

Signed-off-by: Annanay <annanayagarwal@gmail.com>

annanay25 requested review from knylander-grafana and KMiller-Grafana as code owners July 14, 2022 11:20

annanay25 requested review from joe-elliott and mdisibio July 14, 2022 12:09

Disable caching in findtracebyid till we make it configurable

1d607a2

Signed-off-by: Annanay <annanayagarwal@gmail.com>

mdisibio approved these changes Jul 14, 2022

View reviewed changes

annanay25 merged commit 9be0ae5 into grafana:main Jul 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata caching for parquet blocks #1564

Metadata caching for parquet blocks #1564

annanay25 commented Jul 12, 2022 •

edited

Loading

joe-elliott left a comment

mdisibio commented Jul 12, 2022

joe-elliott commented Jul 12, 2022

joe-elliott Jul 12, 2022

annanay25 Jul 13, 2022

joe-elliott Jul 13, 2022 •

edited

Loading

annanay25 Jul 14, 2022

mdisibio left a comment

Metadata caching for parquet blocks #1564

Metadata caching for parquet blocks #1564

Conversation

annanay25 commented Jul 12, 2022 • edited Loading

joe-elliott left a comment

Choose a reason for hiding this comment

mdisibio commented Jul 12, 2022

joe-elliott commented Jul 12, 2022

joe-elliott Jul 12, 2022

Choose a reason for hiding this comment

annanay25 Jul 13, 2022

Choose a reason for hiding this comment

joe-elliott Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

annanay25 Jul 14, 2022

Choose a reason for hiding this comment

mdisibio left a comment

Choose a reason for hiding this comment

annanay25 commented Jul 12, 2022 •

edited

Loading

joe-elliott Jul 13, 2022 •

edited

Loading