Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata caching for parquet blocks #1564

Merged
merged 11 commits into from
Jul 15, 2022

Conversation

annanay25
Copy link
Contributor

@annanay25 annanay25 commented Jul 12, 2022

Signed-off-by: Annanay annanayagarwal@gmail.com

What this PR does:

  • Updates parquet-go library to catch some caching related changes and other optimisations
  • Adds caching for certain metadata objects in parquet blocks - Header, Footer, BloomFilters, ColumnIndex, OffsetIndex. This will relax the current requirement to load these (small) metadata objects from backend on every search request.
  • Uses the same cache as currently configured to store bloom filters.

Which issue(s) this PR fixes:
Part of #1480

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I don't see a ton of benefit to an in memory cache, particularly one that only exists for the scope of a single search. From what I've seen parquet is good about only requesting metadata elements like the footer one time.

I think just adding support for the off process cache in parquet would be a nice boost.

@mdisibio
Copy link
Contributor

I'll admit I don't see a ton of benefit to an in memory cache, particularly one that only exists for the scope of a single search. From what I've seen parquet is good about only requesting metadata elements like the footer one time.

I think just adding support for the off process cache in parquet would be a nice boost.

This is making use of off-process cache with the new ReadRange(shouldCache) param: https://github.com/grafana/tempo/pull/1564/files#diff-e306c61bbe2576e7be962548144f5938df0cfbb1f2e5f8cfaa6c8211b7368cc2R36 The map is used to record the extents of the "metadata" sections: footer/indexes offset+lengths.

@joe-elliott
Copy link
Member

This is making use of off-process cache with the new ReadRange(shouldCache) param: https://github.com/grafana/tempo/pull/1564/files#diff-e306c61bbe2576e7be962548144f5938df0cfbb1f2e5f8cfaa6c8211b7368cc2R36 The map is used to record the extents of the "metadata" sections: footer/indexes offset+lengths.

Wups, read it too quickly. Thank you 👍

}

// check if the offset and length is stored as a special object
if r.cachedObjects[off] == int64(len(p)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're kind of "forking" the read path here. We're sending special queries off to the backend reader and other queries to the in memory buffered reader at. Is there a cleaner implementation here?

It feels like all calls should go through the same layers:
Parquet File -> In memory cache -> Off process cache (for specific segments of the file) -> Object storage

Is there a clean way to achieve this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All calls are going through the same layers, ie. Tempo Querier -> Cached Reader -> Backend reader -> Object storage. The same underlying BackendReaderAt is called each time. If you look at this piece of logic here - https://github.com/grafana/tempo/pull/1564/files#diff-6c42681fadfdec225eb9164d0d49a2fe2033bb8ff0e820ecbd5010bb0bb629d8R37-R51
The difference is that just one of the readers is buffered.

We are just setting shouldCache to true in some special cases and leaving it as false for most other calls.

Not sure if I'm missing something, we can also chat offline to discuss.

Copy link
Member

@joe-elliott joe-elliott Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can't say I have great ideas here. I think what I expect in some perfect world is

  • only one "reader" being passed to this object which all calls go through.
  • the special ranges (footer, metadata, etc) also passing through the in memory buffer

However, I think any attempt to do this while also caching the special ranges would be difficult and would make other sections of code worse. My largest concern is that by adding this code we will be increasing calls to the backend especially in cases where off process cache is not configured. Here is a read listing on searching the cluster column:

len=77631 off=664715997 hit=false dur=127.416656ms
len=4096 off=4645575 hit=false dur=324.308924ms
len=4096 off=4649671 hit=true dur=324.310427ms
len=4096 off=4653767 hit=true dur=324.312226ms
...
len=4096 off=47668776 hit=false dur=695.911273ms
len=4096 off=47672872 hit=true dur=695.912669ms
len=4096 off=47676968 hit=true dur=695.913873ms
...
len=4096 off=85484375 hit=false dur=1.965658619s
len=4096 off=85488471 hit=true dur=1.96566025s
len=4096 off=85492567 hit=true dur=1.965661536s
...

The first read is the footer. Then it attempts to start reading the column in one of the row groups. You can see there is a buffered cache miss (hit = false) everytime it jumps to a new row group. If that first call is grabbing some kind of section metadata would it increase the number of calls to the backend b/c it wouldn't be hitting the buffered reader?

Maybe all I'm wanting here is a way to configure this on/off to assess impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, and went ahead with making cache settings configurable. Let me know if this works!

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Copy link
Contributor

@mdisibio mdisibio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work, seeing this improvement through end to end.

@annanay25 annanay25 merged commit 9be0ae5 into grafana:main Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants