Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compactor: Partially drop traces that exceed max_bytes_per_trace #1317

Merged
merged 9 commits into from
Mar 7, 2022

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Mar 1, 2022

What this PR does:
Updates the compactor to not write an object back into a block that exceeds the per tenant max_bytes_per_trace. Moves the tempo_discarded_spans_total metric into a shared location for both the distributor and the compactor to use. The distributor now adds a spans discarded reason of trace_too_large_to_compact.

A lot of subtle choices were made when creating this PR and ultimately I erred on the side of submitting a minimal change, but please review with the following in mind:

  • The compactor will go through the work of creating the trace even when it will just choose to throw it away immediately afterward. I chose to do this because we are only seeing issues when traces grow into the 100s of MBs. This is occurring on tenants with significantly lower max_bytes_per_trace. This change will catch and discard huge traces before they become a problem.

  • After throwing a combined trace away the compactor will attempt to count the number of spans it threw away, but this number is almost certainly overcounted. The compactor simply counts all the spans in the discarded objects even though it's likely that most of those spans are in the returned object. The only way I think to make this value correct would be fairly invasive changes into the ObjectCombiner. I felt like that complexity would not be worth it, but feel free to advise otherwise.

Which issue(s) this PR fixes:
Fixes #976

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott changed the title Compactor: Don't combine traces that exceed max_bytes_per_trace Compactor: Discard traces that exceed max_bytes_per_trace Mar 1, 2022
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott changed the title Compactor: Discard traces that exceed max_bytes_per_trace Compactor: Partially drop traces that exceed max_bytes_per_trace Mar 2, 2022
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott merged commit 59ce870 into grafana:main Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compactors require high memory as traces combine and grow in the backend
2 participants