Skip to content

Commit

Permalink
Delete TempoRequestErrors alert from mixin
Browse files Browse the repository at this point in the history
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
  • Loading branch information
zalegrala committed Nov 2, 2022
1 parent 3cd396c commit b639c20
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 35 deletions.
13 changes: 0 additions & 13 deletions operations/tempo-mixin-compiled/alerts.yaml
Original file line number Diff line number Diff line change
@@ -1,19 +1,6 @@
"groups":
- "name": "tempo_alerts"
"rules":
- "alert": "TempoRequestErrors"
"annotations":
"message": |
{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf "%.2f" $value }}% errors.
"runbook_url": "https://github.com/grafana/tempo/tree/main/operations/tempo-mixin/runbook.md#TempoRequestErrors"
"expr": |
100 * sum(rate(tempo_request_duration_seconds_count{status_code=~"5.."}[1m])) by (cluster, namespace, job, route)
/
sum(rate(tempo_request_duration_seconds_count[1m])) by (cluster, namespace, job, route)
> 10
"for": "15m"
"labels":
"severity": "critical"
- "alert": "TempoRequestLatency"
"annotations":
"message": |
Expand Down
19 changes: 0 additions & 19 deletions operations/tempo-mixin/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,6 @@
{
name: 'tempo_alerts',
rules: [
{
alert: 'TempoRequestErrors',
expr: |||
100 * sum(rate(tempo_request_duration_seconds_count{status_code=~"5.."}[1m])) by (%(group_by_job)s, route)
/
sum(rate(tempo_request_duration_seconds_count[1m])) by (%(group_by_job)s, route)
> 10
||| % $._config,
'for': '15m',
labels: {
severity: 'critical',
},
annotations: {
message: |||
{{ $labels.job }} {{ $labels.route }} is experiencing {{ printf "%.2f" $value }}% errors.
|||,
runbook_url: 'https://github.com/grafana/tempo/tree/main/operations/tempo-mixin/runbook.md#TempoRequestErrors',
},
},
{
alert: 'TempoRequestLatency',
expr: |||
Expand Down
5 changes: 2 additions & 3 deletions operations/tempo-mixin/runbook.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# Runbook

This document should help with remediating operational issues in Tempo.
This document should help with remediation of operational issues in Tempo.

## TempoRequestErrors
## TempoRequestLatency

Aside from obvious errors in the logs the only real lever you can pull here is scaling. Use the Reads or Writes dashboard
Expand Down Expand Up @@ -281,4 +280,4 @@ The error "Unexpected error reloading meta for local block. Ignoring and continu
meta.json. Repair the meta.json and then restart the ingester to successfully recover the block. Or if
it is not able to be repaired then the block files can be simply deleted as the ingester has already started
without it. As long as the replication factor is 2 or higher, then there will be no data loss as the
same data was also written to another ingester.
same data was also written to another ingester.

0 comments on commit b639c20

Please sign in to comment.