-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Actionable Observability] [SPIKE] Investigate SLO definition #139213
Comments
Pinging @elastic/actionable-observability (Team: Actionable Observability) |
Defining and Registering a Saved Object in KibanaHere is the Kibana Developer Guide for Saved Objects: https://docs.elastic.dev/kibana-dev-docs/key-concepts/saved-objects-intro Here is a complete tutorial on defining a Saved Object and registering it: https://docs.elastic.dev/kibana-dev-docs/tutorials/saved-objects Here is an example of a Saved Object type from the Infrastructure Monitoring UI
Here is an example of registering the type with the Saved Objects service kibana/x-pack/plugins/infra/server/plugin.ts Lines 143 to 147 in 908a01b
|
Here is where the routes for Observability are defined: https://github.com/elastic/kibana/tree/main/x-pack/plugins/observability/server/routes |
After our discussion with the transform team, I think we should also use this pipeline to create monthly indices. We will need to modify the
We will also need to add |
This spike has been completed and implementation started |
Epic: #137323
RFC: https://docs.google.com/document/d/1-9w1WW9HoOCG7I4WAtTFi1Hfnh7BT11dctLVOQs7iwc/edit?usp=sharing
📝 Summary
We want to define how the SLO definition will be stored in Kibana Saved Object. This SLO definition will be used later to generate a Transformer to aggregate the data.
As part of this epic, we want to focus on two type of SLOs:
🧪 Experimentation
Run Kibana and ES locally, and then follow the instruction on this repository to start generating APM data: https://github.com/fkanout/elastic-apm-api-alerts-generator
After a while, you'll notice some data under the o11y-app:
Now you need to create the following index mappings and settings that the rollup index will use.
Index mappings & settings
We can now start experimenting with aggregation and creating some transformers for the two SLOs:
Availability SLO
💡 This SLO uses APM metrics
This will create buckets of transaction.name (request endpoint) with good defined as the number of requests with a http status code [2xx, 3xx, 4xx], and total defined as the total number of requests.
Search apm-metrics with aggregation
Transformer
Latency SLO
💡 This SLO uses APM metrics
This creates buckets of transaction.name (request endpoint) with good defined as the number of requests with a latency < 3000ms, and total defined as the total number of requests.
Search apm metrics with aggregation
Transformer
Latency SLO for "o11y-app" service and "GET /slow" transaction
Transformer
Visualization
We can then visualize the SLOs with a Lens (this lens is aggregating the metrics per hour, in a real life example we might use 1d, 7d, 30d instead).
We could also visualize the SLO per
transaction.name
, e.g.latency SLO > GET /slow
oravailability SLO > GET /flaky
❓ Questions
The text was updated successfully, but these errors were encountered: