Wrong percentile values for metrics shown in Datadog #2819

imiric · 2022-12-13T18:33:29Z

Brief summary

As reported several times on the forum (1, 2, 3), the percentile values users can visualize in Datadog are often wrong when compared to the end-of-test summary shown by k6.

As explained here, this is likely caused by the additional aggregation done by the DogStatsD agent, which by default only generates the 95percentile metric, which is then aggregated again by the query used in the Datadog graph.

Besides resulting in wrong values, users are unable to use raw metric data to generate any percentile value they need.

k6 version

0.41.0

OS

any

Steps to reproduce the problem

Follow the steps to setup the Datadog Agent, and run a test as explained here.
Go to the Datadog web UI and visualize the k6.http_req_duration.max and k6.http_req_duration.95percentile metrics, and compare them to the end-of-test summary shown by k6. Notice that they don't match.

Expected behaviour

The percentiles shown in Datadog should match the end-of-test summary shown by k6.
The user should be able to generate any percentile over raw metric data sent to Datadog. Ideally, the Datadog Agent (DogStatsD) shouldn't do any aggregation at all.

Actual behaviour

The percentiles are different, and the Datadog 95percentile is confusing.
The user can only use a limited number of metrics, and most pre-aggregated ones show wrong values.

Suggested solution

After going through the Datadog documentation, it seems it's possible to send raw data that won't be aggregated by the Datadog Agent using the distribution metric type (1, 2). This would not only avoid the aggregation, but allow users to generate any percentile value they need over the raw data.

The drawback would possibly be overloading the ingest pipeline (either of the Datadog Agent or of Datadog itself, hitting some API limits, etc.), so this needs to be thoroughly tested.

Currently we send Count, Gauge and other metric types that will be aggregated, but the datadog-go/statsd client we use also supports the Distribution metric.

In addition to evaluating whether this change works for some of our metric types, we should also ensure that we don't break support for any other StatsD backends.

The text was updated successfully, but these errors were encountered:

imiric · 2022-12-19T16:35:11Z

I think sending our Trend metric as a Distribution metric would also fix querying percentiles in New Relic. See the New Relic data types documentation, specifically for the distribution type. Notice that it supports the percentile and histogram functions, which are not supported by the other types.

Currently this is not supported, and we removed the documentation about the previously wrong query in grafana/k6-docs#895.

vieiraes · 2023-04-14T23:43:10Z

waiting for a update about it... all % metrics are different in Datadog...

imiric · 2023-04-17T08:47:30Z

Sorry about that @vieiraes. 😞 You can see some updates in #2982.

What we decided to do is to split up the DataDog and New Relic outputs into separate k6 extensions, since it's clear that they can't be well supported by the generic statsd output.

This work is high priority, as we know it's a major impact for k6 users of these backends, so it's likely we'll start working on this in the next few weeks, but we can't promise any ETAs yet.

LeonAdato · 2024-08-28T15:31:15Z

Per @olegbespalov and @javaducky, this issue should probably be part of the StatsD project. Feel free to transfer it here:

https://github.com/LeonAdato/xk6-output-statsd

olegbespalov · 2024-08-30T08:05:56Z

Closing in favor of LeonAdato/xk6-output-statsd#24

imiric added bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 labels Dec 13, 2022

imiric changed the title ~~Wrong percentile values for metrics sent to Datadog~~ Wrong percentile values for metrics shown in Datadog Dec 13, 2022

imiric mentioned this issue Mar 21, 2023

Future of StatsD outputs #2982

Open

olegbespalov added metrics-output Issues related to the metrics output (a.k.a. just output) statsd Something related to the statsd labels Jul 12, 2024

olegbespalov mentioned this issue Jul 15, 2024

Porting a know issues from the k6 repository LeonAdato/xk6-output-statsd#18

Closed

github-actions bot added the triage label Aug 28, 2024

github-actions bot assigned joanlopez Aug 28, 2024

olegbespalov assigned olegbespalov and unassigned joanlopez Aug 28, 2024

olegbespalov mentioned this issue Aug 30, 2024

Wrong percentile values for metrics shown in Datadog LeonAdato/xk6-output-statsd#24

Open

olegbespalov closed this as not planned Won't fix, can't repro, duplicate, stale Aug 30, 2024

olegbespalov removed their assignment Aug 30, 2024

olegbespalov removed the triage label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong percentile values for metrics shown in Datadog #2819

Wrong percentile values for metrics shown in Datadog #2819

imiric commented Dec 13, 2022 •

edited

Loading

imiric commented Dec 19, 2022

vieiraes commented Apr 14, 2023 •

edited

Loading

imiric commented Apr 17, 2023

LeonAdato commented Aug 28, 2024

olegbespalov commented Aug 30, 2024

Wrong percentile values for metrics shown in Datadog #2819

Wrong percentile values for metrics shown in Datadog #2819

Comments

imiric commented Dec 13, 2022 • edited Loading

Brief summary

k6 version

OS

Steps to reproduce the problem

Expected behaviour

Actual behaviour

Suggested solution

imiric commented Dec 19, 2022

vieiraes commented Apr 14, 2023 • edited Loading

imiric commented Apr 17, 2023

LeonAdato commented Aug 28, 2024

olegbespalov commented Aug 30, 2024

imiric commented Dec 13, 2022 •

edited

Loading

vieiraes commented Apr 14, 2023 •

edited

Loading