kube-state-metrics API scraping timeout #995

zhengl7 · 2019-12-09T18:35:23Z

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
We are using kube-state-metrics (1.6.0) from prometheus-operator. Our cluster periodically creates hundreds of pods and deletes them within a 15 minutes window for bursting traffic. When this happens, kube-state-metrics target in Prometheus will often see API timeout (default 10 seconds). Suspecting this is due to the mutex locking mechanism used for MetricsStore that when heavy writing of new entries into the map happens it'll block reading for extended time.

What you expected to happen:
More consistent API response time to not cause Prometheus scraping timeout.

How to reproduce it (as minimally and precisely as possible):
Our cluster normally has 30k metrics in total, with roughly 25k metrics are from pod collectors (seems there are 27 pod_info_ metrics, we have close to 1000 pods). When the bursting traffic comes, the cluster create a few hundreds of pods to push total metrics to around 60k, most added in pod_info metrics.

Response size doesn't seem to be an issue, with 60k metrics, the total response size is roughly 10MB. When the scraping call doesn't time out, it can response in 100 ~ 200ms.

Single kube-state-metrics pod.

Anything else we need to know?:
Tried modifying kube-state-metrics code by using a sync.Map for MetricsStore in hope of better concurrency, it seems relieving the problem a little bit, but when the flood comes in it can still take seconds for kube-state-metrics to respond, as sync.Map is really good for infrequently changed map entries but in our case there are always a ton of new entries being created.

I think padding the gap in Prometheus with previous sample should be fine, not sure if it makes sense for kube-state-metrics to provide this behavior, something like keeping a copy of previous response, and if the collector reading takes too long simply using the copy.

Environment:

Kubernetes version (use kubectl version):
1.13
Kube-state-metrics image version
1.6.0

The text was updated successfully, but these errors were encountered:

lilic · 2019-12-11T12:11:37Z

Thanks for the issue.

Our cluster periodically creates hundreds of pods and deletes them within a 15 minutes window for bursting traffic.

Seems like your use case is very specific, have you tried disabling the pod metrics resource from kube-state-metrics? Does it make a difference?

brancz · 2019-12-11T13:08:39Z

Or potentially configure kube-state-metrics to not watch the namespaces with the problematic workload.

zhengl7 · 2019-12-11T19:14:54Z

@lilic Thanks for the response. Yeah agree our use case is special. I stood up my local kube-state-metrics with some timing instrumentation and it's the pod collector that takes majority of the time, so I'm pretty sure it'll get things fixed, but I don't think that's an option for us.

@brancz Thanks for the info, that might be something we can look at if no other solutions. The namespaces can also be dynamic so it won't be ideal.

I'm now leaning toward exploring using completely lock-free maps to enahnce kube-state-metrics, as sync.Map still locks with new entries creation. If things work well I can generate a pull request for that.

lilic · 2019-12-12T08:28:45Z

@zhengl7 note that we have benchmark tests in place and some of those things were done for optimisation purposes, so if the performance decreases we won't accept the PR. But we are happy to see any PRs that solve bugs :)

zhengl7 · 2019-12-12T17:56:38Z

@lilic Absolutely. I just got a version that solves our problem, it reduces the scraping call duration from tens of seconds to subsecond during those peak traffic minutes, on the same cluster. I'll try running the test scripts locally.

zhengl7 · 2020-01-18T00:39:09Z

@lilic Just submitted the PR about the proposed change, the fix ends up being just using sync.Map, and the timeouts are completely gone.

I ran make test-benchmark-compare and seems there are some degradation, not sure whether it's acceptable or not.

fejta-bot · 2020-04-17T01:01:21Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-05-17T01:46:24Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-06-16T02:30:19Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-06-16T02:30:32Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

There are a few documented scenarios where `kube-state-metrics` will lock up(kubernetes#995, kubernetes#1028). I believe a much simpler solution to ensure `kube-state-metrics` doesn't lock up and require a restart to server `/metrics` requests is to add default read and write timeouts and to allow them to be configurable. At Grafana, we've experienced a few scenarios where `kube-state-metrics` running in larger clusters falls behind and starts getting scraped multiple times. When this occurs, `kube-state-metrics` becomes completely unresponsive and requires a reboot. This is somewhat easily reproduceable(I'll provide a script in an issue) and causes other critical workloads(KEDA, VPA) to fail in weird ways. Adds two flags: - `server-read-timeout` - `server-write-timeout` Updates the metrics http server to set the `ReadTimeout` and `WriteTimeout` to the configured values.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 9, 2019

zhengl7 mentioned this issue Jan 18, 2020

Change metrics store to use sync.Map #1028

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 17, 2020

k8s-ci-robot closed this as completed Jun 16, 2020

afirth mentioned this issue Nov 21, 2023

Prometheus scrape not able to recover from tcp write broken pipe prometheus/prometheus#13094

Closed

This was referenced Jun 5, 2024

fix(server): Add read and write timeouts #2412

Merged

Scrapes hang when deadlocked #2413

Closed

rarruda mentioned this issue Sep 26, 2024

perf: use concurrent map when storing metrics #2510

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube-state-metrics API scraping timeout #995

kube-state-metrics API scraping timeout #995

zhengl7 commented Dec 9, 2019

lilic commented Dec 11, 2019

brancz commented Dec 11, 2019

zhengl7 commented Dec 11, 2019

lilic commented Dec 12, 2019

zhengl7 commented Dec 12, 2019

zhengl7 commented Jan 18, 2020

fejta-bot commented Apr 17, 2020

fejta-bot commented May 17, 2020

fejta-bot commented Jun 16, 2020

k8s-ci-robot commented Jun 16, 2020

kube-state-metrics API scraping timeout #995

kube-state-metrics API scraping timeout #995

Comments

zhengl7 commented Dec 9, 2019

lilic commented Dec 11, 2019

brancz commented Dec 11, 2019

zhengl7 commented Dec 11, 2019

lilic commented Dec 12, 2019

zhengl7 commented Dec 12, 2019

zhengl7 commented Jan 18, 2020

fejta-bot commented Apr 17, 2020

fejta-bot commented May 17, 2020

fejta-bot commented Jun 16, 2020

k8s-ci-robot commented Jun 16, 2020