Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

Prometheus scraping of cAdvisor values does not work with Kubernetes 1.7 #1655

Closed
JoergM opened this issue Aug 9, 2017 · 8 comments
Closed
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@JoergM
Copy link

JoergM commented Aug 9, 2017

Is this a request for help?: No


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:
Helm 2.5.0
Kubernetes: 1.7.1

Which chart:
stable/prometheus

What happened:
All metrics that are provided by cAdvisor such as "container_network_receive_bytes_total" are not scraped by Prometheus. This prevents e.g. the very popular Grafana dashboard for Kubernetes from working (https://grafana.com/dashboards/315).

This is due to a change in kubectl and cAdvisor metrics in 1.7. This is very well described in the following Prometheus issue: prometheus/prometheus#2916

What you expected to happen:
To keep it compatible with Kubernetes 1.6. it might be necessary to have two versions of the Prometheus config.

How to reproduce it (as minimally and precisely as possible):
Install the Prometheus chart on a Kubernetes 1.7 cluster and try to find one of the cAdvisor metrics such as "container_network_receive_bytes_total".

Anything else we need to know:

There is a proposed scrape configuration in the Prometheus issue, that did work for me. I had to provide the serverfiles->prometheus.yaml with the fix.

The additional scrape config is:

        - job_name: 'kubernetes-cadvisor'

          scheme: https

          tls_config:
            ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

          kubernetes_sd_configs:
          - role: node

          relabel_configs:
          - action: labelmap
            regex: __meta_kubernetes_node_label_(.+)
          - target_label: __address__
            replacement: kubernetes.default.svc:443
          - source_labels: [__meta_kubernetes_node_name]
            regex: (.+)
            target_label: __metrics_path__
            replacement: /api/v1/proxy/nodes/${1}:4194/metrics 
@sylr
Copy link
Contributor

sylr commented Aug 10, 2017

Know that even with that the data you get is not really consistent, see google/cadvisor#1704

@sitnik
Copy link
Contributor

sitnik commented Aug 23, 2017

Can confirm that solution above works on k8s 1.7.

@krogon-dp
Copy link

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2018
@heckj
Copy link
Contributor

heckj commented Jan 15, 2018

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 15, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 15, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 15, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

7 participants