Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet cAdvisor eventually losing kubernetes metadata labels #2183

Closed
k0nstantinv opened this issue Feb 28, 2019 · 4 comments
Closed

kubelet cAdvisor eventually losing kubernetes metadata labels #2183

k0nstantinv opened this issue Feb 28, 2019 · 4 comments

Comments

@k0nstantinv
Copy link

k0nstantinv commented Feb 28, 2019

I use:

kubernetes 1.8.11
prometheus 2.6.1
docker 17.03.1-ce

I have some kubernetes nodes (many) where the kubelet (http://NODE_IP:4194/metrics) eventually stops exporting a huge part of the metrics, such as container_* . To be more precise - it does, but it starts losing all the kubernetes metadata labels. For example:

"good" node

container_cpu_usage_seconds_total{container_name="POD",cpu="cpu00",id="/kubepods/burstable/pod2823c623-3b4c-11e9-9b8a-d09466092ef4/a50b87d8e7483e6c717490b43a19aa1acd254a15ffad4b1d5ff8eac3ed70f566",image="gcr.io/google_containers/pause-amd64:3.0",instance="srv0005",job="kubernetes-cadvisors",name="k8s_POD_frontend-86cc98b5fc-kqjfc_frontend_2823c623-3b4c-11e9-9b8a-d09466092ef4_0",namespace="frontend",pod_name="frontend-86cc98b5fc-kqjfc"} | 0.000023848

"bad" node

container_cpu_usage_seconds_total{cpu="cpu00",id="/kubepods/burstable/pod32a657f3-1b2e-11e9-9b8a-d09466092ef4/113d6bf59bebc9042ac5eeac1641577cf97bcc9d23fff726b07796ec19fb6b09",instance="srv0014",job="kubernetes-cadvisors"}

As you can see there's no lablels like: container_name, namespace, pod_name etc. The ploblem is - there's no metric queries with that kind of labels at all.

The nodes configured exactly the same way. The only thing I found interesting is the /validate endpoint outputs. For the "good" node it returns:

...
Managed containers:
	/kubepods/burstable/pod5f1000a0-3068-11e9-9b8a-d09466092ef4/9d350c41379f1073bf86f5362dc03d128118289c1bd6bd7deed437da9d6fa377
		Namespace: docker
		Aliases:
			k8s_some-name-b79bf94ff-4ng6m_some-name_5f1000a0-3068-11e9-9b8a-d09466092ef4_0
			9d350c41379f1073bf86f5362dc03d128118289c1bd6bd7deed437da9d6fa377
...

while for the "bad" node:

...
Managed containers:
	/kubepods/burstable/pod816cebb1-3a83-11e9-9b8a-d09466092ef4/c43f480f1049c95b262d9ac4cca6b135521720f09177b7206c2ce1dfff864952
	/kubepods/burstable/podcdc3239f-14e1-11e9-9b8a-d09466092ef4/7392898c92d8aa6dc809dcf0beea59061cf8c374091045056b9c171147072f7d
	/kubepods/burstable/podf7033cd1-3a93-11e9-9b8a-d09466092ef4/7b64674481641210db5e4413c3a863390c02af7dae580f1e54bcb1e39c838333
....

and so on...

I think it might be somehow related to this issues:
#1704
#1958

I'm ready to show logs, configs etc (but seems like there's no any helpful information)

@dashpole
Copy link
Collaborator

Yeah, it does seem like you hit #1704. It looks like that was released in v0.28.3, which corresponds to kubernetes 1.9.

@k0nstantinv
Copy link
Author

@dashpole, so is there a way to fix this in 1.8.11? I can't upgrade one of my clusters right now due to legacy reasons

@dashpole
Copy link
Collaborator

I'm afraid they stopped accepting patches to 1.8 a little while ago, as kubernetes currently patches the last 3 minor releases (currently 1.10 - 1.13 but 1.14 is about to be released).

@k0nstantinv
Copy link
Author

seems like there is no way to fix it without upgrading the clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants