Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/k8scluster] Check resource support before setting up watchers #9523

Merged
merged 1 commit into from
Apr 29, 2022

Conversation

dmitryax
Copy link
Member

@dmitryax dmitryax commented Apr 26, 2022

This change adds additional validation using discovery API before setting up watchers on particular k8s resources. Otherwise, if k8s API doesn't support a k8s resource, setting up watchers breaks the receiver.

This change also adds support of batch/v1beta1 CronJob resources.

This makes the receiver able to work with k8s server versions 1.19 and 1.20 which are still available on some cloud providers.

Fixes: #9414

@dmitryax dmitryax requested review from a team and mx-psi April 26, 2022 02:10
@dmitryax dmitryax force-pushed the k8s-cluster-receiver-fix branch 3 times, most recently from 22126d6 to 4d75b74 Compare April 26, 2022 06:04
@mx-psi
Copy link
Member

mx-psi commented Apr 26, 2022

@open-telemetry/collector-contrib-approvers is there anyone with more knowledge of Kubernetes internals than me that can review this? I don't feel confident enough to be able to give a thorough review here

@bogdandrutu
Copy link
Member

cc @dashpole

Copy link
Member

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but it would be good to decide when to remove the deprecated batch API from this code.

Copy link
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a different idea: Can we feature-gate the use of these new k8s APIs? Right now, since most users are on versions without support for the new APIs, they need to stay on the old ones, and we should use the old ones by default. In ~6 months, most users should be on the new ones (but many lag behind), and we should change the default to use the new ones.

Edit: alternatively, we can just stick with the older APIs.

I'm not sure using discovery APIs is a good idea. It isn't a common pattern in k8s to check something only once. APIs can also be updated without restarting the collector.

receiver/k8sclusterreceiver/watcher.go Show resolved Hide resolved
@jpkrohling
Copy link
Member

Can we feature-gate the use of these new k8s APIs?

The batch v1 API has been released a couple of versions ago, hasn't it?

@dashpole
Copy link
Contributor

The batch v1 API has been released a couple of versions ago, hasn't it?

It was, but the k8s community has a fairly wide support window for older versions, since most managed k8s offerings lag ~3 versions behind what is just released.

@dmitryax
Copy link
Member Author

@dashpole thanks for your feedback. We can use feature gates instead of discovery API. Ideally I wanted to make sure that if the informer API is not available for some resource, the receiver would keep working providing metrics for other resources. Do you have any idea how that can be achieved without using discovery API?

@dashpole
Copy link
Contributor

Do you have any idea how that can be achieved without using discovery API?

The discovery API is definitely the right way of doing it. It would probably be acceptable to use the discovery API if the receiver fails if the discovery API isn't available (i.e. don't just log the error--return it)

@dmitryax
Copy link
Member Author

The discovery API is definitely the right way of doing it. It would probably be acceptable to use the discovery API if the receiver fails if the discovery API isn't available (i.e. don't just log the error--return it)

I like this approach. I'll change it accordingly

Copy link
Contributor

@dashpole dashpole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks reasonable to me.

@dmitryax dmitryax force-pushed the k8s-cluster-receiver-fix branch 3 times, most recently from 77c5ed7 to 2e363ea Compare April 28, 2022 16:01
This change adds additional validation using discovery API before setting up watchers on particular k8s resources. Otherwise, if k8s API doesn't support a k8s resource, setting up watchers breaks the receiver.

This change also adds support of batch/v1beta1 CronJob resources. This makes the receiver able to work with k8s server versions 1.19 and 1.20 which are still available on some cloud providers.
@bogdandrutu bogdandrutu merged commit 121ba63 into open-telemetry:main Apr 29, 2022
djaglowski pushed a commit to djaglowski/opentelemetry-collector-contrib that referenced this pull request May 10, 2022
…rs (open-telemetry#9523)

This change adds additional validation using discovery API before setting up watchers on particular k8s resources. Otherwise, if k8s API doesn't support a k8s resource, setting up watchers breaks the receiver.

This change also adds support of batch/v1beta1 CronJob resources. This makes the receiver able to work with k8s server versions 1.19 and 1.20 which are still available on some cloud providers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[receiver/k8scluster] Dropped support of k8s cluster version earlier than 1.21
6 participants