Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEDA Operator reaching OOM #1565

Closed
avivgold098 opened this issue Feb 3, 2021 · 8 comments · Fixed by #1572
Closed

KEDA Operator reaching OOM #1565

avivgold098 opened this issue Feb 3, 2021 · 8 comments · Fixed by #1572
Labels
bug Something isn't working

Comments

@avivgold098
Copy link

A KEDA operator deployment is suffering from a memory leak, leading to the operator being restarted.

Expected Behavior

Keda operator is able to run with stable memory performance. Memory usage & connections should remain stable over time.

Actual Behavior

Keda operator pod memory is increasing over time until the pod is reached to his memory limit, causing OOM.

Steps to Reproduce the Problem

  1. Deploy Keda with chart version 2.1.1
  2. Define scaledobjects.keda.sh kubernetes objects

Logs from KEDA operator - removed objects names

2021-02-03T15:19:41.838Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":8080"}
2021-02-03T15:19:41.840Z	INFO	controllers.ScaledObject	Running on Kubernetes 1.18	{"version": "v1.18.15"}
2021-02-03T15:19:41.840Z	INFO	setup	Starting manager
2021-02-03T15:19:41.840Z	INFO	setup	KEDA Version: 2.1.0
2021-02-03T15:19:41.840Z	INFO	setup	Git Commit: 4866ce69c4897df532b43390bafe4477275bf65a
2021-02-03T15:19:41.840Z	INFO	setup	Go Version: go1.15.6
2021-02-03T15:19:41.840Z	INFO	setup	Go OS/Arch: linux/amd64
I0203 15:19:41.840358       1 leaderelection.go:243] attempting to acquire leader lease keda/operator.keda.sh...
2021-02-03T15:19:41.840Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
I0203 15:19:59.239226       1 leaderelection.go:253] successfully acquired lease keda/operator.keda.sh
2021-02-03T15:19:59.239Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.239Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.339Z	INFO	controller	Starting Controller	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob"}
2021-02-03T15:19:59.339Z	INFO	controller	Starting workers	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledJob", "controller": "scaledjob", "worker count": 1}
2021-02-03T15:19:59.339Z	INFO	controller	Starting EventSource	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "source": "kind source: /, Kind="}
2021-02-03T15:19:59.440Z	INFO	controller	Starting Controller	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject"}
2021-02-03T15:19:59.440Z	INFO	controller	Starting workers	{"reconcilerGroup": "keda.sh", "reconcilerKind": "ScaledObject", "controller": "scaledobject", "worker count": 1}
2021-02-03T15:19:59.440Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:19:59.956Z	INFO	controllers.ScaledObject	Initializing Scaling logic according to ScaledObject Specification	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:19:59.963Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": "enrichment-ingest"}
2021-02-03T15:20:01.478Z	INFO	controllers.ScaledObject	Initializing Scaling logic according to ScaledObject Specification	{"ScaledObject.Namespace": "default", "ScaledObject.Name": "enrichment-ingest"}
2021-02-03T15:20:01.485Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.506Z	INFO	controllers.ScaledObject	Initializing Scaling logic according to ScaledObject Specification	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.512Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:01.552Z	INFO	controllers.ScaledObject	Initializing Scaling logic according to ScaledObject Specification	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.157Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.258Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:12.314Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.205Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.280Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:27.306Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.251Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.315Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:42.340Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.298Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.346Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:20:57.371Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.347Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.440Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:12.460Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.811Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.839Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:27.865Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.857Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.889Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:42.916Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.667Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.697Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:21:58.778Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.718Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.747Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:13.805Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.539Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.571Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:29.596Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.586Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.615Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}
2021-02-03T15:22:44.639Z	INFO	controllers.ScaledObject	Reconciling ScaledObject		

Specifications

  • KEDA Version: 2.1.0
  • KEDA Chart Version: 2.1.1
  • Platform & Version: *OS: Ubuntu 20.04.1 LTS, Kernel: 5.4.0-1035-aws, Conainer runtime: docker://19.3.11 *
  • Kubernetes Version: Kuberneted 1.18.15 - KOPS (1.18.3)
  • Scaler(s): Kafka Scaler

Additonal Metrics

Pod performance

Screen Shot 2021-02-03 at 18 08 42

@avivgold098 avivgold098 added the bug Something isn't working label Feb 3, 2021
@zroubalik
Copy link
Member

Quickly looking at the logs, it is quite strange, on many lines ScaledObject.Name is empty:

2021-02-03T15:20:01.485Z	INFO	controllers.ScaledObject	Reconciling ScaledObject	{"ScaledObject.Namespace": "default", "ScaledObject.Name": ""}

How many ScaledObjects do you have deployed? And what is this giving you:

kubectl get so

@avivgold098
Copy link
Author

avivgold098 commented Feb 4, 2021

@zroubalik Hey!
I've said in the description that ScaledObject.Name is removed (by me) from the logs.
There are 4 active scaledobjects.keda.sh in our cluster

@zroubalik
Copy link
Member

@avivgold098 sorry, I missed that note 🤦

Could you please enable debug log level?
https://github.com/kedacore/keda/blob/main/BUILD.md#setting-log-levels

@avivgold098
Copy link
Author

Yes :)
keda-operator-debug-logs.txt

@zroubalik
Copy link
Member

@avivgold098 by chance, isn't there something that could modify the scaledobjects? We shouldn't see that much reconciliation happennig on the scaledobjects

@zroubalik
Copy link
Member

Have you run that setup with KEDA 2.0?

@avivgold098
Copy link
Author

@zroubalik Nothing is modified the scaledobjects.

We experienced the same behavior with Chart: v2.0.1 | App: v2.0.0 setup, thought the upgrade will fix the issue.

@ahmelsayed
Copy link
Contributor

I have a repro for the issue and I'm looking at it. The frequent reconciliation is odd, but shouldn't cause a memory leak. I had to artificially cause it and can see the memory building up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants