add scaler for temporal #6191

Prajithp · 2024-09-26T08:17:40Z

Implement a temporal scaler

Checklist

When introducing a new scaler, I agree with the scaling governance policy
I have verified that my change is according to the deprecations & breaking changes policy
Tests have been added
Changelog has been updated and is aligned with our changelog requirements
A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified)
A PR is opened to update the documentation on (repo) (if applicable)
Commits are signed with Developer Certificate of Origin (DCO - learn more)

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

cretz · 2024-09-26T13:52:43Z

See comment at temporalio/temporal#33 (comment). Temporal's KEDA approach may slightly differ. Will have the engineers review, but we may suggest slight differences.

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

febinct · 2024-09-30T11:34:36Z

any suggestion/comments @cretz

robholland · 2024-09-30T16:18:10Z

We're currently discussing which use cases we would like the scaler to support, we'll be in a position to give some feedback/direction on Friday 4th.

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

jhecking · 2024-10-04T12:38:57Z

We rolled out this new scaler to one of our dev clusters. (Rebased on top of the v2.15 release branch.) Activation/deactivation is working as expected. BUT, what I'm seeing is that the kena-operator pod is eating up all its allocated CPU when the temporal trigger is active. When I pause the scaledObject with the temporal trigger, then the CPU utilisation goes back to near zero. There are several other scaledObjects with prometheus triggers which don't cause this problem.

I don't see anything relevant in the keda-operator logs, even on DEBUG log level. I enabled profiling and this is the flame graph I see when this is happening:

[go tool pprof -http=:8081 "http://localhost:8082/debug/pprof/profile?seconds=60"]

For reference, here is a "normal" flame graph when all the Temporal triggers are paused:

One detail that might be relevant is that keda is connecting to the Temporal server via our Consul service mesh, i.e. there is a consul proxy injected into the keda-operator pod and the Temporal scaler is configured to connect to localhost:7233. But Keda is able to connect to the Temporal server, i.e. there are no connection errors. And we use this same configuration for all the Temporal worker services in the same cluster that Keda is supposed to scale, and none of them show this same behaviour.

I'm a bit at a loss as to how to debug this further. Any suggestions?

add scaler for temporal

f5d7f78

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

Prajithp requested a review from a team as a code owner September 26, 2024 08:17

febinct mentioned this pull request Sep 26, 2024

feat: add scaler for temporal #4863

Closed

7 tasks

Prajithp added 2 commits September 27, 2024 11:42

add option to filter based on build ids

2cfaa31

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

use typed config

6018463

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

Prajithp added 3 commits September 30, 2024 22:35

support apiKey authentication

984d1de

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

use context

58d8990

Signed-off-by: Prajithp <prajithpalakkuda@gmail.com>

Merge branch 'main' into temporal

6766405

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add scaler for temporal #6191

add scaler for temporal #6191

Prajithp commented Sep 26, 2024

cretz commented Sep 26, 2024

febinct commented Sep 30, 2024

robholland commented Sep 30, 2024

jhecking commented Oct 4, 2024

add scaler for temporal #6191

Are you sure you want to change the base?

add scaler for temporal #6191

Conversation

Prajithp commented Sep 26, 2024

Checklist

cretz commented Sep 26, 2024

febinct commented Sep 30, 2024

robholland commented Sep 30, 2024

jhecking commented Oct 4, 2024