Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval #2800

lucas-howard-macmillan · 2022-09-14T17:58:34Z

Is your feature request related to a problem?

If a load balancer is deleted through the AWS Console, the AWS load balancer does not notice or re-create the load balancer.

The AWS load balancer controller must be restarted, and then the missing load balancer is recreated.

Describe the solution you'd like

An argument that could be passed into the controller indicating that it should do a full scan of AWS on a certain interval in an attempt to detect and fix drift within AWS from the expected state.

This would basically emulate the behavior that the AWS load balancer controller does when it starts up.

Potentially, for large deployments, you might also need a segment size argument as well
i.e.

Every 5 minutes scan AWS for 100 ingresses, then the next 5 minutes the next 100 ingresses etc..

Describe alternatives you've considered

I have used all existing arguments, such as sync period, but none of them cause the load balancer to be re-created.

kishorj · 2022-09-14T22:07:08Z

/assign @M00nF1sh
Investigate further on the periodic sync issue
This is similar to #2515

lukonjun · 2022-10-04T07:17:30Z

Experienced the same behaviour. I assumed when I delete the LB via the AWS Console the ALB Controller would automatically recreate, however it did not.

dongho-jung · 2022-10-25T08:54:00Z

I'm experiencing the same issue. It only gets recovered when the number of replicas behind the service is changed.

I expected it would get recovered every 200s according to below

aws-load-balancer-controller/pkg/deploy/elbv2/target_group_binding_manager.go

Lines 21 to 26 in ec34185

    
           const ( 
        
           	defaultWaitTGBObservedPollInterval = 200 * time.Millisecond 
        
           	defaultWaitTGBObservedTimeout      = 60 * time.Second 
        
           	defaultWaitTGBDeletionPollInterval = 200 * time.Millisecond 
        
           	defaultWaitTGBDeletionTimeout      = 60 * time.Second 
        
           )

ChrisV78 · 2022-11-18T12:07:24Z

Experienced the same by accident, removed the wrong ALB from the AWS console and the lb-controller only recreates the ALB the moment you restart the lb-controller deployment.

mxkmp · 2022-11-23T07:40:54Z

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

k8s-triage-robot · 2023-02-21T08:08:44Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

lucas-howard-macmillan · 2023-02-21T15:30:25Z

/remove-lifecycle stale

…

On Tue, Feb 21, 2023 at 2:08 AM Kubernetes Triage Robot < ***@***.***> wrote: The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules: - After 90d of inactivity, lifecycle/stale is applied - After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied - After 30d of inactivity since lifecycle/rotten was applied, the issue is closed You can: - Mark this issue as fresh with /remove-lifecycle stale - Close this issue with /close - Offer to help out with Issue Triage <https://www.kubernetes.dev/docs/guide/issue-triage/> Please send feedback to sig-contributor-experience at kubernetes/community <https://github.com/kubernetes/community>. /lifecycle stale — Reply to this email directly, view it on GitHub <#2800 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJWRH7766EVOPYGH7WOQJ23WYRZZRANCNFSM6AAAAAAQMU33SM> . You are receiving this because you authored the thread.Message ID: <kubernetes-sigs/aws-load-balancer-controller/issues/2800/1438014716@ github.com>

lucas-howard-macmillan · 2023-03-15T13:24:35Z

Yes, that's unfortunately the only fix which is possible at the moment. I think in an older version (before the renaming) it was possible to get it recreated automatically. Can this please be fixed?

In previous versions, it did automatically modify / recreate when it detected that AWS resources are not correct.

While running a previous version, we had an issue where multiple load balancers were accidentally deleted, and by the time we were notified there was issue, the controller had already re-created them.

k8s-triage-robot · 2023-06-13T13:29:00Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2023-07-13T13:49:11Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

oliviassss · 2023-07-17T18:44:41Z

Hi, we have shipped the fix in v2.5.4, please check the details in our release note: https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.5.4.
I'm closing this ticket as of now, please feel free to reach out or reopen if you have any issues. Thanks

lucas-howard-macmillan · 2023-09-27T20:58:48Z

@oliviassss Thank You!

k8s-ci-robot assigned M00nF1sh Sep 14, 2022

kishorj added the triage/needs-investigation label Sep 14, 2022

kishorj mentioned this issue Dec 30, 2022

Controller 2.4.5 Not Reconciling Changes to ACM Certificate #2915

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2023

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 13, 2023

This was referenced Jul 6, 2023

Controller runtime flag sync-period is not working as expected #3264

Closed

fix the bug that evenhanlder ignores the update per sync-period #3280

Merged

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 13, 2023

oliviassss closed this as completed Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval #2800

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval #2800

lucas-howard-macmillan commented Sep 14, 2022 •

edited

Loading

kishorj commented Sep 14, 2022

lukonjun commented Oct 4, 2022

dongho-jung commented Oct 25, 2022 •

edited

Loading

ChrisV78 commented Nov 18, 2022 •

edited

Loading

mxkmp commented Nov 23, 2022

k8s-triage-robot commented Feb 21, 2023

lucas-howard-macmillan commented Feb 21, 2023 via email

lucas-howard-macmillan commented Mar 15, 2023

k8s-triage-robot commented Jun 13, 2023

k8s-triage-robot commented Jul 13, 2023

oliviassss commented Jul 17, 2023

lucas-howard-macmillan commented Sep 27, 2023

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval #2800

Self Healing: Allow AWS Controller to Detect and Fix AWS Resource Changes on Interval #2800

Comments

lucas-howard-macmillan commented Sep 14, 2022 • edited Loading

kishorj commented Sep 14, 2022

lukonjun commented Oct 4, 2022

dongho-jung commented Oct 25, 2022 • edited Loading

ChrisV78 commented Nov 18, 2022 • edited Loading

mxkmp commented Nov 23, 2022

k8s-triage-robot commented Feb 21, 2023

lucas-howard-macmillan commented Feb 21, 2023 via email

lucas-howard-macmillan commented Mar 15, 2023

k8s-triage-robot commented Jun 13, 2023

k8s-triage-robot commented Jul 13, 2023

oliviassss commented Jul 17, 2023

lucas-howard-macmillan commented Sep 27, 2023

lucas-howard-macmillan commented Sep 14, 2022 •

edited

Loading

dongho-jung commented Oct 25, 2022 •

edited

Loading

ChrisV78 commented Nov 18, 2022 •

edited

Loading