Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/destroy/bootstrap: Separate load-balancer target teardown #1148

Closed

Commits on Jan 29, 2019

  1. data/bootstrap/files/usr/local/bin/bootkube.sh.template: Set --tear-d…

    …own-event
    
    In master there's a bit of a window during the bootstrap-teardown
    dance:
    
    1. cluster-bootstrap sees the requested pods.
    2. cluster-bootstrap shuts itself down.
    3. openshift.sh pushes the OpenShift-specific manifests.
    4. report-progress.sh pushes bootstrap-complete.
    5. The installer sees bootstrap-complete and removes the bootstrap
       resources, including the bootstrap load-balancer targets.
    6. subsequent Kubernetes API traffic hits the production control
       plane.
    
    That leaves a fairly large window from 3 through 5 where Kubernetes
    API requests could be routed to the bootstrap machine and dropped
    because it no longer has anything listening on 6443.
    
    With this commit, we take advantage of
    openshift/cluster-bootstrap@d07548e3 (Add --tear-down-event flag to
    delay tear down, 2019-01-24, openshift/cluster-bootstrap#9) to drop
    step 2 (waiting for an event we never send).  That leaves the
    bootstrap control-plane running until we destroy that machine.  We
    take advantage of openshift/cluster-bootstrap@180599bc
    (pkg/start/asset: Add support for post-pod-manifests, 2019-01-29,
    openshift/cluster-bootstrap#13) to replace our previous openshift.sh
    (with a minor change to the manifest directory).  And we take
    advantage of openshift/cluster-bootstrap@e5095848 (Create
    bootstrap-success event before tear down, 2019-01-24,
    openshift/cluster-bootstrap#9) to replace our previous
    report-progress.sh (with a minor change to the event name).
    
    Also set --strict, because we want to fail-fast for these resources.
    The user is unlikely to scrape them out of the installer state and
    push them by hand if we fail to push them from the bootstrap node.
    
    With these changes, the new transition is:
    
    1. cluster-bootstrap sees the requested pods.
    2. cluster-bootstrap pushes the OpenShift-specific manifests.
    3. cluster-bootstrap pushes bootstrap-success.
    4. The installer sees bootstrap-success and removes the bootstrap
       resources, including the bootstrap load-balancer targets.
    5. subsequent Kubernetes API traffic hits the production control
       plane.
    
    There's still a small window for lost Kubernetes API traffic:
    
    * The Terraform tear-down could remove the bootstrap machine before it
      removes the bootstrap load-balancer target, leaving the target
      pointing into empty space.
    * Bootstrap teardown does not allow existing client connections to
      drain after removing the load balancer target before removing the
      bootstrap machine.
    
    Both of these could be addressed by:
    
    1. Remove the bootstrap load-balancer targets.
    2. Wait for the 30 seconds (healthy_threshold * interval for our
       aws_lb_target_group [1]) for the load-balancer to notice the
       production control-plane targets are live.  This assumes the
       post-pod manifests are all pushed in zero seconds, so it's overly
       conservative, but waiting an extra 30 seconds isn't a large cost.
    3. Remove the remaining bootstrap resources, including the bootstrap
       machine.
    
    But even without that delay, this commit reduces the window compared
    to what we have in master.  I'll land the delay in follow-up work.
    
    [1]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html
    wking committed Jan 29, 2019
    Configuration menu
    Copy the full SHA
    4f68502 View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2019

  1. pkg/destroy/bootstrap: Separate load-balancer target teardown

    Close a small window for lost Kubernetes API traffic:
    
    * The Terraform tear-down could remove the bootstrap machine before it
      removes the bootstrap load-balancer target, leaving the target
      pointing into empty space.
    * Bootstrap teardown does not allow existing client connections to
      drain after removing the load balancer target before removing the
      bootstrap machine.
    
    With this commit, we:
    
    1. Wait 30 seconds for the production control plane to come up.
    2. Remove the bootstrap load-balancer targets.
    3. Wait 10 seconds for requests to the bootstrap machine to drain out.
    4. Remove the remaining bootstrap resources, including the bootstrap
       machine.
    
    The 30-second calculation is provider specific. On AWS, it is
    30-seconds for AWS to notice out production control-plane targets are
    live (healthy_threshold * interval for our aws_lb_target_group on
    AWS). This assumes the post-pod manifests are all pushed in zero
    seconds, so it's overly conservative, but waiting an extra 30 seconds
    isn't a large cost.
    
    The 30-second delay doesn't really matter for libvirt, because clients
    will have been banging away at the production control plane the whole
    time, with those requests failing until the control plane came up to
    listen.  But an extra 30 second delay is not a big deal either.
    
    The 10-second delay for draining works around a Terraform plugin
    limitation on AWS.  From the AWS network load-balancer docs [2]:
    
    > Connection draining ensures that in-flight requests complete before
    > existing connections are closed.  The initial state of a
    > deregistering target is draining.  By default, the state of a
    > deregistering target changes to unused after 300 seconds.  To change
    > the amount of time that Elastic Load Balancing waits before changing
    > the state to unused, update the deregistration delay value.  We
    > recommend that you specify a value of at least 120 seconds to ensure
    > that requests are completed.
    
    And from [3]:
    
    > Deregistering a target removes it from your target group, but does
    > not affect the target otherwise.  The load balancer stops routing
    > requests to a target as soon as it is deregistered.  The target
    > enters the draining state until in-flight requests have completed.
    
    The Terraform attachment-deletion logic is in [4], and while it fires
    a deregister request, it does not wait around for draining to
    complete.  I don't see any issues in the provider repository about
    waiting for the unused state, but we could push something like that
    [6] if we wanted more finesse here than a 10-second cross-platform
    sleep.  For the moment, I'm just saying "we know who our consumers are
    at this point, and none of them will keep an open request going for
    more than 10 seconds".
    
    The 10-second drain delay also seems sufficient for libvirt's
    round-robin DNS, since clients should be able to fall-back to
    alternative IPs on their own.  We may be able set shorter TTLs on
    libvirt DNS entries to firm that up, but clean transitions are less
    important for dev-only libvirt clusters anyway.  And, as for the
    30-second delay for the production control plane to come up, clients
    have been banging away on all of these IPs throughout the whole
    bootstrap process.
    
    I'm not sure how OpenStack handles this teardown; naively grepping
    through data/data/openstack didn't turn up anything that looked much
    like a bootstrap load-balancer target resource.
    
    [1]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html
    [2]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html
    [3]: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-target-groups.html#registered-targets
    [4]: pkg/terraform/exec/plugins/vendor/github.com/terraform-providers/terraform-provider-aws/aws/resource_aws_lb_target_group_attachment.go#L80-L106
    [5]: https://docs.aws.amazon.com/sdk-for-go/api/service/elbv2/#ELBV2.WaitUntilTargetDeregistered
    wking committed Jan 30, 2019
    Configuration menu
    Copy the full SHA
    1c772ac View commit details
    Browse the repository at this point in the history