pkg/start/asset: Add support for post-pod-manifests #13

wking · 2019-01-29T21:27:40Z

So the installer can inject objects that require OpenShift custom resources (e.g. machine(set)s). We currently do this in a separate script that happens before bootstrap-complete, but it's a good fit for cluster-bootstrap, which already has object-injection logic for the pre-pod assets.

CC @sttts

So the installer can inject objects that require OpenShift custom resources (e.g. machine(set)s).

openshift-ci-robot · 2019-01-29T21:27:49Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wking
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: sttts

If they are not already assigned, you can assign the PR to them by writing /assign @sttts in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…own-event In master there's a bit of a window during the bootstrap-teardown dance: 1. cluster-bootstrap sees the requested pods. 2. cluster-bootstrap shuts itself down. 3. openshift.sh pushes the OpenShift-specific manifests. 4. report-progress.sh pushes bootstrap-complete. 5. The installer sees bootstrap-complete and removes the bootstrap resources, including the bootstrap load-balancer targets. 6. subsequent Kubernetes API traffic hits the production control plane. That leaves a fairly large window from 3 through 5 where Kubernetes API requests could be routed to the bootstrap machine and dropped because it no longer has anything listening on 6443. With this commit, we take advantage of openshift/cluster-bootstrap@d07548e3 (Add --tear-down-event flag to delay tear down, 2019-01-24, openshift/cluster-bootstrap#9) to drop step 2 (waiting for an event we never send). That leaves the bootstrap control-plane running until we destroy that machine. We take advantage of openshift/cluster-bootstrap@180599bc (pkg/start/asset: Add support for post-pod-manifests, 2019-01-29, openshift/cluster-bootstrap#13) to replace our previous openshift.sh (with a minor change to the manifest directory). And we take advantage of openshift/cluster-bootstrap@e5095848 (Create bootstrap-success event before tear down, 2019-01-24, openshift/cluster-bootstrap#9) to replace our previous report-progress.sh (with a minor change to the event name). Also set --strict, because we want to fail-fast for these resources. The user is unlikely to scrape them out of the installer state and push them by hand if we fail to push them from the bootstrap node. With these changes, the new transition is: 1. cluster-bootstrap sees the requested pods. 2. cluster-bootstrap pushes the OpenShift-specific manifests. 3. cluster-bootstrap pushes bootstrap-success. 4. The installer sees bootstrap-success and removes the bootstrap resources, including the bootstrap load-balancer targets. 5. subsequent Kubernetes API traffic hits the production control plane. There's still a small window for lost Kubernetes API traffic: * The Terraform tear-down could remove the bootstrap machine before it removes the bootstrap load-balancer target, leaving the target pointing into empty space. * Bootstrap teardown does not allow existing client connections to drain after removing the load balancer target before removing the bootstrap machine. Both of these could be addressed by: 1. Remove the bootstrap load-balancer targets. 2. Wait for the 30 seconds (healthy_threshold * interval for our aws_lb_target_group [1]) for the load-balancer to notice the production control-plane targets are live. This assumes the post-pod manifests are all pushed in zero seconds, so it's overly conservative, but waiting an extra 30 seconds isn't a large cost. 3. Remove the remaining bootstrap resources, including the bootstrap machine. But even without that delay, this commit reduces the window compared to what we have in master. I'll land the delay in follow-up work. [1]: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html

abhinavdahiya · 2019-01-29T23:40:52Z

There are no retries in createAssets.
The assets being pushed to cluster by https://github.com/openshift/installer/blob/v0.11.0/data/data/bootstrap/files/usr/local/bin/openshift.sh will keep failing until the api is made available by the operator.

createAssets is not a replacement for https://github.com/openshift/installer/blob/v0.11.0/data/data/bootstrap/files/usr/local/bin/openshift.sh

wking · 2019-01-30T01:11:38Z

...will keep failing until the api is made available by the operator.

Ah, I'd expected these were CRDs which would be live before the pods we're waiting for come up. But if not, I'll add optional retry logic to createAssets.

sttts · 2019-01-30T10:17:32Z

/hold

until the discussion around openshift/installer#1147 (comment) is resolved.

sttts · 2019-01-30T16:17:02Z

I am not sure we want to move cluster bootup orchestration (above the control plane itself) here. Machine objects need CRDs, don't they? Those CRDs are only available when the CVO has bootstrapped enough.

Instead of making cluster-bootstrap CVO logic aware (please, let's not do that, it's not its purpose, but CVO's), we should have a way to pass those manifests to CVO somehow.

wking · 2019-01-30T17:11:53Z

I am not sure we want to move cluster bootup orchestration (above the control plane itself) here. Machine objects need CRDs, don't they?

That was @abhinavdahiya's point above. I still think that retries (in addition to our current pod waits) will allow us to handle this issue without needing to bake in a CVO-watcher.

Instead of making cluster-bootstrap CVO logic aware (please, let's not do that, it's not its purpose, but CVO's), we should have a way to pass those manifests to CVO somehow.

I don't think we want the CVO involved in manifests that aren't part of the update-payload.

sttts · 2019-01-31T09:44:51Z

That was @abhinavdahiya's point above. I still think that retries (in addition to our current pod waits) will allow us to handle this issue without needing to bake in a CVO-watcher.

To understand your point: you want that cluster-bootstrap keep trying to create those CustomResources until the corresponding CRD is installed by CVO and the create call will succeed? I think I could live with that.

sttts · 2019-01-31T15:03:25Z

If the installer waits for the bootstrap-success event, why can't it create the necessary manifests itself? I don't see a reason to mix non-control-plane bootstrapping logic in here.

For post-pod manifests that depend on CRDs pushed by operators, we may get through waitUntilPodsRunning before the CRDs have been submitted and registered. This commit adds an optional retry delay, so we still error immediately without retries for pre-pod manifests (this doesn't seem very robust?) but the new post-pod manifests get the more relaxed retry approach. There's no cap on the number of retries. For stuck clusters, the installer will eventually give up on waiting and bail out, so we don't need a timeout in cluster-bootstrap itself.

openshift-ci-robot · 2019-02-01T09:38:58Z

@wking: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/images	`1af1035`	link	`/test images`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

wking · 2019-02-01T09:42:59Z

I've pushed 180599b -> 1af1035 adding retries to the post-pod manifests. I'd personally prefer to have those retries for pre-pod manifests as well (there could be network hiccups, etc.), so let me know if you want me to drop the retryDelay plumbing and just hard-code the retry loop in (*creator).create.

If the installer waits for the bootstrap-success event, why can't it create the necessary manifests itself? I don't see a reason to mix non-control-plane bootstrapping logic in here.

It can. But this PR is an 11-line addition here to replace a 131-line removal in openshift/installer#1147. Collecting both pre- and post-pod manifest pushing here means we only need one pod-waiting implementation (vs. one in each repo), and only one list of pods to wait for, only one manifest-pushing implementation. This PR also allows those post-pod manifests to get pushed at the right time as a result of their placement in a Go function, vs. doing an event push/watch dance to facilitate a transition to a separate command living on the same bootstrap machine where cluster-bootstrap is running.

And the new functionality is opt-in. If callers don't want to use it, they just keep their stuff out of post-pod-manifests/.

sttts · 2019-02-04T10:40:18Z

But this PR is an 11-line addition here to replace a 131-line removal in openshift/installer#1147.

This is easy to solve by making manifest creation shared via library-go. I am not a fan of diluting clear responsibilities of components in order to save 100 lines of code.

What are we talking here about exactly? What are the resources that should be created between bootstrap-success and bootstrap-complete?

wking · 2019-02-04T11:48:39Z

But this PR is an 11-line addition here to replace a 131-line removal in openshift/installer#1147.

This is easy to solve by making manifest creation shared via library-go. I am not a fan of diluting clear responsibilities of components in order to save 100 lines of code.

Putting manifest-pushing in library-go does not address triggering. We'd have to also add pod-waiting and copy our same list of pods over, add event listening and wait for bootstrap-success, or just repeatedly attempt the pushes the whole time cluster-bootstrap was running. None of those sound as simple/efficient as this PR.

What are we talking here about exactly? What are the resources that should be created between bootstrap-success and bootstrap-complete?

Currently master/worker Machine(Set)s, cloud secrets, the kubeadmin password, and whatever 99_binding-discovery.yaml is. Soon maybe some MachineConfigs too (openshift/installer#1150).

sttts · 2019-02-04T11:58:39Z

We'd have to also add pod-waiting

This concerns me most. Creating manifests is one thing (with some dumb logic). But waiting for pods sounds like non-trivial orchestration logic. None of "master/worker Machine(Set)s, cloud secrets, the kubeadmin password" is a pod.

wking · 2019-02-04T12:11:22Z

We'd have to also add pod-waiting

This concerns me most. Creating manifests is one thing (with some dumb logic). But waiting for pods sounds like non-trivial orchestration logic. None of "master/worker Machine(Set)s, cloud secrets, the kubeadmin password" is a pod.

I think you mean "requires a pod", when these mostly require custom-resource registration (e.g. via CRDs). But those registrations are certainly correlated with pods, because the CVO is pushing out both CRDs and Deployments for the waited-on-pods. So while we still need retried pushes for these post-pod manifests (and as I pointed out here, I think we probably want retries for pre-pod manifests too), this PR's Go triggering is saving us from retries (or event waiting, or whatever) for the several minutes cluster-bootstrap is running before the waited pods come up.

sttts · 2019-02-04T12:20:00Z

So while we still need retried pushes for these post-pod manifests

This I get, especially for CRDs. Pods though seem to be irrelevant. They are a technical details of how the operators behind the CRDs are created. We shouldn't care in the post-pod (which should be better called "post-tear-down") phase.

wking · 2019-02-04T12:33:46Z

We shouldn't care in the post-pod (which should be better called "post-tear-down") phase.

I don't care about retries in the post-pod phase. What I'm trying to avoid is polling retries for these day-2-ish manifests before the waited-on pods are up.

sttts · 2019-02-04T12:41:43Z

I don't care about retries in the post-pod phase. What I'm trying to avoid is polling retries for these day-2-ish manifests before the waited-on pods are up.

You have a bootstrap-success event ;-)

wking · 2019-02-04T13:11:43Z

You have a bootstrap-success event ;-)

Right, but as I pointed out earlier, both the current cluster-bootstrap logic and this post-pod manifest pushing live on the bootstrap machine. Go function calls seem like a much simpler and more efficient way to serialize them than using a Kubernetes event.

sttts · 2019-02-04T14:56:49Z

Right, but as I pointed out earlier, both the current cluster-bootstrap logic and this post-pod manifest pushing live on the bootstrap machine.

What is the reason that this happens on the bootstrap machine at all and not via the installer directly?

openshift-ci-robot · 2019-02-08T08:48:27Z

@wking: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking · 2019-03-07T00:00:31Z

Obsoleted by #14, with which I can just dump everything into one bucket of manifests.

pkg/start/asset: Add support for post-pod-manifests

180599b

So the installer can inject objects that require OpenShift custom resources (e.g. machine(set)s).

openshift-ci-robot requested review from deads2k and sttts January 29, 2019 21:27

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jan 29, 2019

wking mentioned this pull request Jan 29, 2019

data/bootstrap/files/usr/local/bin/bootkube.sh.template: Set --tear-down-event openshift/installer#1147

Closed

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2019

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 1, 2019

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2019

wking closed this Mar 7, 2019

wking deleted the post-pod-objects branch March 7, 2019 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkg/start/asset: Add support for post-pod-manifests #13

pkg/start/asset: Add support for post-pod-manifests #13

wking commented Jan 29, 2019

openshift-ci-robot commented Jan 29, 2019

abhinavdahiya commented Jan 29, 2019

wking commented Jan 30, 2019

sttts commented Jan 30, 2019

sttts commented Jan 30, 2019 •

edited

Loading

wking commented Jan 30, 2019

sttts commented Jan 31, 2019

sttts commented Jan 31, 2019

openshift-ci-robot commented Feb 1, 2019

wking commented Feb 1, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

openshift-ci-robot commented Feb 8, 2019

wking commented Mar 7, 2019

pkg/start/asset: Add support for post-pod-manifests #13

pkg/start/asset: Add support for post-pod-manifests #13

Conversation

wking commented Jan 29, 2019

openshift-ci-robot commented Jan 29, 2019

abhinavdahiya commented Jan 29, 2019

wking commented Jan 30, 2019

sttts commented Jan 30, 2019

sttts commented Jan 30, 2019 • edited Loading

wking commented Jan 30, 2019

sttts commented Jan 31, 2019

sttts commented Jan 31, 2019

openshift-ci-robot commented Feb 1, 2019

wking commented Feb 1, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

wking commented Feb 4, 2019

sttts commented Feb 4, 2019

openshift-ci-robot commented Feb 8, 2019

wking commented Mar 7, 2019

sttts commented Jan 30, 2019 •

edited

Loading