Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify control plane downgrade and/or rollback during HA upgrade #12327

Open
Tracked by #12329
liggitt opened this issue Jan 22, 2019 · 30 comments
Open
Tracked by #12329

clarify control plane downgrade and/or rollback during HA upgrade #12327

liggitt opened this issue Jan 22, 2019 · 30 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. language/en Issues or PRs related to English language lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. wg/lts Categorizes an issue or PR as relevant to WG LTS.

Comments

@liggitt
Copy link
Member

liggitt commented Jan 22, 2019

Follow up from #11060, tracked in #12329

What is tested/supported for control plane component downgrade, and for safe rollback during an HA control plane upgrade is not clear in user-facing documentation:

Relevant comments are copied here:

#11060 (comment)

@yastij:
Do we support downgrades ? We should clarify this as @tpepper said.

@bgrant0607:
The open-source project currently doesn't support control-plane downgrades, but we are working on it. Replacing kubelets with older versions within permitted skew should be fine. I don't see any documentation on kubernetes.io about downgrades, either. So far, it's been provider-specific. Issues include storage version downgrades, resource orphaning / leaking, component and add-on downgrade order, and extension management. There's some discussion here: kubernetes/kubernetes#4855 (comment)

#11060 (comment):

@tpepper:
I'm curious about the comments around downgrade. My impression today is that we do not actually have anybody giving meaningful support for downgrade. There's a very narrow use case where we have some test coverage. It breaks regularly and there isn't an owner for the test. I can't find people who actually use or genuinely want it. At best we keep saying "that's Google" in SIG Release, the release teams, and SIG Cluster Lifecycle when we bump into downgrade issues, but at KubeCon last week in discussion with Chao Xu @caesarxuchao I got the distinct impression that actually this is not something Google does today and he was talking about looking in 2019 at making it functional...ie: adding meaningful support for downgrade.

As it stands the PR mentions upgrade, the skew document starts out somewhat generic in terms of skew direction, and then focuses on upgrade.

Is downgrade supported today? If so, the document should cover it explicitly.

That said, I really prefer the engineering simplifications that come with saying "no" to allowing downgrade, but even if we say we only support forward moves, we also don't have sufficient tooling to make it easy for operators to validate forward moves ahead of attempting them or make it safe and easy to discard such attempts (ie: no manually editing etcd content to amend the mistakes). In my experience it's dramatically easier to implement those improvements in a forward-only scheme, versus also trying to support downgrades.

@kubernetes/sig-testing @kubernetes/sig-release @kubernetes/sig-cluster-lifecycle @kubernetes/sig-architecture-feature-requests

Page to Update:
https://kubernetes.io/docs/setup/version-skew-policy/ (or update to link to relevant docs)

@k8s-ci-robot k8s-ci-robot added sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. kind/feature Categorizes issue or PR as related to a new feature. labels Jan 22, 2019
@imkin
Copy link

imkin commented Mar 15, 2019

/wg lts

@k8s-ci-robot k8s-ci-robot added the wg/lts Categorizes an issue or PR as relevant to WG LTS. label Mar 15, 2019
@sftim
Copy link
Contributor

sftim commented Jun 4, 2019

/language en

@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Jun 4, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 2, 2019
@BenTheElder
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 3, 2019
@sftim
Copy link
Contributor

sftim commented Sep 10, 2019

/kind feature
/priority backlog

@k8s-ci-robot k8s-ci-robot added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Sep 10, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 9, 2019
@jaredledvina
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 10, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 8, 2020
@jaredledvina
Copy link

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 8, 2020
@sftim
Copy link
Contributor

sftim commented Jun 3, 2020

/sig cluster-lifecycle
?

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Jun 3, 2020
@neolit123
Copy link
Member

etcd has one big gotcha WRT downgrades:
https://github.com/etcd-io/etcd/blob/master/Documentation/upgrades/upgrade_3_4.md#downgrade
also AFAIK, 4->3 might not be supported. there was a google doc by the etcd team discussing the future of etcd downgrades, but i cannot find it.

sig-cluster-lifecycle in the face of kubeadm does officially claims downgrades as unsupported, but it mostly works if one tries it. yet kubeadm does perform static pod manifest rollback in case a kubeadm upgrade per-node command fails and this is already mentioned in the upgrade docs.

AFAIK and after doing a quick check, other SIG CL tools claim downgrade as unsupported or tentative.

@BenTheElder
Copy link
Member

BenTheElder commented Jun 11, 2020 via email

@liggitt
Copy link
Member Author

liggitt commented Sep 1, 2022

/reopen
/remove-lifecycle rotten

@k8s-ci-robot
Copy link
Contributor

@liggitt: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 1, 2022
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 1, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 30, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 30, 2022
@tengqm
Copy link
Contributor

tengqm commented Dec 30, 2022

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 30, 2022
@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 20, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2024
@divya-mohan0209
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 10, 2024
@divya-mohan0209
Copy link
Contributor

/triage accepted

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 28, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. language/en Issues or PRs related to English language lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. wg/lts Categorizes an issue or PR as relevant to WG LTS.
Projects
None yet
Development

No branches or pull requests