Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.27] - klipper-helm: Reinstalling job of a failed chart fails. (PR linked) #9623

Closed
brandond opened this issue Mar 1, 2024 · 1 comment
Assignees
Milestone

Comments

@brandond
Copy link
Member

brandond commented Mar 1, 2024

Backport fix for klipper-helm: Reinstalling job of a failed chart fails. (PR linked)

@brandond brandond self-assigned this Mar 1, 2024
@brandond brandond added this to the v1.27.12+k3s1 milestone Mar 6, 2024
@endawkins endawkins self-assigned this Mar 8, 2024
@endawkins
Copy link

Validated on branch 1.27 with 78ad575 / version 1.27

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

Linux i-0f477e039ffb149b9 6.5.0-1014-aws #14~22.04.1-Ubuntu SMP Thu Feb 15 15:27:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Cluster Configuration:

1 IPv6 Only Server

Config.yaml:

write-kubeconfig-mode: 644
token: test
node-ip: [redacted]
cluster-cidr: 2001:cafe:42:0::/56
service-cidr: 2001:cafe:42:1::/112
disable-network-policy: true
flannel-ipv6-masq: true
node_external_ip: [redacted]

Additional files

helmchartconfig.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    image:
      name: traefik
      tag: 2.9876.10
    ports:
      web:
        forwardedHeaders:
          trustedIPs:
            - 10.0.0.0/8
helmchartconfig1.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: traefik
  namespace: kube-system
spec:
  valuesContent: |-
    ports:
      web:
        forwardedHeaders:
          trustedIPs:
            - 10.0.0.0/8

Testing Steps

  1. Launch Dualstack Instance from AWS
  2. Launch IPv6 Only Instance from AWS
  3. Copy .pem from local to dualstack instance
  4. ssh -i to IPv6 only instance
$ ssh -i "<.pem_file>" user@<IPv6_ADDRESS>
  1. Configure IPv6 Instance
  • configure /etc/netplan/50-cloud-init.yaml (you may need to update your nameservers using nat64)
network: 
    ethernets: 
        ens5: 
            dhcp4: true 
            dhcp6: true
            match: 
                macaddress: [redacted]
            set-name: ens5 
            nameservers: 
                addresses: ["[redacted]", "[redacted]", "[redacted]"] 
    version: 2
  1. sudo netplan apply
  2. Update /etc/hosts file:
127.0.0.1 localhost
 
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback i-<hostname_of_IPv6>
  1. sudo systemctl stop systemd-resolved.service
  2. Update /etc/resolv.conf
nameserver <IPv6_ADDRESS>
options edns0 trust-ad
search [redacted]
  1. Copy config.yaml
$ sudo mkdir -p /etc/rancher/k3s/ && sudo cp config.yaml /etc/rancher/k3s/ && cat /etc/rancher/k3s/config.yaml
  1. Install k3s
  2. Apply the bad helmchartconfig
$ kubectl apply -f <helmchartconfig_bad.yaml>
  1. Mark the traefik chart as failed:
$ kubectl run helm-test --rm --stdin --tty --command --namespace kube-system --overrides='{"spec":{"serviceAccount":"helm-traefik"}}' --image=docker.io/rancher/klipper-helm:v0.8.2-build20230815 sh -
If you don't see a command prompt, try pressing enter.
~ $ helm_v3 set-status traefik failed
~ $ helm_v3 ls --all
~ $ exit
Session ended, resume using 'kubectl attach helm-test -c helm-test -i -t' command when the pod is running
pod "helm-test" deleted
  1. Apply another HelmChartConfig for reinstallation:
$ kubectl apply -f helmchartconfig_new.yaml
  1. Verify the reinstallation was successful

Replication Results:

  • k3s version used for replication:
N/A
N/A

This issue is a hard-to-reproduce issue - so reproduction was not able to be captured.

Validation Results:

  • k3s version used for validation:
$ k3s -v
k3s version v1.27.12-rc1+k3s1 (78ad5756)
go version go1.21.8
~ $ helm_v3 set-status traefik failed
2024/03/25 22:27:48 release traefik status updated

~ $ helm_v3 ls --all
NAME       	NAMESPACE  	REVISION	UPDATED                                	STATUS  	CHART                      	APP VERSION
traefik    	kube-system	2       	2024-03-25 22:27:48.03672866 +0000 UTC 	failed  	traefik-25.0.2+up25.0.0    	v2.10.5
traefik-crd	kube-system	1       	2024-03-25 22:12:09.184066937 +0000 UTC	deployed	traefik-crd-25.0.2+up25.0.0	v2.10.5

~ $ exit
Session ended, resume using 'kubectl attach helm-test -c helm-test -i -t' command when the pod is running
pod "helm-test" deleted

Additional context / logs:

$ kubectl get nodes,pods -A -o wide
NAME                       STATUS   ROLES                  AGE   VERSION             INTERNAL-IP                              EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
node/i-0f477e039ffb149b9   Ready    control-plane,master   11m   v1.27.12-rc1+k3s1   [redacted]                               <none>        Ubuntu 22.04.4 LTS   6.5.0-1014-aws   containerd://1.7.11-k3s2.27

NAMESPACE     NAME                                          READY   STATUS             RESTARTS   AGE    IP                NODE                  NOMINATED NODE   READINESS GATES
kube-system   pod/local-path-provisioner-79ffd768b5-rm8v2   1/1     Running            0          11m    2001:cafe:42::4   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/coredns-77ccd57875-sgscj                  1/1     Running            0          11m    2001:cafe:42::2   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-crd-zfdwk            0/1     Completed          0          11m    2001:cafe:42::3   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/svclb-traefik-ac7dafdc-w4hmg              2/2     Running            0          10m    2001:cafe:42::7   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/traefik-768bdcdcdd-6pxhk                  1/1     Running            0          10m    2001:cafe:42::8   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/metrics-server-c44988498-clzvf            1/1     Running            0          11m    2001:cafe:42::6   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-xpj85                0/1     Completed          0          2m8s   2001:cafe:42::9   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/traefik-7cd9458f4d-hf79l                  0/1     ImagePullBackOff   0          2m5s   2001:cafe:42::a   i-0f477e039ffb149b9   <none>           <none>

NAME                       STATUS   ROLES                  AGE   VERSION             INTERNAL-IP                              EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
node/i-0f477e039ffb149b9   Ready    control-plane,master   19m   v1.27.12-rc1+k3s1   [redacted]                               <none>        Ubuntu 22.04.4 LTS   6.5.0-1014-aws   containerd://1.7.11-k3s2.27

NAMESPACE     NAME                                          READY   STATUS        RESTARTS   AGE   IP                NODE                  NOMINATED NODE   READINESS GATES
kube-system   pod/local-path-provisioner-79ffd768b5-rm8v2   1/1     Running       0          19m   2001:cafe:42::4   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/coredns-77ccd57875-sgscj                  1/1     Running       0          19m   2001:cafe:42::2   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-crd-zfdwk            0/1     Completed     0          19m   2001:cafe:42::3   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/metrics-server-c44988498-clzvf            1/1     Running       0          19m   2001:cafe:42::6   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-95kgr                1/1     Running       0          3s    2001:cafe:42::c   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/traefik-768bdcdcdd-6pxhk                  1/1     Terminating   0          18m   2001:cafe:42::8   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/svclb-traefik-ac7dafdc-w4hmg              0/2     Terminating   0          18m   <none>            i-0f477e039ffb149b9   <none>           <none>

NAME                       STATUS   ROLES                  AGE   VERSION             INTERNAL-IP                              EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
node/i-0f477e039ffb149b9   Ready    control-plane,master   21m   v1.27.12-rc1+k3s1   [redacted]                               <none>        Ubuntu 22.04.4 LTS   6.5.0-1014-aws   containerd://1.7.11-k3
s2.27

NAMESPACE     NAME                                          READY   STATUS      RESTARTS   AGE    IP                NODE                  NOMINATED NODE   READINESS GATES
kube-system   pod/local-path-provisioner-79ffd768b5-rm8v2   1/1     Running     0          21m    2001:cafe:42::4   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/coredns-77ccd57875-sgscj                  1/1     Running     0          21m    2001:cafe:42::2   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-crd-zfdwk            0/1     Completed   0          21m    2001:cafe:42::3   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/metrics-server-c44988498-clzvf            1/1     Running     0          21m    2001:cafe:42::6   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/svclb-traefik-7d49c1f2-ktz88              2/2     Running     0          2m3s   2001:cafe:42::d   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/helm-install-traefik-95kgr                0/1     Completed   0          2m6s   2001:cafe:42::c   i-0f477e039ffb149b9   <none>           <none>
kube-system   pod/traefik-54dfd465df-wpbbm                  1/1     Running     0          2m3s   2001:cafe:42::e   i-0f477e039ffb149b9   <none>           <none>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants