Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.30] - Improve performance on K3s secrets-encrypt reencrypt #10637

Closed
brandond opened this issue Aug 1, 2024 · 1 comment
Closed
Assignees
Milestone

Comments

@brandond
Copy link
Member

brandond commented Aug 1, 2024

Backport fix for Improve performance on K3s secrets-encrypt reencrypt

@aganesh-suse
Copy link

Validated on release-1.30 branch with a125b7f

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"

$ uname -m
x86_64

Cluster Configuration:

HA: 3 server/ 1 agent
1 etcd, 2 cp nodes and 1 agent

Config.yaml:

Etcd Node/Server1:

cat /etc/rancher/k3s/config.yaml 
token: xxxx
disable-apiserver: true
disable-controller-manager: true
disable-scheduler: true
node-taint:
- node-role.kubernetes.io/etcd:NoExecute
cluster-init: true
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 1.1.1.1
node-label:
- k3s-upgrade=server
debug: true

CP Nodes:

$ cat /etc/rancher/k3s/config.yaml 
token: secret
server: https://1.1.1.1:6443
disable-etcd: true
node-taint:
- node-role.kubernetes.io/control-plane:NoSchedule
write-kubeconfig-mode: "0644"
secrets-encryption: true
node-external-ip: 2.2.2.2
node-label:
- k3s-upgrade=server
debug: true

Agent node:

$ cat /etc/rancher/k3s/config.yaml 
token: secret
server: https://1.1.1.1:6443
node-external-ip: 4.4.4.4
node-label:
- k3s-upgrade=agent
debug: true

Testing Steps

  1. Copy config.yaml
$ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
  1. Install k3s
curl -sfL https://get.k3s.io | sudo INSTALL_K3S_COMMIT='a125b7f6238512237b3ca3bb05231c57f6935064' sh -s - server
  1. Verify Cluster Status:
kubectl get nodes -o wide
kubectl get pods -A
  1. Refer: Use higher QPS for secrets reencryption #10571
    Test reencryption via:
    a) Traditional method: prepare/reboot, rotate/reboot, reencrypt reboot.
    b) New method: rotate-keys option for rencryption
    Test 1: with 1001 basic secrets
    Test 2: With 150 large secrets at the size of 1000k each. (plus 1 basic secret)
    Note: The large secrets is highly memory intensive. Use minimum 8G memory for each node while testing this.

Compare the time taken for reencryption by monitoring the journal logs for secrets processed time.

Replication Results:

  • k3s version used for replication:
$ k3s -v
k3s version v1.30.3+k3s1 (f6466040)
go version go1.22.5

Basic secrets time taken:
Traditional method: 3 min 15 sec
Rotate_keys method: 3 min 10 sec
Example logs:

journalctl -xeu k3s | grep 'SecretsProgress' 
Aug 14 18:48:56 ip-172-31-20-31 k3s[9880]: I0814 18:48:56.082438    9880 event.go:389] "Event occurred" object="ip-172-31-20-31" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 60 secrets"
.
.
Aug 14 18:52:06 ip-172-31-20-31 k3s[9880]: I0814 18:52:06.571597    9880 event.go:389] "Event occurred" object="ip-172-31-20-31" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 1010 secrets"

Large secrets time taken:
Traditional method: 27 seconds
Rotate_keys method: 29 seconds

Aug 15 00:32:05 ip-172-31-30-123 k3s[10420]: I0815 00:32:05.883603   10420 event.go:389] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 10 secrets"
.
.
Aug 15 00:32:34 ip-172-31-30-123 k3s[10420]: I0815 00:32:34.930387   10420 event.go:389] "Event occurred" object="ip-172-31-30-123" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 160 secrets"

Validation Results:

  • k3s version used for validation:
$ k3s -v
k3s version v1.30.3+k3s-a125b7f6 (a125b7f6)
go version go1.22.5

Basic secrets time taken for rencryption:
Traditional method: 9 secs
Rotate_keys method: 7 secs
Example logs:

Aug 14 18:40:24 ip-172-31-19-225 k3s[7998]: I0814 18:40:24.809397    7998 event.go:389] "Event occurred" object="ip-172-31-19-225" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 50 secrets"
.
.
Aug 14 18:40:33 ip-172-31-19-225 k3s[7998]: I0814 18:40:33.131066    7998 event.go:389] "Event occurred" object="ip-172-31-19-225" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 1000 secrets"

Large secrets time taken for reencryption:
Traditional method: 9 secs
Rotate_keys method: 10 secs

Aug 15 00:29:51 ip-172-31-22-87 k3s[10149]: I0815 00:29:51.590131   10149 event.go:389] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 50 secrets"
.
.
Aug 15 00:30:01 ip-172-31-22-87 k3s[10149]: I0815 00:30:01.584397   10149 event.go:389] "Event occurred" object="ip-172-31-22-87" fieldPath="" kind="Node" apiVersion="" type="Normal" reason="SecretsProgress" message="reencrypted 150 secrets"

Additional context / logs:
The memory consumption for large secrets is high/varies and noted some results for reference here (8G RAM used):

                    OLD version     |  New version
Stage               %Mem Max  Avg | %Mem  Max  Avg

Post_Install        1.6  502 478  | 1.5  484  484
Create_Secrets      5.5 1761 732  | 4.8 1531 703
Prepare             5.8 1857 1386 | 5.1 1630 1256
Reboot              8.1 2590 2393 | 7.1 2622 2324
Rotate              10 2667 2533  | 9.7 3111 2932
Reboot              7.9 2534 2415 | 8.5 2726 2717
                    9.1 2665 2574 | 8.3 2831 2715
Reencrypt           8.4 2901 2604 | 7.5 2831 2639
Wait 300secs        5.1 2477 2123 | 5.6 2455 2063
Reboot              7.6 2436 2167 | 8.7 2803 2760
Rotate_Keys         7.7 3195 1850 | 8.2 2638 1922
Wait 300secs        5.7 2547 2189 | 6.7 2638 2304
Reboot              9   2890 2587 | 11.9 3851 3017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

4 participants