Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The mirror and rewrite configured on the ranger are missing in the /var/lib/rancher/rke2/agent/etc/containerd/config.toml file. #6066

Closed
xzxiaoshan opened this issue May 31, 2024 · 15 comments

Comments

@xzxiaoshan
Copy link

image

image

image

I have created a cluster in Rancher and configured mirror and rewrite. The generated registries.yaml file appears to be fine, but the config.toml file that is supposed to be generated does not exist. This failure to generate the config.toml prevents the rewrite process from functioning properly, hence the cluster cannot be established. Is this indicative of a bug?

@brandond
Copy link
Member

Storing these items in config.toml has long been deprecated by containerd. Recent releases use files under the path set by the config_path for mirrors, rewrites, and tls config. Ref: k3s-io/k3s#8973

@xzxiaoshan
Copy link
Author

Storing these items in config.toml has long been deprecated by containerd. Recent releases use files under the path set by the config_path for mirrors, rewrites, and tls config. Ref: k3s-io/k3s#8973

Thank you, may I know from which release version this has been supported?

@brandond
Copy link
Member

brandond commented Jun 3, 2024

All of them starting in January.

@xzxiaoshan
Copy link
Author

All of them starting in January.

I created the rke2 cluster on Rancher 2.8.2, but it seems that I did not see the entry for config_path configuration in the configuration?

@brandond
Copy link
Member

brandond commented Jun 3, 2024

The Rancher version doesn't matter. The RKE2 version is what determines how the containerd config file is generated.

There is no specific option for configuring config_path. The registries.yaml content is used identically, the only difference is whether it goes into a single config file, or is spread across multiple files in the config_path directory.

Is there an actual problem you are trying to solve?

@xzxiaoshan
Copy link
Author

My specific problem now is that I am confused because I don't know how to rewrite the URL of sandbox_image. Although rke2 is not related to ranger, when used in conjunction with ranger, the configuration file of rke2 is automatically generated based on the ranger's configuration. I still don't understand how to solve the problem of rewriting urls for sandbox_image images.

Is there a specific example about rewriting sandbox_image? Thank you.

@xzxiaoshan
Copy link
Author

The Rancher version doesn't matter. The RKE2 version is what determines how the containerd config file is generated.

There is no specific option for configuring config_path. The registries.yaml content is used identically, the only difference is whether it goes into a single config file, or is spread across multiple files in the config_path directory.

Is there an actual problem you are trying to solve?

Is there a specific configuration method?

@brandond
Copy link
Member

brandond commented Jun 3, 2024

The sandbox image is just rancher/mirrored-pause:3.6, it is pulled like any other - there's nothing special about it. You can find instructions on rewrites at https://docs.rke2.io/install/containerd_registry_configuration#rewrites

@xzxiaoshan
Copy link
Author

The sandbox image is just rancher/mirrored-pause:3.6, it is pulled like any other - there's nothing special about it. You can find instructions on rewrites at https://docs.rke2.io/install/containerd_registry_configuration#rewrites

I used it this way according to the document. Most of the pull for other images are effective, except for the pause in rke2, which does not work. Therefore, my initial question was that there is no rewrite rule in the configuration file for this rke. Only this sandbox_image in the rke2 config.toml configuration file was not correctly rewritten.

I suspect it's a bug, or do you need any further logs from my end?

@brandond
Copy link
Member

brandond commented Jun 3, 2024

Can you provide your registries.yaml, and containerd logs showing the rewrite not being applied for the pause image?

@xzxiaoshan
Copy link
Author

xzxiaoshan commented Jun 4, 2024

Can you provide your registries.yaml, and containerd logs showing the rewrite not being applied for the pause image?

  1. Execute on the new node using the script provided by Rancher to register the node
curl --insecure -fL https://rancher.test.com/system-agent-install.sh | sudo  sh -s - --server https://rancher.test.com --label 'cattle.io/os=linux' --token jtqx6bs5gtf7hkzf794jxdfsfshcl4v7ds2m789dsfg9h2zktf4zfvb --ca-checksum bdf5c68ba789sdf45bfd773bd57374742ce108dfc56734a092343bf0409d15c --etcd --controlplane --worker
  1. The following rke2 configuration file registries.yaml is automatically generated, and this file has an automatically generated rewrite configuration, which looks normal.
{
    "configs": {
        "harbor.test.com": {
            "auth": {
                "username": "admin",
                "password": "harbor12345",
                "auth": "",
                "identity_token": ""
            },
            "tls": {
                "ca_file": "",
                "cert_file": "",
                "key_file": "",
                "insecure_skip_verify": true
            }
        }
    },
    "mirrors": {
        "harbor.test.com": {
            "endpoint": [
                "https://harbor.test.com"
            ],
            "rewrite": {
                "^rancher/(.*)": "dockerhub_proxy/rancher/$1"
            }
        },
        "docker.io": {
            "endpoint": [
                "https://harbor.test.com"
            ],
            "rewrite": {
                "^rancher/(.*)": "dockerhub_proxy/rancher/$1"
            }
        }
    }
}
  1. Using sudo journalctl - u rke2 server - f to view the logs, it appears that the sandbox has not been properly rewritten (other images required for initialization have been rewritten to dockerhub_proxy)
sudo journalctl -u rke2-server -f

Jun 04 08:31:47 k8s-prd-m1 rke2[30594]: time="2024-06-04T08:31:47+08:00" level=info msg="Pod for etcd not synced (pod sandbox not found), retrying"
Jun 04 08:32:07 k8s-prd-m1 rke2[30594]: time="2024-06-04T08:32:07+08:00" level=info msg="Pod for etcd not synced (pod sandbox not found), retrying"
Jun 04 08:32:27 k8s-prd-m1 rke2[30594]: time="2024-06-04T08:32:27+08:00" level=info msg="Pod for etcd not synced (pod sandbox not found), retrying"
Jun 04 08:32:47 k8s-prd-m1 rke2[30594]: time="2024-06-04T08:32:47+08:00" level=info msg="Pod for etcd not synced (pod sandbox not found), retrying"
Jun 04 08:33:07 k8s-prd-m1 rke2[30594]: time="2024-06-04T08:33:07+08:00" level=info msg="Pod for etcd not synced (pod sandbox not found), retrying"
  1. Assuming that ranger/mirrored pause: 3.6 was pulled due to other errors, then this 3.6 image must have been cached in the Harbor image repository, but it was not actually cached, proving that ranger/mirrored pause: 3.6 was not properly rewritten.

image

Note: It is strange that only the sandbox_image in the config.toml configuration file has not been rewritten.

Thanks!

@brandond
Copy link
Member

brandond commented Jun 4, 2024

Please show the containerd log, not the rke2 log. The information here doesn't show that it's not being rewritten or anything else, all I se here is that some pods aren't running.

@xzxiaoshan
Copy link
Author

Please show the containerd log, not the rke2 log. The information here doesn't show that it's not being rewritten or anything else, all I se here is that some pods aren't running.

I found an error in the containerd log and I'm not sure if it's related. I'm using CentOS 7.9.

Jun 04 09:06:47 k8s-prd-m1 containerd[1315]: time="2024-06-04T09:06:47.021547774+08:00" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"

@xzxiaoshan
Copy link
Author

According to online research, it seems that this matter is a normal situation.
Because "when checking the status of containerd, we can see errors related to CNI. This is because we first installed CNI plugins but have not yet installed the CNI plugin for k8s."

@brandond
Copy link
Member

brandond commented Jun 4, 2024

Yes, that's normal. If you can't run pods due to a missing sandbox image, the CNI install pod won't have run yet.

Can you attach the whole containerd log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants