Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] coredns looses customization after cluster or docker restart #1112

Open
fragolinux opened this issue Jul 28, 2022 · 6 comments
Open

[BUG] coredns looses customization after cluster or docker restart #1112

fragolinux opened this issue Jul 28, 2022 · 6 comments
Labels
bug Something isn't working

Comments

@fragolinux
Copy link

fragolinux commented Jul 28, 2022

What did you do

How was the cluster created?
k3d cluster create --config cluster.yaml

# cluster.yaml (the network exists, is created externally)
# info: https://k3d.io/v5.4.4/usage/configfile/
apiVersion: k3d.io/v1alpha4 # this will change in the future as we make everything more stable
kind: Simple # internally, we also have a Cluster config, which is not yet available externally
metadata:
  name: $CLUSTER_NAME # name that you want to give to your cluster (will still be prefixed with `k3d-`)
kubeAPI: # same as `--api-port myhost.my.domain:6445` (where the name would resolve to 127.0.0.1)
  host: $LOCALDOMAIN # important for the `server` setting in the kubeconfig
  hostIP: "127.0.0.1" # where the Kubernetes API will be listening on
  hostPort: "6445" # where the Kubernetes API listening port will be mapped to on your host system
image: rancher/$IMAGE # same as `--image rancher/k3s:v1.20.4-k3s1`
network: $DOCKER_NETWORK # same as `--network my-custom-net`
volumes: # repeatable flags are represented as YAML lists
  - volume: $DOMAIN_STORAGE/rwo:/var/lib/rancher/k3s/storage # same as `--volume '/my/host/path:/path/in/node@server:0;agent:*'`
    nodeFilters:
      - all
  - volume: $DOMAIN_STORAGE/rwx:/var/lib/csi-local-hostpath # :shared # same as `--volume '/my/host/path:/path/in/node@server:0;agent:*'`
    nodeFilters:
      - all
  - volume: $HOME/.local/share/mkcert/rootCA.pem:/etc/ssl/certs/rootCA.pem
    nodeFilters:
      - all
ports:
  - port: 80:80 # same as `--port '80:80@loadbalancer'`
    nodeFilters:
      - loadbalancer
  - port: 443:443 # same as `--port '443:443@loadbalancer'`
    nodeFilters:
      - loadbalancer
registries:
  config: |
    mirrors:
      "local-registry":
        endpoint:
          - https://$LOCALDOMAIN:5555
    configs:
      "local-registry":
        tls:
          ca_file: "/etc/ssl/certs/rootCA.pem"
hostAliases: # /etc/hosts style entries to be injected into /etc/hosts in the node containers and in the NodeHosts section in CoreDNS
  - ip: $PUBLICIP
    hostnames:
      - $LOCALDOMAIN
options:
  k3d: # k3d runtime settings
    wait: true # wait for cluster to be usable before returining; same as `--wait` (default: true)
    timeout: "60s" # wait timeout before aborting; same as `--timeout 60s`
  k3s: # options passed on to K3s itself
    extraArgs: # additional arguments passed to the `k3s server|agent` command; same as `--k3s-arg`
      - arg: --tls-san=$LOCALDOMAIN
        nodeFilters:
          - server:0
      - arg: --disable=traefik
        nodeFilters:
          - server:0
  kubeconfig:
    updateDefaultKubeconfig: true # add new cluster to your default Kubeconfig; same as `--kubeconfig-update-default` (default: true)
    switchCurrentContext: true # also set current-context to the new cluster's context; same as `--kubeconfig-switch-context` (default: true)

  • What did you do afterwards?
    once created the cluster, i checked the coredns configmap, and it's ok, it contains:
NodeHosts:
----
192.168.1.103 e4t.local
192.168.65.2 host.k3d.internal
172.22.0.5 k3d-dom1-serverlb
172.22.0.3 k3d-dom1-tools
172.22.0.2 local-registry
172.22.0.4 k3d-dom1-server-0

i then restarted docker (desktop, on mac), container started but the configmap was completely emptied, it now contains just this:

NodeHosts:
----
172.22.0.4 k3d-dom1-server-0

What did you expect to happen

configmap survives a simple docker restart... i confirm this happens even on k3s container stop and restart...

Screenshots or terminal output

image

Which OS & Architecture

arch: x86_64
cgroupdriver: cgroupfs
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
name: docker
os: Docker Desktop
ostype: linux
version: 20.10.17

Which version of k3d

k3d version v5.4.4
k3s version v1.23.8-k3s1 (default)

PLEASE note that even the k3s version is wrong, as the cluster config has this ENV...
K3SVERSION=v1.21.14-k3s1
export IMAGE=k3s:"${K3SVERSION}"

Which version of docker

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc., v0.8.2)
  compose: Docker Compose (Docker Inc., v2.6.1)
  extension: Manages Docker extensions (Docker Inc., v0.2.7)
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 44
  Running: 9
  Paused: 0
  Stopped: 35
 Images: 70
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.104-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 2.921GiB
 Name: docker-desktop
 ID: UIUY:OMDB:ECAX:Q4QW:PPCY:U566:SK3J:CWXV:42GG:S7ZX:SJEJ:OVFB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  e4t.local:5555
  hubproxy.docker.internal:5000
  127.0.0.0/8
 Live Restore Enabled: false

@fragolinux fragolinux added the bug Something isn't working label Jul 28, 2022
@fragolinux fragolinux changed the title [BUG] coredns looses customization after cluster restart [BUG] coredns looses customization after cluster or docker restart Jul 28, 2022
@fragolinux
Copy link
Author

fragolinux commented Aug 1, 2022

Any way to implement in k3d the official "coredns-custom" cm, to be created when in config.yaml you define custom hosts or other coredns related mods, instead of the actual method that patches directly the default cm? This would solve the issue, probably...

https://docs.digitalocean.com/products/kubernetes/how-to/customize-coredns/

@chris13524
Copy link

@fragolinux see #816

@ErikEngerd
Copy link

Defining a custom DNS config would not fix my problem. I am trying to manage one k3d cluster from another k3d cluster. For this, I needed to add the server from the managed k3d cluster to the network of the cluster running argocd. After starting up the clusters using k3d cluster start, the config map show the correct entries.

However, after restarting docker, the config map is reset as mentioned above.

Is there a way to prevent k3d from resetting the coredns configmap upon startup?

@chris-codeflow
Copy link

@fragolinux Unfortunately this is expected behavior with regard to Rancher K3s and is not a k3d issue per se. Rancher K3s rewrites the manifest files whenever it is started and the documentation states that they shouldn't be altered:

Manifests for packaged components are managed by K3s, and should not be altered. The files are re-written to disk whenever K3s is started, in order to ensure their integrity.

When k3d creates a cluster (k3d cluster create), it injects entries into the CoreDNS config map by updating the manifest file /var/lib/rancher/k3s/server/manifests/coredns.yaml, however, when the container is restarted, K3s will rewrite the manifest file (as seen in the log entry: level=info msg="Writing manifest: /var/lib/rancher/k3s/server/manifests/coredns.yaml) and so all the NodeHosts entries in the ConfigMap are removed. You can't, despite the k3d documentation, even map a volume with a custom coredns.yaml file to this path as K3s will still overwrite it (I've proved this myself).

A workaround for this issue is to run k3d cluster stop <cluster name> and then 'k3d cluster start ` to ensure that the NodeHosts entries are re-injected by k3d.

It is possible to disable the Rancher K3s CoreDNS manifests and replace with your own but, unfortunately, k3d doesn't have an option for injecting the NodeHosts entries into a different manifest file (e.g., /var/lib/rancher/k3s/server/manifests/custom/mycoredns.yaml).

@fragolinux
Copy link
Author

Solved 2 years ago 😅
I just save the coredns cm after 1st start, and reapply on restart
Thanks BTW!

@iwilltry42
Copy link
Member

Host entries to the CoreDNS config are now managed via the coredns-custom configmap as per #1453 so they survive restarts of the cluster and host system.

This is released in https://github.com/k3d-io/k3d/releases/tag/v5.7.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants