Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS not configured correctly on a Raspberry Pi cluster #1375

Closed
soapergem opened this issue Nov 19, 2020 · 5 comments
Closed

DNS not configured correctly on a Raspberry Pi cluster #1375

soapergem opened this issue Nov 19, 2020 · 5 comments
Labels

Comments

@soapergem
Copy link

I'm having some trouble setting up Kubernetes with coredns and Flannel, on a cluster of 4x Raspberry Pis. After installing kubeadm on my master node and pulling images, I initialized it with this command:

sudo kubeadm init --token-ttl=0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.1.194

And then I installed Flannel v0.13.0 using this command:

kubectl apply -f https://rawgit.com/coreos/flannel/v0.13.0/Documentation/kube-flannel.yml

So far, so good. It spawns the Flannel daemonset on all nodes (although incidentally, I sometimes have to run sudo ip link delete flannel.1 on each node to get it working), and I can launch containers. However, unfortunately, DNS does not work on my containers. I check the /etc/resolv.conf file and they all point to 10.96.0.10, but this doesn't seem to work. If I kubectl exec into a running pod and from there run dig google.com... it just times out. (Whereas if I run dig @8.8.8.8 google.com it immediately returns a result, so at least I have Internet connectivity! And that narrows it down to a cluster DNS problem.)

I was reading that you have to pass --pod-network-cidr=10.244.0.0/16 during set up in order for Flannel to work. And as far as I can tell it is working. I'm just wondering if there is an additional parameter I'm missing that will get DNS to finally start working as well?

@pinfort
Copy link

pinfort commented May 16, 2021

I have same issue.

My environment is

  • Raspberry Pi 4 model B * 2 nodes
  • ubuntu for raspberry Pi

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
--- trancate ---

  • kubernetes version is v1.21.1
  • using cri-o as container runtime

$ sudo crictl version
Version: 0.1.0
RuntimeName: cri-o
RuntimeVersion: 1.21.0
RuntimeApiVersion: v1alpha2

  • flannel from master branch
    • image: quay.io/coreos/flannel:v0.14.0-rc1
  • kubeadm init command is
    • sudo kubeadm init --pod-network-cidr=10.244.0.0/16

current behavior

dig google.com fails

dig google.com
; <<>> DiG 9.11.6-P1 <<>> google.com
;; global options: +cmd
;; connection timed out; no servers could be reached

dig @8.8.8.8 google.com succeeded

dig @8.8.8.8 google.com
google.com. 47 IN A 172.217.26.46

expected behavior

dig google.com works fine.

Due to it, I cannot execute apt-get install foo or some other commands in pods.
Is there any information to make it work?

@soapergem
Copy link
Author

In my case because it was a development environment, I ended up turning off all of the firewall rules entirely, using sudo ufw disable. This meant that every port is open on all of my nodes. So obviously this is not an approach that would work for a production environment. But I never did figure out which internal port Kubernetes/Flannel is using to handle DNS resolution. I can tell you that it's not using 53, as adding firewall rules for that port specifically had no effect.

@vincentmli
Copy link
Contributor

@soapergem the coredns pod probably should log something why it failed when firewall is on, and iptables trace can be your friend to trace firewall problem, https://youtu.be/9HNKRP7x57M

@pinfort
Copy link

pinfort commented May 20, 2021

Finally, I resolved this issue with three steps.

  • Add --resolv-conf=/run/systemd/resolve/resolv.conf to KUBELET_EXTRA_ARGS

  • Use host-gw mode instead of vxlan for flannel

    After these, but I cannot still access CoreDNS. Next final step fixes all issue.

  • Add route to worker nodes for accessing IPs on control plane node.

    • ex. sudo ip route add 10.96.0.0/16 via $CONTROL_PLANE_NODE_IP dev eth0
    • 10.96.0.0/16 is serviceSubnet on my cluster.
    • Before it, I cannot found route to 10.96.0.0/16 by ip route on my worker node. But I can find it now, and accessible.

Thanks for all of your supports.

FYI: I'm using flannel v0.13.0 instead of v0.14.0-rc1 now.

@stale
Copy link

stale bot commented Jan 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 25, 2023
@stale stale bot closed this as completed Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants