Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "no default routes found" when using Air-gap installation #1144

Closed
floydljy opened this issue Nov 28, 2019 · 19 comments
Closed

Error "no default routes found" when using Air-gap installation #1144

floydljy opened this issue Nov 28, 2019 · 19 comments
Labels
kind/documentation Improvements or additions to documentation
Milestone

Comments

@floydljy
Copy link

Version:
v1.0.0
Describe the bug
errors when using Air-gap installing.

FATA[2019-11-28T09:04:09.931286843+08:00] apiserver exited: unable to find suitable network address.error='no default routes found in "/proc/net/route" or "/proc/net/ipv6_route"'. Try to set the AdvertiseAddress directly or provide a valid BindAddress to fix this

To Reproduce

[aiops@7 ~]$ sudo INSTALL_K3S_SKIP_DOWNLOAD=true ./install.sh
[INFO]  Skipping k3s download and verify
which: no kubectl in (/sbin:/bin:/usr/sbin:/usr/bin)
[INFO]  Creating /usr/local/bin/kubectl symlink to k3s
which: no crictl in (/sbin:/bin:/usr/sbin:/usr/bin)
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /bin/ctr
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink from /etc/systemd/system/multi-user.target.wants/k3s.service to /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s
Job for k3s.service failed because the control process exited with error code. See "systemctl status k3s.service" and "journalctl -xe" for details.
[aiops@7 ~]$ sudo systemctl status -l k3s.service
● k3s.service - Lightweight Kubernetes
   Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since 四 2019-11-28 09:03:14 CST; 4s ago
     Docs: https://k3s.io
  Process: 30109 ExecStart=/usr/local/bin/k3s server (code=exited, status=1/FAILURE)
  Process: 30106 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
  Process: 30104 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
 Main PID: 30109 (code=exited, status=1/FAILURE)

11月 28 09:03:14 7 k3s[30109]: --log-file-max-size uint           Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
11月 28 09:03:14 7 k3s[30109]: --log-flush-frequency duration     Maximum number of seconds between log flushes (default 5s)
11月 28 09:03:14 7 k3s[30109]: --logtostderr                      log to standard error instead of files (default true)
11月 28 09:03:14 7 k3s[30109]: --skip-headers                     If true, avoid header prefixes in the log messages
11月 28 09:03:14 7 k3s[30109]: --skip-log-headers                 If true, avoid headers when opening log files
11月 28 09:03:14 7 k3s[30109]: --stderrthreshold severity         logs at or above this threshold go to stderr (default 2)
11月 28 09:03:14 7 k3s[30109]: -v, --v Level                          number for the log level verbosity
11月 28 09:03:14 7 k3s[30109]: --version version[=true]           Print version information and quit
11月 28 09:03:14 7 k3s[30109]: --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging
11月 28 09:03:14 7 k3s[30109]: time="2019-11-28T09:03:14.937465904+08:00" level=fatal msg="apiserver exited: unable to find suitable network address.error='no default routes found in \"/proc/net/route\" or \"/proc/net/ipv6_route\"'. Try to set the AdvertiseAddress directly or provide a valid BindAddress to fix this"

Expected behavior
The K3s server should be started

Actual behavior
failed with error “no default routes found”

Additional context

  • ifconfig output
[aiops@7 ~]$ ifconfig -a
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:c8ff:fe4e:cadd  prefixlen 64  scopeid 0x20<link>
        ether 02:42:c8:4e:ca:dd  txqueuelen 0  (Ethernet)
        RX packets 82  bytes 8449 (8.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 86  bytes 23010 (22.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.100.7  netmask 255.255.255.0  broadcast 192.168.100.255
        inet6 fe80::e255:ed27:fe49:370e  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::4ddc:440f:1a22:8def  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::7ef5:38f0:1116:c541  prefixlen 64  scopeid 0x20<link>
        ether 00:50:56:90:dc:83  txqueuelen 1000  (Ethernet)
        RX packets 2825735  bytes 815531129 (777.7 MiB)
        RX errors 0  dropped 883  overruns 0  frame 0
        TX packets 50330  bytes 4290449 (4.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
  • ip route output
[aiops@7 ~]$ ip route
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.100.0/24 dev ens192 proto kernel scope link src 192.168.100.7 metric 100
@floydljy
Copy link
Author

floydljy commented Nov 28, 2019

after add sudo ip route add default via 192.168.100.1, the installation passed. but it should be a bug because this default route is meaningless in my env. thx
also, the default route info will be missing after reboot

@ranchersvet
Copy link

gz#12542

@james-mchugh
Copy link

Would love to hear some feedback from the dev team. It seems like a bit of a hole in the airgap install instructions if you cannot setup K3S in an environment that does not have a default route.

@Oats87
Copy link
Member

Oats87 commented Jul 1, 2021

The default route requirement actually stems down to a bunch of downstream components; for example, you need a default route to enable cluster networking with Flannel, as otherwise, traffic won't pass through iptables etc.

The default route doesn't necessarily need to be functional, but is necessary to be in place.

@stale
Copy link

stale bot commented Dec 28, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Dec 28, 2021
@branttaylor
Copy link

This is still an open issue

@stale stale bot removed the status/stale label Jan 4, 2022
@brandond
Copy link
Member

brandond commented Jan 4, 2022

As noted above, this is a limitation of Kubernetes components upstream from K3s. At this time, it is required that you have a default route configured on your node. If your environment does not actually have anywhere to send traffic to, you can at the very least configure a dummy interface with a low-priority default route; something along the lines of:

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 169.254.255.255/32 dev dummy0
ip route add default via 169.254.255.255 dev dummy0 metric 1000

@branttaylor
Copy link

I don't know if I've done something wrong and I need to start over but when I try these commands, the final command fails with:

Error: Nexthop has invalid gateway

I'm running these commands while attached to no network, just entering them into the console.

@brandond
Copy link
Member

brandond commented Jan 7, 2022

maybe alter that to

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 169.254.255.254/31 dev dummy0
ip route add default via 169.254.255.255 dev dummy0 metric 1000

You may have to play with it a bit - I haven't tested this myself recently.

@rancher-max
Copy link
Contributor

We saw a similar issue to this today while doing some airgap testing. Full steps:

  1. Delete the default route:
    ip route del default

  2. Attempt to start k3s without changing anything else:

$ sudo ./k3s server --cluster-init --token=test --write-kubeconfig-mode 644 
INFO[0000] Acquiring lock file /var/lib/rancher/k3s/data/.lock 
INFO[0000] Preparing data dir /var/lib/rancher/k3s/data/7c3132493a60e11638d4b0a1f8fda3ee7a2bc62f078da402588e9e11305abb0c 
FATA[0000] unable to select an IP from default routes. 
  1. Attempt to start k3s by providing the node-ip:
$  sudo ./k3s server --cluster-init --token=test --write-kubeconfig-mode 644 --node-ip=aa.bb.cc.dd
INFO[0000] Acquiring lock file /var/lib/rancher/k3s/data/.lock 
INFO[0000] Preparing data dir /var/lib/rancher/k3s/data
. . .
INFO[0029] Flannel found PodCIDR assigned for node maxrke2-2 
FATA[0029] flannel exited: failed to get default interface: Unable to find default route
  1. Attempt to start k3s by providing both the node-ip and the flannel interface (previously the default route):
$  sudo ./k3s server --cluster-init --token=test --write-kubeconfig-mode 644 --node-ip=aa.bb.cc.dd --flannel-iface=eth0

This appeared to work, but logs were looping with metrics-server errors:

E0503 22:21:09.859107    2214 available_controller.go:524] v1beta1.metrics.k8s.io failed with: failing or missing response from [https://10.43.18.235:443/apis/metrics.k8s.io/v1beta1](https://10.43.18.235/apis/metrics.k8s.io/v1beta1): Get "[https://10.43.18.235:443/apis/metrics.k8s.io/v1beta1](https://10.43.18.235/apis/metrics.k8s.io/v1beta1)": dial tcp 10.43.18.235:443: connect: network is unreachable
E0503 22:21:13.918048    2214 resource_quota_controller.go:413] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
W0503 22:21:14.404009    2214 garbagecollector.go:707] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]

Also, kubectl top node did not work.

  1. Supply the workaround provided by @brandond in Error "no default routes found" when using Air-gap installation #1144 (comment)
$ sudo ip link add dummy0 type dummy
$ sudo ip link set dummy0 up
$ sudo ip addr add 169.254.255.254/31 dev dummy0
$ sudo ip route add default via 169.254.255.255 dev dummy0 metric 1000
  1. Start k3s again. (ensure it was fully killed from any previous failing attempts)
    This time it works as expected and all functions appear to be operating. No errors in logs, all nodes pods are up and running, and no kubectl errors.

@taleodor
Copy link

taleodor commented Oct 12, 2022

Following modification of work-around worked for me:

$ sudo ip link add dummy0 type dummy
$ sudo ip link set dummy0 up
$ sudo ip addr add 192.168.3.254/31 dev dummy0
$ sudo ip route add default via 192.168.3.255 dev dummy0 metric 1000

Also, for coredns, any 127.0.0.x entries should be commented out in /etc/resolv.conf and instead

nameserver 192.168.3.254

should be added.

@stale
Copy link

stale bot commented Apr 10, 2023

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Apr 10, 2023
@dereknola dereknola added the kind/documentation Improvements or additions to documentation label Apr 10, 2023
@stale stale bot removed the status/stale label Apr 10, 2023
@caroline-suse-rancher caroline-suse-rancher added this to the Backlog milestone Apr 26, 2023
@dereknola
Copy link
Member

Closing as documentation issue has been addressed.

@x-coder-L
Copy link

x-coder-L commented Jul 5, 2023

I find

maybe alter that to

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 169.254.255.254/31 dev dummy0
ip route add default via 169.254.255.255 dev dummy0 metric 1000

You may have to play with it a bit - I haven't tested this myself recently.

hello, We saw a similar issue to this.But follow that step can not solve the problem.As I setted the dummy route and then restart k3s, it failed to start k3s and report a fatal message "unable to select an IP from default routes."
this is my route info:
`[root@master route-test]# ip route

default via 169.254.255.255 dev dummy0 metric 1000
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1
169.254.255.254/31 dev dummy0 proto kernel scope link src 169.254.255.254
192.168.169.0/24 dev ens33 proto kernel scope link src 192.168.169.12 metric 100`

And after I checked the source code of k3s and k8s , I found when k3s want to get hostname and ip, it will use func ChooseHostInterface() in "k8s.io/apimachinery/pkg/util/net" package and in this package it will detect the default route and check the ip by func IsGlobalUnicast() in net package, but now IsGlobalUnicast() will return false when using 169.xx.xx.xx.So I think maybe the change of func IsGlobalUnicast() that make the dummy route not work. Is there any other way to solve this problem?

@brandond
Copy link
Member

brandond commented Jul 5, 2023

@x-coder-L you could chose a different address range for your dummy interface? I don't recall having issues with that address range in the past but it is possible that upstream has changed their utility code and it now will not auto-detect the dummy interface when using that range.

@x-coder-L
Copy link

x-coder-L commented Jul 6, 2023

@brandond Thanks your apply. Sure, using other different address range for dummy interface is ok. But we still want to use a invalid IP address to avoid IP conflict, so are there some better way to avoid the "no default routes found" problem? And maybe the example for dummy route in airgap need to update?

@chaychoong
Copy link

I find

maybe alter that to

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 169.254.255.254/31 dev dummy0
ip route add default via 169.254.255.255 dev dummy0 metric 1000

You may have to play with it a bit - I haven't tested this myself recently.

hello, We saw a similar issue to this.But follow that step can not solve the problem.As I setted the dummy route and then restart k3s, it failed to start k3s and report a fatal message "unable to select an IP from default routes." this is my route info: `[root@master route-test]# ip route

default via 169.254.255.255 dev dummy0 metric 1000 10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1 169.254.255.254/31 dev dummy0 proto kernel scope link src 169.254.255.254 192.168.169.0/24 dev ens33 proto kernel scope link src 192.168.169.12 metric 100`

And after I checked the source code of k3s and k8s , I found when k3s want to get hostname and ip, it will use func ChooseHostInterface() in "k8s.io/apimachinery/pkg/util/net" package and in this package it will detect the default route and check the ip by func IsGlobalUnicast() in net package, but now IsGlobalUnicast() will return false when using 169.xx.xx.xx.So I think maybe the change of func IsGlobalUnicast() that make the dummy route not work. Is there any other way to solve this problem?

@brandond is it worth updating the documentation here if 169.254.255.254/31 no longer works? Just stumbled upon this problem and thankfully this issue was here.

@brandond
Copy link
Member

brandond commented Nov 2, 2023

yeah, we should update the docs to suggest a different range.

@shinchley
Copy link

I ran into the above with an RKE2 air-gapped agent, so used this IP instead (reserved for examples and documents per https://en.wikipedia.org/wiki/Reserved_IP_addresses so should be safe to use in general):

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 203.0.113.254/31 dev dummy0
ip route add default via 203.0.113.255 dev dummy0 metric 100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Improvements or additions to documentation
Projects
Status: Closed
Archived in project
Development

No branches or pull requests