From 5295a0f6c359ea6b51b90e91dc7eb2e1be368c51 Mon Sep 17 00:00:00 2001 From: Akihiro Suda Date: Thu, 4 Apr 2024 15:16:39 +0900 Subject: [PATCH] troubleshooting.md: add `ethtool -K flannel.1 tx-checksum-ip-generic off` for NAT When the public IP is behind NAT, the UDP checksum fields of the VXLAN packets can be corrupted. In that case, try running the following commands to avoid corrupted checksums: ```bash /usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off ``` To automate the command above via udev, create `/etc/udev/rules.d/90-flannel.rules` as follows: ``` SUBSYSTEM=="net", ACTION=="add|change|move", ENV{INTERFACE}=="flannel.1", RUN+="/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off" ``` ref: - flannel-io/flannel issue 1279 - kubernetes/kops PR 9074 - karmab/kcli@b1a8eff658d17cf4e28162f0fa2c8b2b10e5ad00 Signed-off-by: Akihiro Suda --- Documentation/kubernetes.md | 3 ++- Documentation/troubleshooting.md | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/Documentation/kubernetes.md b/Documentation/kubernetes.md index 166e81765..62697d460 100644 --- a/Documentation/kubernetes.md +++ b/Documentation/kubernetes.md @@ -32,7 +32,8 @@ Other options include [Kyverno](https://kyverno.io/policies/pod-security/) and [ # Annotations * `flannel.alpha.coreos.com/public-ip`, `flannel.alpha.coreos.com/public-ipv6`: Define the used public IP of the node. If configured when Flannel starts it'll be used as the `public-ip` and `public-ipv6` flag. -* `flannel.alpha.coreos.com/public-ip-overwrite`, `flannel.alpha.coreos.com/public-ipv6-overwrite`: Allows to overwrite the public IP of a node. Useful if the public IP can not determined from the node, e.G. because it is behind a NAT. It can be automatically set to a nodes `ExternalIP` using the [flannel-node-annotator](https://github.com/alvaroaleman/flannel-node-annotator) +* `flannel.alpha.coreos.com/public-ip-overwrite`, `flannel.alpha.coreos.com/public-ipv6-overwrite`: Allows to overwrite the public IP of a node. Useful if the public IP can not determined from the node, e.G. because it is behind a NAT. It can be automatically set to a nodes `ExternalIP` using the [flannel-node-annotator](https://github.com/alvaroaleman/flannel-node-annotator). + See also the "NAT" section in [troubleshooting](./troubleshooting.md) if UDP checksums seem corrupted. ## Older versions of Kubernetes diff --git a/Documentation/troubleshooting.md b/Documentation/troubleshooting.md index 09d6645c4..1096f2428 100644 --- a/Documentation/troubleshooting.md +++ b/Documentation/troubleshooting.md @@ -39,6 +39,27 @@ Vagrant typically assigns two interfaces to all VMs. The first, for which all ho This may lead to problems with flannel. By default, flannel selects the first interface on a host. This leads to all hosts thinking they have the same public IP address. To prevent this issue, pass the `--iface=eth1` flag to flannel so that the second interface is chosen. +## NAT +When the public IP is behind NAT, the UDP checksum fields of the VXLAN packets can be corrupted. +In that case, try running the following commands to avoid corrupted checksums: + +```bash +/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off +``` + +To automate the command above via udev, create `/etc/udev/rules.d/90-flannel.rules` as follows: + +``` +SUBSYSTEM=="net", ACTION=="add|change|move", ENV{INTERFACE}=="flannel.1", RUN+="/usr/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off" +``` + + + ## Permissions Depending on the backend being used, flannel may need to run with super user permissions. Examples include creating VXLAN devices or programming routes. If you see errors similar to the following, confirm that the user running flannel has the right permissions (or try running with `sudo)`. * `Error adding route...`