Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support node CIDR mask config #488

Closed
g-gaston opened this issue Oct 25, 2021 · 7 comments
Closed

Support node CIDR mask config #488

g-gaston opened this issue Oct 25, 2021 · 7 comments
Assignees
Labels
area/cni Kubernetes CNIs for EKS-A kind/enhancement New feature or request team/cli
Milestone

Comments

@g-gaston
Copy link
Member

Right now, the kube controller manager is using the default for --node-cidr-mask-size (24 for ipv4 and 64 for ipv6)
Add the ability to configure this through the eks-a cluster config CRD

@g-gaston g-gaston changed the title Support node CIDR mask Support node CIDR mask config Oct 25, 2021
@jaxesn jaxesn added this to the next milestone Nov 8, 2021
@jaxesn jaxesn modified the milestones: next, next+1 Jan 10, 2022
@jaxesn jaxesn modified the milestones: next+1, backlog Jan 26, 2022
@g-gaston g-gaston added kind/enhancement New feature or request area/cni Kubernetes CNIs for EKS-A team/cli labels Apr 25, 2022
@CharudathGopal
Copy link

CharudathGopal commented May 13, 2022

We are trying to deploy a EKS Anywhere setup with more than 500 nodes and hitting this issue too.

Appreciate any workarounds or suggestions to move ahead.

@jaxesn jaxesn modified the milestones: backlog, next May 17, 2022
@jaxesn
Copy link
Member

jaxesn commented May 17, 2022

We are going to go ahead and look into this one and see if we can something quickly. Well plan on having it in our late June release (0.10.0) but could produce a dev build if interested in testing it earlier once we have something in place.

@CharudathGopal
Copy link

CharudathGopal commented May 18, 2022

@jaxesn Please let us know when you have the fix, happy to give a shot!

JFYI, Here is the command we used to set the CIDR on the nodes after which the Cilium pods came up as expected. Before this fix, we had only 254 cilium pods coming up as /24 mask was used.

kubectl annotate node --all --overwrite io.cilium.network.ipv4-pod-cidr=192.168.0.0/16

@jaxesn
Copy link
Member

jaxesn commented May 18, 2022

When you initially created the cluster with 500 nodes, what were some of the values of the cilium annotation before you changed it? This information may still exist as spec.podCIDR(s) on the node objects.

@jaxesn
Copy link
Member

jaxesn commented May 18, 2022

@CharudathGopal would you mind giving me a few more details on your network setup and what cidr range and masks you would like to set?

@CharudathGopal
Copy link

Here is the snapshot of cilium config.

auto-direct-node-routes                        false
bpf-lb-map-max                                 65536
bpf-map-dynamic-size-ratio                     0.0025
bpf-policy-map-max                             16384
cgroup-root                                    /run/cilium/cgroupv2
cilium-endpoint-gc-interval                    5m0s
cluster-id
cluster-name                                   poc-22
cni-chaining-mode                              portmap
custom-cni-conf                                false
debug                                          false
disable-cnp-status-updates                     true
enable-auto-protect-node-port-range            true
enable-bandwidth-manager                       false
enable-bpf-clock-probe                         true
enable-bpf-masquerade                          true
enable-endpoint-health-checking                true
enable-health-check-nodeport                   true
enable-health-checking                         true
enable-hubble                                  true
enable-ipv4                                    true
enable-ipv6                                    false
enable-l7-proxy                                true
enable-local-redirect-policy                   false
enable-metrics                                 true
enable-policy                                  default
enable-remote-node-identity                    true
enable-session-affinity                        true
enable-well-known-identities                   false
enable-xt-socket-fallback                      true
hubble-disable-tls                             false
hubble-listen-address                          :4244
hubble-socket-path                             /var/run/cilium/hubble.sock
hubble-tls-cert-file                           /var/lib/cilium/tls/hubble/server.crt
hubble-tls-client-ca-files                     /var/lib/cilium/tls/hubble/client-ca.crt
hubble-tls-key-file                            /var/lib/cilium/tls/hubble/server.key
identity-allocation-mode                       crd
install-iptables-rules                         true
ipam                                           kubernetes
kube-proxy-replacement                         probe
kube-proxy-replacement-healthz-bind-address
masquerade                                     true
monitor-aggregation                            medium
monitor-aggregation-flags                      all
monitor-aggregation-interval                   5s
node-port-bind-protection                      true
operator-api-serve-addr                        127.0.0.1:9234
operator-prometheus-serve-addr                 :6942
preallocate-bpf-maps                           false
prometheus-serve-addr                          :9090
proxy-prometheus-port                          9095
sidecar-istio-proxy-image                      cilium/istio_proxy
tunnel                                         geneve
wait-bpf-mount                                 false

With this configuration, Cilium PODs were failing to comeup after reaching 255 nodes. So I added few more params:


cluster-pool-ipv4-cidr                         192.168.0.0/16
cluster-pool-ipv4-mask-size                    16
ipv4-pod-cidr                                  192.168.0.0/16
ipv4-range                                     192.168.0.0/16
allocate-node-cidrs                            true

This did not make much difference, finally after setting the annotations on the nodes using this command cilium PODs came up.

kubectl annotate node --all --overwrite io.cilium.network.ipv4-pod-cidr=192.168.0.0/16

@jaxesn
Copy link
Member

jaxesn commented May 18, 2022

After that annotation change, are pods running on all nodes? Could you send the results "k get pods -A -owide"? Setting the pod cidr range to the entire /16 block on all nodes seems like it shouldn't work since all the nodes could potentially being trying to assign pods with the same IPs as other nodes?

I think exposing the node cidr mask makes a lot of sense and @mitalipaygude is actively looking at what it will take to that do that, but I want to make sure that would actually solve the problem in your environment. Are you thinking of leaving the podcidr the same, 192.168.0.0/16 and then changing the node cidr mask to like 28 or something to increase the number of avoid ranges for nodes, but limit the number of pods on each node? Or were you thinking of opening up your cidr range to like 10.0.0.0/8 to have more total IPs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cni Kubernetes CNIs for EKS-A kind/enhancement New feature or request team/cli
Projects
None yet
Development

No branches or pull requests

4 participants