Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom VPC domain-name affecting node lease #1457

Closed
voidense opened this issue Oct 6, 2023 · 6 comments
Closed

Custom VPC domain-name affecting node lease #1457

voidense opened this issue Oct 6, 2023 · 6 comments

Comments

@voidense
Copy link

voidense commented Oct 6, 2023

What happened:
We observed the same symptom reported by #1263, however this is not a duplicate because we're using the updated AMI and the issue still happened. We took a closer look at the fix for that issue -- https://github.com/awslabs/amazon-eks-ami/pull/1264/files (which went out with 20230501) -- and found that the command used, aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[].Instances[].PrivateDnsName', actually returns the hostname with our custom domain as well.

order of events:

  1. node group provisioned
  2. VPC dns option is enabled. we have custom dhcp option set with our custom domain, call it "custom_domain"
  3. All nodes suddenly stopped joining.
    • error message sample
      I1005 21:32:17.410848      10 node_authorizer.go:260] NODE DENY: 'ip-10-176-27-215.ec2.internal' &authorizer.AttributesRecord{User:(*user.DefaultInfo)(<obfuscated>), Verb:"get", Namespace:"kube-node-lease", APIGroup:"coordination.k8s.io", APIVersion:"v1", Resource:"leases", Subresource:"", Name:"ip-10-176-27-215.custom_domain", ResourceRequest:true, Path:"/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-176-27-215.custom_domain"}
      
  4. re-provisioned the node (manually terminate and let Cluster Autoscaling bringing up the new one). issue was immediately resolved and we saw the hostname pattern changed back to *.ec2-internal. During this entire period there was no VPC option change.

What you expected to happen:
No NODE DENY error should appear

How to reproduce it (as minimally and precisely as possible):
I'm not sure if this is deterministic, but the triggering condition seems to be:

  1. provision node group (newer than 20230501 so the fix is in)
  2. update vpc options to enable DNS and enable a DHCP options set that provides a custom_domain
  3. wait for it to happen

Anything else we need to know?:

  • in a different cluster and different node (with the same setup), on the EC2 console for it we can see Hostname type is IP name: <ip>.custom_domain and Private IP DNS name (IPv4 only) is <ip>.ec2.internal. This mismatch is expected because we have that DHCP option, however, the CLI command aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[].Instances[].PrivateDnsName' (which is used in the fix) actually returns <ip>.custom_domain
    • reading the code comments it seems (i might be wrong here) it's assuming aws ec2 describe-instances ... PrivateDnsName should return .ec2.internal, but apparently that's not the case
  • our node lease renewal cadence is per-hour (verified by the logs). however, this issue didn't immediately, or in an hour, occur after nodes are provisioned or vpc option is changed -- it happened ~1 week afterwards. it seems a random node lease refresh event was triggering it

Environment:

  • AWS Region: us-east-1
  • Instance Type(s): m5a.4xlarge
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.12
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.23
  • AMI Version: amazon-eks-node-1.23-v20230711
  • Kernel (e.g. uname -a): Linux ip-10-176-42-253.custom_domain 5.4.249-163.359.amzn2.x86_64 #1 SMP Wed Jul 12 18:58:58 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node): BASE_AMI_ID="ami-018ae0f2e02aab38b" BUILD_TIME="Fri Jul 28 04:19:03 UTC 2023" BUILD_KERNEL="5.4.249-163.359.amzn2.x86_64" ARCH="x86_64"
@cartermckinnon
Copy link
Member

I'm not clear on the exact behavior you're seeing. You mean that you're changing the PrivateDnsName after the instance launches and joins the cluster?

reading the code comments it seems (i might be wrong here) it's assuming aws ec2 describe-instances ... PrivateDnsName should return .ec2.internal, but apparently that's not the case

We're not assuming this. The PrivateDnsName is used to associate a Kubernetes user with IAM credentials (in the configmap/aws-auth). It doesn't matter if the PrivateDnsName uses a custom domain name.

our node lease renewal cadence is per-hour (verified by the logs).

Are you referring to the Lease object created by each kubelet? That should be updated every 10 seconds.

@voidense
Copy link
Author

voidense commented Oct 6, 2023

@cartermckinnon thanks for the reply.

You mean that you're changing the PrivateDnsName after the instance launches and joins the cluster?

the node group is always there and a part of the cluster with everything working fine, we changed the VPC DNS option (false to true) and after some time (1-2 weeks) we saw the the issue happen. we didn't change PrivateDnsName for any node directly. that VPC DNS change was the only change on the infra.
to be clear on the terminologies, when i pull up "Instance summary" on the aws console for one of the nodes, there are two fields:

  • field: Hostname type: IP name: some value --> let's call this "hostname"
  • field: Private IP DNS name (IPv4 only): some value --> let's call this "PrivateDnsName"

other than the main failure itself, the other symptom we observed for the problematic node groups are hostname being <ip>.custom_domain and PrivateDnsName being <ip>.ec2.internal, however aws ec2 describe-instance would return the former (ie with custom_domain). And a re-provision/destroy-and-recreate actually brings these 2 to be the same value at <ip>.ec2.internal and everything starts to work fine again

We're not assuming this. The PrivateDnsName is used to associate a Kubernetes user with IAM credentials (in the configmap/aws-auth). It doesn't matter if the PrivateDnsName uses a custom domain name.

so the error message reported by the api server is as follows:

I1005 21:32:17.410848      10 node_authorizer.go:260] NODE DENY: 'ip-10-176-27-215.ec2.internal' &authorizer.AttributesRecord{User:(*user.DefaultInfo)(<obfuscated>), Verb:"get", Namespace:"kube-node-lease", APIGroup:"coordination.k8s.io", APIVersion:"v1", Resource:"leases", Subresource:"", Name:"ip-10-176-27-215.custom_domain", ResourceRequest:true, Path:"/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-176-27-215.custom_domain"}

question is who is giving ip-10-176-27-215.ec2.internal and who is giving ip-10-176-27-215.custom_domain? I presume the mismatch caused the NODE DENY problem

Are you referring to the Lease object created by each kubelet? That should be updated every 10 seconds

hmm we see Registered Node event logs every 1 hour, i'm not familiar with how it works, maybe this is a different thing than the Lease object? but the question remains that how come our issue didn't happen immediately, or after 10 second, or after 1 hour of the VPC DNS option flip, but at a random timestamp about 1.5 weeks after. This is also a 2nd time that we observed this. The 1st time was in a test cluster so we didn't take it too seriously.

@cartermckinnon
Copy link
Member

cartermckinnon commented Oct 6, 2023

Changing your VPC's DNS settings can cause the PrivateDnsName to change. The domain-name specified in your DHCP options will only show up in your instance's PrivateDnsName if one of these is false:

  • enableDnsHostnames
  • enableDnsSupport

If both of these are true, the PrivateDnsName will not reflect the custom domain-name.


My guess as to why you're seeing something break a long time after you change these options is because the aws-iam-authenticator permanently caches each instance's PrivateDnsName: https://github.com/kubernetes-sigs/aws-iam-authenticator/blob/e847a7b5792b51307b358dd106c84fe8c38b4461/pkg/ec2provider/ec2provider.go#L120

Your kubelet will continue to use whatever the PrivateDnsName was when the instance was bootstrapped, and the aws-iam-authenticator will use the value cached at that same time.

Eventually though, the aws-iam-authenticator instances will be recycled, at which point the current (now different) PrivateDnsName will be used to make the authz decision.

Even if the aws-iam-authenticator freshened its cache regularly, the kubelet would not pick up changes in the PrivateDnsName; so changing this is currently not supported.

@voidense
Copy link
Author

voidense commented Oct 6, 2023

My guess as to why you're seeing something break a long time after you change these options is because the aws-iam-authenticator permanently caches each instance's PrivateDnsName: https://github.com/kubernetes-sigs/aws-iam-authenticator/blob/e847a7b5792b51307b358dd106c84fe8c38b4461/pkg/ec2provider/ec2provider.go#L120

Your kubelet will continue to use whatever the PrivateDnsName was when the instance was bootstrapped, and the aws-iam-authenticator will use the value cached at that same time.

Eventually though, the aws-iam-authenticator instances will be recycled, at which point the current (now different) PrivateDnsName will be used to make the authz decision.

Even if the aws-iam-authenticator freshened its cache regularly, the kubelet would not pick up changes in the PrivateDnsName; so changing this is currently not supported.

got it. this seems to be the most probable root cause.

would this essentially conclude that if I flip my VPC DNS settings that results in a change of PrivateDNSName, my existing node group is doomed to go wrong if i keep it running long enough for the aws-iam-authenticator instances to be recycled?

This whole thing seems a bit wrong to me that we're sensitive to a VPC DNS change in the context of EKS node groups. Shouldn't there be some kind of canonical and static format for a node's identity that's used in auth to control plane instead of a volatile (relatively speaking) one like PrivateDNSName? On the other hand, if there are valid reasons for this we should at least make recommendations in the doc that you shouldn't make certain VPC dns changes for a running node group, or if you have to, re-create your node group asap after you've done so -- if that makes sense

@cartermckinnon
Copy link
Member

cartermckinnon commented Oct 6, 2023

Shouldn't there be some kind of canonical and static format for a node's identity that's used in auth to control

yep, exactly! I’m working on this for a future Kubernetes version on EKS. But for now, the DNS name is in the critical path by default.

@cartermckinnon
Copy link
Member

Going to close this as it's a known issue.

@cartermckinnon cartermckinnon closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants