Custom VPC domain-name affecting node lease #1457

voidense · 2023-10-06T03:57:26Z

What happened:
We observed the same symptom reported by #1263, however this is not a duplicate because we're using the updated AMI and the issue still happened. We took a closer look at the fix for that issue -- https://github.com/awslabs/amazon-eks-ami/pull/1264/files (which went out with 20230501) -- and found that the command used, aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[].Instances[].PrivateDnsName', actually returns the hostname with our custom domain as well.

order of events:

node group provisioned
VPC dns option is enabled. we have custom dhcp option set with our custom domain, call it "custom_domain"

All nodes suddenly stopped joining.

error message sample

I1005 21:32:17.410848      10 node_authorizer.go:260] NODE DENY: 'ip-10-176-27-215.ec2.internal' &authorizer.AttributesRecord{User:(*user.DefaultInfo)(<obfuscated>), Verb:"get", Namespace:"kube-node-lease", APIGroup:"coordination.k8s.io", APIVersion:"v1", Resource:"leases", Subresource:"", Name:"ip-10-176-27-215.custom_domain", ResourceRequest:true, Path:"/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-176-27-215.custom_domain"}

re-provisioned the node (manually terminate and let Cluster Autoscaling bringing up the new one). issue was immediately resolved and we saw the hostname pattern changed back to *.ec2-internal. During this entire period there was no VPC option change.

What you expected to happen:
No NODE DENY error should appear

How to reproduce it (as minimally and precisely as possible):
I'm not sure if this is deterministic, but the triggering condition seems to be:

provision node group (newer than 20230501 so the fix is in)
update vpc options to enable DNS and enable a DHCP options set that provides a custom_domain
wait for it to happen

Anything else we need to know?:

in a different cluster and different node (with the same setup), on the EC2 console for it we can see Hostname type is IP name: <ip>.custom_domain and Private IP DNS name (IPv4 only) is <ip>.ec2.internal. This mismatch is expected because we have that DHCP option, however, the CLI command aws ec2 describe-instances --instance-ids $INSTANCE_ID --query 'Reservations[].Instances[].PrivateDnsName' (which is used in the fix) actually returns <ip>.custom_domain
- reading the code comments it seems (i might be wrong here) it's assuming aws ec2 describe-instances ... PrivateDnsName should return .ec2.internal, but apparently that's not the case
our node lease renewal cadence is per-hour (verified by the logs). however, this issue didn't immediately, or in an hour, occur after nodes are provisioned or vpc option is changed -- it happened ~1 week afterwards. it seems a random node lease refresh event was triggering it

Environment:

AWS Region: us-east-1
Instance Type(s): m5a.4xlarge
EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.12
Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.23
AMI Version: amazon-eks-node-1.23-v20230711
Kernel (e.g. uname -a): Linux ip-10-176-42-253.custom_domain 5.4.249-163.359.amzn2.x86_64 #1 SMP Wed Jul 12 18:58:58 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Release information (run cat /etc/eks/release on a node): BASE_AMI_ID="ami-018ae0f2e02aab38b" BUILD_TIME="Fri Jul 28 04:19:03 UTC 2023" BUILD_KERNEL="5.4.249-163.359.amzn2.x86_64" ARCH="x86_64"

The text was updated successfully, but these errors were encountered:

cartermckinnon · 2023-10-06T16:41:58Z

I'm not clear on the exact behavior you're seeing. You mean that you're changing the PrivateDnsName after the instance launches and joins the cluster?

reading the code comments it seems (i might be wrong here) it's assuming aws ec2 describe-instances ... PrivateDnsName should return .ec2.internal, but apparently that's not the case

We're not assuming this. The PrivateDnsName is used to associate a Kubernetes user with IAM credentials (in the configmap/aws-auth). It doesn't matter if the PrivateDnsName uses a custom domain name.

our node lease renewal cadence is per-hour (verified by the logs).

Are you referring to the Lease object created by each kubelet? That should be updated every 10 seconds.

voidense · 2023-10-06T17:19:39Z

@cartermckinnon thanks for the reply.

You mean that you're changing the PrivateDnsName after the instance launches and joins the cluster?

the node group is always there and a part of the cluster with everything working fine, we changed the VPC DNS option (false to true) and after some time (1-2 weeks) we saw the the issue happen. we didn't change PrivateDnsName for any node directly. that VPC DNS change was the only change on the infra.
to be clear on the terminologies, when i pull up "Instance summary" on the aws console for one of the nodes, there are two fields:

field: Hostname type: IP name: some value --> let's call this "hostname"
field: Private IP DNS name (IPv4 only): some value --> let's call this "PrivateDnsName"

other than the main failure itself, the other symptom we observed for the problematic node groups are hostname being <ip>.custom_domain and PrivateDnsName being <ip>.ec2.internal, however aws ec2 describe-instance would return the former (ie with custom_domain). And a re-provision/destroy-and-recreate actually brings these 2 to be the same value at <ip>.ec2.internal and everything starts to work fine again

We're not assuming this. The PrivateDnsName is used to associate a Kubernetes user with IAM credentials (in the configmap/aws-auth). It doesn't matter if the PrivateDnsName uses a custom domain name.

so the error message reported by the api server is as follows:

I1005 21:32:17.410848      10 node_authorizer.go:260] NODE DENY: 'ip-10-176-27-215.ec2.internal' &authorizer.AttributesRecord{User:(*user.DefaultInfo)(<obfuscated>), Verb:"get", Namespace:"kube-node-lease", APIGroup:"coordination.k8s.io", APIVersion:"v1", Resource:"leases", Subresource:"", Name:"ip-10-176-27-215.custom_domain", ResourceRequest:true, Path:"/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/ip-10-176-27-215.custom_domain"}

question is who is giving ip-10-176-27-215.ec2.internal and who is giving ip-10-176-27-215.custom_domain? I presume the mismatch caused the NODE DENY problem

Are you referring to the Lease object created by each kubelet? That should be updated every 10 seconds

hmm we see Registered Node event logs every 1 hour, i'm not familiar with how it works, maybe this is a different thing than the Lease object? but the question remains that how come our issue didn't happen immediately, or after 10 second, or after 1 hour of the VPC DNS option flip, but at a random timestamp about 1.5 weeks after. This is also a 2nd time that we observed this. The 1st time was in a test cluster so we didn't take it too seriously.

cartermckinnon · 2023-10-06T17:53:56Z

Changing your VPC's DNS settings can cause the PrivateDnsName to change. The domain-name specified in your DHCP options will only show up in your instance's PrivateDnsName if one of these is false:

enableDnsHostnames
enableDnsSupport

If both of these are true, the PrivateDnsName will not reflect the custom domain-name.

My guess as to why you're seeing something break a long time after you change these options is because the aws-iam-authenticator permanently caches each instance's PrivateDnsName: https://github.com/kubernetes-sigs/aws-iam-authenticator/blob/e847a7b5792b51307b358dd106c84fe8c38b4461/pkg/ec2provider/ec2provider.go#L120

Your kubelet will continue to use whatever the PrivateDnsName was when the instance was bootstrapped, and the aws-iam-authenticator will use the value cached at that same time.

Eventually though, the aws-iam-authenticator instances will be recycled, at which point the current (now different) PrivateDnsName will be used to make the authz decision.

Even if the aws-iam-authenticator freshened its cache regularly, the kubelet would not pick up changes in the PrivateDnsName; so changing this is currently not supported.

voidense · 2023-10-06T19:38:18Z

My guess as to why you're seeing something break a long time after you change these options is because the aws-iam-authenticator permanently caches each instance's PrivateDnsName: https://github.com/kubernetes-sigs/aws-iam-authenticator/blob/e847a7b5792b51307b358dd106c84fe8c38b4461/pkg/ec2provider/ec2provider.go#L120

Your kubelet will continue to use whatever the PrivateDnsName was when the instance was bootstrapped, and the aws-iam-authenticator will use the value cached at that same time.

Eventually though, the aws-iam-authenticator instances will be recycled, at which point the current (now different) PrivateDnsName will be used to make the authz decision.

Even if the aws-iam-authenticator freshened its cache regularly, the kubelet would not pick up changes in the PrivateDnsName; so changing this is currently not supported.

got it. this seems to be the most probable root cause.

would this essentially conclude that if I flip my VPC DNS settings that results in a change of PrivateDNSName, my existing node group is doomed to go wrong if i keep it running long enough for the aws-iam-authenticator instances to be recycled?

This whole thing seems a bit wrong to me that we're sensitive to a VPC DNS change in the context of EKS node groups. Shouldn't there be some kind of canonical and static format for a node's identity that's used in auth to control plane instead of a volatile (relatively speaking) one like PrivateDNSName? On the other hand, if there are valid reasons for this we should at least make recommendations in the doc that you shouldn't make certain VPC dns changes for a running node group, or if you have to, re-create your node group asap after you've done so -- if that makes sense

cartermckinnon · 2023-10-06T20:30:32Z

Shouldn't there be some kind of canonical and static format for a node's identity that's used in auth to control

yep, exactly! I’m working on this for a future Kubernetes version on EKS. But for now, the DNS name is in the critical path by default.

cartermckinnon · 2023-10-10T19:02:01Z

Going to close this as it's a known issue.

cartermckinnon closed this as not planned Won't fix, can't repro, duplicate, stale Oct 10, 2023

This was referenced Mar 6, 2024

Can't launch AL2023 nodes aws/karpenter-provider-aws#5793

Closed

AL2023 - PrivateDNSName regression #1711

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom VPC domain-name affecting node lease #1457

Custom VPC domain-name affecting node lease #1457

voidense commented Oct 6, 2023 •

edited

Loading

cartermckinnon commented Oct 6, 2023

voidense commented Oct 6, 2023

cartermckinnon commented Oct 6, 2023 •

edited

Loading

voidense commented Oct 6, 2023

cartermckinnon commented Oct 6, 2023 •

edited

Loading

cartermckinnon commented Oct 10, 2023

Custom VPC domain-name affecting node lease #1457

Custom VPC domain-name affecting node lease #1457

Comments

voidense commented Oct 6, 2023 • edited Loading

cartermckinnon commented Oct 6, 2023

voidense commented Oct 6, 2023

cartermckinnon commented Oct 6, 2023 • edited Loading

voidense commented Oct 6, 2023

cartermckinnon commented Oct 6, 2023 • edited Loading

cartermckinnon commented Oct 10, 2023

voidense commented Oct 6, 2023 •

edited

Loading

cartermckinnon commented Oct 6, 2023 •

edited

Loading

cartermckinnon commented Oct 6, 2023 •

edited

Loading