Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cluster autoscaler to scale down game server pods #2747

Closed
roberthbailey opened this issue Sep 23, 2022 · 13 comments · Fixed by #2754
Closed

Allow cluster autoscaler to scale down game server pods #2747

roberthbailey opened this issue Sep 23, 2022 · 13 comments · Fixed by #2754
Assignees
Labels
kind/feature New features for Agones
Milestone

Comments

@roberthbailey
Copy link
Member

Is your feature request related to a problem? Please describe.
[This is paraphrased from a slack conversation]

As described in https://agones.dev/site/docs/advanced/scheduling-and-autoscaling/#cluster-autoscaler-1, we currently prevent the cluster autoscaler from removing game server pods by adding the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict": "false".

For game servers that are OK being terminated (even while in the allocated state), we should make it possible for the cluster autoscaler to remove the pods to compact the cluster by removing underutilized nodes. Note that this imposes a pretty strict burden on the game server & it's configuration to have properly set graceful termination and for the game server to be able to actually gracefully terminate and finish player sessions within the configured time. As a result, we should err on the side of not enabling this by default and considering it an advanced use case.

Describe the solution you'd like
A helm configuration option to change the boolean value for this particular annotation.

Note that removing the annotation will not have the desired effect, because the cluster autoscaler does not understand that Agones-managed pods are managed and since it believes that they are un-replicated bare pods it will not remove them. Instead, we need to change the annotation to "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" to let the autoscaler know that it is safe to remove the pods.

Describe alternatives you've considered
Instead of allowing the cluster autoscaler to compact the cluster and move workloads, we could write a custom controller that looks at the cluster and shuts down any game servers that can be evicted thereby compacting it from there. This would largely duplicate the functionality of the cluster autoscaler, but we could potentially add some Agones-aware logic to better choose which game servers to evict.

Additional context
n/a

@roberthbailey roberthbailey added the kind/feature New features for Agones label Sep 23, 2022
@roberthbailey
Copy link
Member Author

Some notes to myself:

@roberthbailey roberthbailey self-assigned this Sep 23, 2022
@markmandel
Copy link
Member

markmandel commented Sep 23, 2022

Fun question for this - is there a case for being able to make this change to the Pod annotation on a per GameServer / Fleet basis, rather than all of Agones?

Given that one would need a Gracefull termination for any GameServer that is safe to evict - in a heterogeneous Fleet environment, it might be useful to specify this at the GameServer level (and thereby at the Fleet level) such that those that have a graceful termination could be safe to evict, and those that don't, are marked as not safe to evict?

WDYT?

This might need to be feature flagged, but something like:

apiVersion: "agones.dev/v1"
kind: GameServer
metadata:
  name: "xonotic"
spec:
  autoscaler:
     safeToEvict: true # defaults to false
  ports:
    - name: default
      containerPort: 26000
  template:
    spec:
      containers:
      - name: xonotic
        image: gcr.io/agones-images/xonotic-example:0.8

@mtcode
Copy link

mtcode commented Sep 23, 2022

@markmandel It's a good suggestion; I'm open to this being specified on a fleet level instead of globally. There may be cases where having the additional flexibility is beneficial such as heterogenous fleet deployments as you mentioned.

@roberthbailey
Copy link
Member Author

I actually like that idea, since it moves the configuration for the autoscaler and the game server close together, making it easier for a single person to understand the relationship. You don't have to ask your cluster admin who installs and configures Agones to do it differently, you just need to specify your fleet differently, and that is the same place that you would be putting the proper graceful termination configuration as well, making it less likely that you set this new value and not the other one.

@roberthbailey
Copy link
Member Author

I'm starting to look into implementing this and ran across an interesting question - how do you feel about adding the field to the game server vs. adding it to the fleet?

Here is the game server example @markmandel gave above:

apiVersion: "agones.dev/v1"
kind: GameServer
metadata:
  name: "xonotic"
spec:
  autoscaler:
     safeToEvict: true # defaults to false
  ports:
    - name: default
      containerPort: 26000
  template:
    spec:
      containers:
      - name: xonotic
        image: gcr.io/agones-images/xonotic-example:0.8

Here is that translated to a fleet:

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: "xonotic-fleet"
spec:
  replicas: 2
  scheduling: Packed
  template:
      autoscaler:
         safeToEvict: true # defaults to false
      ports:
        - name: default
          containerPort: 26000
      template:
        spec:
          containers:
          - name: xonotic
            image: gcr.io/agones-images/xonotic-example:0.8

If we instead added the field to a fleet itself, then the example would look more like:

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: "xonotic-fleet"
spec:
  replicas: 2
  scheduling: Packed
  autoscaler:
    safeToEvict: true # defaults to false
  template:
      ports:
        - name: default
          containerPort: 26000
      template:
        spec:
          containers:
          - name: xonotic
            image: gcr.io/agones-images/xonotic-example:0.8

It feels to me like autoscaler related settings fit better into a fleet spec, similarly to how a fleet has a scheduling field which relates to spreading vs packing game servers. This is reinforced by looking at the code block I linked to earlier, where we only apply the annotation if the scheduling is set to packed:

if gs.Spec.Scheduling == apis.Packed {
	// This means that the autoscaler cannot remove the Node that this Pod is on.
	// (and evict the Pod in the process)
	pod.ObjectMeta.Annotations["cluster-autoscaler.kubernetes.io/safe-to-evict"] = "false"
}

WDYT about adding the field to the game server spec vs. the fleet spec? Are there any cases where we would want to be able to set this on a game server outside of the context of a fleet?

@markmandel
Copy link
Member

I would argue that it makes the most sense to have it on the GameServer, since that is the most flexible implementation (maybe someone doesn't actually use a Fleet at all?).

But to note, when referencing Scheduling a GameServer does actually have a scheduling field in GameServerSpec - it just mainly gets populated by the Fleet, because in the code you referenced, we need to know at the GameServer level what the scheduling value is. So either way, you will need to add it to a GameServer I expect.

func (gsSet *GameServerSet) GameServer() *GameServer {
gs := &GameServer{
ObjectMeta: *gsSet.Spec.Template.ObjectMeta.DeepCopy(),
Spec: *gsSet.Spec.Template.Spec.DeepCopy(),
}
gs.Spec.Scheduling = gsSet.Spec.Scheduling

That being said, I'm surprised we only set it on Packed 🤔 (although I wrote it, so go figure) - I guess the theory being that on-prem you likely wouldn't have a autoscaler, since it's static machine configurations.

@roberthbailey
Copy link
Member Author

But to note, when referencing Scheduling a GameServer does actually have a scheduling field in GameServerSpec - it just mainly gets populated by the Fleet, because in the code you referenced, we need to know at the GameServer level what the scheduling value is.

Huh, this isn't documented in https://agones.dev/site/docs/reference/gameserver/ (where I was looking) but when I look into the source there it is: https://github.com/googleforgames/agones/blob/main/pkg/apis/agones/v1/gameserver.go#L152

@roberthbailey
Copy link
Member Author

One other way to go about this would be to only write the cluster-autoscaler.kubernetes.io/safe-to-evict annotation if it isn't explicitly set in the game server's pod metadata.

Right now, our code will overwrite it even if it is set explicitly, e.g.

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server
spec:
  replicas: 2
  template:
    spec:
      ports:
      - name: default
        containerPort: 7654
      template:
        metadata:
          annotations:
            cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
        spec:
          containers:
          - name: simple-game-server
            image: gcr.io/agones-images/simple-game-server:0.13

produces the same result as

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server-2
spec:
  replicas: 2
  template:
    spec:
      ports:
      - name: default
        containerPort: 7654
      template:
        spec:
          containers:
          - name: simple-game-server
            image: gcr.io/agones-images/simple-game-server:0.13

If we only set the annotation if it wasn't already explicitly set, then the first yaml example here would work as expected without us needing to introduce any new fields into the Agones API.

@roberthbailey
Copy link
Member Author

Or going one step further, since the cluster autoscaler won't remove what it believes is an unreplicated pod (e.g. not created via a replicaset, replicationcontroller, daemonset, job, or stateful set here), we could remove setting this annotation altogether. Even without the annotation set to false, the cluster autoscaler won't remove the pods to remove nodes, and then if someone wanted to let the cluster autoscaler know that it was safe to remove pods (because they are part of a fleet and are in fact replicated) then the annotation could be set explicitly in the pod template, again without needing to increase the surface area of the Agones API.

@roberthbailey
Copy link
Member Author

For my last two points, I guess I'm arguing that I would rather we expose the power of k8s directly instead of wrapping it with extra fields that we need to implement, document, and explain to users. Both of those alternatives don't require us to extend the Agones API but still enable the desired functionality.

@markmandel
Copy link
Member

Or going one step further, since the cluster autoscaler won't remove what it believes is an unreplicated pod (e.g. not created via a replicaset, replicationcontroller, daemonset, job, or stateful set here), we could remove setting this annotation altogether.

AFAIK this functionality isn't documented, but the annotation is - so I'd rather be explicit in this case.

One other way to go about this would be to only write the cluster-autoscaler.kubernetes.io/safe-to-evict annotation if it isn't explicitly set in the game server's pod metadata.

This is also a really good idea. I keep forgetting *Templates contain metadata sections.

....and can confirm we do copy it into the Pod on creation!

func (gs *GameServer) Pod(sidecars ...corev1.Container) (*corev1.Pod, error) {
pod := &corev1.Pod{
ObjectMeta: *gs.Spec.Template.ObjectMeta.DeepCopy(),
Spec: *gs.Spec.Template.Spec.DeepCopy(),
}

So yes, if we do a check to see if it's already set, and if so, don't overwrite it, that would be good to go.

Then it's just a documentation issue (FAQ? Autoscaling? Somewhere else).

@roberthbailey
Copy link
Member Author

I've created #2754 with the code changes. I move it from draft to ready for review once I have a chance to add the documentation updates.

@roberthbailey
Copy link
Member Author

Then it's just a documentation issue (FAQ? Autoscaling? Somewhere else).

I don't think this fits well into the FAQ, so I expanded the section in the scheduling and autoscaling page where we mention the use of this annotation. I think that makes the most sense. This is an advanced feature and folks who run across this behavior will likely be looking for where we document our use of that annotation, at which point they will run across the workaround if they don't want us to set it for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New features for Agones
Projects
None yet
4 participants