Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim unneeded fields stored in informers to reduce memory footprint #6317

Merged
merged 1 commit into from
May 15, 2024

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented May 10, 2024

Unused fields stored in informers can occupy significant memory usage. The informer provides a transform func to allow mutating the objects before they are stored, via which we can trim unneeded fields.

In general, managedFields is never used for all types of objects. In addition, we trim some unneeded fields from Pod objects given their unused fields take the most considerable space.

It can reduce the memory usage of 1k simple Pod by approximately 5m.


Test:
Measured the memory usage of antrea-controller with 2k, 4k, 6k Pods with the following Pod template:

  template:
    metadata:
      labels:
        app: client
    spec:
      containers:
      - image: k8s.gcr.io/e2e-test-images/agnhost:2.29
        imagePullPolicy: IfNotPresent
        name: agnhost
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      nodeName: antrea-agent-simulator-0
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

After creating 2k Pods, its memory usage increases from 78.9m to 104.5m (+25.6m).
After trimming managedFields only, its memory usage decreases to 100.6m (-3.9m).
After trimming managedFields and ownerReference only, its memory usage decreases to 99.5m (-5m).
After trimming all fields covered by the patch, its memory usage decreases to 94.5m (-10m), reducing about 40% in total.

After creating 4k Pods, its memory usage increases to 130.3m (+51.4m).
After trimming all fields covered by the patch, its memory usage decreases to 108.1m (-22.2m), reducing about 43% in total.

After creating 6k Pods, its memory usage increases to 157.6m (+78.7m).
After trimming all fields covered by the patch, its memory usage decreases to 119.3m (-38.3m), reducing about 48% in total.

image

@tnqn
Copy link
Member Author

tnqn commented May 10, 2024

/test-all
/test-multicast-e2e
/test-multicluster-e2e
/test-ipv6-e2e
/test-ipv6-only-e2e

@tnqn tnqn added the action/release-note Indicates a PR that should be included in release notes. label May 10, 2024
@tnqn tnqn requested review from antoninbas and jianjuns May 10, 2024 15:01
jianjuns
jianjuns previously approved these changes May 10, 2024
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!
Could we also trim for local Pod informer?

@antoninbas
Copy link
Contributor

Looks great! Could we also trim for local Pod informer?

Based on the code, it already seems to be the case

Comment on lines 37 to 39
// Trim common fields for all objects.
accessor.SetManagedFields(nil)
accessor.SetOwnerReferences(nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make using Update in our controllers dangerous? If these fields are present and are trimmed, and we later do an Update (not a Patch) in a controller, will the contents of these fields be lost?
I may be missing something but we do apply this trimmer to our CRDs. It is possible that some third-party software could integrate with Antrea, with some controllers that manage the lifecycle of Antrea CRDs. If these third-party controllers leverage owner references for lifecyle management (which is common and encouraged by kube-builder AFAIK), could he field be trimmed by the Antrea controller in storage, breaking garbage collection?

In other words, is it only safe to do this for resources that we do not Update?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I agree we shouldn't do this for all objects to avoid risks and complication. The considerable objects that may have the fields set we watch is Pod only, so it would be safer to move it to TrimPod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.
I am also ok to do this for other objects if we ensure that we only ever use Patch, but then there is always a risk that we make a mistake in the future, so it should probably only be done if there is a noticeable reduction in memory usage.
For example, we could do it for Services if we replaced our usage of Update with Patch in the ServiceExternalIP controller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the test result, trimming owner reference is not that noticeable. I moved it to Pod specific trimmer and didn't create type specific trimmer for other types.

pkg/util/k8s/transform.go Show resolved Hide resolved
@tnqn
Copy link
Member Author

tnqn commented May 11, 2024

Looks great! Could we also trim for local Pod informer?

Based on the code, it already seems to be the case

Yes, it has been applied to local Pod informer.

@tnqn
Copy link
Member Author

tnqn commented May 14, 2024

@jianjuns @antoninbas I moved ownerReference removal to TrimPod and updated the test results.

return obj, nil
}
// Trim common fields for all objects.
accessor.SetManagedFields(nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just to confirm, we don't have the same issue for managedFields as for ownerReference?
Is it because of this:

Note that setting the managedFields to an empty list will not reset the field. This is on purpose, so managedFields never get stripped by clients not aware of the field.

From https://kubernetes.io/docs/reference/using-api/server-side-apply/#clearing-managedfields

If yes, maybe we could add a comment explaining why this is safe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, added a few notes and explanations.

Unused fields stored in informers can occupy significant memory usage.
The informer provides a transformer func to allow mutating the objects
before they are stored, via which we can trim unneeded fields.

In general, managedFields is never used for all types of objects. In
addition, we trim some unneeded fields from Pod objects given their
unused fields take the most considerable space.

It can reduce the memory usage of 1k simple Pod by approximately 5m.

Signed-off-by: Quan Tian <quan.tian@broadcom.com>
@tnqn
Copy link
Member Author

tnqn commented May 15, 2024

/test-all

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I assume the impact of trimming on CPU usage is minimal, even in the antrea-controller?

@tnqn
Copy link
Member Author

tnqn commented May 15, 2024

LGTM I assume the impact of trimming on CPU usage is minimal, even in the antrea-controller?

Yes, confirmed trimming a Pod takes around 40.48 ns/op, hence negligible.

@tnqn tnqn merged commit dcf9b4c into antrea-io:main May 15, 2024
54 of 58 checks passed
@tnqn tnqn deleted the trim-informer-objects branch May 15, 2024 16:15
@luolanzone luolanzone added this to the Antrea v2.1 release milestone Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants