Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for short-circuiting in AntreaProxy #4815

Merged
merged 1 commit into from
May 18, 2023

Conversation

hongliangl
Copy link
Contributor

@hongliangl hongliangl commented Apr 6, 2023

Short-circuiting is used to ensure that the traffic from Pod/Node clients to
external addresses behaves the same way as the traffic from external clients to
external addresses.

External clients do not need to consider which Nodes have local Endpoints, as the
load balancer handles this for them. However, for Pod/Node clients, when the
externalTrafficPolicy of the Service is set to "Local", it will not work on Nodes
without an Endpoint. With this PR, even when the externalTrafficPolicy is set
to "Local", Pod/Node clients without local Endpoints can still work by selecting
Endpoints from the cluster.

@hongliangl hongliangl requested review from tnqn and wenyingd April 6, 2023 10:33
@hongliangl
Copy link
Contributor Author

@tnqn @wenyingd Could you help review this roughly first? I modify method InstallServiceFlows in this draft PR to implement short-circuiting in AntreaProxy, and I'm not sure if it is a good way. Could you give some ideas? Thanks.

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hongliangl I think you should first have a clear description to explain what it tries to do for better understanding. The term short-circuiting is too obscure. Even a few weeks ago we had totally different understandings about it, and I'm not sure if we reach a consensus yet before I take a deep look at the code, not to mention other reviewers.

pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230328-short-circuit branch 2 times, most recently from 988c0a5 to e6b33de Compare April 16, 2023 12:00
@hongliangl hongliangl marked this pull request as ready for review April 16, 2023 12:00
@hongliangl hongliangl added area/proxy Issues or PRs related to proxy functions in Antrea action/release-note Indicates a PR that should be included in release notes. labels Apr 16, 2023
@vicky-liu vicky-liu added this to the Antrea v1.12 release milestone May 4, 2023
@jianjuns
Copy link
Contributor

jianjuns commented May 5, 2023

@hongliangl : the commit is "Add support for ExternalIP in AntreaProxy"? Have you pushed the right commit?

@hongliangl
Copy link
Contributor Author

@hongliangl : the commit is "Add support for ExternalIP in AntreaProxy"? Have you pushed the right commit?

I pushed wrong git commit. I just updated it.

pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230328-short-circuit branch 2 times, most recently from 1326571 to 0f3fc2b Compare May 6, 2023 02:07
@hongliangl hongliangl requested review from jianjuns and tnqn May 6, 2023 02:08
@luolanzone
Copy link
Contributor

@hongliangl I would suggest to remove See kubernetes/kubernetes#108526 for more information. in the commit message. Every force pushed commits will be showed up in the issue history.

@hongliangl hongliangl force-pushed the 20230328-short-circuit branch 4 times, most recently from 194f5db to 133d264 Compare May 11, 2023 08:18
@tnqn
Copy link
Member

tnqn commented May 11, 2023

@hongliangl
Copy link
Contributor Author

@hongliangl I think NodePort should respect ExternalTrafficPolicy according to https://github.com/kubernetes/kubernetes/blob/122a459dcbf7b2317ac5bd3793ed94a400bd4a77/staging/src/k8s.io/api/core/v1/types.go#L4746-L4749

Since it has the definition in API comment, we should implement proxy as it describes.

pkg/agent/openflow/client.go Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230328-short-circuit branch 2 times, most recently from cff9001 to 3f0a45c Compare May 12, 2023 03:12
@tnqn
Copy link
Member

tnqn commented May 16, 2023

The dependency has been merged, could you rebase?

@hongliangl
Copy link
Contributor Author

The dependency has been merged, could you rebase?

Sure, I'll rebase it right now.

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

pkg/agent/openflow/pipeline.go Show resolved Hide resolved
pkg/agent/proxy/proxier.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
pkg/agent/proxy/proxier_test.go Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230328-short-circuit branch 2 times, most recently from 3184f14 to 6eaf6b0 Compare May 16, 2023 12:23
@hongliangl hongliangl requested a review from tnqn May 16, 2023 12:26

clientIP, err := probeClientIPFromPod(data, pod, busyboxContainerName, url)
require.NoError(t, err, errMsg)
require.Equal(t, clientIP, expectedClientIPs[idx])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the clientIP expected to change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-host-network Endpoint, the clientIP should be always the same (source Pod IP), but for host-network Endpoint, if the Endpoint is on local host-network, the clientIP should be the source Pod IP; if the Endpoint is on a remote Node, the clientIP will be the transparent interface IP. In this test, we have two host-network Endpoint, and both of them can be selected even when ExternalTrafficPolicy is Local for client from Pod.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's transparent interface IP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be Node IP. For example, if the Endpoint is a host network on another Node, the packet will be forwarded to local Node host network, then will be SNATed to remote Node. The SNATed IP should be the local Node IP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it seems the SNAT is not necessary but I understand it's not related to the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, assuming that there are two Nodes:
Node A: 192.168.77.100, Pod CIDR 10.10.0.0/24
Node B: 192.168.77.101, Pod CIDR 10.10.1.0/24
Service: 10.96.0.9:8080, two Endpoints, 192.168.77.100:8080, 192.168.77.101:8080
Pod on Node A: 10.10.0.10

Previously, if externalTrafficPolicy is Local, for Pod -> Service, only Endpoint 192.168.77.100:8080 will be selected, and the clientIP is 10.10.0.10. Now, Endpoint 192.168.77.101:8080 could be also selected. If so:

  • on Pod network: 10.10.0.10 -> 10.96.0.9
  • on host network: 10.10.0.10 -> 192.168.77.101
  • to make sure that the reply packets can be returned correctly, on network between Nodes: 192.168.77.100 -> 192.168.77.101, then the clientIP is 192.168.77.100. As a result, the clientIP is not certain. It could be 10.10.0.10 or 192.168.77.100.

tnqn
tnqn previously approved these changes May 17, 2023
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented May 17, 2023

/test-all

Short-circuiting is used to ensure that the traffic from Pod/Node clients to
external addresses behaves the same way as the traffic from external clients to
external addresses.

External clients do not need to consider which Nodes have local Endpoints, as the
load balancer handles this for them. However, for Pod/Node clients, when the
externalTrafficPolicy of the Service is set to "Local", it will not work on Nodes
without an Endpoint. With this PR, even when the externalTrafficPolicy is set
to "Local", Pod/Node clients without local Endpoints can still work by selecting
Endpoints from the cluster.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
@hongliangl
Copy link
Contributor Author

LGTM

@tnqn only renamed a variable, no other changes.

@hongliangl
Copy link
Contributor Author

/test-all

@tnqn
Copy link
Member

tnqn commented May 18, 2023

/test-windows-proxyall-e2e
/test-windows-conformance
/test-windows-e2e

@tnqn
Copy link
Member

tnqn commented May 18, 2023

/test-windows-conformance

@tnqn tnqn merged commit 46d2a3c into antrea-io:main May 18, 2023
@hongliangl hongliangl deleted the 20230328-short-circuit branch May 18, 2023 12:33
@hongliangl
Copy link
Contributor Author

Thanks for merging this PR.

ceclinux pushed a commit to ceclinux/antrea that referenced this pull request Jun 5, 2023
Short-circuiting is used to ensure that the traffic from Pod/Node clients to
external addresses behaves the same way as the traffic from external clients to
external addresses.

External clients do not need to consider which Nodes have local Endpoints, as the
load balancer handles this for them. However, for Pod/Node clients, when the
externalTrafficPolicy of the Service is set to "Local", it will not work on Nodes
without an Endpoint. With this PR, even when the externalTrafficPolicy is set
to "Local", Pod/Node clients without local Endpoints can still work by selecting
Endpoints from the cluster.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/proxy Issues or PRs related to proxy functions in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants