Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HELM] Fix Kubernetes Routing Issue #13450

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

piby180
Copy link

@piby180 piby180 commented Jun 20, 2024

This PR aims to partially solve the issue mentioned here

With publishNotReadyAddresses set to true in service resource, kubernetes will publish the DNS address irrespective of whether the pod is in Ready state or not. Then the sole responsibility of when to start making requests to the newly restarted server/broker lies on Pinot.

The following error will be fixed with this PR

Unable to resolve host pinot-server-2.pinot-server-headless.pinot-dev.svc.cluster.local
or 
Unable to resolve host pinot-broker-2.pinot-server-headless.pinot-dev.svc.cluster.local

@abhioncbr
Copy link
Collaborator

Does it mean that with this change, pods are going to get the request before they get into Ready Stage?

@piby180
Copy link
Author

piby180 commented Jun 20, 2024

  1. Currently, Pinot assumes that the server is online as soon as all the segments have been made online on zookeeper and it starts to send the requests to the newly restarted server immediately.
  2. Kubernetes however marks the server pod as "Ready" only when the first readiness probe is succeeded.
  3. Since pinot makes request to headless service, in order for the request to resolve, the DNS for each server is needed to be resolved. So all the queries made to Pinot between "the time the server is online as per zookeeper" and "the time kubernetes mark it as Ready" fails with error.
Unable to resolve host pinot-server-2.pinot-server-headless.pinot-dev.svc.cluster.local
or 
Unable to resolve host pinot-broker-2.pinot-server-headless.pinot-dev.svc.cluster.local
  1. This causes downtime in Pinot whenver a pod (server or broker) is restarted. This can be seconds or minutes depending on how the probes are configured.

With this change, the DNS for each server will always resolve irrespective of whether the pod is in "Ready" state or not. Pinot will however only make requests to the server when it is online as per Zookeeper.

@abhioncbr
Copy link
Collaborator

  1. Currently, Pinot assumes that the server is online as soon as all the segments have been made online on zookeeper and it starts to send the requests to the newly restarted server immediately.
  2. Kubernetes however marks the server pod as "Ready" only when the first readiness probe is succeeded.
  3. Since pinot makes request to headless service, in order for the request to resolve, the DNS for each server is needed to be resolved. So all the queries made to Pinot between "the time the server is online as per zookeeper" and "the time kubernetes mark it as Ready" fails with error.
Unable to resolve host pinot-server-2.pinot-server-headless.pinot-dev.svc.cluster.local
or 
Unable to resolve host pinot-broker-2.pinot-server-headless.pinot-dev.svc.cluster.local
  1. This causes downtime in Pinot whenver a pod (server or broker) is restarted. This can be seconds or minutes depending on how the probes are configured.

With this change, the DNS for each server will always resolve irrespective of whether the pod is in "Ready" state or not. Pinot will however only make requests to the server when it is online as per Zookeeper.

Thanks for the explanation. I just checked, and the zookeeper is doing the same for the headless service.
Question: Should it be added only in headless services or in all services?

@piby180
Copy link
Author

piby180 commented Jun 20, 2024

  1. Currently, Pinot assumes that the server is online as soon as all the segments have been made online on zookeeper and it starts to send the requests to the newly restarted server immediately.
  2. Kubernetes however marks the server pod as "Ready" only when the first readiness probe is succeeded.
  3. Since pinot makes request to headless service, in order for the request to resolve, the DNS for each server is needed to be resolved. So all the queries made to Pinot between "the time the server is online as per zookeeper" and "the time kubernetes mark it as Ready" fails with error.
Unable to resolve host pinot-server-2.pinot-server-headless.pinot-dev.svc.cluster.local
or 
Unable to resolve host pinot-broker-2.pinot-server-headless.pinot-dev.svc.cluster.local
  1. This causes downtime in Pinot whenver a pod (server or broker) is restarted. This can be seconds or minutes depending on how the probes are configured.

With this change, the DNS for each server will always resolve irrespective of whether the pod is in "Ready" state or not. Pinot will however only make requests to the server when it is online as per Zookeeper.

Thanks for the explanation. I just checked, and the zookeeper is doing the same for the headless service. Question: Should it be added only in headless services or in all services?

Change to just headless services is sufficient. This has been tested on our cluster.
The "normal" service endpoints are being used by ingress resource.

If you agree, I can remove this change from "headful" services.

@abhioncbr
Copy link
Collaborator

  1. Currently, Pinot assumes that the server is online as soon as all the segments have been made online on zookeeper and it starts to send the requests to the newly restarted server immediately.
  2. Kubernetes however marks the server pod as "Ready" only when the first readiness probe is succeeded.
  3. Since pinot makes request to headless service, in order for the request to resolve, the DNS for each server is needed to be resolved. So all the queries made to Pinot between "the time the server is online as per zookeeper" and "the time kubernetes mark it as Ready" fails with error.
Unable to resolve host pinot-server-2.pinot-server-headless.pinot-dev.svc.cluster.local
or 
Unable to resolve host pinot-broker-2.pinot-server-headless.pinot-dev.svc.cluster.local
  1. This causes downtime in Pinot whenver a pod (server or broker) is restarted. This can be seconds or minutes depending on how the probes are configured.

With this change, the DNS for each server will always resolve irrespective of whether the pod is in "Ready" state or not. Pinot will however only make requests to the server when it is online as per Zookeeper.

Thanks for the explanation. I just checked, and the zookeeper is doing the same for the headless service. Question: Should it be added only in headless services or in all services?

Change to just headless services is sufficient. This has been tested on our cluster. The "normal" service endpoints are being used by ingress resource.

If you agree, I can remove this change from "headful" services.

In my opinion, it should be only in headless svc, but I suggest we wait for others' input as well.

@abhioncbr abhioncbr requested a review from xiangfu0 June 21, 2024 00:07
Copy link
Contributor

@zhtaoxiang zhtaoxiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@soumitra-st soumitra-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@@ -29,6 +29,7 @@ metadata:
{{- include "pinot.brokerLabels" . | nindent 4 }}
spec:
type: {{ .Values.broker.external.type }}
publishNotReadyAddresses: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make default external to false here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@xiangfu0
Copy link
Contributor

xiangfu0 commented Aug 16, 2024

I think we should only make default value for service-headless and server service to true, all others to false.

@xiangfu0
Copy link
Contributor

xiangfu0 commented Aug 16, 2024

Pinot handles all the internal readiness for routing request to servers, so it's ok to expose all the server/minion ips once the pod is up but not ready.
For controller and broker, they all start up pretty fast and you don't want them to get request until up, so it's ok to wait until health check finishing to mark them online.

@@ -27,6 +27,7 @@ metadata:
{{- include "pinot.brokerLabels" . | nindent 4 }}
spec:
type: {{ .Values.broker.service.type }}
publishNotReadyAddresses: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default to false

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -29,6 +29,7 @@ metadata:
{{- include "pinot.controllerLabels" . | nindent 4 }}
spec:
type: {{ .Values.controller.external.type }}
publishNotReadyAddresses: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default to false

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -27,6 +27,7 @@ metadata:
{{- include "pinot.controllerLabels" . | nindent 4 }}
spec:
type: {{ .Values.controller.service.type }}
publishNotReadyAddresses: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default to false

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov-commenter
Copy link

codecov-commenter commented Aug 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 64.00%. Comparing base (59551e4) to head (398a40a).
Report is 1094 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #13450      +/-   ##
============================================
+ Coverage     61.75%   64.00%   +2.25%     
- Complexity      207     1539    +1332     
============================================
  Files          2436     2596     +160     
  Lines        133233   143295   +10062     
  Branches      20636    21948    +1312     
============================================
+ Hits          82274    91718    +9444     
+ Misses        44911    44833      -78     
- Partials       6048     6744     +696     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.95% <ø> (+2.24%) ⬆️
java-21 63.90% <ø> (+2.27%) ⬆️
skip-bytebuffers-false 63.97% <ø> (+2.22%) ⬆️
skip-bytebuffers-true 63.88% <ø> (+36.15%) ⬆️
temurin 64.00% <ø> (+2.25%) ⬆️
unittests 64.00% <ø> (+2.25%) ⬆️
unittests1 55.61% <ø> (+8.71%) ⬆️
unittests2 34.47% <ø> (+6.74%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@piby180
Copy link
Author

piby180 commented Sep 26, 2024

@xiangfu0 Can we merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants