-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(health): Add custom health check for KServe Inference Service resources #14177
feat(health): Add custom health check for KServe Inference Service resources #14177
Conversation
Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com>
77826fb
to
0678c46
Compare
end | ||
end | ||
end | ||
if status_true == 3 and status_false == 0 and status_unknown == 0 then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is status_true == 3
necessary and sufficient? I see there are four condition types which may increment status_true: IngressReady, PredictorConfigurationReady, PredictorReady, PredictorRouteReady. Do we need all 5, or is some subset sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need all 5 to be true. Added that.
Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com>
Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com>
resource_customizations/serving.kserve.io/InferenceService/health.lua
Outdated
Show resolved
Hide resolved
Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com>
…lth.lua Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Rachit Chauhan <rachitchauhan43@gmail.com>
Codecov ReportPatch and project coverage have no change.
Additional details and impacted files@@ Coverage Diff @@
## master #14177 +/- ##
=======================================
Coverage 49.62% 49.62%
=======================================
Files 256 256
Lines 43800 43800
=======================================
Hits 21736 21736
Misses 19932 19932
Partials 2132 2132 ☔ View full report in Codecov by Sentry. |
#- healthStatus: | ||
# status: Progressing | ||
# message: "PredictorConfigurationReady is Unknown\nPredictorReady is Unknown, since RevisionMissing. Configuration \"hello-world-predictor-default\" is waiting for a Revision to become ready.\nPredictorRouteReady is Unknown, since RevisionMissing. Configuration \"hello-world-predictor-default\" is waiting for a Revision to become ready.\nReady is Unknown, since RevisionMissing. Configuration \"hello-world-predictor-default\" is waiting for a Revision to become ready.\n" | ||
# inputPath: testdata/progressing.yaml | ||
#- healthStatus: | ||
# status: Degraded | ||
# message: "IngressReady is False, since Predictor ingress not created.\nPredictorConfigurationReady is False, since RevisionFailed. Revision \"helloworld-00002\" failed with message: Container failed with: container exited with no error.\nPredictorReady is False, since RevisionFailed. Revision \"helloworld-00002\" failed with message: Container failed with: container exited with no error.\nReady is False, since Predictor ingress not created.\n" | ||
# inputPath: testdata/degraded.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanna uncomment these and see if we get past tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Just testing one by one. Once this healthy one passes I will uncomment others too.
Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @rachitchauhan43!
…sources (argoproj#14177) * Initial commit for ISVC health check Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Adding test for health check and incorporating review comment. Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Adding test for degraded state Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Testing only healthy scenario Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Update resource_customizations/serving.kserve.io/InferenceService/health.lua Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Rachit Chauhan <rachitchauhan43@gmail.com> * Uncommenting rest fo the tests Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> --------- Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> Signed-off-by: Rachit Chauhan <rachitchauhan43@gmail.com> Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
…sources (argoproj#14177) * Initial commit for ISVC health check Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Adding test for health check and incorporating review comment. Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Adding test for degraded state Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Testing only healthy scenario Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> * Update resource_customizations/serving.kserve.io/InferenceService/health.lua Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Rachit Chauhan <rachitchauhan43@gmail.com> * Uncommenting rest fo the tests Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> --------- Signed-off-by: rachitchauhan43 <rachitchauhan43@gmail.com> Signed-off-by: Rachit Chauhan <rachitchauhan43@gmail.com> Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Context
This PR adds the custom health check for KServe Inference Service Resource
Note on DCO:
If the DCO action in the integration test fails, one or more of your commits are not signed off. Please click on the Details link next to the DCO action for instructions on how to resolve this.
Checklist:
Please see Contribution FAQs if you have questions about your pull-request.