[Obs AI Assistant] knowledge base integration tests #189000

neptunian · 2024-07-23T19:27:04Z

Closes #188999

integration tests for knowledge base api
adds new config field modelId, for internal use, to override elser model id
refactors knowledgeBaseService.setup() to fix bug where if the model failed to install when calling ml.putTrainedModel, we dont get stuck polling and retrying the install. We were assuming that the first error that gets throw when the model is exists would only happen once and the return true or false and poll for whether its done installing. But the installation could fail itself causing getTrainedModelsStats to continuously throw and try to install the model. Now user immediately gets error if model fails to install and polling does not happen.

… option

obltmachine · 2024-07-23T19:27:19Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

neptunian · 2024-07-23T19:43:30Z

/ci

neptunian · 2024-07-24T00:58:16Z

/ci

…unian/kibana into 188999-kb-integration-tests

kibanamachine · 2024-07-25T11:49:14Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#6629

[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 25/25 tests passed.

see run history

x-pack/test/functional/services/ml/api.ts

jgowdyelastic

Added a comment, but overall ML changes LGTM

Co-authored-by: James Gowdy <jgowdy@elastic.co>

dgieselaar · 2024-07-29T07:55:47Z

x-pack/plugins/observability_solution/observability_ai_assistant/server/plugin.ts

      if (!license.hasAtLeast('enterprise')) {
        return defaultModelId;
      }
+      if (configModelId) {


should we move this up before the license check?

I put it below just to make sure someone without an enterprise license could not try to use it, but not sure it matters. And because we are only, at the moment, testing the enterprise functionality using the assistant where its allowed. Though, I'm not sure why we need to doInit to start with and create the ES assets if not enterprise.

I think we await the license check here to make sure the license check is completed. We don't really care whether it is valid, from the looks of it - why even return a model id if the license is not valid? So I think we should move the license check out of this code, and throw an invalid license error further down. WDYT?

It sounds like you and Søren had a similar conversation https://github.com/elastic/kibana/pull/181700/files#r1579341084

It seems like we were catching the invalid license error that ML was throwing but we didn't want to log it. So we check if its enterprise and just return the "default model id" even though its no use. I was also thinking we should not call doInit or stop getModelId from being called further up if they dont have the enterprise license.

I moved the config before license and added back in a comment that was removed that I think was helpful.

.../test/observability_ai_assistant_api_integration/tests/knowledge_base/knowledge_base.spec.ts

...bservability_ai_assistant_api_integration/tests/knowledge_base/knowledge_base_status.spec.ts

dgieselaar · 2024-07-29T18:17:46Z

...bservability_ai_assistant_api_integration/tests/knowledge_base/knowledge_base_status.spec.ts

+      await ml.testResources.cleanMLSavedObjects();
+    });
+
+    it('returns correct status after knowledge base is setup', async () => {


should we test what happens before the model is imported?

It seems, during setup, we assume that no installation error is going to happen. The call to create the model throws an error that isn't caught:

{ name: 'ResponseError', message: 'action_request_validation_exception\n' + '\tRoot causes:\n' + '\t\taction_request_validation_exception: Validation Failed: 1: [model_type] must be set if [definition] is not defined.;2: [inference_config] must not be null.;' }

This causes pRetry to call itself again and repeat the process til it finally fails after the 12 times. We display a toast with a 500 internal server error

These validation errors should not happen give elser2 should already have these properties set somewhere, but I think we should check the call to create model and stop if it fails for whatever reason. If i understand correctly, pRetry is mainly checking to see if a model is finished installing and we depend on a call to getTrainedModels status errors for that?

What I think happens is (and maybe we can e.g. use named functions to clarify this):

(1) Check whether model is installed
-- 1A. If this check throws an error, and it's a resource_not_found_exception or status_exception, install the model, and re-evaluate step (1). this is what happens on a clean slate.
-- 1B. If this check does not throw, but returns false (the model is installing, but not fully defined), retry (1) in n seconds.
-- 1C. if this check does not throw, and returns true, consider the model successfully installed and available for deployment, and continue to step (2)

(2) Deploy installed model.
-- 2A. If this throws with a resource_not_found_exception or status_exception, catch the error and continue to (3).
-- 2B. If any other error, fail and exit the process.

(3) Check if model has been successfully deployed
-- 3A. If successfully deployed, exit process with a 200
-- 3B. If not successfully deployed, re-evaluate (3) in n seconds

To answer your question, I think we don't really retry the installModel call - we retry the calls to determine whether the model is installed. This is a consequence of the fact that a model install can be in progress. One thing that might clear this up is to move the installModel call out of the pRetry. Perhaps it should be something like:

installModelIfDoesNotExist()

pollForModelInstallCompleted()

deployModelIfNeeded()

pollForModelDeploymentCompleted()

For 1, that makes sense and aligns with the behaviour I'm seeing. The problem arises if the model cannot install we get "stuck" in 1A and call installModel repeatedly. This happens when putTrainedModel() in installModel throws which means getTrainedModels() will continue to throw because we never started the install.

Thanks, I'll see if I can clear it up and add error handling for when the model install can't start / throws.

@dgieselaar I updated (1) model installation process. Let me know if that's easier to read.

.../test/observability_ai_assistant_api_integration/tests/knowledge_base/knowledge_base.spec.ts

…ary tests

…unian/kibana into 188999-kb-integration-tests

kibanamachine · 2024-08-01T17:51:08Z

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#6669

[✅] x-pack/test/observability_ai_assistant_api_integration/enterprise/config.ts: 25/25 tests passed.

see run history

dgieselaar

Thanks a ton, this is great!

…unian/kibana into 188999-kb-integration-tests

neptunian · 2024-08-05T12:07:49Z

/ci

kibana-ci · 2024-08-05T12:35:04Z

💚 Build Succeeded

Buildkite Build
Commit: 2471e7c
Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-189000-2471e7c1f094

Metrics [docs]

✅ unchanged

History

💛 Build #225801 was flaky 2471e7c
💚 Build #225403 succeeded b4ab988
💚 Build #225151 succeeded d94b65c
💛 Build #225027 was flaky 03946a6
💚 Build #224787 succeeded 01d9821

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

knowledge base integration tests, add modelId as kibana plugin config…

8c33570

… option

neptunian added 2 commits July 23, 2024 15:33

config optional, remove test changes

11cfe25

remove comment

b4bc2f7

remove unused code

9f4ae3c

neptunian and others added 6 commits July 24, 2024 08:37

Merge branch 'main' into 188999-kb-integration-tests

3453277

fix types

05a79d0

Merge branch '188999-kb-integration-tests' of https://github.com/nept…

9954c9f

…unian/kibana into 188999-kb-integration-tests

add status tests

cce0f53

remove hardcoded tiny elser id

ac5aacb

Merge branch 'main' into 188999-kb-integration-tests

01dfbbb

neptunian marked this pull request as ready for review July 25, 2024 00:45

neptunian requested review from a team as code owners July 25, 2024 00:45

neptunian added the release_note:skip Skip the PR/issue when compiling release notes label Jul 25, 2024

jgowdyelastic reviewed Jul 26, 2024

View reviewed changes

x-pack/test/functional/services/ml/api.ts Outdated Show resolved Hide resolved

botelastic bot added ci:project-deploy-observability Create an Observability project Team:Obs AI Assistant labels Jul 26, 2024

jgowdyelastic approved these changes Jul 26, 2024

View reviewed changes

neptunian and others added 2 commits July 26, 2024 08:53

Update x-pack/test/functional/services/ml/api.ts

9957e6a

Co-authored-by: James Gowdy <jgowdy@elastic.co>

Merge branch 'main' into 188999-kb-integration-tests

54b5592

dgieselaar reviewed Jul 29, 2024

View reviewed changes

neptunian and others added 5 commits July 30, 2024 11:21

use the observabilityAIAssistantAPIClient service and remove unnecess…

19aae75

…ary tests

Merge branch '188999-kb-integration-tests' of https://github.com/nept…

f05e71f

…unian/kibana into 188999-kb-integration-tests

Merge branch 'main' into 188999-kb-integration-tests

e02d7ae

separate out ml api calls to helper functions

71e3383

add helper to kb status test

5364e4f

neptunian and others added 7 commits July 30, 2024 13:32

use observabilityAIAssistantAPIClient for status

01d9821

remove old test file

03946a6

refactor kb setup code and test if model failed to install

8c00960

log error

e4df51d

remove unused import

11f52f3

move config check above license check and add back comment

d94b65c

Merge branch 'main' into 188999-kb-integration-tests

b4ab988

dgieselaar approved these changes Aug 2, 2024

View reviewed changes

neptunian and others added 3 commits August 5, 2024 06:43

Merge branch 'main' into 188999-kb-integration-tests

ce1c50f

remove plural

58b5574

Merge branch '188999-kb-integration-tests' of https://github.com/nept…

2471e7c

…unian/kibana into 188999-kb-integration-tests

neptunian merged commit f18224c into elastic:main Aug 5, 2024
23 checks passed

kibanamachine added v8.16.0 backport:skip This commit does not require backporting labels Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Obs AI Assistant] knowledge base integration tests #189000

[Obs AI Assistant] knowledge base integration tests #189000

neptunian commented Jul 23, 2024 •

edited

Loading

obltmachine commented Jul 23, 2024

neptunian commented Jul 23, 2024

neptunian commented Jul 24, 2024

kibanamachine commented Jul 25, 2024

jgowdyelastic left a comment

dgieselaar Jul 29, 2024

neptunian Jul 30, 2024

dgieselaar Jul 31, 2024

neptunian Jul 31, 2024

dgieselaar Jul 29, 2024

neptunian Jul 31, 2024 •

edited

Loading

dgieselaar Jul 31, 2024 •

edited

Loading

neptunian Jul 31, 2024 •

edited

Loading

neptunian Jul 31, 2024 •

edited

Loading

kibanamachine commented Aug 1, 2024

dgieselaar left a comment

neptunian commented Aug 5, 2024

kibana-ci commented Aug 5, 2024 •

edited

Loading

[Obs AI Assistant] knowledge base integration tests #189000

[Obs AI Assistant] knowledge base integration tests #189000

Conversation

neptunian commented Jul 23, 2024 • edited Loading

obltmachine commented Jul 23, 2024

🤖 GitHub comments

neptunian commented Jul 23, 2024

neptunian commented Jul 24, 2024

kibanamachine commented Jul 25, 2024

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#6629

jgowdyelastic left a comment

Choose a reason for hiding this comment

dgieselaar Jul 29, 2024

Choose a reason for hiding this comment

neptunian Jul 30, 2024

Choose a reason for hiding this comment

dgieselaar Jul 31, 2024

Choose a reason for hiding this comment

neptunian Jul 31, 2024

Choose a reason for hiding this comment

dgieselaar Jul 29, 2024

Choose a reason for hiding this comment

neptunian Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

dgieselaar Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

neptunian Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

neptunian Jul 31, 2024 • edited Loading

Choose a reason for hiding this comment

kibanamachine commented Aug 1, 2024

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#6669

dgieselaar left a comment

Choose a reason for hiding this comment

neptunian commented Aug 5, 2024

kibana-ci commented Aug 5, 2024 • edited Loading

💚 Build Succeeded

Metrics [docs]

History

neptunian commented Jul 23, 2024 •

edited

Loading

neptunian Jul 31, 2024 •

edited

Loading

dgieselaar Jul 31, 2024 •

edited

Loading

neptunian Jul 31, 2024 •

edited

Loading

neptunian Jul 31, 2024 •

edited

Loading

kibana-ci commented Aug 5, 2024 •

edited

Loading