[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

seraphjiang · 2022-06-12T03:44:34Z

@zhongnan to fill in detail

zhongnansu · 2022-06-15T08:24:19Z

The solid truth is we need create multiple clients, which hold multiple connections in order to talk to multiple OpenSearch clusters. The problems are:

1. Where to initialize the client?

2. How to manage clients in an efficient way? (Out of discussion scope of this thread)

While my previous poc #1499 doesn't come up with a clear solution for question number 2, but it provides some insights on question number 1.

Let's first take a look at how default opensearch_strategy retrieve client to talk to OpenSearch. We can see that default search strategy will call the core api to create "child" clients to be used per request. But since .child() has connection pooling, it's still efficient.

OpenSearch-Dashboards/src/plugins/data/server/search/opensearch_search/opensearch_search_strategy.ts

Line 75 in beb46f5

context.core.opensearch.client.asCurrentUser.search(params),

OpenSearch-Dashboards/src/core/server/opensearch/client/scoped_cluster_client.ts

Lines 55 to 60 in 89d3872

    
           export class ScopedClusterClient implements IScopedClusterClient { 
        
             constructor( 
        
               public readonly asInternalUser: OpenSearchClient, 
        
               public readonly asCurrentUser: OpenSearchClient 
        
             ) {} 
        
           }

OpenSearch-Dashboards/src/core/server/opensearch/client/cluster_client.ts

Lines 91 to 97 in 89d3872

    
           asScoped(request: ScopeableRequest) { 
        
             const scopedHeaders = this.getScopedHeaders(request); 
        
             const scopedClient = this.rootScopedClient.child({ 
        
               headers: scopedHeaders, 
        
             }); 
        
             return new ScopedClusterClient(this.asInternalUser, scopedClient); 
        
           }

To support multiple datasource, we also need a way to create and retrieve multiple clients. And there are 2 options

1. initialize client in data plugin - >`search_strategy`, same as poc code

OpenSearch-Dashboards/src/plugins/data/server/search/opensearch_search/ext_opensearch_search_strategy.ts

Lines 87 to 93 in beb46f5

    
           const client = new Client({ 
        
             node: url, 
        
             auth: { 
        
               username, 
        
               password, 
        
             } 
        
           });

2. initialize client in core, and expose core apis for modules to retrieve clients. Similar to what I did in the poc zengyan-amazon#2

The second approach is preferred for the following reasons.

it follows the similar paradigm of default search strategy, it can be accessible from core by something similar to context.core.opensearch.dataSourceClient.asDataSourceUser.search(param)
When other internal and external plugins wants to leverage multi datasource feature, it will have a much cleaner interface to just call cotext.core.dataSourceClient. Giving an example of how external OSD plugins is using default opensearch client
https://github.com/opensearch-project/dashboards-reports/blob/73331021e8e496f82fa80c15d7258ae18fba319d/dashboards-reports/server/routes/reportDefinition.ts#L53
Having it decoupled from data plugin enables us more flexibility when implementing features of clients, such as client pooling, or onboarding other types of datasources(that requires different types of clients). Because we decouple the search(data pugin) and client management(OSD core) logically.

Furthermore, a bit about implementation. We can

Wire it in the existing core -> opensearch_service, similar as scoped client creation
Or we can just create a new service called datasource_service in core.

As for which one is better, it's implementation level that we can decide later.

seraphjiang · 2022-06-16T03:24:32Z

Thanks @zhongnansu for putting information together.

I like the preferred solution to initialize client in core api which could be used by other plugins without change too much code.

Here is my thoughts about below
wire in existing opensearch_service make sense to me if we only want to support opensearch for short term as well as long term. doens't seem too much work for plugin development to follow convention to make their plugin mutiple-source compatibility. the api signature is clear

context.core.opensearch.dataSourceClient.asDataSourceUser.search(param)

create a new datasource_service seems supporting datasource type other than opensearch in the future. However the api signature is not clear and straightforward to me

context.core.datasource_service.dataSourceClient.asDataSourceUser.search(param)
or
context.core.datasource_service.opensearch.dataSourceClient.asDataSourceUser.search(param)
or
context.core.datasource_service.mysql.dataSourceClient.asDataSourceUser.search(param)

@zengyan-amazon any comments?

Furthermore, a bit about implementation. We can

Wire it in the existing core -> opensearch_service, similar as scoped client creation

Or we can just create a new service called datasource_service in core.

@zhongnansu other than above open question, are we clear to close this research task with conclusion and move to design and implementation phase?

seraphjiang assigned zhongnansu Jun 12, 2022

seraphjiang added the v3.0.0 label Jun 12, 2022

seraphjiang added dashboards anywhere label for dashboards anywhere multiple datasource multiple datasource project labels Jun 12, 2022

seraphjiang changed the title ~~Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource~~ [MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource Jun 13, 2022

zhongnansu closed this as completed Jul 27, 2022

zhongnansu mentioned this issue Jul 28, 2022

[MD] Client Management Epic #2003

Closed

26 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

seraphjiang commented Jun 12, 2022

zhongnansu commented Jun 15, 2022 •

edited

Loading

seraphjiang commented Jun 16, 2022

[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

[MD]Research on opensearch_service.ts and client management to determine the connection strategy to support multiple datasource #1721

Comments

seraphjiang commented Jun 12, 2022

zhongnansu commented Jun 15, 2022 • edited Loading

1. Where to initialize the client?

2. How to manage clients in an efficient way? (Out of discussion scope of this thread)

1. initialize client in data plugin - >search_strategy, same as poc code

2. initialize client in core, and expose core apis for modules to retrieve clients. Similar to what I did in the poc zengyan-amazon#2

The second approach is preferred for the following reasons.

seraphjiang commented Jun 16, 2022

zhongnansu commented Jun 15, 2022 •

edited

Loading

1. initialize client in data plugin - >`search_strategy`, same as poc code