Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding MapperServiceProvider for plugins #16110

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dzane17
Copy link
Contributor

@dzane17 dzane17 commented Sep 27, 2024

Description

This PR introduces a new functional interface, MapperServiceProvider, to support the Query Insights plugin in accessing field types for search requests. The IndicesService holds the index mappings for each index, but rather than exposing the entire service to plugins, this interface allows them to retrieve the MapperService for specific indices through the getMapperService(Index index) method.

Related Issues

RFC opensearch-project/query-insights#69
PR opensearch-project/query-insights#130

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dzane17 dzane17 added the backport 2.x Backport to 2.x branch label Sep 27, 2024
Copy link
Contributor

✅ Gradle check result for 9a5dac6: SUCCESS

Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 44.44444% with 10 lines in your changes missing coverage. Please review.

Project coverage is 71.92%. Comparing base (8ddb3ee) to head (7c9a8e2).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
server/src/main/java/org/opensearch/node/Node.java 50.00% 5 Missing ⚠️
...in/java/org/opensearch/indices/IndicesService.java 0.00% 4 Missing ⚠️
...ava/org/opensearch/plugins/MappingAwarePlugin.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16110      +/-   ##
============================================
+ Coverage     71.86%   71.92%   +0.05%     
- Complexity    64447    64485      +38     
============================================
  Files          5288     5289       +1     
  Lines        301438   301455      +17     
  Branches      43552    43553       +1     
============================================
+ Hits         216628   216818     +190     
+ Misses        67034    66880     -154     
+ Partials      17776    17757      -19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

✅ Gradle check result for 3ce3c96: SUCCESS

@reta
Copy link
Collaborator

reta commented Sep 28, 2024

@dzane17 could you please provide more context behind this change: why do you need to expose the IndicesService to plugins? thank you

@reta
Copy link
Collaborator

reta commented Sep 30, 2024

@reta thanks for the feedback, I added to the PR description

Thanks @dzane17 , it still does not answer the question (but leads to more questions):

  • why you are trying to export the IndicesService but referring to MapperService?
  • in which context you need access to MapperService?

Could you please share the reference to the pull request or associated issue so it would be possible to understand the problem you are solving that asks for such solution.

Copy link
Contributor

❌ Gradle check result for de087da: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Collaborator

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the PR title to reflect the latest changes?

@dzane17 dzane17 changed the title Add IndicesAwarePlugin class Adding MapperServiceProvider for plugins Sep 30, 2024
@dzane17 dzane17 force-pushed the indices-service branch 2 times, most recently from be47a7c to 7c9a8e2 Compare September 30, 2024 22:30
@dzane17
Copy link
Contributor Author

dzane17 commented Sep 30, 2024

@reta

  1. MapperService is per index so instead of passing multiple instances, I originally passed the entire IndicesService which contains MapperService for each index. However with @jainankitk's suggestion of MapperServiceProvider, we are now exposing a single method instead of IndicesService.
  2. Field type is needed for query shape in Query Insights. Query shape generation occurs after shard search, on the coordinator node. See onRequestEnd() here.

I have linked the related RFC and PR above.

server/src/main/java/org/opensearch/node/Node.java Outdated Show resolved Hide resolved
if (indexService == null) {
return null;
} else {
return indexService.mapperService();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include small test for this new method? Also, for the change from index as string to index as object

Copy link
Contributor

❕ Gradle check result for 7c9a8e2: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.backwards.MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: David Zane <davizane@amazon.com>
Copy link
Contributor

github-actions bot commented Oct 1, 2024

❌ Gradle check result for 51f28cd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

*/
@FunctionalInterface
@PublicApi(since = "2.18.0")
public interface MapperServiceProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is the best way to do it, that too exposing this inside a plugin package. It basically tells anyone to implement a MapperServiceProvider implementation, but underneath it returns a concrete MapperService class
If the only intention is to expose this MapperService somehow, one way would be to pass it inside MapperAwarePlugin via param Function<Index, MapperService> .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a fair point @sgup432. I was confused whether to directly use function or create interface. I am fine with either since single method interface is technically a functional specification. @reta - do you have any thoughts here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jainankitk I think we should not pursue this route, see please #16110 (comment)

Comment on lines +59 to +69
Client client,
ClusterService clusterService,
ThreadPool threadPool,
ResourceWatcherService resourceWatcherService,
ScriptService scriptService,
NamedXContentRegistry xContentRegistry,
Environment environment,
NodeEnvironment nodeEnvironment,
NamedWriteableRegistry namedWriteableRegistry,
IndexNameExpressionResolver indexNameExpressionResolver,
Supplier<RepositoriesService> repositoriesServiceSupplier,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we want to pass all these params. Any plugin already extends Plugin class which already provides all these classes.

* @opensearch.internal
*/
@PublicApi(since = "2.18.0")
public interface MappingAwarePlugin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds more like MappingServiceAwarePlugin ?

NamedWriteableRegistry namedWriteableRegistry,
IndexNameExpressionResolver indexNameExpressionResolver,
Supplier<RepositoriesService> repositoriesServiceSupplier,
MapperServiceProvider mapperServiceProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned this doesn't look right. One other way I think of doing this(ie exposing MappingService) is by defining an interface sitting inside core(and not under plugins package), and one of the implementation being MapperService. As then anyone trying to implement this plugin also have a ability to pass their own/other implementation of MapperService and use that instead.

@reta
Copy link
Collaborator

reta commented Oct 1, 2024

I have linked the related RFC and PR above.

Thanks @dzane17 , MapperService is contextual component, I believe what you need is to use getGuiceServiceClasses extension instead of createComponents:

    @Override
    public Collection<Class<? extends LifecycleComponent>> getGuiceServiceClasses() {
        return List.of(QueryInsightsService.class, QueryInsightsListener.class);
    }

In this case, you could inject IndicesService directly and create QueryInsightsService / QueryInsightsListener (when appropriate):

@Inject
    public QueryInsightsService(
        final ClusterSettings clusterSettings,
        final ThreadPool threadPool,
        final Client client,
        final MetricsRegistry metricsRegistry,
        final NamedXContentRegistry namedXContentRegistry,
        final IndicesService ndicesService
    ) {
  }

(same applies to QueryInsightsListener)

@jainankitk
Copy link
Collaborator

I have linked the related RFC and PR above.

Thanks @dzane17 , MapperService is contextual component, I believe what you need is to use getGuiceServiceClasses extension instead of createComponents:

    @Override
    public Collection<Class<? extends LifecycleComponent>> getGuiceServiceClasses() {
        return List.of(QueryInsightsService.class, QueryInsightsListener.class);
    }

In this case, you could inject IndicesService directly and create QueryInsightsService / QueryInsightsListener (when appropriate):

@Inject
    public QueryInsightsService(
        final ClusterSettings clusterSettings,
        final ThreadPool threadPool,
        final Client client,
        final MetricsRegistry metricsRegistry,
        final NamedXContentRegistry namedXContentRegistry,
        final IndicesService ndicesService
    ) {
  }

(same applies to QueryInsightsListener)

Thanks @reta for this suggestion. Definitely something we should consider. Although, QueryInsightsListener is added to the SearchOperationsListener list in the core when returned via createComponents. Wondering if the same behavior persists with injection using GuiceServiceClasses

@reta
Copy link
Collaborator

reta commented Oct 1, 2024

Wondering if the same behavior persists with injection using GuiceServiceClasses

It has to, if not - that would be a good fix to have, thanks @dzane17

@jainankitk
Copy link
Collaborator

Wondering if the same behavior persists with injection using GuiceServiceClasses

It has to, if not - that would be a good fix to have, thanks @dzane17

Okay, we will need to add pluginLifecycleComponents to the SearchRequestOperationsCompositeListenerFactory list.
While I don't mind making the below change, seems bit weird to me having Listener implement the LifecycleComponent. Also, this might have other complications like the pluginLifecycleComponents needing injector instance, which in turn requires instance of SearchRequestOperationsCompositeListenerFactory - b.bind(SearchRequestOperationsCompositeListenerFactory.class).toInstance(searchRequestOperationsCompositeListenerFactory);

// register all standard SearchRequestOperationsCompositeListenerFactory to the SearchRequestOperationsCompositeListenerFactory
            final SearchRequestOperationsCompositeListenerFactory searchRequestOperationsCompositeListenerFactory =
                new SearchRequestOperationsCompositeListenerFactory(
                    Stream.concat(
                        Stream.of(
                            searchRequestStats,
                            searchRequestSlowLog,
                            searchTaskRequestOperationsListener,
                            queryGroupRequestOperationListener
                        ),
                        pluginComponents.stream()
                            .filter(p -> p instanceof SearchRequestOperationsListener)
                            .map(p -> (SearchRequestOperationsListener) p)
                    ).toArray(SearchRequestOperationsListener[]::new),
                       pluginLifecycleComponents.stream()
                            .filter(p -> p instanceof SearchRequestOperationsListener)
                            .map(p -> (SearchRequestOperationsListener) p)
                    ).toArray(SearchRequestOperationsListener[]::new)
                );

@reta - Given the amount of changes in core should be pretty small (limited to Node), is it possible for you to take a quick stab, in case we are missing something?

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

Okay, we will need to add pluginLifecycleComponents to the SearchRequestOperationsCompositeListenerFactory list.

@jainankitk why do you need that? The QueryInsightsListener could implement the LifecycleComponent and that should be sufficient

@jainankitk
Copy link
Collaborator

Okay, we will need to add pluginLifecycleComponents to the SearchRequestOperationsCompositeListenerFactory list.

@jainankitk why do you need that? The QueryInsightsListener could implement the LifecycleComponent and that should be sufficient

@reta - How else will the QueryInsightsListener get notified of the search events, if the QueryInsightsListener is not returned as part of pluginComponents anymore. Below change was added to include the QueryInsightsListener for search events:

pluginComponents.stream()
                            .filter(p -> p instanceof SearchRequestOperationsListener)
                            .map(p -> (SearchRequestOperationsListener) p)

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

@reta - How else will the QueryInsightsListener get notified of the search events, if the QueryInsightsListener is not returned as part of pluginComponents anymore.

@jainankitk that is exactly the issue that @dzane17 brought up and we may need to fix: the instances of classes returned by getGuiceServiceClasses() could implement SearchRequestOperationsListener as well but this fact is completely ignored at the moment.

@dzane17
Copy link
Contributor Author

dzane17 commented Oct 2, 2024

@reta @jainankitk I originally intended to solve this by moving the creation of SearchRequestOperationsCompositeListenerFactory to later in node.java so that we can capture listeners from both pluginComponents and pluginLifecycleComponents.

            final SearchRequestOperationsCompositeListenerFactory searchRequestOperationsCompositeListenerFactory =
                new SearchRequestOperationsCompositeListenerFactory(
                    Stream.concat(
                        Stream.of(
                            searchRequestStats,
                            searchRequestSlowLog,
                            searchTaskRequestOperationsListener,
                            queryGroupRequestOperationListener
                        ),
                        Stream.concat(
                            pluginComponents.stream()
                                .filter(SearchRequestOperationsListener.class::isInstance)
                                .map(SearchRequestOperationsListener.class::cast),
                            pluginLifecycleComponents.stream()
                                .filter(SearchRequestOperationsListener.class::isInstance)
                                .map(SearchRequestOperationsListener.class::cast)
                        )
                    ).toArray(SearchRequestOperationsListener[]::new)
                );

However, pluginLifecycleComponents & getGuiceServiceClasses() are called after searchRequestOperationsCompositeListenerFactory (and all other injectables) are already bound to the instance here.

We face a paradox where getGuiceServiceClasses() relies on dependency injection, but we are now attempting to return an object to be bound (QueryInsightsListener) from getGuiceServiceClasses().

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

We face a paradox where getGuiceServiceClasses() relies on dependency injection, but we are now attempting to return an object to be bound (QueryInsightsListener) from getGuiceServiceClasses().

Oh I see, thanks @dzane17 , that's unfortunate, let me think it through a bit

@cwperks
Copy link
Member

cwperks commented Oct 2, 2024

Okay, we will need to add pluginLifecycleComponents to the SearchRequestOperationsCompositeListenerFactory list.

@jainankitk why do you need that? The QueryInsightsListener could implement the LifecycleComponent and that should be sufficient

@reta - How else will the QueryInsightsListener get notified of the search events, if the QueryInsightsListener is not returned as part of pluginComponents anymore. Below change was added to include the QueryInsightsListener for search events:

pluginComponents.stream()
                            .filter(p -> p instanceof SearchRequestOperationsListener)
                            .map(p -> (SearchRequestOperationsListener) p)

@jainankitk @dzane17 Idk if this would be helpful as well, but Security creates a SearchOperationListener within onIndexModule: https://github.com/opensearch-project/security/blob/830b341453908332e4004d519ac2e0b099dca1f6/src/main/java/org/opensearch/security/OpenSearchSecurityPlugin.java#L754-L837

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

See also my previous comment here

@reta
Copy link
Collaborator

reta commented Oct 2, 2024

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

Thanks @cwperks , I think onIndexModule is quite a viable alternative as well

@jainankitk
Copy link
Collaborator

The use-case for Security is Fls/Dls/FieldMasking feature which cluster administrators use to hide sensitive data stored in the clusters to users. Security must intercept search operations to apply the rules.

Thanks @cwperks , I think onIndexModule is quite a viable alternative as well

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

@cwperks
Copy link
Member

cwperks commented Oct 3, 2024

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

Correct, its shard-level. To my knowledge, the only request-level extension point is the handlerWrapper, but that is reserved for the security plugin to perform auth.

My knowledge of how the internals of Search works is limited, but my understanding is that a SearchRequest lands on a coordinator node that then uses the routing table to figure out where to route the internal shard search requests to. It then sends out shard search requests to various nodes and collects the results before returning the results to the client.

Is it possible to distinguish the request that lands on the coordinator node vs. the shard requests?

iirc on the transport layer its the difference between indices:data/read/search vs indices:data/read/search[s]

Edit: All transport actions can be intercepted with ActionPlugin.getActionFilters() extension point, but there are challenges resolving a generic ActionRequest -> Indices being acted upon.

@reta
Copy link
Collaborator

reta commented Oct 3, 2024

@cwperks - Please correct me if I am wrong, but onIndexModule can only be leveraged for listening shard level search events, not request level. We need the search request listener for what we are trying to achieve here! cc: @reta

Got it, thanks @jainankitk , so here is the way we could make it work, which is a bit cumbersome but solves the dependency problem, the QueryInsightsService in injected as the service class:

    @Override
    public Collection<Class<? extends LifecycleComponent>> getGuiceServiceClasses() {
        return List.of(QueryInsightsService.class);
    }

Since QueryInsightsListener is created before QueryInsightsService (and it is injectable), we could inject QueryInsightsListener right there and initialize it with the indicesService or QueryInsightsService (fe using setter) since none of this services are needed during QueryInsightsListener initialization:

@Inject
    public QueryInsightsService(
        final ClusterSettings clusterSettings,
        final ThreadPool threadPool,
        final Client client,
        final MetricsRegistry metricsRegistry,
        final NamedXContentRegistry namedXContentRegistry,
        final IndicesService indicesService,
        final QueryInsightsListener listener
    ) {
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants