[PROPOSAL] Client Support For Extensions #55

Xtansia · 2023-03-23T20:53:33Z

What/Why?

Prior Art

Extensions Proposal: OpenSearch Extensibility OpenSearch#2447
Clients Generation from API Specs: [PROPOSAL] Use API specification to generate clients #19
- Java client generation proposal: [PROPOSAL] Experiment with using Smithy API spec to generate missing APIs opensearch-java#284
- Generator prototypes:

What are you proposing?

Allowing the creation of thin-clients for extensions which will be composable with the core client in every supported programming language. Extensions will publish their REST interfaces in the shape of OpenAPI specs, a generator will then consume the spec and output a client that can be composed with the core client. Manual work and intervention will be minimized by automating as much of the process as possible, providing build-and-test tooling such as CI workflows, so that both OpenSearch project-owned extensions and externally developed extensions can benefit with uniform support.

What users have asked for this feature?

There have been many requests for complete support of plugins in clients and as extensions are an evolution of plugins, these can be treated as a direct indicator of need for support of extensions due to the fact that the plugins will necessarily be migrated into extensions in time:

What problems are you trying to solve?

When a user wants to invoke an extension from a programming language of their choice, they currently lack an easy and reliable method to do so. At present a user would need to directly invoke a “raw” HTTP client directly implementing any additional authentication alongside their core OpenSearch client, or in some cases a languages OpenSearch client exposes a “raw” request method alleviating some of the duplication. However, the current solutions lack any definition of what endpoints are available, their request shape, strong typing of query parameters or any documentation.

What is the extension owner experience going to be?

Extension owners will author an OpenAPI REST specification for their extension. A code generator will then consume it and output the complete compilable & runnable source code for a high level thin-client that is composable with the core client in every supported programming language. Automations such as CI workflows will be provided to streamline the process of generating and publishing a given thin-client, and there will be minimal ongoing maintenance overhead by the extension owner for the generated thin-client.

Example flow of adding a new API:

New endpoint is implemented in the extension
Endpoint definition is added to the extension’s OpenAPI specification
New extension version is tagged and released publishing the updated spec
Client generation workflows are automatically triggered consuming the spec
Clients are generated containing the new endpoint
Finally one of:
1. New client versions are released automatically
2. Workflows automatically create pull-requests to include the new endpoint in checked-in source of clients. Allowing manual review and management of how the clients are versioned if desired.

Are there any security considerations?

At this time, there are no specific security considerations related to this proposal.

Are there any breaking changes to the API?

No, there are no breaking changes to the API, as this relates to an entirely new development.

What is the user experience going to be?

Users will have an easy and reliable way to invoke any extension from their preferred programming language. They will also have access to all new APIs or API updates immediately after an extension is released. Furthermore, due to the thin per-extension nature of the clients users will be able to pull in only the necessary extensions.

Example flow:

User has installed the security extension into their OpenSearch cluster, and now wants to interact with it in an application they are building
The user adds a dependency on the extension’s thin-client to their application in the programming language of their choosing, which transitively pulls in a dependency on the core OpenSearch client & transport.
The user can immediately begin developing their application using the client to interact with the extension in a reliable and well-defined manner.

Are there any breaking changes to the User Experience?

Previously a subset of plugin APIs were included directly in the language clients, differing in coverage between languages. As extensions will be published as separate packages rather than bundled into the core clients this will be a breaking change for the relatively small set of plugins that were covered and will now be migrated to extensions. This is relatively minor as coverage for plugin APIs was generally poor if not non-existent.

Why should it be built? Any reason not to?

Building this proposal brings value to the OpenSearch community by providing a high-standard solution that supports both first-party and third-party extensions uniformly, increasing the "feel good factor" for third-party developers. It also lowers the barrier to entry and increases velocity, scalability, and reliability with well-thought-out tooling and automation. Not building this proposal could limit the flexibility, ease of use, and integration of extensions for the end-users.

What will it take to execute?

The language client maintainers will need to:

Ensure the core clients provide a stable interface for interacting with the low-level REST transport layer, potentially as a separate library (JAR, RubyGem, NuPkg, Crate, etc.) which the core client also depends on. This will involve integration tests of the core interface to aid reliability guarantees.
Provide code generators that accept OpenAPI REST specifications and output client code targeting the aforementioned stable interface. This work is already underway as part of [PROPOSAL] Use API specification to generate clients #19.
Provide prebuilt automations, such as GitHub Actions, that enable the running of the code generators in a quick and easy manner.
Documentation and a guide for how to use the provided tooling to configure and set up the client generation for an extension.
A template repository for a simple extension and uses the provided automations to generate a full suite of clients so that extension builders can easily adopt this paradigm.
Integration tests that make use of the template extension to validate the code generators and the stable core interface.

Extension owners will need to:

Provide OpenAPI REST specifications for the APIs provided by the extension.
Integrate the generators & automations into their workflows to generate their thin-clients.
- For opensearch-project owned extensions this and the following point will be done in collaboration with both the clients maintainers and the build automation maintainers.
Publish the thin-clients to their desired package managers.

Questions:

Why independent thin-clients?

Allows the core client for each language to remain as lightweight as possible while enforcing separation of concerns.
End-users can pull in the exact extensions they need and nothing else. For example where small-footprint, fast cold-start applications are desired such as AWS Lambda functions or Docker images.
Allows extension and their clients to be versioned completely independently of the server and the core clients.

Will the explosion in number of thin-clients cause issues?

The number of thin-clients as we multiply the number of extensions by the number of supported languages will almost certainly be huge in the not too distant future. However, we can mitigate as much of the burden as possible by providing high-quality prebuilt automations (i.e. GitHub Actions) to streamline the process of generating and publishing a given thin-client. As they will in the general case be 100% generated, there will be essentially zero ongoing maintenance overhead by the extension owner. Extension owners can further reduce any overhead by taking a piecemeal approach to which language clients they generate and publish depending on demand (i.e. a machine learning extension may only primarily care about Python).

Why not bundle extension clients into the core client?

As it will be possible for independent developers to create extensions, it would not be feasible to include client code for all extensions in the core client. So we would end up having to draw some kind of delineation which would naturally end up being only first-party (opensearch-project owned) extensions or a subset thereof. This would in turn require supporting both the directly bundled approach and supporting the externally generated thin-client approach. There will likely be many first-party extensions as well so would not be scalable to include all in one client so would lead to further disparity of treatment between extensions.

Why OpenAPI specifications?

Defining the extensions API in a ubiquitous spec language such as OpenAPI, enables developers to generate clients more easily as well as allowing users to use off-the-shelf tooling to generate clients or interact with the APIs as they desire. It can also be used in a spec-first approach where the spec is used to generate the necessary scaffolding to set up the routes and actions in the extension itself.
Using other spec languages such as Smithy as the basis for code generation were considered in the context of opensearch-project/opensearch-java#284, however OpenAPI was chosen as it’s a de-facto standard within the community.

Can this proposal be extended to support other types of specifications besides OpenAPI?

At this time, OpenAPI is the chosen specification language for this proposal due to its wide adoption and tooling support. Other specification types such Smithy often have mechanisms to be converted into OpenAPI, potentially allowing extension authors to write their specifications in something other than OpenAPI and merely publish the converted output.

How will this proposal affect developers who are not familiar with OpenAPI?

OpenAPI is a widely adopted standard with plenty of resources available for learning, so it should not be a significant barrier to entry for most developers.

Will extension owners be able to extend upon their generated thin-clients?

As extension owners will have full control over the generation and publishing of their clients, they will be able to do anything they like with regard to modifying or extending them. The primary recommended manner in which extension owners would be able to extend upon the generated clients would be for them to create a new package/library that depends on the generated client. In their higher-level library they could implement any new logic or features necessary and recommend it as the definitive client for the extension, with users still free to use the simple generated client if desired. There may be other solutions such as checking-in the generated client source to Git and making any necessary additions within the same library.

Any further questions?

dblock · 2023-03-24T11:53:06Z

I like it!

wbeckler · 2023-03-24T14:31:21Z

I'm curious if @reta or @saratvemulapalli or @peternied have thoughts?

reta · 2023-03-24T16:07:04Z

@Xtansia certainly huge +1, we have been discussing this subject in the past and, the plugins / extensions would benefit the most from usage of the OpenAPI as the universal specification format (comparing to Smithy fe, no need to learn new tool). I think this also strong argument to use OpenAPI for core clients (the POCs we have are good starting points).

Regarding the thin-client integration, the consensus we've reached so far is that opensearch-java should be the only recommended way to connect to OpenSearch from Java. We could use ServiceLoaders to discover and load the thin clients, something like (very roughly):

public class public class OpenSearchClient {
    <A> plugin(Class<A> );
    <E> extension(Class<E> );
}

So whenever the thin-client for plugin or/and extension is on classpath, it could be used (the wiring details could be laid out later on, this is just high level idea).

But this is Java only, I think for other languages (Rust/Python/Go/Ruby/...), we would need to work on finding out the "pluggable" options or just use different approaches altogether. This is probably the only unanswered question I have at the moment.

peternied · 2023-03-24T18:48:39Z

@Xtansia +1 Great presentation on the topic! I have a few questions after reading it:

Is the OpenSearch-Project responsible for managing code generation? What about distribution?
Will pre-release extensions and extension features be compatible with thin clients?
Have you considered non-open-source extensions?
With multiple thin clients loaded at once, could using datatypes and APIs from other extensions cause issues with type duplication or incompatibility?
How will extension developers be prompted to author OpenAPI specs to enable thin client generation?

Xtansia · 2023-03-27T23:16:48Z

Thanks all for the feedback and questions. I've done my best to answer your questions as I understood them, feel free to ask follow-up questions or clarify if I've misunderstood.

@reta

I think for other languages (Rust/Python/Go/Ruby/...), we would need to work on finding out the "pluggable" options or just use different approaches altogether. This is probably the only unanswered question I have at the moment.

I didn't want to get too deep into language specific suggestions as I'm not up-to-speed on what's idiomatic for all the languages. But the most basic approach here is just something along the lines of an IOpenSearchClient interface and the extension client just takes it as a constructor argument. C# & Rust at least have some mechanisms for nicer ergonomics around this, essentially being able to define extension methods so that you could define OpenSearchClient.Security() -> SecurityExtensionClient from outside the library that owns OpenSearchClient.

@peternied

Is the OpenSearch-Project responsible for managing code generation? What about distribution?

This depends on how you define "managing code generation" and "distribution", so if you could be more specific that would be great.
In general the OpenSearch project would be responsible for implementing and distributing the code generators for the language clients we currently support along with associated reusable automation/CI workflows. The project would not be responsible for the execution of the generators nor the hosting of the final generated artifacts, except for where the project owns the given extension.
So an external third-party extension developer will be responsible for their own running of generation and distributing their client artifacts, but making use of the tooling we'll provide.

Will pre-release extensions and extension features be compatible with thin clients?

Could you please expand on this question, by "extension features" is that like feature flags?
For pre-release extensions, as the extension owners will control generation it'd be reasonable to expect they could generate and publish pre-release clients.

Have you considered non-open-source extensions?

As the extension owner controls the actual generation & publish step, they would be able to take our open tooling and run it inside their private CI and publish their thin clients to internal package registries if they desired.

With multiple thin clients loaded at once, could using data types and APIs from other extensions cause issues with type duplication or incompatibility?

This is a potential concern, however we can define any truly global types in the core client/library, and handle the mapping in the code generators so that they output a reference to the shared type rather than recreating the definition.

How will extension developers be prompted to author OpenAPI specs to enable thin client generation?

In the general case this will fall under a larger umbrella item of complete documentation for "Getting Started Developing An Extension" and associated examples.
For those that have begun development or soon to begin within the OpenSearch project, it'll more likely be a push (either just a nudge, or some assistance in initial authoring) from the clients maintainers as we roll out the generators and want to get the adoption going.

peternied · 2023-03-29T13:06:24Z

Thanks for the thoughtful response, just a couple follow ups

Is the OpenSearch-Project responsible for managing code generation? What about distribution?

This depends on how you define "managing code generation" and "distribution", so if you could be more specific that would be great.

I think some of the following is covered by your response, but it would help me understand the future if it was clearly laid out in this or another future proposal

Let us consider OpenSearch extensions, like anomaly detection, called AD for short. Since it is part of the OpenSearch Project, its hosted inside of github.com/opensearch-project/AD. The AD extension offers new APIs, a perfect fit for a thin client. Since there will be N different language clients, who is responsible for generating the clients?

Honing in on the AD client for python. Does this client get generated and checked into github.com/opensearch-project/AD-py-client, somewhere else, or not under direct source control? What if the AD extension team/contributors want to add improvements to the client, how do they do that?

Moving to distribution, how is the thin client consumed? For python the typical place is pypi. How does the AD thin client for python get registered and updated?

Do these answers change for an extension produced outside of the OpenSearch Project?

Will pre-release extensions and extension features be compatible with thin clients?

Could you please expand on this question, by "extension features" is that like feature flags?

Not feature flags, I mean will the process allow for release vs snapshot builds of thin clients?

Xtansia · 2023-03-29T22:07:23Z

Thanks for clarifying and expanding, hope this clears things up a bit more.

Let us consider OpenSearch extensions, like anomaly detection, called AD for short. Since it is part of the OpenSearch Project, its hosted inside of github.com/opensearch-project/AD. The AD extension offers new APIs, a perfect fit for a thin client. Since there will be N different language clients, who is responsible for generating the clients?

In this example, the AD maintainers/owners would be responsible for the triggering of the generation, as they'd be in the best position to make decisions around when the API of the extension has changed sufficiently. In general, I'd expect this to be almost entirely automated, whether triggered via tag push or a manual workflow run that regenerates all clients at once thus requiring minimal specific knowledge from the extension maintainers.
It would just not be scalable for the core clients maintainers to directly monitor and own all extensions' generation. However, the clients maintainers would provide ongoing support and guidance for the process as well as taking care of the onboarding.

Honing in on the AD client for python. Does this client get generated and checked into github.com/opensearch-project/AD-py-client, somewhere else, or not under direct source control? What if the AD extension team/contributors want to add improvements to the client, how do they do that?

I see this as a choice to be made by the AD maintainers, it should be possible to take any of these approaches with the tooling the clients maintainers will provide:

Keep client source code ephemeral, having no extra repos or checked in code. Essentially a completely invisible process.
Have a mono-repo of checked in client source code for all languages. Not really an ideal approach, but should still be possible, as may be appealing to external devs.
Have n repos each with checked in source code of the language client, allowing transparency of changes, adding extra libraries or making improvements.
Some combination of the above, i.e. 7 out of 8 languages are "invisibly" generated, but Python specifically is checked in with extras.

In essence, I believe the tooling and GitHub actions etc. provided should be composable and flexible such that the extension maintainers can use it in their work flow as they please. Even if there may end up being a "golden path" that's recommended within the OpenSearch project.

Moving to distribution, how is the thin client consumed? For python the typical place is pypi. How does the AD thin client for python get registered and updated?

In general, they will be consumed in whatever is the standard for the language, e.g. PyPi for Python, NuGet for .NET, Maven for Java.
The clients maintainers (working with the build automation maintainers) will provide automations for publishing the clients to their given artifact registry. These already exist for the core clients, so will mostly be re-using or expanding upon existing automations.
There will be some difference in requirements for OpenSearch project extensions versus external ones, as we have requirements around security, artifact signing and separation of ownership, whereas external devs will more than likely just need a relatively simple GitHub workflow they can plug a NPM API key into for example. So we'll require that support from the build maintainers to aid in provisioning of access to our artifact registry accounts for repos within the OpenSearch project and ensuring we meet our requirements.

Not feature flags, I mean will the process allow for release vs snapshot builds of thin clients?

Unfortunately some languages and artifact registries do not have a good concept of a "snapshot build", so it may not be reasonable to actively publish a snapshot build of a given client. It will certainly be possible to have prerelease versions that publish e.g. explicitly versioned betas or release candidates (1.2.0-beta.1, 1.0.0-rc.3 etc.)

github-actions bot added the untriaged label Mar 23, 2023

Xtansia mentioned this issue Mar 23, 2023

OpenSearch Extensibility opensearch-project/OpenSearch#2447

Open

Xtansia mentioned this issue Mar 27, 2023

[FEATURE] Add support for knn queries opensearch-project/opensearch-net#178

Closed

reta mentioned this issue Mar 27, 2023

[RFC] Protobuf in OpenSearch opensearch-project/OpenSearch#6844

Open

reta mentioned this issue Mar 28, 2023

Generate missing models from legacy spec opensearch-project/opensearch-api-specification#82

Closed

wbeckler removed the untriaged label Mar 29, 2023

reta mentioned this issue Mar 30, 2023

[BUG] Major inconsistencies between Java client and OpenSearch API opensearch-project/opensearch-java#420

Open

VachaShah mentioned this issue Apr 6, 2023

OpenSearch Spec and Client Generation Roadmap #58

Closed

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Client Support For Extensions #55

[PROPOSAL] Client Support For Extensions #55

Xtansia commented Mar 23, 2023

dblock commented Mar 24, 2023

wbeckler commented Mar 24, 2023

reta commented Mar 24, 2023 •

edited

Loading

peternied commented Mar 24, 2023

Xtansia commented Mar 27, 2023

peternied commented Mar 29, 2023

Xtansia commented Mar 29, 2023

[PROPOSAL] Client Support For Extensions #55

[PROPOSAL] Client Support For Extensions #55

Comments

Xtansia commented Mar 23, 2023

What/Why?

Prior Art

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the extension owner experience going to be?

Are there any security considerations?

Are there any breaking changes to the API?

What is the user experience going to be?

Are there any breaking changes to the User Experience?

Why should it be built? Any reason not to?

What will it take to execute?

Questions:

Why independent thin-clients?

Will the explosion in number of thin-clients cause issues?

Why not bundle extension clients into the core client?

Why OpenAPI specifications?

Can this proposal be extended to support other types of specifications besides OpenAPI?

How will this proposal affect developers who are not familiar with OpenAPI?

Will extension owners be able to extend upon their generated thin-clients?

Any further questions?

dblock commented Mar 24, 2023

wbeckler commented Mar 24, 2023

reta commented Mar 24, 2023 • edited Loading

peternied commented Mar 24, 2023

Xtansia commented Mar 27, 2023

peternied commented Mar 29, 2023

Xtansia commented Mar 29, 2023

reta commented Mar 24, 2023 •

edited

Loading