Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC support for client Search API #69

Open
amberzsy opened this issue May 29, 2024 · 6 comments
Open

gRPC support for client Search API #69

amberzsy opened this issue May 29, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@amberzsy
Copy link

amberzsy commented May 29, 2024

Description

Per significant performance improvement mentioned in experiment/poc for node to node communication, we will extend the performance benefits (proposal here) to communication with clients.

Next Steps

Perform poc similar to node to node search api to experiment on performance improvements for client-server communication.

target with search API

Related Issues

Proposal gRPC for clients
RFC
META Protobuf for search api

@amberzsy amberzsy changed the title [PROPOSAL] gRPC support for client Search API gRPC support for client Search API May 29, 2024
@dblock
Copy link
Member

dblock commented Jun 24, 2024

Catch All Triage - 1 2 3 4 5 6

@dblock dblock added enhancement New feature or request and removed untriaged labels Jun 24, 2024
@andrross
Copy link
Member

@amberzsy Do you have any proof-of-concept code you can share for any experiments you have tried here?

@amberzsy
Copy link
Author

amberzsy commented Aug 2, 2024

Code for client <-> server protobuf integration is here

The phase I poc is to see if any performance improvement with REST + proto before converting/add grpc service (which will come with grpc client with proto)

Summary

  1. See 20~30% improvement with relative large payload.
  2. Not include Response proto as the majority saving supposed to be from _source field, however, without schema enforcement, we can only directly convert to byte[]. Need to think of strategy for schema management which will be needed for index API.

Next Step
Same practice can be applied to vector search which usually with tons of embeddings. Will explore and poc on vector search api to see any potential improvement.

Test Env

  1. 3 nodes production env cluster.
  2. OS version 2.14.1
  3. Test client: here
  4. Query type, Boolean, Match, Term
  5. index settings: here
  6. Proto: SearchRequest, MatchQuery, BooleanQuery, TermQuery. With proto, we can skip the Json payload parse which iterate over each token and generate corresponding internal object, including parser fucntion in RestSearchAction and fromXContent per query builder

Result:
image
image

Result correctness validation

  • query with 1 result returned
  • query with 2 result returned
  • query with 3 result returned
  • query with all result returned

Resource consumption with 4K filter terms
CPU:
image

@dblock
Copy link
Member

dblock commented Aug 2, 2024

These are some great numbers! cc: @backslasht @Pallavi-AWS

How do we get this into the product/production?

@Pallavi-AWS
Copy link
Member

Thanks @amberzsy, great to see your work in open source.

@dblock, @andrross will work with @amberzsy for productizing protobuf for client and node to node communications.

@Pallavi-AWS
Copy link
Member

@amberzsy @andrross @msfroh it will be great to have a proposal on phased productization. We could have an experimental feature in 2.17 covering only node to node focusing on search response, and followup in 2.18 with client to server and node to node with search request and response. Thanks again for your work here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants