Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2146

junqiu-lei · 2024-09-24T23:28:18Z

Description

This PR optimizes the reduceToTopK method by eliminating unnecessary pre-filling of the priority queue, reducing redundant peek() calls, and adding null safety checks for better performance.

Related Issues

Closes #2145

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

src/main/java/org/opensearch/knn/index/query/ExactSearcher.java

…ing peek calls Signed-off-by: Junqiu Lei <junqiu@amazon.com>

junqiu-lei · 2024-09-25T18:04:59Z

Offline synced with @navneet1v and @jmazanec15, we now focused on optimizing reduceToTopK function in ResultUtil, updated the PR.

shatejas

@junqiu-lei Did we compare removing the map with this approach to see which is faster? or is it not possible? I feel like its a simpler code that way if its yielding better results

removing the map approach: See if we can preserve the order of the results for each leaf instead of converting it to map.

We might have to first check if KNNResults array is in order. but that way we can do something similar to what lucene does and just pick the top elements and stop at k in reduceToTopK

junqiu-lei · 2024-09-25T19:50:56Z

@junqiu-lei Did we compare removing the map with this approach to see which is faster? or is it not possible? I feel like its a simpler code that way if its yielding better results

removing the map approach: See if we can preserve the order of the results for each leaf instead of converting it to map.

We might have to first check if KNNResults array is in order. but that way we can do something similar to what lucene does and just pick the top elements and stop at k in reduceToTopK

I think this is another place we can optimize, no we don't have comparing the difference so far. For this PR, we might can use micro benchmark for testing the improvement if possible.

jmazanec15

So, I think the most optimal way to do this would be to require that each perLeafResults comes in order from top scores to worst scores. Then, a simple algorithm that basically just scans the top of each set of results and then only takes the top k. That being, that would require how results are returned in the different methods. Do you think we should do this @navneet1v ?

Signed-off-by: Junqiu Lei <junqiu@amazon.com>

junqiu-lei · 2024-09-27T21:22:48Z

So, I think the most optimal way to do this would be to require that each perLeafResults comes in order from top scores to worst scores. Then, a simple algorithm that basically just scans the top of each set of results and then only takes the top k. That being, that would require how results are returned in the different methods. Do you think we should do this @navneet1v ?

I can raise other PR for this part optimization.

navneet1v · 2024-09-27T21:30:48Z

So, I think the most optimal way to do this would be to require that each perLeafResults comes in order from top scores to worst scores. Then, a simple algorithm that basically just scans the top of each set of results and then only takes the top k. That being, that would require how results are returned in the different methods. Do you think we should do this @navneet1v ?

Yes I think we should change the return types

…ing peek calls (#2146) Signed-off-by: Junqiu Lei <junqiu@amazon.com> (cherry picked from commit e0c3afe)

…ing peek calls (#2146) (#2164) Signed-off-by: Junqiu Lei <junqiu@amazon.com> (cherry picked from commit e0c3afe) Co-authored-by: Junqiu Lei <junqiu@amazon.com>

junqiu-lei added the enhancement label Sep 24, 2024

junqiu-lei self-assigned this Sep 24, 2024

junqiu-lei requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, martin-gaievski, ryanbogan and luyuncheng as code owners September 24, 2024 23:28

junqiu-lei force-pushed the optimize-exact-search branch from 98c5a4b to e988ccf Compare September 24, 2024 23:31

navneet1v reviewed Sep 25, 2024

View reviewed changes

src/main/java/org/opensearch/knn/index/query/ExactSearcher.java Outdated Show resolved Hide resolved

navneet1v reviewed Sep 25, 2024

View reviewed changes

src/main/java/org/opensearch/knn/index/query/ExactSearcher.java Outdated Show resolved Hide resolved

navneet1v reviewed Sep 25, 2024

View reviewed changes

src/main/java/org/opensearch/knn/index/query/ExactSearcher.java Outdated Show resolved Hide resolved

navneet1v reviewed Sep 25, 2024

View reviewed changes

src/main/java/org/opensearch/knn/index/query/ExactSearcher.java Outdated Show resolved Hide resolved

Optimize reduceToTopK in ResultUtil by removing pre-filling and reduc…

7d1ad23

…ing peek calls Signed-off-by: Junqiu Lei <junqiu@amazon.com>

junqiu-lei force-pushed the optimize-exact-search branch from e988ccf to 7d1ad23 Compare September 25, 2024 17:57

junqiu-lei changed the title ~~Optimize searchTopK method in ExactSearcher~~ Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls Sep 25, 2024

shatejas reviewed Sep 25, 2024

View reviewed changes

jmazanec15 reviewed Sep 26, 2024

View reviewed changes

jmazanec15 previously approved these changes Sep 27, 2024

View reviewed changes

Merge branch 'main' into optimize-exact-search

05c4b2f

Signed-off-by: Junqiu Lei <junqiu@amazon.com>

junqiu-lei dismissed jmazanec15’s stale review via 05c4b2f September 27, 2024 21:21

jmazanec15 approved these changes Sep 27, 2024

View reviewed changes

navneet1v approved these changes Sep 27, 2024

View reviewed changes

junqiu-lei merged commit e0c3afe into opensearch-project:main Sep 27, 2024
30 checks passed

junqiu-lei deleted the optimize-exact-search branch September 27, 2024 21:47

junqiu-lei added the backport 2.x label Sep 27, 2024

opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 27, 2024

Optimize reduceToTopK in ResultUtil by removing pre-filling and reduc…

a9af127

…ing peek calls (#2146) Signed-off-by: Junqiu Lei <junqiu@amazon.com> (cherry picked from commit e0c3afe)

opensearch-trigger-bot bot mentioned this pull request Sep 27, 2024

[Backport 2.x] Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2146

Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2146

junqiu-lei commented Sep 24, 2024 •

edited

Loading

junqiu-lei commented Sep 25, 2024

shatejas left a comment

junqiu-lei commented Sep 25, 2024

jmazanec15 left a comment

junqiu-lei commented Sep 27, 2024

navneet1v commented Sep 27, 2024

Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2146

Optimize reduceToTopK in ResultUtil by removing pre-filling and reducing peek calls #2146

Conversation

junqiu-lei commented Sep 24, 2024 • edited Loading

Description

Related Issues

Check List

junqiu-lei commented Sep 25, 2024

shatejas left a comment

Choose a reason for hiding this comment

junqiu-lei commented Sep 25, 2024

jmazanec15 left a comment

Choose a reason for hiding this comment

junqiu-lei commented Sep 27, 2024

navneet1v commented Sep 27, 2024

junqiu-lei commented Sep 24, 2024 •

edited

Loading