Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add faster scaling composite hash value encoding for remote path #13251

Merged
merged 2 commits into from
Apr 17, 2024

Conversation

ashking94
Copy link
Member

@ashking94 ashking94 commented Apr 17, 2024

Description

Description copied from #13155.

This is one of the tasks to achieve #12589 as part of the feature request #12567. This is only an increment step to conclude the optimised prefix path work proposed in the feature request.

In this PR, we have introduced a composite encoding for the 64 bits hash value. We use multiple attributes like index uuid, shard id, data category (segment/translog) and data type (data/metadata/lock_files) for generating a 64 bit hash value. Now, based on tests performed during the development, we have found a encoding that works best when the index shard counts or indexing rate increases on a cluster. The encoding uses 1st 6 bits to generate URL safe base 64 character and uses rest of the 14 bits as binary string equivalent. This encoded value as the prefix or infix depending of the path type selected at the cluster level. Currently, we are making this encoding value as default. We have levers in code to set the hash algorithm in index metadata to allow us in future to give a dynamic cluster setting that can be used to set the hash algorithm or the hash value encoding from multiple options.

Simulation steps done using AWS S3 -

  1. Create S3 bucket and upload at very high rate at all possible characters from URL safe base64 charset. This will warm up and increase the overall maximum request rate capacity for the prefix. We see throttling for couple of hours until the throttling stops. Please note at this point we have not started our opensearch cluster.
  2. Now we start our opensearch cluster with the same bucket we use above. We see we are able to support multiple active shards. Now, if we get more indexes (or more shards) created gradually, we are able to scale up S3 request rate capacity faster with binary encoding than the base64 encoding. The time is considerably lower in comparison to base64.

This PR was reverted in #13244 due to breaking change. However, these changes have not been released in any version yet. In comparison to the previous PR, we are adding ExperimentalApi annotations to these breaking changes.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • [ ] Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • [ ] Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>
Copy link
Contributor

❌ Gradle check result for 1dc3f78: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ashking94
Copy link
Member Author

The breaking changes that can been seen in the Detect breaking changes PR checklist item is not present in 2.x branch yet. This is only there in main and in this PR we have renamed the field names.

@ashking94
Copy link
Member Author

❌ Gradle check result for 1dc3f78: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Flaky test - #13207

Copy link
Contributor

✅ Gradle check result for 1dc3f78: SUCCESS

Copy link

codecov bot commented Apr 17, 2024

Codecov Report

Attention: Patch coverage is 89.28571% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 71.34%. Comparing base (b15cb0c) to head (1dc3f78).
Report is 184 commits behind head on main.

Files Patch % Lines
.../org/opensearch/index/remote/RemoteStoreUtils.java 72.72% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #13251      +/-   ##
============================================
- Coverage     71.42%   71.34%   -0.09%     
- Complexity    59978    60591     +613     
============================================
  Files          4985     5040      +55     
  Lines        282275   285432    +3157     
  Branches      40946    41335     +389     
============================================
+ Hits         201603   203629    +2026     
- Misses        63999    64945     +946     
- Partials      16673    16858     +185     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gbbafna gbbafna merged commit 02f9d74 into opensearch-project:main Apr 17, 2024
78 of 80 checks passed
ashking94 added a commit to ashking94/OpenSearch that referenced this pull request Apr 23, 2024
@ashking94 ashking94 added the backport 2.x Backport to 2.x branch label Apr 27, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-13251-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 02f9d74ec7746ebe9f6e71fe86b127813a4e4daa
# Push it to GitHub
git push --set-upstream origin backport/backport-13251-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-13251-to-2.x.

@ashking94
Copy link
Member Author

Auto backport failed, raising a manual backport.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants