Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable MongoDB sink with unique fields in MongoDB Sharded cluster #135

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidch93
Copy link

Updates:

  • Create a new toggle (isUseFilterInValueDoc) in the MongoDbUpdate to enable using the filter field provided by Debezium.
  • Create a new class (MongoDbUniqueFieldHandler) to provide default operations with the isUseFilterInValueDoc enabled.
  • Update MongoDbHandler and ChangeStreamHandler to adjust default operations with the isUseFilterInValueDoc disabled.
  • Add some new unit tests.

Problem Statements

Our use case is to migrate data to a new MongoDB Sharded cluster. Our approach is to use CDC (Debezium). When replicating data to the new cluster, we got an error regarding the shard key with the following details.

Failed to target upsert by query :: could not extract exact shard key

The current situation is that we have unique fields enabled for some collections in our Sharded cluster. We also use the same unique fields in our new Sharded cluster. The error happened because the MongoDB Sink connector doesn't provide the unique fields information when upserting data.

Solution

As you may know, Debezium provides a filter field in its payload, as shown below.

{
  "after": null,
  "patch": "{\"$v\": 1,\"$set\": {\"updated_at\": {\"$date\": 1678606630819}}}",
  "filter": "{\"transaction_id\": {\"$numberLong\": \"1234\"},\"_id\": {\"$oid\": \"63a3edd806add30266cb831f\"}}"
}

Thus, we want to use the filter information as a filter when upserting data to solve our problem.

Updates:
- Create a new toggle (`isUseFilterInValueDoc`) in the `MongoDbUpdate` to enable using the `filter` field provided by Debezium.
- Create a new class (`MongoDbUniqueFieldHandler`) to provide default operations with the `isUseFilterInValueDoc` enabled.
- Adjust `MongoDbHandler` and `ChangeStreamHandler` to provide default operations with the `isUseFilterInValueDoc` disabled.
- Add some new unit tests.
@davidch93 davidch93 changed the title Enable MongoDB sink with unique fields Enable MongoDB sink with unique fields in MongoDB Sharded Mar 15, 2023
@davidch93 davidch93 changed the title Enable MongoDB sink with unique fields in MongoDB Sharded Enable MongoDB sink with unique fields in MongoDB Sharded cluster Mar 15, 2023
Copy link

@ankit-gautam23 ankit-gautam23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants