Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No WHERE pushdown for ObjectID (NAME) based queries. #169

Open
Climax777 opened this issue Nov 10, 2023 · 5 comments · May be fixed by #170
Open

No WHERE pushdown for ObjectID (NAME) based queries. #169

Climax777 opened this issue Nov 10, 2023 · 5 comments · May be fixed by #170

Comments

@Climax777
Copy link

Closed issue #163 implies we cannot get WHERE clause pushdown on ObjectID (NAME) types queries.

For example, in the provided example in https://www.enterprisedb.com/docs/mongo_data_adapter/latest/08c_example_where_pushdown/ querying for a specific one causes a collection scan (this is an objectID generated in the insert):
select * from emp where _id = '654ca5772cff3e18f66ce383'

Resulting explain analyze:

"QUERY PLAN"
"Foreign Scan on public.emp (actual time=4.406..4.416 rows=1 loops=1)"
"  Output: eid"
"  Filter: (emp._id = '654ca5772cff3e18f66ce383'::name)"
"  Rows Removed by Filter: 3"
"  Foreign Namespace: edb.emp"
"Planning Time: 0.103 ms"
"Execution Time: 5.383 ms"

Note 3 rows were returned from the foreign end, which means the OID query hasn't been pushed down. This, to me, is a big problem since there are many common scenarios where this is required to be filtered at the foreign end.

@sheim-dev
Copy link

This is in fact a big issue in several of my common use cases as well.

@Climax777
Copy link
Author

Climax777 commented Nov 13, 2023

This is in fact a big issue in several of my common use cases as well.

I'm working on it, though I have a lot of learning to do. I'm not familiar with the PostgreSQL C-APIs. But it seems doable.

In my poking around I saw that some columns do in fact filter and others not, even though they are both NAME typed. And also _id never works

@vaibhavdalvi93
Copy link

@Climax777 , thank you for working on PR #170. I am NOT in favour of pull request 170 because this looks unsafe to me.
The postgres_fdw and other fdw don’t allow this. The postgres_fdw don’t allow because of following reason:

Note: The C and POSIX locales may behave differently depending on the database encoding.

On all platforms, the collations named default, C, and POSIX are available.
Additional collations may be available depending on operating system support.
The default collation selects the LC_COLLATE and LC_CTYPE values specified at database
creation time. The C and POSIX collations both specify “traditional C” behavior, in which
only the ASCII letters “A” through “Z” are treated as letters, and sorting is done strictly
by character code byte values.

For more details please refer this.

For homogeneous database, if there is risk then for heterogeneous looks more unsafe to me. Also, we don’t consider LOCALE at the time of forming mongo query pipeline.

If we do proposed change then that will be applicable to all columns not only _id one. So, to achieve one thing, we can't take risk of non-reliable results.

If we want to push-down non-default collation then need to form the query pipeline with collation method/operator. Also, need to maintain the collation mapping table for Postgres and MongoDB, along with checking it’s availability. This is going to difficult from maintainability and there is risk of potential bugs.

Following documents can be referred to learn about collation in MongoDB.

Collation

db.collection.find()

If you're agree with the explanation, could you close this issue and PR request too.

Thanks,
Vaibhav

@Climax777
Copy link
Author

Climax777 commented Jan 5, 2024 via email

@vaibhavdalvi93
Copy link

@Climax777 , I can understand your concern.

surely collation should not affect or be affected by objectIds? Should this then only be an exception case for this type perhaps?

The _id field can be of any BSON data type i.e. not necessary the _id field data type is always of ObjectId type. For your reference following is the line from MongoDB office documentation:
The _id field may contain values of any BSON data type, other than an array, regex, or undefined.

For more details on _id field please refer this document.

So, we can't make collation exception for _id field because it's NOT guarantee that it's always a ObjectId type. We could do so if _id field is of type ObjectId only.

Hope, it's clear to you. If not, feel free to revert back.

Thanks,
Vaibhav

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants