Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require that keys for 'put' / 'delete' match the 'dataset_id' of the batch #552

Merged
merged 11 commits into from
Jan 28, 2015
Merged

Require that keys for 'put' / 'delete' match the 'dataset_id' of the batch #552

merged 11 commits into from
Jan 28, 2015

Conversation

tseaver
Copy link
Contributor

@tseaver tseaver commented Jan 15, 2015

Fixes #447.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 15, 2015
@tseaver tseaver added the api: datastore Issues related to the Datastore API. label Jan 15, 2015
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling b6ccf75 on tseaver:447-require_key_dataset_ids_match_branch into 12ac983 on GoogleCloudPlatform:master.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

Shall we also ditch _get_dataset_id_from_keys and just rely on this behavior?

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

_get_dataset_id_from_keys is used by datastore.get, too, and is used to figure out which dataset_id to use in datastore.put and datastore.delete when no batch / transaction is already in play.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

OK so let's not remove, but it doesn't make sense to check them at the beginning and then again when each Key / Entity is passed to Batch.put() / Batch.delete().

Proposal:

  • Stop using _get_dataset_id_from_keys in put() and delete().
  • Introduce optional dataset_id=None parameter to put() and delete()
  • Fall back to implicit if no dataset ID is passed

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

SGTM: For consistency, get() should work the same way.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

But get() is not tied to a batch. How do you mean the same way?

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

I just mean that the API should be consistent: get() should take an optional dataset_id, and fall back to the implicit one, rather than deriving it from the keys (it should still check the keys to ensure that they all match).

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

SGTM

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

I'm adding a variant on the "inferred connection / dataset ID" bit: if there is an active batch or transaction, and you don't pass them to datastore.get() / datastore.put() / datastore.delete(), then we use the value set on the batch or transaction.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 9006e61 on tseaver:447-require_key_dataset_ids_match_branch into be8f5fb on GoogleCloudPlatform:master.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

RE: Doing the right thing comparing dataset_ids, instead of waiting on #528, how do you feel about just moving regression.datastore._compare_dataset_ids into gcloud.datastore.api?

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

RE: variant of implicit, that seems to be the right move. Reviewing now.

"""
if not entities:
return

connection = connection or _implicit_environ.CONNECTION
connection = _require_connection(connection)
dataset_id = _require_dataset_id(dataset_id)

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

I forgot regression.datastore._compare_dataset_ids was there, and implemented a stripped-down variant in Batch._match_dataset_id (1b87a7e)

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

Yes I just saw it in the review. Unfortunately it would gave false positives like e~foo == s~foo and s~foo == s~notfoo~foo. (These are not likely to be possible values in the wild, but better safe than sorry.)

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

For a given "project ID" (the un-prefixed dataset ID), is it possible to have more than one prefix? I'm pretty sure that the project IDs themselves will never have an embedded ~: why else would they have picked it to delimit the "location"-based IDs?

At any rate, we could move the regression.datastore_compare_dataset_ids code into helpers, and call it from within _match_dataset_id (which wants to raise an error).

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

I'm fairly certain the two cases I described are impossible, but the extra effort required is minimal since I've already written the code and the tests.

Moving to helpers SGTM, take a peek at
dhermes@90d5063
for _dataset_ids_equal and Test__dataset_ids_equal.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling ae880b4 on tseaver:447-require_key_dataset_ids_match_branch into be8f5fb on GoogleCloudPlatform:master.

@tseaver
Copy link
Contributor Author

tseaver commented Jan 15, 2015

Are you working to merge dhermes/google-cloud-python@90d5063? If so, we should update _match_dataset_id to use your helper function afterwards.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

I am holding off on it until #528 has some sort of resolution, since it doesn't solve an actual problem, just gives "nice" behavior.

@dhermes
Copy link
Contributor

dhermes commented Jan 15, 2015

RE: Impossibility of ordering

I think you're correct that without pathological things like

key = Key('Foo')
key._dataset_id = None

the last one can't occur.


I spoke too soon in my recommendation. As implemented, the following will fail but should not:

from gcloud import datastore
datastore.set_defaults(dataset_id='foo')

k = Key('Kind', 10, dataset_id='bar')
datastore.delete([k])

With this, api.delete() will create a new Batch with dataset_id = 'foo' and then will result in an error when Batch.delete() is called on this new batch.

This is why the ordering

  • dataset_id if passed explicitly
  • Batch.current().dataset_id
  • entities[0].key.dataset_id
  • _implicit_environ.DATASET_ID

is necessary.

@tseaver
Copy link
Contributor Author

tseaver commented Jan 22, 2015

This is why the ordering

  • dataset_id if passed explicitly
  • Batch.current().dataset_id
  • entities[0].key.dataset_id
  • _implicit_environ.DATASET_ID

I think the last is a "can't get there" case, because keys will always have a dataset_id. If that is so, I can update the PR to match.

@dhermes
Copy link
Contributor

dhermes commented Jan 22, 2015

It's not the Keys we're worried about, it's the Batch (more than just Key uses the default). See the scenario I mentioned below the 1st line above.

@tseaver
Copy link
Contributor Author

tseaver commented Jan 27, 2015

@dhermes: d5d64 adds the keys back per your comment.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling d5d6400 on tseaver:447-require_key_dataset_ids_match_branch into 6bbb144 on GoogleCloudPlatform:master.

"""Infer a dataset ID from the environment, if not passed explicitly.

Order or precedence:

This comment was marked as spam.

@dhermes
Copy link
Contributor

dhermes commented Jan 27, 2015

@tseaver Sorry for the email spam, I accidentally commented on the commit and then moved the feedback to the PR.

Incorporates feedback from @dhermes:

- #552 (comment)

- #552 (comment)
Instead, caller passes only the first key.

Incorporates feedback from @dhermes:

- #552 (comment)
@tseaver
Copy link
Contributor Author

tseaver commented Jan 27, 2015

@dhermes no worries.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling db4c728 on tseaver:447-require_key_dataset_ids_match_branch into 6bbb144 on GoogleCloudPlatform:master.

@dhermes
Copy link
Contributor

dhermes commented Jan 27, 2015

@tseaver Everything looks good except moving _match_dataset_id to helpers and using the implementation / tests from dhermes@90d5063

@tseaver
Copy link
Contributor Author

tseaver commented Jan 28, 2015

@dhermes PTAL

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 49977da on tseaver:447-require_key_dataset_ids_match_branch into 6bbb144 on GoogleCloudPlatform:master.

@dhermes
Copy link
Contributor

dhermes commented Jan 28, 2015

LGTM

tseaver added a commit that referenced this pull request Jan 28, 2015
…h_branch

Require that keys for 'put' / 'delete' match the 'dataset_id' of the batch
@tseaver tseaver merged commit 2c75763 into googleapis:master Jan 28, 2015
@tseaver tseaver deleted the 447-require_key_dataset_ids_match_branch branch January 28, 2015 17:26
vchudnov-g pushed a commit that referenced this pull request Sep 20, 2023
…552)

Source-Link: googleapis/synthtool@8e55b32
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-python:latest@sha256:c6c965a4bf40c19011b11f87dbc801a66d3a23fbc6704102be064ef31c51f1c3
parthea added a commit that referenced this pull request Oct 21, 2023
* chore: Update gapic-generator-python to v1.8.5

PiperOrigin-RevId: 511892190

Source-Link: googleapis/googleapis@a45d9c0

Source-Link: googleapis/googleapis-gen@1907294
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMTkwNzI5NGIxZDgzNjVlYTI0ZjhjNWYyZTA1OWE2NDEyNGM0ZWQzYiJ9

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

* revert

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
Co-authored-by: Anthonios Partheniou <partheniou@google.com>
parthea pushed a commit that referenced this pull request Oct 21, 2023
* docs: Minor formatting
chore: Update gapic-generator-python to v1.11.5
build: Update rules_python to 0.24.0

PiperOrigin-RevId: 563436317

Source-Link: googleapis/googleapis@42fd37b

Source-Link: googleapis/googleapis-gen@280264c
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiMjgwMjY0Y2EwMmZiOTMxNmI0MjM3YTk2ZDBhZjFhMjM0M2E4MWE1NiJ9

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

---------

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
@release-please release-please bot mentioned this pull request Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Batch (and Transaction) dataset_id should match that of Key / Query objects when added
4 participants