subscription.pull() should return a list of objects rather than tuples. #913

jgeewax · 2015-06-06T17:03:15Z

Copied from comments in #910

Currently, when pulling messages via a subscription, I'd write:

received = subscription.pull()
for ack_id, message in received:
  print ack_id
  print message['attr1']
  print message.data  # ?
  print message.id  # ?

I think it'd be better to have the return value be an iterable of some more complex object (maybe a pubsub.Message?):

messages = subscription.pull()
for message in messages:
  print message.get_ack_id()  # ReceivedMessage.ackId (string)
  print message['attr1']  # PubsubMessage.attributes (dict)
  print message.get_data()  # PubsubMessage.data (bytes)
  print message.get_id()  # corresponds to PubsubMessage.messageId (string)

The text was updated successfully, but these errors were encountered:

dhermes · 2015-06-06T17:09:37Z

What is the point of message.get_data() when we can just say message.data?
@tseaver Are there any issues with making ack_id an optional attribute of Message?

jgeewax · 2015-06-06T17:11:37Z

What is the point of message.get_data() when we can just say message.data?

I'm fine with just .data.

Are there any issues with making ack_id an optional attribute of Message?

Can we ack multiple messages at a time? that is, could the following be possible in our current implementation?

received = subscription.pull()
for ack_id, messages in received:
  for message in messages:
    print message.data
  acknowledge(ack_id)

tseaver · 2015-06-09T16:04:06Z

Per the spec, each message in a multi-message pull response has its own ackId, but that ID is not part of the message object, which has its own messageId.

It is kind of like the difference between the "envelope From" and the "payload From: header" in SMTP. The "transfer agent" cares about the "envelope" ackId (because it must use it to acknowlege the message in order to prevent re-transmission), while the "normal" part of the application would care about the messageId.

jgeewax · 2015-06-09T16:17:29Z

Sure, so the idea then is... if we had a Message class, we could put the ackId into the instance, and even provide an acknowledge() method that handles this, right?

tseaver · 2015-06-09T16:49:36Z

We do have a message class, but putting the acknowledge method on it would be wrong: it is much more closely tied in the API to the subscription (again, "envelope" vs. "payload").

jgeewax · 2015-06-09T16:54:30Z

I get that it is logically a separate thing, however conceptually, the only thing I can acknowledge is a single message, right?

I'm OK with the acknowledge method living elsewhere, but wouldn't it make sense to put the ackId into the message?

tseaver · 2015-06-09T17:26:34Z

The bits of the application that deal with the actual message payload are going to be very different from the part that does the pull + acknowlege part: the latter is likely to be "frameworky".

I think keeping the ackId out of the "payload" is actually a win.

jgeewax · 2015-06-09T18:41:47Z

Can you give any more background on why that's a win? If we put it in the message the list returns a list of Message objects. If we don't it's a list of tuples, that has more specific unpacking requirements.

So the code might look like:

for message in subscription.pull():
  acknowledgement_system.acknowledge_receipt(message.ack_id)
  # or even...
  acknowledgement_system.acknowledge_receipt(message)
  message_processing_system.process_message(message)

In other words, I'm saying "I want to acknowledge the receipt of a message", treating the ackId as something that's part of the message. The "payload" is message.data... So a Message object is the envelope, and you can get the payload by looking at .data ..

Is this crazy?

tseaver · 2015-06-09T19:29:30Z

In my mind the "payload" is the data, attrs, and messageId (analogous to headers and body of an SMTP message), while the "envelope" is the ackId.

I'm sorry that I don't seem to be communicating here my sense of the importance of separation of concerns, in addition to staying as close a reasonably possible to the concepts used by the back-end API:

The attributes of the current Message class all come from the PubsubMessage resource, and represent non-ephemeral data about the given message. They are the same whether the message is obtained via a pull() on a subscription, or via a push to an endpoint.
The ackId is part of the ReceivedMessage resource (along with the message, which is the PubsubMessage): it is only relevant for pull(): in push, the endpoint's HTTP response code acts as the acknowedgement.

Except in toy examples, the code dealing with pull() and acknowledge() will not care about the message object itself: it will dispatch it to other code which does care about the message, and then send the acknowlegement. The dispatched-to code which handles the message object itself won't care about the ackId at all.

jgeewax · 2015-06-09T19:38:03Z

OK, and if another attribute comes back with the ReceivedMessage (maybe an expiration or something of that nature) ? Will we continue unpacking tuples still?

It seems like if ReceivedMessage had more than just an ackId + PubsubMessage, we'd have wrapped it in an object. So my main point is...

It seems like it'd be nice if .pull() gets back a list of objects.
Those objects have ways to get the things you want from them
- Envelope, aka ackId
- Payload, aka PubsubMessage

Since the ackId is the only piece of data, and it has a 1-to-1 relationship with the PubsubMessage itself (I asked about this above), it seemed reasonable to put this together and say .pull() returns an object that let's you "get" the different things you want.

tseaver · 2015-06-10T16:36:11Z

OK, and if another attribute comes back with the ReceivedMessage (maybe an expiration or something of that nature) ? Will we continue unpacking tuples still?

"We" don't unpack them -- the caller does. If it makes sense, we could bundle the two as a namedtuple, which would allow the caller to access the items so:

for received in subscription.pull():
    if 'log_me' in received.message.attrs:
        print received.message.data
    acknowledge(received.ack_id)

jgeewax · 2015-06-10T16:55:15Z

I'd be fine with the named tuple, but then it leads to the question: why would we use a named tuple here instead of an object for ReceivedMessage?

Are we making the decision that "if there's only N properties and 0 methods, we should use a named tuple"?

jgeewax · 2015-06-10T16:56:22Z

Also, using your example code, I'm a tiny bit confused why that's preferable to:

for message in subscription.pull():
    if 'log_me' in message.attrs:
        print message.data
    acknowledge(message.ack_id)

Is there some example you can share as to why that is a horrible idea ?

tseaver · 2015-06-10T16:59:13Z

I've tried to share it above: the Message class we have now is a transport-neutral mapping of the PubsubMessage structure. Capturing the "pull-only" bits of ReceivedMessage in a separate class (or the namedtuple) maintains that clarity / separation of concerns.

jgeewax · 2015-06-10T17:09:44Z

So what about the part asking about the named tuple vs object?

dhermes · 2015-12-31T05:30:11Z

These discussion seems to have died. Closing out for now (feel free to re-open if feature is necessary)

…les#913) * Fixes for non-ASCII encodings * Adds test for UTF * Style fix

jgeewax added the api: pubsub Issues related to the Pub/Sub API. label Jun 6, 2015

jgeewax mentioned this issue Jun 6, 2015

Attempt at implementing clients for Pub/Sub #910

Merged

dhermes closed this as completed Dec 31, 2015

JustinBeckwith assigned dhermes Feb 1, 2021

parthea pushed a commit that referenced this issue Jul 6, 2023

Fixes for text encoding [(#913)](GoogleCloudPlatform/python-docs-samp…

78f4bc9

…les#913) * Fixes for non-ASCII encodings * Adds test for UTF * Style fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subscription.pull() should return a list of objects rather than tuples. #913

subscription.pull() should return a list of objects rather than tuples. #913

jgeewax commented Jun 6, 2015

dhermes commented Jun 6, 2015

jgeewax commented Jun 6, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 10, 2015

jgeewax commented Jun 10, 2015

jgeewax commented Jun 10, 2015

tseaver commented Jun 10, 2015

jgeewax commented Jun 10, 2015

dhermes commented Dec 31, 2015

subscription.pull() should return a list of objects rather than tuples. #913

subscription.pull() should return a list of objects rather than tuples. #913

Comments

jgeewax commented Jun 6, 2015

dhermes commented Jun 6, 2015

jgeewax commented Jun 6, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 9, 2015

jgeewax commented Jun 9, 2015

tseaver commented Jun 10, 2015

jgeewax commented Jun 10, 2015

jgeewax commented Jun 10, 2015

tseaver commented Jun 10, 2015

jgeewax commented Jun 10, 2015

dhermes commented Dec 31, 2015