Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subscription.pull() should return a list of objects rather than tuples. #913

Closed
jgeewax opened this issue Jun 6, 2015 · 16 comments
Closed
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API.

Comments

@jgeewax
Copy link
Contributor

jgeewax commented Jun 6, 2015

Copied from comments in #910

Currently, when pulling messages via a subscription, I'd write:

received = subscription.pull()
for ack_id, message in received:
  print ack_id
  print message['attr1']
  print message.data  # ?
  print message.id  # ?

I think it'd be better to have the return value be an iterable of some more complex object (maybe a pubsub.Message?):

messages = subscription.pull()
for message in messages:
  print message.get_ack_id()  # ReceivedMessage.ackId (string)
  print message['attr1']  # PubsubMessage.attributes (dict)
  print message.get_data()  # PubsubMessage.data (bytes)
  print message.get_id()  # corresponds to PubsubMessage.messageId (string)
@jgeewax jgeewax added the api: pubsub Issues related to the Pub/Sub API. label Jun 6, 2015
@dhermes
Copy link
Contributor

dhermes commented Jun 6, 2015

  1. What is the point of message.get_data() when we can just say message.data?
  2. @tseaver Are there any issues with making ack_id an optional attribute of Message?

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 6, 2015

What is the point of message.get_data() when we can just say message.data?

I'm fine with just .data.

Are there any issues with making ack_id an optional attribute of Message?

Can we ack multiple messages at a time? that is, could the following be possible in our current implementation?

received = subscription.pull()
for ack_id, messages in received:
  for message in messages:
    print message.data
  acknowledge(ack_id)

@tseaver
Copy link
Contributor

tseaver commented Jun 9, 2015

Per the spec, each message in a multi-message pull response has its own ackId, but that ID is not part of the message object, which has its own messageId.

It is kind of like the difference between the "envelope From" and the "payload From: header" in SMTP. The "transfer agent" cares about the "envelope" ackId (because it must use it to acknowlege the message in order to prevent re-transmission), while the "normal" part of the application would care about the messageId.

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 9, 2015

Sure, so the idea then is... if we had a Message class, we could put the ackId into the instance, and even provide an acknowledge() method that handles this, right?

@tseaver
Copy link
Contributor

tseaver commented Jun 9, 2015

We do have a message class, but putting the acknowledge method on it would be wrong: it is much more closely tied in the API to the subscription (again, "envelope" vs. "payload").

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 9, 2015

I get that it is logically a separate thing, however conceptually, the only thing I can acknowledge is a single message, right?

I'm OK with the acknowledge method living elsewhere, but wouldn't it make sense to put the ackId into the message?

@tseaver
Copy link
Contributor

tseaver commented Jun 9, 2015

The bits of the application that deal with the actual message payload are going to be very different from the part that does the pull + acknowlege part: the latter is likely to be "frameworky".

I think keeping the ackId out of the "payload" is actually a win.

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 9, 2015

Can you give any more background on why that's a win? If we put it in the message the list returns a list of Message objects. If we don't it's a list of tuples, that has more specific unpacking requirements.

So the code might look like:

for message in subscription.pull():
  acknowledgement_system.acknowledge_receipt(message.ack_id)
  # or even...
  acknowledgement_system.acknowledge_receipt(message)
  message_processing_system.process_message(message)

In other words, I'm saying "I want to acknowledge the receipt of a message", treating the ackId as something that's part of the message. The "payload" is message.data... So a Message object is the envelope, and you can get the payload by looking at .data ..

Is this crazy?

@tseaver
Copy link
Contributor

tseaver commented Jun 9, 2015

In my mind the "payload" is the data, attrs, and messageId (analogous to headers and body of an SMTP message), while the "envelope" is the ackId.

I'm sorry that I don't seem to be communicating here my sense of the importance of separation of concerns, in addition to staying as close a reasonably possible to the concepts used by the back-end API:

  • The attributes of the current Message class all come from the PubsubMessage resource, and represent non-ephemeral data about the given message. They are the same whether the message is obtained via a pull() on a subscription, or via a push to an endpoint.
  • The ackId is part of the ReceivedMessage resource (along with the message, which is the PubsubMessage): it is only relevant for pull(): in push, the endpoint's HTTP response code acts as the acknowedgement.

Except in toy examples, the code dealing with pull() and acknowledge() will not care about the message object itself: it will dispatch it to other code which does care about the message, and then send the acknowlegement. The dispatched-to code which handles the message object itself won't care about the ackId at all.

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 9, 2015

OK, and if another attribute comes back with the ReceivedMessage (maybe an expiration or something of that nature) ? Will we continue unpacking tuples still?

It seems like if ReceivedMessage had more than just an ackId + PubsubMessage, we'd have wrapped it in an object. So my main point is...

  • It seems like it'd be nice if .pull() gets back a list of objects.
  • Those objects have ways to get the things you want from them
    • Envelope, aka ackId
    • Payload, aka PubsubMessage

Since the ackId is the only piece of data, and it has a 1-to-1 relationship with the PubsubMessage itself (I asked about this above), it seemed reasonable to put this together and say .pull() returns an object that let's you "get" the different things you want.

@tseaver
Copy link
Contributor

tseaver commented Jun 10, 2015

OK, and if another attribute comes back with the ReceivedMessage (maybe an expiration or something of that nature) ? Will we continue unpacking tuples still?

"We" don't unpack them -- the caller does. If it makes sense, we could bundle the two as a namedtuple, which would allow the caller to access the items so:

for received in subscription.pull():
    if 'log_me' in received.message.attrs:
        print received.message.data
    acknowledge(received.ack_id)

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 10, 2015

I'd be fine with the named tuple, but then it leads to the question: why would we use a named tuple here instead of an object for ReceivedMessage?

Are we making the decision that "if there's only N properties and 0 methods, we should use a named tuple"?

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 10, 2015

Also, using your example code, I'm a tiny bit confused why that's preferable to:

for message in subscription.pull():
    if 'log_me' in message.attrs:
        print message.data
    acknowledge(message.ack_id)

Is there some example you can share as to why that is a horrible idea ?

@tseaver
Copy link
Contributor

tseaver commented Jun 10, 2015

I've tried to share it above: the Message class we have now is a transport-neutral mapping of the PubsubMessage structure. Capturing the "pull-only" bits of ReceivedMessage in a separate class (or the namedtuple) maintains that clarity / separation of concerns.

@jgeewax
Copy link
Contributor Author

jgeewax commented Jun 10, 2015

So what about the part asking about the named tuple vs object?

@dhermes
Copy link
Contributor

dhermes commented Dec 31, 2015

These discussion seems to have died. Closing out for now (feel free to re-open if feature is necessary)

@dhermes dhermes closed this as completed Dec 31, 2015
parthea pushed a commit that referenced this issue Jul 6, 2023
…les#913)

* Fixes for non-ASCII encodings

* Adds test for UTF

* Style fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API.
Projects
None yet
Development

No branches or pull requests

3 participants