Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable reuse of FromData instance #6853

Closed
wants to merge 0 commits into from
Closed

Conversation

GLGDLY
Copy link

@GLGDLY GLGDLY commented Aug 1, 2022

I have been facing a situation that an API I was requesting would have some retryable error especially with form-data requests. While with reusing the FormData instance in my request, the RuntimeError was raised. In this case, I was confused that why the FormData instance would raise error upon re-gen.

With looking into the source code of aiohttp, I cannot discover any possible errors that may occurs with the reuse of FormData instance, especially the MultipartWriter seems to be designed with the ability to write() for more than once. As a result, I tried the change in this pr in my own code to test and proved that the resulting form-data to be send is correctly outputted.

To work more, with the fact that MultipartWriter seems to be designed for more than one time usage, the FormData may also accept further adding fields into the writer after already generated form data for a certain number of times. With this, it might be more flexible for we to deal with form-data request.

What do these changes do?

  • do not raise RuntimeError with the reuse of FormData instance but returning self._writer
  • after already generated form data for a certain number of times, allow adding more fields to further-update the MultipartWriter instance

Are there changes in behavior for the user?

Users may reuse the FormData instance without the need of creating a new instance for every form-data requests.

Related issue number

n/a

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Aug 1, 2022
@@ -132,8 +131,8 @@ def _gen_form_urlencoded(self) -> payload.BytesPayload:

def _gen_form_data(self) -> multipart.MultipartWriter:
"""Encode a list of fields using the multipart/form-data MIME format"""
if self._is_processed:
raise RuntimeError("Form data has been processed already")
if not self._fields:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is safe to reprocess since it looks like get_payload supports streaming payloads which may only be able to be consumed once

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if we want to support this (and I've no idea if we do), then we'd probably need to subclass this, or have different logic depending on the type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an open issue about trying to resend form data on a redirect, which is probably something we want to look at sometime. So, still not sure about the implementation of this, but the test atleast should probably be updated to show the redirect issue, rather than reusing a form in 2 completely different requests (which I'm not sure we want to support).

#5577

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is safe to reprocess since it looks like get_payload supports streaming payloads which may only be able to be consumed once

Just realized this review today Imao.

Some thoughts to this. To my understanding, FormData is just a wrapper class for generating the MultipartWriter object. In which _gen_from_data is a helper method for generating Payload objects, from our value passed to add_field (when _is_multipart). It is weird to me that MultipartWriter.write() is actually reusable in current code, but this FormData wrapper class is not. That's why I think _gen_from_data should directly return the generated MultipartWriter object when it is called for multiple times.

Yet, I do agree with your concern here about the safety issue, but I believe this safety issue you are worrying about actually is an already-existent issue in current code, based on the recallable MultipartWriter.write(). Quick reproduction to what I have mentioned: mpwriter.append(io.BytesIO(b"...")); await session.put("https://bing.com/", data=mpwriter),which bing.com 302 redirects me to a local domain, mpwriter.write() is then called again upon redirect and cause IO operation on closed file (that io.BytesIO).

I agree that this is a critical issue and thus I will be working on it to solve together with this FormData reuse. However, this PR is old, many new changes have been made over time, and the objectives have changed if we would like to solve the issue above. I might be opening another PR today after work to try fix and discuss this.

Yet, thanks for your review, I will let you know when the new PR is ready.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an open issue about trying to resend form data on a redirect, which is probably something we want to look at sometime. So, still not sure about the implementation of this, but the test atleast should probably be updated to show the redirect issue, rather than reusing a form in 2 completely different requests (which I'm not sure we want to support).

#5577

I might be opening another PR here based on some new observations (see my comments above), will let you know when the new PR is ready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought: If we know the writer is not callable twice, should we consider enabling expect100 by default? That would increase the chance of a server giving us the redirect before we write the body, which would atleast solve the issue in some cases...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random thought: If we know the writer is not callable twice, should we consider enabling expect100 by default? That would increase the chance of a server giving us the redirect before we write the body, which would atleast solve the issue in some cases...

emm... I'm not sure if enabling Expect 100 by default is a good approach, cuz for my understanding it would somehow increase latency (and some web servers may not have the mechanism to handle Expect)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dreamsorcerer @bdraco new PR is here #9201; Will update the test coverage issue and the CHANGES file later, but can take a look first, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants