Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser fails on data with mixed encodings #63

Closed
enigmathix opened this issue Apr 4, 2023 · 0 comments
Closed

Parser fails on data with mixed encodings #63

enigmathix opened this issue Apr 4, 2023 · 0 comments
Labels

Comments

@enigmathix
Copy link

When data contains a base64 encoding, then a plain text, the parser tries to decode the plain text as if it was base64, which results in an error. For example:

from io import BytesIO
import multipart

def on_field(field):
    print('field', field)

def on_file(file):
    print('file', file)

data = b'--foo\r\nContent-Type: text/plain; charset="UTF-8"\r\nContent-Disposition: form-data; name=field1\r\nContent-Transfer-Encoding: base64\r\n\r\nw6k=\r\n--foo\r\nContent-Type: text/plain; charset="UTF-8"\r\nContent-Disposition: form-data; name=field2\r\n\r\nsome text\r\n\r\n--foo--'

headers = {'Content-Type': 'multipart/form-data; boundary="foo"', 'Content-Length': str(len(data))}
multipart.parse_form(headers, BytesIO(data), on_field, on_file)

Output:

field Field(field_name=b'field1', value=b'\xc3\xa9')
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/decoders.py", line 60, in write
    decoded = base64.b64decode(val)
              ^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/base64.py", line 88, in b64decode
    return binascii.a2b_base64(s, strict_mode=validate)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Incorrect padding

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/christophe/xxxx.py", line 13, in <module>
    multipart.parse_form(headers, BytesIO(data), on_field, on_file)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1884, in parse_form
    parser.write(buff)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1776, in write
    return self.parser.write(data)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1058, in write
    l = self._internal_write(data, data_len)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1327, in _internal_write
    data_callback('part_data')
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1104, in data_callback
    self.callback(name, data, marked_index, i)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 584, in callback
    func(data, start, end)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/multipart.py", line 1665, in on_part_data
    bytes_processed = vars.writer.write(data[start:end])
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/multipart/decoders.py", line 62, in write
    raise DecodeError('There was an error raised while decoding '
multipart.exceptions.DecodeError: There was an error raised while decoding base64-encoded data.

The problem is that the parser is trying to decode the text "some text" as base64 when it's actually plain text.

@Kludex Kludex added the bug label Feb 12, 2024
jhnstrk added a commit to jhnstrk/python-multipart that referenced this issue Mar 26, 2024
The contents of the headers dict wasn't cleared when a new part started, If the second part didn't overwrite the header value, it would appear in the wrong part.
@Kludex Kludex closed this as completed in 8b85d35 Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants