././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732869510.8088465 multipart-1.2.1/CHANGELOG.rst0000644000000000000000000001147414722276607012527 0ustar00========= Changelog ========= This project follows Semantic Versioning (``major.minor.patch``), with the exception that behavior changes are allowed in minor releases as long as the change corrects behavior to match documentation, specification or expectation. In other words: Bugfixes do not count as backward incompatible changes, even if they technically change behavior from *incorrect* to *correct* and may break applications that rely on *incorrect* or *undefined* behavior or *undocumented* APIs. Release 1.2 =========== This release improves error handling, documentation and performance, fixes several parser edge-cases and adds new functionality. API changes are backwards compatible. * feat: Split up `MultipartError`` into more specific exceptions and add HTTP status code hints. All exceptions are subclasses of `MultipartError`. * feat: New `parse_form_data(ignore_errors)` parameter to throw exceptions in non-strict mode, or suppress exceptions in strict mode. Default behavior does not change (throw in strict-mode, ignore in non-strict mode). * feat: New `is_form_request(environ)` helper. * feat: New specialized `content_disposition_[un]quote` functions. * feat: `parse_options_header()` can now use different unquote functions. The default does not change. * fix: `parse_form_data()` no longer checks the request method and the new `is_form_request` function also ignores it. All methods can carry parse-able form data, including unknown methods. The only reliable way is to check the `Content-Type` header, which both functions do. * fix: First boundary not detected if separated by chunk border. * fix: Allow CRLF in front of first boundary, even in strict mode. * fix: Fail fast if first boundary is broken or part of the preamble. * fix: Fail if stream ends without finding any boundary at all. * fix: Use modern WHATWG quoting rules for field names and filenames (#60). Legacy quoting is still supported as a fallback. * fix: `MultiDict.get(index=999)` should return default value, not throw IndexError. * docs: Lots of work on docs and docstrings. * perf: Multiple small performance improvements * build: Require Python 3.8 Release 1.1 =========== This release could have been a patch release, but some of the fixes include change in behavior to match documentation or specification. None of them should be a surprise or have an impact on real-world clients, though. Existing apps should be able to upgrade without issues. * fix: Fail faster on input with invalid line breaks (#55) * fix: Allow empty segment names (#56) * fix: Avoid ResourceWarning when using parse_form_data (#57) * fix: MultipartPart now always has a sensible content type. * fix: Actually check parser state on context manager exit. * fix: Honor Content-Length header, if present. * perf: Reduce overhead for small segments (-21%) * perf: Reduce write overhead for large uploads (-2%) Release 1.0 =========== This release introduces a completely new, fast, non-blocking ``PushMultipartParser`` parser, which now serves as the basis for all other parsers. * feat: new ``PushMultipartParser`` parser. * change: Parser is stricter by default and now rejects clearly broken input. This should not affect data sent by actual browsers or HTTP clients, but may break some artificial unit tests. * Fail on invalid line-breaks in headers or around boundaries. * Fail on invalid header names. * change: Default charset for segment headers and text fields changed to ``utf8``, as recommended by W3C HTTP. * change: Default disk and memory limits for ``MultipartParser`` increased, but multiple other limits were introduced to allow finer control. Check if the new defaults still fit your needs. * change: Several undocumented APIs were deprecated or removed, some of which were not strictly private but should only be used by the parser itself, not by applications. Release 0.2 =========== This release dropped support for Python versions below ``3.6``. Stay on ``multipart-0.1`` if you need Python 2.5+ support. Patch 0.2.5 ----------- * security: Don't test semicolon separators in urlencoded data (#33) * build: Add python-requires directive, indicating Python 3.5 or later is required and preventing older Pythons from attempting to download this version (#32) * fix: Add official support for Python 3.10-3.12 (#38, #48) * fix: Default value of ``copy_file`` should be ``2 ** 16``, not ``2 * 16`` (#41) * docs: Update URL for Bottle (#42) Patch 0.2.4 ----------- * fix: Consistently decode non-utf8 URL-encoded form-data Patch 0.2.3 ----------- * fix: Import MutableMapping from collections.abc (#23) * fix: Allow stream to contain data before first boundary (#25) * tests: Fix a few more ResourceWarnings in the test suite (#24) Patch 0.2.2 ----------- * fix: ResourceWarnings on Python 3 (#21) Patch 0.2.1 ----------- * fix: empty payload (#20) Release 0.1 =========== First release ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726859695.8880606 multipart-1.2.1/LICENSE0000644000000000000000000000205114673344660011502 0ustar00Copyright (c) 2010-2024, Marcel Hellkamp Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1726859695.8880606 multipart-1.2.1/MAINTAINERS.rst0000644000000000000000000000011014673344660012773 0ustar00* Marcel Hellkamp * Colin Watson ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732717221.0986688 multipart-1.2.1/README.rst0000644000000000000000000000575214721625245012172 0ustar00================================= Python multipart/form-data parser ================================= .. image:: https://github.com/defnull/multipart/actions/workflows/test.yaml/badge.svg :target: https://github.com/defnull/multipart/actions/workflows/test.yaml :alt: Tests Status .. image:: https://img.shields.io/pypi/v/multipart.svg :target: https://pypi.python.org/pypi/multipart/ :alt: Latest Version .. image:: https://img.shields.io/pypi/l/multipart.svg :target: https://pypi.python.org/pypi/multipart/ :alt: License .. _HTML5: https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#multipart-form-data .. _RFC7578: https://www.rfc-editor.org/rfc/rfc7578 .. _WSGI: https://peps.python.org/pep-3333 .. _ASGI: https://asgi.readthedocs.io/en/latest/ .. _SansIO: https://sans-io.readthedocs.io/ .. _asyncio: https://docs.python.org/3/library/asyncio.html This module provides a fast incremental non-blocking parser for ``multipart/form-data`` [HTML5_, RFC7578_], as well as blocking alternatives for easier use in WSGI_ or CGI applications: * **PushMultipartParser**: Fast SansIO_ (incremental, non-blocking) parser suitable for ASGI_, asyncio_ and other IO, time or memory constrained environments. * **MultipartParser**: Streaming parser that reads from a byte stream and yields memory- or disk-buffered `MultipartPart` instances. * **WSGI Helper**: High-level functions and containers for WSGI_ or CGI applications with support for both `multipart` and `urlencoded` form submissions. Features ======== * Pure python single file module with no dependencies. * Optimized for both blocking and non-blocking applications. * 100% test coverage with test data from actual browsers and HTTP clients. * High throughput and low latency (see `benchmarks `_). * Predictable memory and disk resource consumption via fine grained limits. * Strict mode: Spent less time parsing malicious or broken inputs. Scope and compatibility ======================= All parsers in this module implement ``multipart/form-data`` as defined by HTML5_ and RFC7578_, supporting all modern browsers or HTTP clients in use today. Legacy browsers (e.g. IE6) are supported to some degree, but only if the required workarounds do not impact performance or security. In detail this means: * Just ``multipart/form-data``, not suitable for email parsing. * No ``multipart/mixed`` support (deprecated in RFC7578_). * No ``base64`` or ``quoted-printable`` transfer encoding (deprecated in RFC7578_). * No ``encoded-word`` or ``name=_charset_`` encoding markers (deprecated in HTML5_). * No support for clearly broken clients (e.g. invalid line breaks or headers). Installation ============ ``pip install multipart`` Documentation ============= Examples and API documentation can be found at: https://multipart.readthedocs.io/ License ======= .. __: https://github.com/defnull/multipart/raw/master/LICENSE Code and documentation are available under MIT License (see LICENSE__). ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732869610.7293882 multipart-1.2.1/multipart.py0000644000000000000000000011340014722276753013073 0ustar00# -*- coding: utf-8 -*- """ This module provides multiple parsers for RFC-7578 `multipart/form-data`, both low-level for framework authors and high-level for WSGI application developers. Copyright (c) 2010-2024, Marcel Hellkamp License: MIT (see LICENSE file) """ __author__ = "Marcel Hellkamp" __version__ = '1.2.1' __license__ = "MIT" __all__ = ["MultipartError", "ParserLimitReached", "ParserError", "StrictParserError", "ParserStateError", "is_form_request", "parse_form_data", "MultipartParser", "MultipartPart", "PushMultipartParser", "MultipartSegment"] import re from io import BytesIO from typing import Iterator, Union, Optional, Tuple, List from urllib.parse import parse_qs from wsgiref.headers import Headers from collections.abc import MutableMapping as DictMixin import tempfile import functools from math import inf ## ### Exceptions ## class MultipartError(ValueError): """ Base class for all parser errors or warnings """ #: Suitable HTTP status code for this exception http_status = 500 # Internal Error class ParserError(MultipartError): """ Detected invalid input """ http_status = 415 # Unsupported Media Type class StrictParserError(ParserError): """ Detected unusual input while parsing in strict mode """ http_status = 415 # Unsupported Media Type class ParserLimitReached(MultipartError): """ Parser reached one of the configured limits """ http_status = 413 # Request Entity Too Large class ParserStateError(MultipartError): """ Parser reachend an invalid state (e.g. use after close) """ http_status = 500 # Internal Error ############################################################################## ################################ Helper & Misc ############################### ############################################################################## # Some of these were copied from bottle: https://bottlepy.org class MultiDict(DictMixin): """ A dict that stores multiple values per key. Most dict methods return the last value by default. There are special methods to get all values. """ def __init__(self, *args, **kwargs): self.dict = {} for arg in args: if hasattr(arg, 'items'): for k, v in arg.items(): self[k] = v else: for k, v in arg: self[k] = v for k, v in kwargs.items(): self[k] = v def __len__(self): return len(self.dict) def __iter__(self): return iter(self.dict) def __contains__(self, key): return key in self.dict def __delitem__(self, key): del self.dict[key] def __str__(self): return str(self.dict) def __repr__(self): return repr(self.dict) def keys(self): return self.dict.keys() def __getitem__(self, key): return self.dict[key][-1] def __setitem__(self, key, value): self.append(key, value) def append(self, key, value): """ Add an additional value to a key. """ self.dict.setdefault(key, []).append(value) def replace(self, key, value): """ Replace all values for a key with a single value. """ self.dict[key] = [value] def getall(self, key): """ Return a list with all values for a key. The list may be empty. """ return self.dict.get(key) or [] def get(self, key, default=None, index=-1): try: return self.dict[key][index] except (KeyError, IndexError): return default def iterallitems(self): """ Yield (key, value) pairs with repeating keys for each value. """ for key, values in self.dict.items(): for value in values: yield key, value def to_bytes(data, enc="utf8"): if isinstance(data, str): data = data.encode(enc) return data def copy_file(stream, target, maxread=-1, buffer_size=2 ** 16): """ Read from :stream and write to :target until :maxread or EOF. """ size, read = 0, stream.read while True: to_read = buffer_size if maxread < 0 else min(buffer_size, maxread - size) part = read(to_read) if not part: return size target.write(part) size += len(part) class _cached_property: """ A property that is only computed once per instance and then replaces itself with an ordinary attribute. Deleting the attribute resets the property. """ def __init__(self, func): functools.update_wrapper(self, func) self.func = func def __get__(self, obj, cls): if obj is None: return self # pragma: no cover value = obj.__dict__[self.func.__name__] = self.func(obj) return value # ------------- # Header Parser # ------------- # ASCII minus control or special chars _token="[a-zA-Z0-9-!#$%&'*+.^_`|~]+" _re_istoken = re.compile("^%s$" % _token, re.ASCII) # A token or quoted-string (simple qs | token | slow qs) _value = r'"[^\\"]*"|%s|"(?:\\.|[^"])*"' % _token # A "; key=value" pair from content-disposition header _option = r'; *(%s) *= *(%s)' % (_token, _value) _re_option = re.compile(_option) def header_quote(val): """ Quote header option values if necessary. Note: This is NOT the way modern browsers quote field names or filenames in Content-Disposition headers. See :func:`content_disposition_quote` """ if _re_istoken.match(val): return val return '"' + val.replace("\\", "\\\\").replace('"', '\\"') + '"' def header_unquote(val, filename=False): """ Unquote header option values. Note: This is NOT the way modern browsers quote field names or filenames in Content-Disposition headers. See :func:`content_disposition_unquote` """ if val[0] == val[-1] == '"': val = val[1:-1] # fix ie6 bug: full path --> filename if filename and (val[1:3] == ":\\" or val[:2] == "\\\\"): val = val.split("\\")[-1] return val.replace("\\\\", "\\").replace('\\"', '"') return val def content_disposition_quote(val): """ Quote field names or filenames for Content-Disposition headers the same way modern browsers do it (see WHATWG HTML5 specification). """ val = val.replace("\r", "%0D").replace("\n", "%0A").replace('"', "%22") return '"' + val + '"' def content_disposition_unquote(val, filename=False): """ Unquote field names or filenames from Content-Disposition headers. Legacy quoting mechanisms are detected to some degree and also supported, but there are rare ambiguous edge cases where we have to guess. If in doubt, this function assumes a modern browser and follows the WHATWG HTML5 specification (limited percent-encoding, no backslash-encoding). """ if '"' == val[0] == val[-1]: val = val[1:-1] if '\\"' in val: # Legacy backslash-escaped quoted strings val = val.replace("\\\\", "\\").replace('\\"', '"') elif "%" in val: # Modern (HTML5) limited percent-encoding val = val.replace("%0D", "\r").replace("%0A", "\n").replace("%22", '"') # ie6/windows bug: full path instead of just filename if filename and (val[1:3] == ":\\" or val[:2] == "\\\\"): val = val.rpartition("\\")[-1] elif "%" in val: # Modern (HTML5) limited percent-encoding val = val.replace("%0D", "\r").replace("%0A", "\n").replace("%22", '"') return val def parse_options_header(header, options=None, unquote=header_unquote): """ Parse Content-Type (or similar) headers into a primary value and an options-dict. Note: For Content-Disposition headers you need a different unquote function. See `content_disposition_unquote`. """ i = header.find(";") if i < 0: return header.lower().strip(), {} options = options or {} for key, val in _re_option.findall(header, i): key = key.lower() options[key] = unquote(val, key == "filename") return header[:i].lower().strip(), options ############################################################################## ################################## SansIO Parser ############################# ############################################################################## # Parser states as constants _PREAMBLE = "PREAMBLE" _HEADER = "HEADER" _BODY = "BODY" _COMPLETE = "END" class PushMultipartParser: def __init__( self, boundary: Union[str, bytes], content_length=-1, max_header_size=4096 + 128, # 4KB should be enough for everyone max_header_count=8, # RFC 7578 allows just 3 max_segment_size=inf, # unlimited max_segment_count=inf, # unlimited header_charset="utf8", strict=False, ): """A push-based (incremental, non-blocking) parser for multipart/form-data. In `strict` mode, the parser will be less forgiving and bail out more quickly when presented with strange or invalid input, avoiding unnecessary work caused by broken or malicious clients. Fatal errors will always trigger exceptions, even in non-strict mode. The various limits are meant as safeguards and exceeding any of those limit will trigger a :exc:`ParserLimitReached` exception. :param boundary: The multipart boundary as found in the Content-Type header. :param content_length: Expected input size in bytes, or -1 if unknown. :param max_header_size: Maximum length of a single header line (name and value). :param max_header_count: Maximum number of headers per segment. :param max_segment_size: Maximum size of a single segment body. :param max_segment_count: Maximum number of segments. :param header_charset: Charset for header names and values. :param strict: Enables additional format and sanity checks. """ self.boundary = to_bytes(boundary) self.content_length = content_length self.header_charset = header_charset self.max_header_size = max_header_size self.max_header_count = max_header_count self.max_segment_size = max_segment_size self.max_segment_count = max_segment_count self.strict = strict self._delimiter = b"\r\n--" + self.boundary # Internal parser state self._parsed = 0 self._fieldcount = 0 self._buffer = bytearray() self._current = None self._state = _PREAMBLE #: True if the parser reached the end of the multipart stream, stopped #: parsing due to an :attr:`error`, or :meth:`` was called. self.closed = False #: A :exc:`MultipartError` instance if parsing failed. self.error: Optional[MultipartError] = None def __enter__(self): return self def __exit__(self, exc_type, exc_val, exc_tb): self.close(check_complete=not exc_type) def parse( self, chunk: Union[bytes, bytearray] ) -> Iterator[Union["MultipartSegment", bytearray, None]]: """Parse a chunk of data and yield as many result objects as possible with the data given. For each multipart segment, the parser will emit a single instance of :class:`MultipartSegment` with all headers already present, followed by zero or more non-empty `bytearray` instances containing parts of the segment body, followed by a single `None` signaling the end of the current segment. The returned iterator will stop if more data is required or if the end of the multipart stream was detected. The iterator must be fully consumed before parsing the next chunk. End of input can be signaled by parsing an empty chunk or closing the parser. This is important to verify the multipart message was parsed completely and the last segment is actually complete. Format errors or exceeded limits will trigger :exc:`MultipartError`. """ try: assert isinstance(chunk, (bytes, bytearray)) if not chunk: self.close() return if self.closed: raise ParserStateError("Parser closed") if self.content_length > -1: available = self._parsed + len(self._buffer) + len(chunk) if self.content_length < available: raise ParserError("Content-Length limit exceeded") if self._state is _COMPLETE: if self.strict: raise StrictParserError("Unexpected data after end of multipart stream") return delimiter = self._delimiter d_len = len(delimiter) buffer = self._buffer buffer += chunk # In-place append bufferlen = len(buffer) offset = 0 while True: if self._state is _PREAMBLE: # Scan for first delimiter (CRLF prefix is optional here) index = buffer.find(delimiter[2:], offset) if index > -1: # Boundary must be at position zero, or start with CRLF if index > 0 and not (index >= 2 and buffer[index-2:index] == b"\r\n"): raise ParserError("Unexpected byte in front of first boundary") next_start = index + d_len tail = buffer[next_start-2 : next_start] if tail == b"\r\n": # Normal delimiter found self._current = MultipartSegment(self) self._state = _HEADER offset = next_start continue elif tail == b"--": # First is also last delimiter offset = next_start self._state = _COMPLETE break # parsing complete elif tail[0:1] == b"\n": # Broken client or legacy test case raise ParserError("Invalid line break after first boundary") elif len(tail) == 2: raise ParserError("Unexpected byte after first boundary") elif self.strict and bufferlen >= d_len: # No boundary in first chunk -> Fail fast in strict mode # and do not waste time consuming a legacy preamble. raise StrictParserError("Boundary not found in first chunk") # Delimiter not found, skip data until we find one offset = bufferlen - (d_len + 2) break # wait for more data elif self._state is _HEADER: # Find end of header line nl = buffer.find(b"\r\n", offset) if nl > offset: # Non-empty header line self._current._add_headerline(buffer[offset:nl]) offset = nl + 2 continue elif nl == offset: # Empty header line -> End of header section self._current._close_headers() yield self._current self._state = _BODY offset += 2 continue else: # No CRLF found -> Ask for more data if buffer.find(b"\n", offset) != -1: raise ParserError("Invalid line break in segment header") if bufferlen - offset > self.max_header_size: raise ParserLimitReached("Maximum segment header length exceeded") break # wait for more data elif self._state is _BODY: # Ensure there is enough data in buffer to fit a delimiter if offset + d_len + 2 > bufferlen: break # wait for more data # Scan for delimiter (CRLF + boundary + (CRLF or '--')) index = buffer.find(delimiter, offset) if index > -1: next_start = index + d_len + 2 tail = buffer[next_start-2 : next_start] if tail == b"\r\n" or tail == b"--": if index > offset: self._current._update_size(index - offset) yield buffer[offset:index] offset = next_start self._current._mark_complete() yield None # End of segment if tail == b"--": # Last delimiter self._state = _COMPLETE break else: # Normal delimiter self._current = MultipartSegment(self) self._state = _HEADER continue # Keep enough in buffer to accout for a partial delimiter at # the end, but emiot the rest. chunk_end = bufferlen - (d_len + 1) assert chunk_end > offset # Always true self._current._update_size(chunk_end - offset) yield buffer[offset:chunk_end] offset = chunk_end break # wait for more data else: # pragma: no cover raise RuntimeError(f"Unexpected internal state: {self._state}") # We ran out of data, or reached the end if offset > 0: self._parsed += offset buffer[:] = buffer[offset:] except Exception as err: if not self.error: self.error = err self.close(check_complete=False) raise def close(self, check_complete=True): """ Close this parser if not already closed. :param check_complete: Raise :exc:`ParserError` if the parser did not reach the end of the multipart stream yet. """ self.closed = True self._current = None del self._buffer[:] if check_complete and not self._state is _COMPLETE: err = ParserError("Unexpected end of multipart stream (parser closed)") if not self.error: self.error = err raise err class MultipartSegment: """ A :class:`MultipartSegment` represents the header section of a single multipart part and provides convenient access to part headers and other details (e.g. :attr:`name` and :attr:`filename`). Each segment also tracks its own content :attr:`size` while the :class:`PushMultipartParser` processes more data, and is marked as :attr:`complete` as soon as the next multipart border is found. Segments do not store or buffer any of their content data, though. """ #: List of headers as name/value pairs with normalized (Title-Case) names. headerlist: List[Tuple[str, str]] #: The 'name' option of the `Content-Disposition` header. Always a string, #: but may be empty. name: str #: The optional 'filename' option of the `Content-Disposition` header. filename: Optional[str] #: The cleaned up `Content-Type` segment header, if present. The value is #: lower-cased and header options (e.g. charset) are removed. content_type: Optional[str] #: The 'charset' option of the `Content-Type` header, if present. charset: Optional[str] #: Segment body size (so far). Will be updated during parsing. size: int #: If true, the segment content was fully parsed and the size value is final. complete: bool def __init__(self, parser: PushMultipartParser): """ Private constructor, used by :class:`PushMultipartParser` """ self._parser = parser if parser._fieldcount+1 > parser.max_segment_count: raise ParserLimitReached("Maximum segment count exceeded") parser._fieldcount += 1 self.headerlist = [] self.size = 0 self.complete = 0 self.name = None self.filename = None self.content_type = None self.charset = None self._clen = -1 self._size_limit = parser.max_segment_size def _add_headerline(self, line: bytearray): assert line and self.name is None parser = self._parser if line[0] in b" \t": # Multi-line header value if not self.headerlist or parser.strict: raise StrictParserError("Unexpected segment header continuation") prev = ": ".join(self.headerlist.pop()) line = prev.encode(parser.header_charset) + b" " + line.strip() if len(line) > parser.max_header_size: raise ParserLimitReached("Maximum segment header length exceeded") if len(self.headerlist) >= parser.max_header_count: raise ParserLimitReached("Maximum segment header count exceeded") try: name, col, value = line.decode(parser.header_charset).partition(":") name = name.strip() if not col or not name: raise ParserError("Malformed segment header") if " " in name or not name.isascii() or not name.isprintable(): raise ParserError("Invalid segment header name") except UnicodeDecodeError as err: raise ParserError("Segment header failed to decode", err) self.headerlist.append((name.title(), value.strip())) def _close_headers(self): assert self.name is None for h,v in self.headerlist: if h == "Content-Disposition": dtype, args = parse_options_header(v, unquote=content_disposition_unquote) if dtype != "form-data": raise ParserError("Invalid Content-Disposition segment header: Wrong type") if "name" not in args and self._parser.strict: raise StrictParserError("Invalid Content-Disposition segment header: Missing name option") self.name = args.get("name", "") self.filename = args.get("filename") elif h == "Content-Type": self.content_type, args = parse_options_header(v) self.charset = args.get("charset") elif h == "Content-Length" and v.isdecimal(): self._clen = int(v) if self.name is None: raise ParserError("Missing Content-Disposition segment header") def _update_size(self, bytecount: int): assert self.name is not None and not self.complete self.size += bytecount if self._clen >= 0 and self.size > self._clen: raise ParserError("Segment Content-Length exceeded") if self.size > self._size_limit: raise ParserLimitReached("Maximum segment size exceeded") def _mark_complete(self): assert self.name is not None and not self.complete if self._clen >= 0 and self.size != self._clen: raise ParserError("Segment size does not match Content-Length header") self.complete = True def header(self, name: str, default=None): """Return the value of a header if present, or a default value.""" compare = name.title() for header in self.headerlist: if header[0] == compare: return header[1] if default is KeyError: raise KeyError(name) return default def __getitem__(self, name): """Return a header value if present, or raise :exc:`KeyError`.""" return self.header(name, KeyError) ############################################################################## ################################## Multipart ################################# ############################################################################## class MultipartParser(object): def __init__( self, stream, boundary, content_length=-1, charset="utf8", strict=False, buffer_size=1024 * 64, header_limit=8, headersize_limit=1024 * 4 + 128, # 4KB part_limit=128, partsize_limit=inf, # unlimited spool_limit=1024 * 64, # Keep fields up to 64KB in memory memory_limit=1024 * 64 * 128, # spool_limit * part_limit disk_limit=inf, # unlimited mem_limit=0, memfile_limit=0, ): """A parser that reads from a `multipart/form-data` encoded byte stream and yields :class:`MultipartPart` instances. The parse acts as a lazy iterator and will only read and parse as much data as needed to return the next part. Results are cached and the same part can be requested multiple times without extra cost. :param stream: A readable byte stream or any other object that implements a :meth:`read(size) ` method. :param boundary: The multipart boundary as found in the Content-Type header. :param charset: Default charset for headers and text fields. :param strict: Enables additional format and sanity checks. :param buffer_size: Chunk size when reading from the source stream. :param header_limit: Maximum number of headers per part. :param headersize_limit: Maximum length of a single header line (name and value). :param part_limit: Maximum number of parts. :param partsize_limit: Maximum content size of a single parts. :param spool_limit: Parts up to this size are buffered in memory and count towards `memory_limit`. Larger parts are spooled to temporary files on disk and count towards `disk_limit`. :param memory_limit: Maximum size of all memory-buffered parts. Should be smaller than ``spool_limit * part_limit`` to have an effect. :param disk_limit: Maximum size of all disk-buffered parts. """ self.stream = stream self.boundary = boundary self.content_length = content_length self.charset = charset self.strict = strict self.buffer_size = buffer_size self.header_limit = header_limit self.headersize_limit = headersize_limit self.part_limit = part_limit self.partsize_limit = partsize_limit self.memory_limit = mem_limit or memory_limit self.spool_limit = min(memfile_limit or spool_limit, self.memory_limit) self.disk_limit = disk_limit self._done = [] self._part_iter = None def __iter__(self): """ Parse the multipart stream and yield :class:`MultipartPart` instances as soon as they are available. """ if not self._part_iter: self._part_iter = self._iterparse() if self._done: yield from self._done for part in self._part_iter: self._done.append(part) yield part def parts(self): """ Parse the entire multipart stream and return all :class:`MultipartPart` instances as a list. """ return list(self) def get(self, name, default=None): """ Return the first part with a given name, or the default value if no matching part exists. """ for part in self: if name == part.name: return part return default def get_all(self, name): """ Return all parts with the given name. """ return [p for p in self if p.name == name] def _iterparse(self): read = self.stream.read bufsize = self.buffer_size mem_used = disk_used = 0 readlimit = self.content_length part = None parser = PushMultipartParser( boundary=self.boundary, content_length=self.content_length, max_header_count=self.header_limit, max_header_size=self.headersize_limit, max_segment_count=self.part_limit, max_segment_size=self.partsize_limit, header_charset=self.charset, ) with parser: while not parser.closed: if readlimit >= 0: chunk = read(min(bufsize, readlimit)) readlimit -= len(chunk) else: chunk = read(bufsize) for event in parser.parse(chunk): if isinstance(event, MultipartSegment): part = MultipartPart( buffer_size=self.buffer_size, memfile_limit=self.spool_limit, charset=self.charset, segment=event, ) elif event: part._write(event) if part.is_buffered(): if part.size + mem_used > self.memory_limit: raise ParserLimitReached("Memory limit reached") elif part.size + disk_used > self.disk_limit: raise ParserLimitReached("Disk limit reached") else: if part.is_buffered(): mem_used += part.size else: disk_used += part.size part._mark_complete() yield part part = None class MultipartPart(object): """ A :class:`MultipartPart` represents a fully parsed multipart part and provides convenient access to part headers and other details (e.g. :attr:`name` and :attr:`filename`) as well as its memory- or disk-buffered binary or text content. """ def __init__( self, buffer_size=2**16, memfile_limit=2**18, charset="utf8", segment: "MultipartSegment" = None, ): """ Private constructor, used by :class:`MultipartParser` """ self._segment = segment #: A file-like buffer holding the parts binary content, or None if this #: part was :meth:`closed `. self.file = BytesIO() #: Part size in bytes. self.size = 0 #: Part name. self.name = segment.name #: Part filename (if defined). self.filename = segment.filename #: Charset as defined in the part header, or the parser default charset. self.charset = segment.charset or charset #: All part headers as a list of (name, value) pairs. self.headerlist = segment.headerlist self.memfile_limit = memfile_limit self.buffer_size = buffer_size @_cached_property def headers(self) -> Headers: """ A convenient dict-like holding all part headers. """ return Headers(self._segment.headerlist) @_cached_property def disposition(self) -> str: """ The value of the `Content-Disposition` part header. """ return self._segment.header("Content-Disposition") @_cached_property def content_type(self) -> str: """ Cleaned up content type provided for this part, or a sensible default (`application/octet-stream` for files and `text/plain` for text fields). """ return self._segment.content_type or ( "application/octet-stream" if self.filename else "text/plain") def _write(self, chunk): self.size += len(chunk) self.file.write(chunk) if self.size > self.memfile_limit: old = self.file self.file = tempfile.TemporaryFile() self.file.write(old.getvalue()) self._write = self._write_nocheck def _write_nocheck(self, chunk): self.size += len(chunk) self.file.write(chunk) def _mark_complete(self): self.file.seek(0) def is_buffered(self): """ Return true if :attr:`file` is memory-buffered, or false if the part was larger than the `spool_limit` and content was spooled to temporary files on disk. """ return isinstance(self.file, BytesIO) @property def value(self): """Return the entire payload as a decoded text string. Warning, this may consume a lot of memory, check :attr:`size` first. """ return self.raw.decode(self.charset) @property def raw(self): """Return the entire payload as a raw byte string. Warning, this may consume a lot of memory, check :attr:`size` first. """ pos = self.file.tell() self.file.seek(0) val = self.file.read() self.file.seek(pos) return val def save_as(self, path): """ Save a copy of this part to `path` and return the number of bytes written. """ with open(path, "wb") as fp: pos = self.file.tell() try: self.file.seek(0) size = copy_file(self.file, fp, buffer_size=self.buffer_size) finally: self.file.seek(pos) return size def close(self): """ Close :attr:`file` and set it to `None` to free up resources. """ if self.file: self.file.close() self.file = False ############################################################################## #################################### WSGI #################################### ############################################################################## def is_form_request(environ): """ Return True if the environ represents a form request that can be parsed with :func:`parse_form_data`. Checks for a compatible `Content-Type` header. """ content_type = environ.get("CONTENT_TYPE", "") return content_type.split(";", 1)[0].strip().lower() in ( "multipart/form-data", "application/x-www-form-urlencoded", "application/x-url-encoded" ) def parse_form_data( environ, charset="utf8", strict=False, ignore_errors=None, **kwargs): """ Parses both types of form data (multipart and url-encoded) from a WSGI environment and returns two :class:`MultiDict` instances, one for text form fields (strings) and one for file uploads (:class:`MultipartPart` instances). Text fields that are too big to fit into memory limits are treated as file uploads with no filename. In case of an url-encoded form request, the total request body size is limited by `memory_limit`. Larger requests will trigger an error. :param environ: A WSGI environment dictionary. Only `wsgi.input`, `CONTENT_TYPE` and `CONTENT_LENGTH` are used. :param charset: The default charset used to decode headers and text fields. :param strict: Enables additional format and sanity checks. :param ignore_errors: If True, suppress all exceptions. The returned results may be empty or incomplete. If False, then exceptions are not suppressed. A value of None (default) throws exceptions in strict mode but suppresses errors in non-strict mode. :param kwargs: Additional keyword arguments are forwarded to :class:`MultipartParser`. This is particularly useful to change the default parser limits. :raises MultipartError: See `ignore_errors` parameters. """ forms, files = MultiDict(), MultiDict() try: stream = environ.get("wsgi.input") if not stream: if strict: raise StrictParserError("No 'wsgi.input' in WSGI environment") stream = BytesIO() content_type = environ.get("CONTENT_TYPE", "") if not content_type: if strict: raise StrictParserError("Missing Content-Type header") return forms, files try: content_length = int(environ.get("CONTENT_LENGTH", -1)) except ValueError: raise ParserError("Invalid Content-Length header") content_type, options = parse_options_header(content_type) kwargs["charset"] = charset = options.get("charset", charset) if content_type == "multipart/form-data": boundary = options.get("boundary", "") if not boundary: raise ParserError("Missing boundary for multipart/form-data") for part in MultipartParser(stream, boundary, content_length, **kwargs): if part.filename or not part.is_buffered(): files.append(part.name, part) else: # TODO: Big form-fields go into the files dict. Really? forms.append(part.name, part.value) part.close() elif content_type in ( "application/x-www-form-urlencoded", "application/x-url-encoded", ): mem_limit = kwargs.get("memory_limit", kwargs.get("mem_limit", 1024*64*128)) if content_length > -1: if content_length > mem_limit: raise ParserLimitReached("Memory limit exceeded") data = stream.read(min(mem_limit, content_length)) if len(data) < content_length: raise ParserError("Unexpected end of data stream") else: data = stream.read(mem_limit + 1) if len(data) > mem_limit: raise ParserLimitReached("Memory limit exceeded") data = data.decode(charset) data = parse_qs(data, keep_blank_values=True, encoding=charset) for key, values in data.items(): for value in values: forms.append(key, value) elif strict: raise StrictParserError("Unsupported Content-Type") except MultipartError: if ignore_errors is None: ignore_errors = not strict if not ignore_errors: for _, part in files.iterallitems(): part.close() raise return forms, files ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1732869534.3160682 multipart-1.2.1/pyproject.toml0000644000000000000000000000260614722276636013421 0ustar00[build-system] requires = ["flit_core >=3.9,<4"] build-backend = "flit_core.buildapi" [project] name = "multipart" requires-python = ">=3.8" dynamic = ["version"] license = {text = "MIT License"} description = "Parser for multipart/form-data" readme = "README.rst" authors = [ {name = "Marcel Hellkamp", email = "marc@gsites.de"}, ] classifiers = [ "Development Status :: 6 - Mature", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Topic :: Internet :: WWW/HTTP", "Topic :: Internet :: WWW/HTTP :: Dynamic Content", "Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries", "Topic :: Internet :: WWW/HTTP :: WSGI", "Programming Language :: Python :: 3", ] [project.urls] PyPI = "https://pypi.org/project/multipart/" Homepage = "https://multipart.readthedocs.io/" Documentation = "https://multipart.readthedocs.io/" Changelog = "https://multipart.readthedocs.io/en/latest/changelog.html" Source = "https://github.com/defnull/multipart" Issues = "https://github.com/defnull/multipart/issues" [project.optional-dependencies] dev = [ "pytest", "pytest-cov", "build", "twine", ] docs = [ "sphinx>=8,<9", "sphinx-autobuild", ] [tool.flit.sdist] include = [ "test/*.py", "README.rst", "MAINTAINERS.rst", "CHANGELOG.rst", "LICENSE" ] [tool.pytest.ini_options] addopts = "-ra" testpaths = [ "test" ] ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1727974302.7122154 multipart-1.2.1/test/__init__.py0000644000000000000000000000000014677545637013570 0ustar00././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1727974302.7122154 multipart-1.2.1/test/__main__.py0000644000000000000000000000042514677545637013564 0ustar00import unittest import pathlib import sys if __name__ == '__main__': suite = unittest.defaultTestLoader.discover(pathlib.Path(__file__).parent.parent) result = unittest.TextTestRunner(verbosity=0).run(suite) sys.exit((result.errors or result.failures) and 1 or 0) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1731774273.4053965 multipart-1.2.1/test/test_header_utils.py0000644000000000000000000000421014716143501015521 0ustar00# -*- coding: utf-8 -*- import functools import unittest import multipart class TestHeaderParser(unittest.TestCase): def test_token_unquote(self): unquote = multipart.header_unquote self.assertEqual('foo', unquote('"foo"')) self.assertEqual('foo"bar', unquote('"foo\\"bar"')) self.assertEqual('ie.exe', unquote('"\\\\network\\ie.exe"', True)) self.assertEqual('ie.exe', unquote('"c:\\wondows\\ie.exe"', True)) unquote = multipart.content_disposition_unquote self.assertEqual('foo', unquote('"foo"')) self.assertEqual('foo"bar', unquote('foo%22bar')) self.assertEqual('foo"bar', unquote('"foo%22bar"')) self.assertEqual('foo"bar', unquote('"foo\\"bar"')) self.assertEqual('ie.exe', unquote('"\\\\network\\ie.exe"', True)) self.assertEqual('ie.exe', unquote('"c:\\wondows\\ie.exe"', True)) def test_token_quote(self): quote = multipart.header_quote self.assertEqual(quote('foo'), 'foo') self.assertEqual(quote('foo"bar'), '"foo\\"bar"') quote = multipart.content_disposition_quote self.assertEqual(quote('foo'), '"foo"') self.assertEqual(quote('foo"bar'), '"foo%22bar"') def test_options_parser(self): parse = multipart.parse_options_header head = 'form-data; name="Test"; ' self.assertEqual(parse(head+'filename="Test.txt"')[0], 'form-data') self.assertEqual(parse(head+'filename="Test.txt"')[1]['name'], 'Test') self.assertEqual(parse(head+'filename="Test.txt"')[1]['filename'], 'Test.txt') self.assertEqual(parse(head+'FileName="Te\\"s\\\\t.txt"')[1]['filename'], 'Te"s\\t.txt') self.assertEqual(parse(head+'filename="C:\\test\\bla.txt"')[1]['filename'], 'bla.txt') self.assertEqual(parse(head+'filename="\\\\test\\bla.txt"')[1]['filename'], 'bla.txt') self.assertEqual(parse(head+'filename="täst.txt"')[1]['filename'], 'täst.txt') parse = functools.partial(multipart.parse_options_header, unquote=multipart.content_disposition_unquote) self.assertEqual(parse(head+'FileName="Te%22s\\\\t.txt"')[1]['filename'], 'Te"s\\\\t.txt') ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1731773105.9496706 multipart-1.2.1/test/test_legacy_parser.py0000644000000000000000000002051514716141262015701 0ustar00# -*- coding: utf-8 -*- from .utils import BaseParserTest import unittest import base64 import os.path, tempfile from io import BytesIO import multipart as multipart from multipart import to_bytes #TODO: bufsize=10, line=1234567890--boundary\n #TODO: bufsize < len(boundary) (should not be possible) #TODO: bufsize = len(boundary)+5 (edge case) #TODO: At least one test per possible exception (100% coverage) class TestMultipartParser(BaseParserTest): def test_copyfile(self): source = BytesIO(to_bytes('abc')) target = BytesIO() self.assertEqual(multipart.copy_file(source, target), 3) target.seek(0) self.assertEqual(target.read(), to_bytes('abc')) def test_big_file(self): ''' If the size of an uploaded part exceeds memfile_limit, it is written to disk. ''' test_file = 'abc'*1024 parser = self.parser( '--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file, '\r\n--foo\r\n', 'Content-Disposition: form-data; name="file2"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file + 'a', '\r\n--foo\r\n', 'Content-Disposition: form-data; name="file3"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file*2, '\r\n--foo--', memfile_limit=len(test_file)) self.assertEqual(parser.get('file1').file.read(), to_bytes(test_file)) self.assertTrue(parser.get('file1').is_buffered()) self.assertEqual(parser.get('file2').file.read(), to_bytes(test_file + 'a')) self.assertFalse(parser.get('file2').is_buffered()) self.assertEqual(parser.get('file3').file.read(), to_bytes(test_file*2)) self.assertFalse(parser.get('file3').is_buffered()) def test_get_all(self): ''' Test the get() and get_all() methods. ''' p = self.parser('--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc'*1024, '\r\n--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'def'*1024, '\r\n--foo--') self.assertEqual(p.get('file1').file.read(), to_bytes('abc'*1024)) self.assertEqual(p.get('file2'), None) self.assertEqual(len(p.get_all('file1')), 2) self.assertEqual(p.get_all('file1')[1].file.read(), to_bytes('def'*1024)) self.assertEqual(p.get_all('file1'), p.parts()) def test_file_seek(self): ''' The file object should be readable withoud a seek(0). ''' test_file = 'abc'*1024 p = self.parser( '--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file, '\r\n--foo--') self.assertEqual(p.get('file1').file.read(), to_bytes(test_file)) self.assertEqual(p.get('file1').value, test_file) def test_unicode_value(self): ''' The .value property always returns unicode ''' test_file = 'abc'*1024 p = self.parser('--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file, '\r\n--foo--') self.assertEqual(p.get('file1').file.read(), to_bytes(test_file)) self.assertEqual(p.get('file1').value, test_file) self.assertTrue(hasattr(p.get('file1').value, 'encode')) def test_save_as(self): ''' save_as stores data in a file keeping the file position. ''' def tmp_file_name(): # create a temporary file name (on Python 2.6+ NamedTemporaryFile # with delete=False could be used) fd, fname = tempfile.mkstemp() f = os.fdopen(fd) f.close() return fname test_file = 'abc'*1024 p = self.parser('--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file, '\r\n--foo--') self.assertEqual(p.get('file1').file.read(1024), to_bytes(test_file)[:1024]) tfn = tmp_file_name() p.get('file1').save_as(tfn) tf = open(tfn, 'rb') self.assertEqual(tf.read(), to_bytes(test_file)) tf.close() self.assertEqual(p.get('file1').file.read(), to_bytes(test_file)[1024:]) def test_part_header(self): ''' HTTP allows headers to be multiline. ''' p = self.parser('--foo\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', "xxx", '\r\n--foo--') part = p.get("file1") self.assertEqual(part.file.read(), b"xxx") self.assertEqual(part.size, 3) self.assertEqual(part.name, "file1") self.assertEqual(part.filename, "random.png") self.assertEqual(part.charset, "utf8") self.assertEqual(part.headerlist, [ ('Content-Disposition','form-data; name="file1"; filename="random.png"'), ('Content-Type','image/png') ]) self.assertEqual(part.headers["CoNtEnT-TyPe"], "image/png") self.assertEqual(part.disposition, 'form-data; name="file1"; filename="random.png"') self.assertEqual(part.content_type, "image/png") def test_multiline_header(self): ''' HTTP allows headers to be multiline. ''' test_file = to_bytes('abc'*1024) test_text = u'Test text\n with\r\n ümläuts!' p = self.parser('--foo\r\n', 'Content-Disposition: form-data;\r\n', '\tname="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', test_file, '\r\n--foo\r\n', 'Content-Disposition: form-data;\r\n', ' name="text"\r\n', '\r\n', test_text, '\r\n--foo--') self.assertEqual(p.get('file1').file.read(), test_file) self.assertEqual(p.get('file1').filename, 'random.png') self.assertEqual(p.get('text').value, test_text) def test_disk_limit(self): with self.assertRaises(multipart.MultipartError): self.write_field("file1", 'x'*1025, filename="foo.bin") self.write_end() self.parser(spool_limit=10, disk_limit=1024) def test_spool_limit(self): self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_field("file2", 'x'*1025, filename="foo.bin") self.write_end() p = self.parser(spool_limit=1024) self.assertTrue(p.get("file1").is_buffered()) self.assertFalse(p.get("file2").is_buffered()) def test_spool_limit_nocheck_write_func(self): self.write_field("file1", 'x'*10240, filename="foo.bin") self.write_end() p = self.parser(spool_limit=1024, buffer_size=1024) # A large upload should trigger the fast _write_nocheck path self.assertEqual(p.get("file1")._write, p.get("file1")._write_nocheck) def test_memory_limit(self): self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_end() p = self.parser(memory_limit=1024) self.assertTrue(p.get("file1").is_buffered()) self.reset() self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_field("file2", 'x', filename="foo.bin") self.write_end() with self.assertMultipartError("Memory limit reached"): p = self.parser(memory_limit=1024) def test_content_length(self): self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_end() clen = len(self.get_buffer_copy().getvalue()) # Correct content length list(self.parser(content_length=clen)) # Short content length with self.assertMultipartError("Unexpected end of multipart stream"): list(self.parser(content_length=clen-1)) # Large content length (we don't care) list(self.parser(content_length=clen+1)) def test_segment_close_twice(self): self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_end() # Correct content length file1 = self.parser().get("file1") self.assertFalse(file1.file.closed) file1.close() self.assertFalse(file1.file) file1.close() # Do nothing ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1727974302.7122154 multipart-1.2.1/test/test_multdict.py0000644000000000000000000000330414677545637014727 0ustar00# -*- coding: utf-8 -*- import unittest import multipart as multipart class TestMultiDict(unittest.TestCase): def test_init(self): md = multipart.MultiDict([("a", "1")], {"a": "2"}, a="3") self.assertEqual(md.dict, {"a": ["1", "2", "3"]}) def test_append(self): md = multipart.MultiDict() md["a"] = "1" md["a"] = "2" md.append("a", "3") md.update(a="4") self.assertEqual(md.dict, {"a": ["1", "2", "3", "4"]}) def test_behaves_like_dict(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) self.assertTrue("a" in md) self.assertFalse("b" in md) self.assertTrue("a" in md.keys()) self.assertEqual(list(md), ["a"]) del md["a"] self.assertTrue("a" not in md) def test_access_last(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) self.assertEqual(md["a"], "2") self.assertEqual(md.get("a"), "2") self.assertEqual(md.get("b"), None) def test_replace(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) md.replace("a", "3") self.assertEqual(md.dict, {"a": ["3"]}) def test_str_repr(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) self.assertEqual(str(md), str(md.dict)) self.assertEqual(repr(md), repr(md.dict)) def test_access_index(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) self.assertEqual(md.get("a", index=0), "1") def test_access_all(self): md = multipart.MultiDict([("a", "1"), ("a", "2")]) self.assertEqual(md.getall("a"), ["1", "2"]) self.assertEqual(list(md.iterallitems()), [("a", "1"), ("a", "2")]) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1731774273.4053965 multipart-1.2.1/test/test_push_parser.py0000644000000000000000000013464214716143501015421 0ustar00# -*- coding: utf-8 -*- """ Tests for the PushMultipartParser all other parsers are based on. """ from contextlib import contextmanager import unittest from base64 import b64decode import multipart def assertStrict(text): def decorator(func): def wrapper(self): func(self, strict=False) with self.assertRaisesRegex(multipart.MultipartError, text): func(self, strict=True) return wrapper return decorator class PushTestBase(unittest.TestCase): def setUp(self): self.parser = None self.reset() self.events = [] @contextmanager def assertParseError(self, errortext): with self.assertRaises(multipart.MultipartError) as r: yield fullmsg = " ".join(map(str, r.exception.args)) self.assertTrue(errortext in fullmsg, f"{errortext!r} not in {fullmsg!r}") def reset(self, **ka): ka.setdefault("boundary", "boundary") self.parser = multipart.PushMultipartParser(**ka) self.events = [] return self def parse(self, *chunks): events = [] for chunk in chunks: events += list(self.parser.parse(multipart.to_bytes(chunk))) self.events += events return events def compact_events(self): current = None data = [] for event in self.events: if isinstance(event, multipart.MultipartSegment): current = event elif event: data.append(event) else: yield current, b''.join(data) current = None data = [] if current: yield current, b''.join(data) def get_segment(self, index_or_name): allnames = [] for i, (segment, body) in enumerate(self.compact_events()): allnames.append(segment.name) if index_or_name == i or index_or_name == segment.name: return segment, body self.fail(f"Segment {index_or_name!r} not found in {allnames!r}") class TestPushParser(PushTestBase): def test_data_after_terminator(self): self.parse(b"--boundary--") self.parse(b"junk") # Fine self.reset(strict=True) self.parse(b"--boundary--") with self.assertRaises(multipart.MultipartError): self.parse(b"junk") def test_eof_before_clen(self): self.reset(content_length=100) self.parse(b"--boundary") with self.assertParseError("Unexpected end of multipart stream (parser closed)"): self.parse(b"") def test_data_after_eof(self): self.parse(b"--boundary--") assert self.parser._state == multipart._COMPLETE assert not self.parser.closed self.parse(b"") assert self.parser.closed with self.assertParseError("Parser closed"): self.parse(b"junk") def test_eof_before_terminator(self): self.parse(b"--boundary") with self.assertParseError("Unexpected end of multipart stream"): self.parse(b"") def test_data_after_clen(self): self.reset(content_length=12) with self.assertParseError("Content-Length limit exceeded"): self.parse(b"--boundary\r\njunk") def test_clen_match(self): self.reset(content_length=12) self.parse(b"--boundary--") assert self.parser._state is multipart._COMPLETE def test_junk_before(self): with self.assertParseError("Unexpected byte in front of first boundary"): self.parse(b"junk--boundary--") @assertStrict("Unexpected data after end of multipart stream") def test_junk_after(self, strict): self.reset(strict=strict) self.parse(b"--boundary--") self.parse(b"junk") def test_close_before_end(self): self.parse(b"--boundary") with self.assertParseError("Unexpected end of multipart stream"): self.parser.close() def test_autoclose(self): with self.parser: self.parse(b"--boundary--") self.reset() with self.assertParseError("Unexpected end of multipart stream (parser closed)"): with self.parser: self.parse(b"--boundary") def test_invalid_NL_delimiter(self): with self.assertParseError("Invalid line break after first boundary"): self.parse(b"--boundary\nfoo") def test_invalid_NL_header(self): with self.assertParseError("Invalid line break in segment header"): self.parse(b"--boundary\r\nfoo:bar\nbar:baz") def test_header_size_limit(self): self.reset(max_header_size=1024) self.parse(b"--boundary\r\n") with self.assertParseError("Maximum segment header length exceeded"): self.parse(b"Header: " + b"x" * (1024)) self.reset(max_header_size=1024, strict=True) self.parse(b"--boundary\r\n") with self.assertRaisesRegex( multipart.MultipartError, "Maximum segment header length exceeded" ): self.parse(b"Header: " + b"x" * (1024) + b"\r\n") def test_header_count_limit(self): self.reset(max_header_count=10) self.parse(b"--boundary\r\n") for i in range(10): self.parse(b"Header: value\r\n") with self.assertParseError("Maximum segment header count exceeded"): self.parse(b"Header: value\r\n") @assertStrict("Unexpected segment header continuation") def test_header_continuation(self, strict): self.reset(strict=strict) self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data;\r\n") self.parse(b'\tname="foo"\r\n') parts = self.parse(b"\r\ndata\r\n--boundary--") self.assertEqual( [("Content-Disposition", 'form-data; name="foo"')], parts[0].headerlist ) self.assertEqual(b"data", parts[1]) def test_header_continuation_first(self): self.parse(b"--boundary\r\n") with self.assertParseError("Unexpected segment header continuation"): self.parse(b"\tbad: header\r\n\r\ndata\r\n--boundary--") def test_header_continuation_long(self): self.reset(max_header_size=1024) self.parse(b"--boundary\r\n") self.parse(b"Header: " + b"v" * 1000 + b"\r\n") with self.assertParseError("Maximum segment header length exceeded"): self.parse(b"\tmoooooooooooooooooooooooooore value\r\n") def test_header_bad_name(self): self.reset() with self.assertParseError("Malformed segment header"): self.parse(b"--boundary\r\nno-colon\r\n\r\ndata\r\n--boundary--") self.reset() with self.assertParseError("Malformed segment header"): self.parse(b"--boundary\r\n:empty-name\r\n\r\ndata\r\n--boundary--") for badchar in (b" ", b"\0", b"\r", b"\n", "ö".encode("utf8")): self.reset() with self.assertParseError("Invalid segment header name"): self.parse( b"--boundary\r\ninvalid%sname:value\r\n\r\ndata\r\n--boundary--" % badchar ) self.reset() with self.assertParseError("Segment header failed to decode"): self.parse( b"--boundary\r\ninvalid\xc3\x28:value\r\n\r\ndata\r\n--boundary--" ) def test_header_wrong_segment_subtype(self): with self.assertParseError("Invalid Content-Disposition segment header: Wrong type"): self.parse( b"--boundary\r\nContent-Disposition: mixed\r\n\r\ndata\r\n--boundary--" ) def test_segment_empty_name(self): self.parse(b"--boundary\r\n") parts = self.parse(b"Content-Disposition: form-data; name\r\n\r\n") self.assertEqual(parts[0].name, "") self.parse(b"\r\n--boundary\r\n") parts = self.parse(b"Content-Disposition: form-data; name=\r\n\r\n") self.assertEqual(parts[0].name, "") self.parse(b"\r\n--boundary\r\n") parts = self.parse(b'Content-Disposition: form-data; name=""\r\n\r\n') self.assertEqual(parts[0].name, "") @assertStrict("Invalid Content-Disposition segment header: Missing name option") def test_segment_missing_name(self, strict): self.reset(strict=strict) self.parse(b"--boundary\r\n") parts = self.parse(b"Content-Disposition: form-data;\r\n\r\n") print(parts) self.assertEqual(parts[0].name, "") def test_segment_count_limit(self): self.reset(max_segment_count=1) self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo\r\n") self.parse(b"\r\n") with self.assertParseError("Maximum segment count exceeded"): self.parse(b"\r\n--boundary\r\n") def test_segment_size_limit(self): self.reset(max_segment_size=5) self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo\r\n") self.parse(b"\r\n") with self.assertParseError("Maximum segment size exceeded"): self.parse(b"123456") self.parse(b"\r\n--boundary\r\n") def test_partial_parts(self): self.reset() self.assertEqual([], self.parse(b"--boundary\r\n")) self.assertEqual( [], self.parse(b'Content-Disposition: form-data; name="foo"\r\n') ) part = self.parse(b"\r\n")[0] self.assertEqual( [("Content-Disposition", 'form-data; name="foo"')], part.headerlist ) # Write enough body data to trigger a new part part = self.parse(b"body" * 10)[0] # Write partial boundary, should stay incomplete part = self.parse(b"more\r\n--boundary")[0] # Turn the incomplete boundary into a terminator parts = self.parse(b"--") self.assertIsNone(parts[-1]) def test_segment_clen(self): self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo\r\n") self.parse(b"Content-Length: 10\r\n") self.parse(b"\r\n") self.parse(b"x" * 10) self.parse(b"\r\n--boundary--") def test_segment_clen_exceeded(self): self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo\r\n") self.parse(b"Content-Length: 10\r\n") self.parse(b"\r\n") with self.assertParseError("Segment Content-Length exceeded"): self.parse(b"x" * 11) self.parse(b"\r\n--boundary--") def test_segment_clen_not_reached(self): self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo\r\n") self.parse(b"Content-Length: 10\r\n") self.parse(b"\r\n") with self.assertParseError("Segment size does not match Content-Length header"): self.parse(b"x" * 9) self.parse(b"\r\n--boundary--") def test_segment_handle_access(self): self.parse(b"--boundary\r\n") self.parse(b"Content-Disposition: form-data; name=foo; filename=bar.txt\r\n") self.parse(b"Content-Type: text/x-foo; charset=ascii\r\n") part = self.parse(b"\r\n")[0] self.assertEqual(part.header("Content-Type"), "text/x-foo; charset=ascii") self.assertEqual(part.header("CONTENT-Type"), "text/x-foo; charset=ascii") self.assertEqual(part["Content-Type"], "text/x-foo; charset=ascii") self.assertEqual(part["CONTENT-Type"], "text/x-foo; charset=ascii") self.assertEqual(part.name, "foo") self.assertEqual(part.filename, "bar.txt") self.assertEqual(part.header("Missing"), None) self.assertEqual(part.header("Missing", 5), 5) with self.assertRaises(KeyError): part["Missing"] def test_part_ends_after_header(self): with self.assertRaises(multipart.MultipartError), self.parser: self.parse('--boundary\r\n', 'Header: value\r\n', '\r\n--boundary--') def test_part_ends_in_header(self): with self.assertRaises(multipart.MultipartError), self.parser: self.parse('--boundary\r\n', 'Header: value', '\r\n--boundary--') def test_no_terminator(self): with self.assertRaises(multipart.MultipartError), self.parser: self.parse('--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc') def test_no_newline_after_content(self): with self.assertRaises(multipart.MultipartError), self.parser: self.parse('--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc', '--boundary--') def test_no_newline_after_middle_content(self): with self.parser: self.parse( '--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc', '--boundary\r\n' 'Content-Disposition: form-data; name="file2"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--') segment, body = self.get_segment("file1") self.assertTrue(body.startswith(b"abc--boundary\r\n")) self.assertTrue(body.endswith(b"abc")) @assertStrict("Boundary not found in first chunk") def test_ignore_junk_before_start_boundary(self, strict): self.reset(strict=strict) self.parse('Lots of junk lots of junk', '\r\n--boundary\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--') self.parser.close() def test_reject_boundary_in_preamble(self): """ The RFC defines that a boundary must not appear in segment bodies, but technically it is still allowed to appear in the preamble as long as it does not qualify as a full start delimiter (position zero, or separated from the preamble by CRLF). This is absurd, preambles are useless to begin with and the boundary appearing in the preamble is never intentional. Instead of silently skipping it (and the first segment), we assume a broken client and fail fast, even in non-strict mode. A clear error is better as silently loosing data. """ with self.assertParseError("Unexpected byte in front of first boundary"): self.parse( 'Preamble\n', '--boundary\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--') self.reset() with self.assertParseError("Unexpected byte in front of first boundary"): self.parse('\n--boundary--') self.reset() with self.assertParseError("Unexpected byte after first boundary"): self.parse( '--boundaryy\r\n''--boundary\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--') def test_accept_crln_before_start_boundary(self): """ While uncommon, a single \\r\\n before and after the first and last boundary should be accepted even in strict mode. """ self.reset(strict=True) self.parse('\r\n--boundary\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--\r\n') def test_allow_junk_after_end_boundary(self): self.parse('--boundary--\r\njunk') self.reset() self.parse('--boundary\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--\r\n', 'junk') def test_partial_start_boundary(self): self.parse('--boun', 'dary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--\r\n', 'junk') def test_tiny_chunks(self): payload = list(''.join(['--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--boundary--\r\n'])) for char in payload: self.parse(char) self.assertEqual(self.get_segment("file1")[1], b"abc") def test_no_boundary(self): with self.assertParseError("Unexpected end of multipart stream (parser closed)"): self.parse('Not a multipart message') self.parser.close() # Strict mode should fail quicker self.reset(strict=True) with self.assertParseError("Boundary not found in first chunk"): self.parse('Not a multipart message') def test_no_start_boundary(self): with self.assertRaises(multipart.MultipartError), self.parser: self.parse('--bar\r\n','--nonsense\r\n' 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n', '--nonsense--') def test_no_end_boundary(self): with self.assertRaises(multipart.MultipartError): self.parse('--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc\r\n') self.parser.close() def test_empty_part(self): self.parse('--boundary\r\n', '--boundary--') with self.assertRaises(multipart.MultipartError): self.parser.close() def test_invalid_header(self): with self.assertRaises(multipart.MultipartError): self.parse('--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', 'Bad header\r\n', '\r\n', 'abc'*1024+'\r\n', '--boundary--') def test_content_length_to_small(self): with self.assertRaises(multipart.MultipartError): self.parse('--boundary\r\n', 'Content-Disposition: form-data; name="file1"; filename="random.png"\r\n', 'Content-Type: image/png\r\n', 'Content-Length: 111\r\n', '\r\n', 'abc'*1024, '\r\n--boundary--') def test_no_disposition_header(self): with self.assertRaises(multipart.MultipartError): self.parse('--boundary\r\n', 'Content-Type: image/png\r\n', '\r\n', 'abc'*1024+'\r\n', '--boundary--') def test_error_property(self): with self.assertRaises(multipart.MultipartError): self.parse('--boundary\r\njunk\r\n--boundary--') self.assertIsInstance(self.parser.error, multipart.ParserError) def test_error_twice(self): with self.assertRaises(multipart.ParserError): self.parse('--boundary\r\njunk\r\n--boundary--') first_error = self.parser.error self.assertIsInstance(first_error, multipart.ParserError) # The first error should stick with self.assertRaises(multipart.ParserStateError): self.parse('more junk') self.assertIs(self.parser.error, first_error) with self.assertRaises(multipart.ParserError): self.parser.close() self.assertIs(self.parser.error, first_error) ''' The files used by the following test were taken from the werkzeug library test suite and are therefore partly copyrighted by the Werkzeug Team under BSD licence. See https://werkzeug.palletsprojects.com/ ''' browser_test_cases = {} browser_test_cases['firefox3-2png1txt'] = {'data': b64decode(b''' LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0xODY0NTQ2NTE3MTM1MTkzNDE5NTE1ODEwMzAx MDUNCkNvbnRlbnQtRGlzcG9zaXRpb246IGZvcm0tZGF0YTsgbmFtZT0iZmlsZTEiOyBmaWxlbmFt ZT0iYW5jaG9yLnBuZyINCkNvbnRlbnQtVHlwZTogaW1hZ2UvcG5nDQoNColQTkcNChoKAAAADUlI RFIAAAAQAAAAEAgGAAAAH/P/YQAAAARnQU1BAACvyDcFiukAAAAZdEVYdFNvZnR3YXJlAEFkb2Jl IEltYWdlUmVhZHlxyWU8AAABnUlEQVQ4y6VTMWvCQBS+qwEFB10KGaS1P6FDpw7SrVvzAwRRx04V Ck4K6iAoDhLXdhFcW9qhZCk4FQoW0gp2U4lQRDAUS4hJmn5Xgg2lsQ198PHu3b3vu5d3L9S2bfIf 47wOer1ewzTNtGEYBP48kUjkfsrb8BIAMb1cLovwRfi07wrYzcCr4/1/Am4FzzhzBGZeefR7E7vd 7j0Iu4wYjUYDBMfD0dBiMUQfstns3toKkHgF6EgmqqruW6bFiHcsxr70awVu63Q6NiOmUinquwfM dF1f28CVgCRJx0jMAQ1BEFquRn7CbYVCYZVbr9dbnJMohoIh9kViu90WEW9nMpmxu4JyubyF/VEs FiNcgCPyoyxiu7XhCPBzdU4s652VnUccbDabPLyN2C6VSmwdhFgel5DB84AJb64mEUlvmqadTKcv 40gkUkUsg1DjeZ7iRsrWgByP71T7/afxYrHIYry/eoBD9mxsaK4VRamFw2EBQknMAWGvRClNTpQJ AfkCxFNgBmiez1ipVA4hdgQcOD/TLfylKIo3vubgL/YBnIw+ioOMLtwAAAAASUVORK5CYIINCi0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tMTg2NDU0NjUxNzEzNTE5MzQxOTUxNTgxMDMwMTA1 DQpDb250ZW50LURpc3Bvc2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9ImZpbGUyIjsgZmlsZW5hbWU9 ImFwcGxpY2F0aW9uX2VkaXQucG5nIg0KQ29udGVudC1UeXBlOiBpbWFnZS9wbmcNCg0KiVBORw0K GgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0U29mdHdh cmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJRSURBVBgZpcHda81xHMDx9+d3fudYzuYw2RaZ5yTW olEiuZpCSjGJFEktUUr8A6ZxQZGHmDtqdrGUXHgoeZqSp1F2bLFWjtkOB8PZzvmd7+djv5XaBRfL 6yVmxv+QjQeu7l25uuZYJmtxM0AVU8Wpw9RQU8w51AxzDqfKhFjwq6Mjdbj1RN0Zv2ZFzaloUdwr L2Is4r+y7hRwxs8G5mUzPxmrwcA8hvnmjIZtcxmr3Y09hHwzJZQvOAwwNZyCYqgaThVXMFzBCD7f Jfv8MpHiKvaV3ePV2f07fMwIiSeIGeYJJoao4HmCiIeIQzPXifY+paJqO4lZi/nWPZ/krabjvlNH yANMBAQiBiqgakQMCunbxHJviM9bQeZdBzHJUzKhguLJlQnf1BghAmZ4gImAgAjk++8jP56QmL2G XG8zsfFCz8skA1mQXKbaU3X8ISIgQsgDcun7FL7cJjFnLUMfLyLRr0SLS4hbhiup5Szd19rpFYKA ESKICCERoS95neyHmyTmbmAodQ4vGpAfmEn6YTtTahv4ODiRkGdOCUUAAUSE/uQNfqTaKFu4jvyn JiIxIzcwg/SjF1RsOk9R+QJMlZCvqvwhQFdbM4XvrynIVHpfn2ZSWYyhzHS+PUtSueUC0cQ0QmpG yE9197TUnwzq1DnUKbXSxOb6S7xtPkjngzbGVVbzvS/FjaGt9DU8xlRRJdTCMDEzRjuyZ1FwaFe9 j+d4eecaPd1dPxNTSlfWHm1v5y/EzBitblXp4JLZ5f6yBbOwaK5tsD+9c33jq/f8w2+mRSjOllPh kAAAAABJRU5ErkJggg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0xODY0NTQ2NTE3MTM1 MTkzNDE5NTE1ODEwMzAxMDUNCkNvbnRlbnQtRGlzcG9zaXRpb246IGZvcm0tZGF0YTsgbmFtZT0i dGV4dCINCg0KZXhhbXBsZSB0ZXh0DQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLTE4NjQ1 NDY1MTcxMzUxOTM0MTk1MTU4MTAzMDEwNS0tDQo='''), 'boundary':'---------------------------186454651713519341951581030105', 'files': {'file1': (u'anchor.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAGdSURBVDjLpVMxa8JAFL6rAQUHXQoZpLU/ oUOnDtKtW/MDBFHHThUKTgrqICgOEtd2EVxb2qFkKTgVChbSCnZTiVBEMBRLiEmafleCDaWxDX3w 8e7dve+7l3cv1LZt8h/jvA56vV7DNM20YRgE/jyRSOR+ytvwEgAxvVwui/BF+LTvCtjNwKvj/X8C bgXPOHMEZl559HsTu93uPQi7jBiNRgMEx8PR0GIxRB+y2eze2gqQeAXoSCaqqu5bpsWIdyzGvvRr BW7rdDo2I6ZSKeq7B8x0XV/bwJWAJEnHSMwBDUEQWq5GfsJthUJhlVuv11uckyiGgiH2RWK73RYR b2cymbG7gnK5vIX9USwWI1yAI/KjLGK7teEI8HN1TizrnZWdRxxsNps8vI3YLpVKbB2EWB6XkMHz gAlvriYRSW+app1Mpy/jSCRSRSyDUON5nuJGytaAHI/vVPv9p/FischivL96gEP2bGxorhVFqYXD YQFCScwBYa9EKU1OlAkB+QLEU2AGaJ7PWKlUDiF2BBw4P9Mt/KUoije+5uAv9gGcjD6Kg4wu3AAA AABJRU5ErkJggg==''')), 'file2': (u'application_edit.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJRSURBVBgZpcHda81xHMDx9+d3fudYzuYw 2RaZ5yTWolEiuZpCSjGJFEktUUr8A6ZxQZGHmDtqdrGUXHgoeZqSp1F2bLFWjtkOB8PZzvmd7+dj v5XaBRfL6yVmxv+QjQeu7l25uuZYJmtxM0AVU8Wpw9RQU8w51AxzDqfKhFjwq6Mjdbj1RN0Zv2ZF zaloUdwrL2Is4r+y7hRwxs8G5mUzPxmrwcA8hvnmjIZtcxmr3Y09hHwzJZQvOAwwNZyCYqgaThVX MFzBCD7fJfv8MpHiKvaV3ePV2f07fMwIiSeIGeYJJoao4HmCiIeIQzPXifY+paJqO4lZi/nWPZ/k rabjvlNHyANMBAQiBiqgakQMCunbxHJviM9bQeZdBzHJUzKhguLJlQnf1BghAmZ4gImAgAjk++8j P56QmL2GXG8zsfFCz8skA1mQXKbaU3X8ISIgQsgDcun7FL7cJjFnLUMfLyLRr0SLS4hbhiup5Szd 19rpFYKAESKICCERoS95neyHmyTmbmAodQ4vGpAfmEn6YTtTahv4ODiRkGdOCUUAAUSE/uQNfqTa KFu4jvynJiIxIzcwg/SjF1RsOk9R+QJMlZCvqvwhQFdbM4XvrynIVHpfn2ZSWYyhzHS+PUtSueUC 0cQ0QmpGyE9197TUnwzq1DnUKbXSxOb6S7xtPkjngzbGVVbzvS/FjaGt9DU8xlRRJdTCMDEzRjuy Z1FwaFe9j+d4eecaPd1dPxNTSlfWHm1v5y/EzBitblXp4JLZ5f6yBbOwaK5tsD+9c33jq/f8w2+m RSjOllPhkAAAAABJRU5ErkJggg=='''))}, 'forms': {'text': u'example text'}} browser_test_cases['firefox3-2pnglongtext'] = {'data': b64decode(b''' LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0xNDkwNDA0NDczOTc4NzE5MTAzMTc1NDcxMTc0 OA0KQ29udGVudC1EaXNwb3NpdGlvbjogZm9ybS1kYXRhOyBuYW1lPSJmaWxlMSI7IGZpbGVuYW1l PSJhY2NlcHQucG5nIg0KQ29udGVudC1UeXBlOiBpbWFnZS9wbmcNCg0KiVBORw0KGgoAAAANSUhE UgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0U29mdHdhcmUAQWRvYmUg SW1hZ2VSZWFkeXHJZTwAAAKfSURBVDjLpZPrS1NhHMf9O3bOdmwDCWREIYKEUHsVJBI7mg3FvCxL 09290jZj2EyLMnJexkgpLbPUanNOberU5taUMnHZUULMvelCtWF0sW/n7MVMEiN64AsPD8/n83uu cQDi/id/DBT4Dolypw/qsz0pTMbj/WHpiDgsdSUyUmeiPt2+V7SrIM+bSss8ySGdR4abQQv6lrui 6VxsRonrGCS9VEjSQ9E7CtiqdOZ4UuTqnBHO1X7YXl6Daa4yGq7vWO1D40wVDtj4kWQbn94myPGk CDPdSesczE2sCZShwl8CzcwZ6NiUs6n2nYX99T1cnKqA2EKui6+TwphA5k4yqMayopU5mANV3lNQ TBdCMVUA9VQh3GuDMHiVcLCS3J4jSLhCGmKCjBEx0xlshjXYhApfMZRP5CyYD+UkG08+xt+4wLVQ ZA1tzxthm2tEfD3JxARH7QkbD1ZuozaggdZbxK5kAIsf5qGaKMTY2lAU/rH5HW3PLsEwUYy+YCcE RmIjJpDcpzb6l7th9KtQ69fi09ePUej9l7cx2DJbD7UrG3r3afQHOyCo+V3QQzE35pvQvnAZukk5 zL5qRL59jsKbPzdheXoBZc4saFhBS6AO7V4zqCpiawuptwQG+UAa7Ct3UT0hh9p9EnXT5Vh6t4C2 2QaUDh6HwnECOmcO7K+6kW49DKqS2DrEZCtfuI+9GrNHg4fMHVSO5kE7nAPVkAxKBxcOzsajpS4Y h4ohUPPWKTUh3PaQEptIOr6BiJjcZXCwktaAGfrRIpwblqOV3YKdhfXOIvBLeREWpnd8ynsaSJoy ESFphwTtfjN6X1jRO2+FxWtCWksqBApeiFIR9K6fiTpPiigDoadqCEag5YUFKl6Yrciw0VOlhOiv v/Ff8wtn0KzlebrUYwAAAABJRU5ErkJggg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0x NDkwNDA0NDczOTc4NzE5MTAzMTc1NDcxMTc0OA0KQ29udGVudC1EaXNwb3NpdGlvbjogZm9ybS1k YXRhOyBuYW1lPSJmaWxlMiI7IGZpbGVuYW1lPSJhZGQucG5nIg0KQ29udGVudC1UeXBlOiBpbWFn ZS9wbmcNCg0KiVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK 6QAAABl0RVh0U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJvSURBVDjLpZPrS5NhGIf9 W7YvBYOkhlkoqCklWChv2WyKik7blnNris72bi6dus0DLZ0TDxW1odtopDs4D8MDZuLU0kXq61Ci jSIIasOvv94VTUfLiB74fXngup7nvrnvJABJ/5PfLnTTdcwOj4RsdYmo5glBWP6iOtzwvIKSWstI 0Wgx80SBblpKtE9KQs/We7EaWoT/8wbWP61gMmCH0lMDvokT4j25TiQU/ITFkek9Ow6+7WH2gwsm ahCPdwyw75uw9HEO2gUZSkfyI9zBPCJOoJ2SMmg46N61YO/rNoa39Xi41oFuXysMfh36/Fp0b7bA fWAH6RGi0HglWNCbzYgJaFjRv6zGuy+b9It96N3SQvNKiV9HvSaDfFEIxXItnPs23BzJQd6DDEVM 0OKsoVwBG/1VMzpXVWhbkUM2K4oJBDYuGmbKIJ0qxsAbHfRLzbjcnUbFBIpx/qH3vQv9b3U03IQ/ HfFkERTzfFj8w8jSpR7GBE123uFEYAzaDRIqX/2JAtJbDat/COkd7CNBva2cMvq0MGxp0PRSCPF8 BXjWG3FgNHc9XPT71Ojy3sMFdfJRCeKxEsVtKwFHwALZfCUk3tIfNR8XiJwc1LmL4dg141JPKtj3 WUdNFJqLGFVPC4OkR4BxajTWsChY64wmCnMxsWPCHcutKBxMVp5mxA1S+aMComToaqTRUQknLTH6 2kHOVEE+VQnjahscNCy0cMBWsSI0TCQcZc5ALkEYckL5A5noWSBhfm2AecMAjbcRWV0pUTh0HE64 TNf0mczcnnQyu/MilaFJCae1nw2fbz1DnVOxyGTlKeZft/Ff8x1BRssfACjTwQAAAABJRU5ErkJg gg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0xNDkwNDA0NDczOTc4NzE5MTAzMTc1NDcx MTc0OA0KQ29udGVudC1EaXNwb3NpdGlvbjogZm9ybS1kYXRhOyBuYW1lPSJ0ZXh0Ig0KDQotLWxv bmcgdGV4dA0KLS13aXRoIGJvdW5kYXJ5DQotLWxvb2thbGlrZXMtLQ0KLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0xNDkwNDA0NDczOTc4NzE5MTAzMTc1NDcxMTc0OC0tDQo='''), 'boundary':'---------------------------14904044739787191031754711748', 'files': {'file1': (u'accept.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAKfSURBVDjLpZPrS1NhHMf9O3bOdmwDCWRE IYKEUHsVJBI7mg3FvCxL09290jZj2EyLMnJexkgpLbPUanNOberU5taUMnHZUULMvelCtWF0sW/n 7MVMEiN64AsPD8/n83uucQDi/id/DBT4Dolypw/qsz0pTMbj/WHpiDgsdSUyUmeiPt2+V7SrIM+b Sss8ySGdR4abQQv6lrui6VxsRonrGCS9VEjSQ9E7CtiqdOZ4UuTqnBHO1X7YXl6Daa4yGq7vWO1D 40wVDtj4kWQbn94myPGkCDPdSesczE2sCZShwl8CzcwZ6NiUs6n2nYX99T1cnKqA2EKui6+TwphA 5k4yqMayopU5mANV3lNQTBdCMVUA9VQh3GuDMHiVcLCS3J4jSLhCGmKCjBEx0xlshjXYhApfMZRP 5CyYD+UkG08+xt+4wLVQZA1tzxthm2tEfD3JxARH7QkbD1ZuozaggdZbxK5kAIsf5qGaKMTY2lAU /rH5HW3PLsEwUYy+YCcERmIjJpDcpzb6l7th9KtQ69fi09ePUej9l7cx2DJbD7UrG3r3afQHOyCo +V3QQzE35pvQvnAZukk5zL5qRL59jsKbPzdheXoBZc4saFhBS6AO7V4zqCpiawuptwQG+UAa7Ct3 UT0hh9p9EnXT5Vh6t4C22QaUDh6HwnECOmcO7K+6kW49DKqS2DrEZCtfuI+9GrNHg4fMHVSO5kE7 nAPVkAxKBxcOzsajpS4Yh4ohUPPWKTUh3PaQEptIOr6BiJjcZXCwktaAGfrRIpwblqOV3YKdhfXO IvBLeREWpnd8ynsaSJoyESFphwTtfjN6X1jRO2+FxWtCWksqBApeiFIR9K6fiTpPiigDoadqCEag 5YUFKl6Yrciw0VOlhOivv/Ff8wtn0KzlebrUYwAAAABJRU5ErkJggg==''')), 'file2': (u'add.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJvSURBVDjLpZPrS5NhGIf9W7YvBYOkhlko qCklWChv2WyKik7blnNris72bi6dus0DLZ0TDxW1odtopDs4D8MDZuLU0kXq61CijSIIasOvv94V TUfLiB74fXngup7nvrnvJABJ/5PfLnTTdcwOj4RsdYmo5glBWP6iOtzwvIKSWstI0Wgx80SBblpK tE9KQs/We7EaWoT/8wbWP61gMmCH0lMDvokT4j25TiQU/ITFkek9Ow6+7WH2gwsmahCPdwyw75uw 9HEO2gUZSkfyI9zBPCJOoJ2SMmg46N61YO/rNoa39Xi41oFuXysMfh36/Fp0b7bAfWAH6RGi0Hgl WNCbzYgJaFjRv6zGuy+b9It96N3SQvNKiV9HvSaDfFEIxXItnPs23BzJQd6DDEVM0OKsoVwBG/1V MzpXVWhbkUM2K4oJBDYuGmbKIJ0qxsAbHfRLzbjcnUbFBIpx/qH3vQv9b3U03IQ/HfFkERTzfFj8 w8jSpR7GBE123uFEYAzaDRIqX/2JAtJbDat/COkd7CNBva2cMvq0MGxp0PRSCPF8BXjWG3FgNHc9 XPT71Ojy3sMFdfJRCeKxEsVtKwFHwALZfCUk3tIfNR8XiJwc1LmL4dg141JPKtj3WUdNFJqLGFVP C4OkR4BxajTWsChY64wmCnMxsWPCHcutKBxMVp5mxA1S+aMComToaqTRUQknLTH62kHOVEE+VQnj ahscNCy0cMBWsSI0TCQcZc5ALkEYckL5A5noWSBhfm2AecMAjbcRWV0pUTh0HE64TNf0mczcnnQy u/MilaFJCae1nw2fbz1DnVOxyGTlKeZft/Ff8x1BRssfACjTwQAAAABJRU5ErkJggg=='''))}, 'forms': {'text': u'--long text\r\n--with boundary\r\n--lookalikes--'}} browser_test_cases['opera8-2png1txt'] = {'data': b64decode(b''' LS0tLS0tLS0tLS0tekVPOWpRS21MYzJDcTg4YzIzRHgxOQ0KQ29udGVudC1EaXNwb3NpdGlvbjog Zm9ybS1kYXRhOyBuYW1lPSJmaWxlMSI7IGZpbGVuYW1lPSJhcnJvd19icmFuY2gucG5nIg0KQ29u dGVudC1UeXBlOiBpbWFnZS9wbmcNCg0KiVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9h AAAABGdBTUEAAK/INwWK6QAAABl0RVh0U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAHY SURBVDjLlVLPS1RxHJynpVu7KEn0Vt+2l6IO5qGCIsIwCPwD6hTUaSk6REoUHeoQ0qVAMrp0COpY 0SUIPVRgSl7ScCUTst6zIoqg0y7lvpnPt8MWKuuu29w+hxnmx8dzzmE5+l7mxk1u/a3Dd/ejDjSs II/m3vjJ9MF0yt93ZuTkdD0CnnMO/WOnmsxsJp3yd2zfvA3mHOa+zuHTjy/zojrvHX1YqunAZE9M lpUcZAaZQBNIZUg9XdPBP5wePuEO7eyGQXg29QL3jz3y1oqwbvkhCuYEOQMp/HeJohCbICMUVwr0 DvZcOnK9u7GmQNmBQLJCgORxkneqRmAs0BFmDi0bW9E72PPda/BikwWi0OEHkNR14MrewsTAZF+l AAWZEH6LUCwUkUlntrS1tiG5IYlEc6LcjYjSYuncngtdhakbM5dXlhgTNEMYLqB9q49MKgsPjTBX ntVgkDNIgmI1VY2Q7QzgJ9rx++ci3ofziBYiiELQEUAyhB/D29M3Zy+uIkDIhGYvgeKvIkbHxz6T evzq6ut+ANh9fldetMn80OzZVVdgLFjBQ0tpEz68jcB4ifx3pQeictVXIEETnBPCKMLEwBIZAPJD 767V/ETGwsjzYYiC6vzEP9asLo3SGuQvAAAAAElFTkSuQmCCDQotLS0tLS0tLS0tLS16RU85alFL bUxjMkNxODhjMjNEeDE5DQpDb250ZW50LURpc3Bvc2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9ImZp bGUyIjsgZmlsZW5hbWU9ImF3YXJkX3N0YXJfYnJvbnplXzEucG5nIg0KQ29udGVudC1UeXBlOiBp bWFnZS9wbmcNCg0KiVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/I NwWK6QAAABl0RVh0U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJvSURBVDjLhZNNSFRR FIC/N++9eWMzhkl/ZJqFMQMRFvTvImkXSdKiVRAURBRRW1eZA9EqaNOiFlZEtQxKyrJwUS0K+qEQ zaTE/AtLHR3HmffuvafFNINDWGdz7z2c7+Nyzr2WiFAIffaMBDW1+B0diAgYgxiDiCDG4DU1QfcL os+fWAXGYUGIUsXiAliUFER+sBAhVCIIVB7QGtEat1oTbcwVz2LMfwR+gPg+oY0bEa3x6sHdUoVd niMUj0M2i/j+PwVJa2QUu7YWp34D7mqNWdNApD6Ks24dpvcL4gfJRQXevbutjI4lGRzCS9iYukPo 5dvxVqWQvn6k/2uyoudd60LGEhG43VBGyI4j2ADZ7vDJ8DZ9Img4hw4cvO/3UZ1vH3p7lrWRLwGV neD4y6G84NaOYSoTVYIFIiAGvXI3OWctJv0TW03jZb5gZSfzl9YBpMcIzUwdzQsuVR9EyR3TeCqm 6w5jZiZQMz8xsxOYzDTi50AMVngJNgrnUweRbwMPiLpHrOJDOl9Vh6HD7GyO52qa0VPj6MwUJpNC 5mYQS/DUJLH3zzRp1cqN8YulTUyODBBzt4X6Ou870z2I8ZHsHJLLYNQ8jusQ6+2exJf9BfivKdAy mKZiaVdodhBRAagAjIbgzxp20lwb6Vp0jADYkQO6IpHfuoqInSJUVoE2HrpyRQ1tic2LC9p3lSHW Ph2rJfL1MeVP2weWvHp8s3ziNZ49i1q6HrR1YHGBNnt1dG2Z++gC4TdvrqNkK1eHj7ljQ/ujHx6N yPw8BFIiKPmNpKar7P7xb/zyT9P+o7OYvzzYSUt8U+TzxytodixEfgN3CFlQMNAcMgAAAABJRU5E rkJggg0KLS0tLS0tLS0tLS0tekVPOWpRS21MYzJDcTg4YzIzRHgxOQ0KQ29udGVudC1EaXNwb3Np dGlvbjogZm9ybS1kYXRhOyBuYW1lPSJ0ZXh0Ig0KDQpibGFmYXNlbCDDtsOkw7wNCi0tLS0tLS0t LS0tLXpFTzlqUUttTGMyQ3E4OGMyM0R4MTktLQ0K'''), 'boundary':'----------zEO9jQKmLc2Cq88c23Dx19', 'files': {'file1': (u'arrow_branch.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAHYSURBVDjLlVLPS1RxHJynpVu7KEn0Vt+2 l6IO5qGCIsIwCPwD6hTUaSk6REoUHeoQ0qVAMrp0COpY0SUIPVRgSl7ScCUTst6zIoqg0y7lvpnP t8MWKuuu29w+hxnmx8dzzmE5+l7mxk1u/a3Dd/ejDjSsII/m3vjJ9MF0yt93ZuTkdD0CnnMO/WOn msxsJp3yd2zfvA3mHOa+zuHTjy/zojrvHX1YqunAZE9MlpUcZAaZQBNIZUg9XdPBP5wePuEO7eyG QXg29QL3jz3y1oqwbvkhCuYEOQMp/HeJohCbICMUVwr0DvZcOnK9u7GmQNmBQLJCgORxkneqRmAs 0BFmDi0bW9E72PPda/BikwWi0OEHkNR14MrewsTAZF+lAAWZEH6LUCwUkUlntrS1tiG5IYlEc6Lc jYjSYuncngtdhakbM5dXlhgTNEMYLqB9q49MKgsPjTBXntVgkDNIgmI1VY2Q7QzgJ9rx++ci3ofz iBYiiELQEUAyhB/D29M3Zy+uIkDIhGYvgeKvIkbHxz6Tevzq6ut+ANh9fldetMn80OzZVVdgLFjB Q0tpEz68jcB4ifx3pQeictVXIEETnBPCKMLEwBIZAPJD767V/ETGwsjzYYiC6vzEP9asLo3SGuQv AAAAAElFTkSuQmCC''')), 'file2': (u'award_star_bronze_1.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJvSURBVDjLhZNNSFRRFIC/N++9eWMzhkl/ ZJqFMQMRFvTvImkXSdKiVRAURBRRW1eZA9EqaNOiFlZEtQxKyrJwUS0K+qEQzaTE/AtLHR3Hmffu vafFNINDWGdz7z2c7+Nyzr2WiFAIffaMBDW1+B0diAgYgxiDiCDG4DU1QfcLos+fWAXGYUGIUsXi AliUFER+sBAhVCIIVB7QGtEat1oTbcwVz2LMfwR+gPg+oY0bEa3x6sHdUoVdniMUj0M2i/j+PwVJ a2QUu7YWp34D7mqNWdNApD6Ks24dpvcL4gfJRQXevbutjI4lGRzCS9iYukPo5dvxVqWQvn6k/2uy oudd60LGEhG43VBGyI4j2ADZ7vDJ8DZ9Img4hw4cvO/3UZ1vH3p7lrWRLwGVneD4y6G84NaOYSoT VYIFIiAGvXI3OWctJv0TW03jZb5gZSfzl9YBpMcIzUwdzQsuVR9EyR3TeCqm6w5jZiZQMz8xsxOY zDTi50AMVngJNgrnUweRbwMPiLpHrOJDOl9Vh6HD7GyO52qa0VPj6MwUJpNC5mYQS/DUJLH3zzRp 1cqN8YulTUyODBBzt4X6Ou870z2I8ZHsHJLLYNQ8jusQ6+2exJf9BfivKdAymKZiaVdodhBRAagA jIbgzxp20lwb6Vp0jADYkQO6IpHfuoqInSJUVoE2HrpyRQ1tic2LC9p3lSHWPh2rJfL1MeVP2weW vHp8s3ziNZ49i1q6HrR1YHGBNnt1dG2Z++gC4TdvrqNkK1eHj7ljQ/ujHx6NyPw8BFIiKPmNpKar 7P7xb/zyT9P+o7OYvzzYSUt8U+TzxytodixEfgN3CFlQMNAcMgAAAABJRU5ErkJggg=='''))}, 'forms': {'text': u'blafasel öäü'}} browser_test_cases['webkit3-2png1txt'] = {'data': b64decode(b''' LS0tLS0tV2ViS2l0Rm9ybUJvdW5kYXJ5amRTRmhjQVJrOGZ5R055Ng0KQ29udGVudC1EaXNwb3Np dGlvbjogZm9ybS1kYXRhOyBuYW1lPSJmaWxlMSI7IGZpbGVuYW1lPSJndGstYXBwbHkucG5nIg0K Q29udGVudC1UeXBlOiBpbWFnZS9wbmcNCg0KiVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACN iR0NAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAAN1wAADdcBQiibeAAAABl0RVh0U29mdHdhcmUA d3d3Lmlua3NjYXBlLm9yZ5vuPBoAAANnSURBVDiNldJ9aJVVHAfw7znPuS/PvW4405WbLWfbsBuN bramq5Tp7mLqIFPXINlwpAitaCAPjWKgBdXzR2TBpEZoadAyCVGndttCFNxqLXORK7x3y704NlzX zfs8d89znuf0R/fKk03xHvjCOZxzPpzzO4cIIZBuC6nsGYmRrwFMWVw0hxV+PDVH0gVDKvNSRgZf rm5+QCISOi58pY1MXhm1uHg+rPDfabqnoxJpKQ2snf/gwgKY3ut4pfodX/lTGwokRt4AgLTAkMoK 3cz7enVJg/fyTCdGE/3gwsTo+LBu2+J82qDE6IEXyrd7YvYwbpgjyPOtQHTikvhz+NKgsNGWFhhS WU3uwqWPBx9aRwfjPTCFgXx5JY50tumWKbaFFS7uGQypLINKZH/tukb/kN6DSSOCFfO3oqu/3biZ iH0ZVvjF1Np7AiVG31sdXO/P8GfhqtaLbE8BqOlBZ++xuMXFbudaljxBDnNJHbZlFwF407bFh6kr hFRW7Jcztlc9Uee5HD+DaWsCTy/YgbaOvZpl2Y1hhU87QVLxvpQpMfpzfeXuZfmLA/Rw1wdaZOS3 Pm7aNQDGJUZ/qatqKs5etIj03TiKQv8aaFOWOHRm30+nm4zS229DmVs6Ulm6OW/50iD9G1Hsqnrb t2lNwyoXYwMAPnk4N1D4aO4qEtW6wagHeZ4SfNP1mW6Zdt1c5WEE8Lll5qKCQbdiGIh/h+JlK6Wi xcHM4z2fb9tUtkOO6hdw3Yzi2axdON33xaxuzLSGFf7HXCA1Dav+5Nn2Kyd7DyYK5bXw0QWIJM4j 7rqGmvKd8gwZw5D+I3K8jyGhmzj366lpi4uWOz0gEUIgpDKPxGjr/VlLanZubJknXLMYiH8Pjccw K26C27Oouu8tfHysWbs6HnkxrPATdwVTLaSyzW63+8BLzzX6H1lSSrtjBzFpRPBkZi0mrk3Z7Z2t P5xqMiruhP0PTKL5EqMnSgKr87eUvSqPGf3Ipsux53CDpie0QFjhf90NhBDiVlJ1LaqmcqXq2l/7 aU7826E94rWjQb3iXbYXgAzAC8ADwI1//zF1OkQIAUIIBSAlc6tfpkjr52XTj4SFi937eP3MmDAB 2I5YyaT63AmyuVDHmAAQt0FOzARg/aeGhBCS3EjnCBygMwKAnXL+AdDkiZ/xYgR3AAAAAElFTkSu QmCCDQotLS0tLS1XZWJLaXRGb3JtQm91bmRhcnlqZFNGaGNBUms4ZnlHTnk2DQpDb250ZW50LURp c3Bvc2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9ImZpbGUyIjsgZmlsZW5hbWU9Imd0ay1uby5wbmci DQpDb250ZW50LVR5cGU6IGltYWdlL3BuZw0KDQqJUE5HDQoaCgAAAA1JSERSAAAAFAAAABQIBgAA AI2JHQ0AAAAEc0JJVAgICAh8CGSIAAAACXBIWXMAAA3XAAAN1wFCKJt4AAAAGXRFWHRTb2Z0d2Fy ZQB3d3cuaW5rc2NhcGUub3Jnm+48GgAAAzVJREFUOI2tlM9rG0cUxz8zu7OzsqhtyTIONDG2g9ue UnIwFEqCwYUeTC+99u5T/4FAKKUEeuh/4FPvOZXiWw3GpRRcGjW0h1KwLLe4juOspJUlS95frwft CkdJbh347o95bz+8mfedVSLC/zncNwUeKnVfw4YD6yncBXCgnsJeBruPRPZf952arPCBUhUL216p tLm0vGxmq1X3rbk5AC6CgE67nTQbjTgaDHauYOtrkfYbgV8o9SHw/crKytR7d+5YDXhzc2hjEBGy OCZutciU4s+nT68ajcYl8MlXIj+9AnygVMXA4draWqVWqaBLJcz09ChLBBGBXHEYImlK0G5zcHDQ juF2UakuyBa2l27dmqqWywxOTpAkIWq1iILgFWVxzOXREZVymaXFxSkL2wVHFw0w1m6urq7asF7H sZa01SINAiQIyIp7q0XaapEEAcp1CZ884Z3VVWus3Xyo1P1xlzVsvL2wYJLTUwhDdBiiHAedL1EV +yxCJoJkGTpJkDAkOj3l5o0b5vD4eAPYd3M7rM+WSq7qdLCAOjtD+z46y1DXgJkIZNmIHUWj3E6H melp14H1cYUZ3J31fZyTE1zA7fVw+n0cERSg8v2RUS5pPqeArNtlZmGBwqtjY+skwYig80lXBCff 5OvANFeSxzIRojge5+j8Uu9dXOD5Pt6o41jAz1W69uznMQ8wgOf79LpdNNTHwBT22r1ebDwPt0h8 DbQAFTADGGvp9PtxCntjYAa7zW43wVpca3HyZZsJaAF0C/k+4vs0wzDJYHcMfCSyHyfJzq/n50NT raKVwhl1H3cCpAsphVut8tvz58M4SXaKn8X4pFzB1lG/P2gOBuhaDYxBJhqR5e8Yg56f53gwoNHr Da9gq+CMz7JSauoz+HgFvr1trX+vXPZKUYSbJCMTA+K6xMYw8Dx+7Pfjw+Fw+Dt8/h38ALwQkeg6 cAaoLcLyp/BlVam1dz3PWdDaqbkjdwVpymmaZn9FUXouUn8M3zyDJvAC+PclYA6dBmpA5SO4dxM+ mIf3fVgCGMLfz+CPf+CXPfgZCIFz4ExEkpeWfH0opZzcKYUsI38nIy5D4BK4kgnAfwLblOaQdQsS AAAAAElFTkSuQmCCDQotLS0tLS1XZWJLaXRGb3JtQm91bmRhcnlqZFNGaGNBUms4ZnlHTnk2DQpD b250ZW50LURpc3Bvc2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9InRleHQiDQoNCnRoaXMgaXMgYW5v dGhlciB0ZXh0IHdpdGggw7xtbMOkw7x0cw0KLS0tLS0tV2ViS2l0Rm9ybUJvdW5kYXJ5amRTRmhj QVJrOGZ5R055Ni0tDQo='''), 'boundary':'----WebKitFormBoundaryjdSFhcARk8fyGNy6', 'files': {'file1': (u'gtk-apply.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz AAAN1wAADdcBQiibeAAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAAANnSURB VDiNldJ9aJVVHAfw7znPuS/PvW4405WbLWfbsBuNbramq5Tp7mLqIFPXINlwpAitaCAPjWKgBdXz R2TBpEZoadAyCVGndttCFNxqLXORK7x3y704NlzXzfs8d89znuf0R/fKk03xHvjCOZxzPpzzO4cI IZBuC6nsGYmRrwFMWVw0hxV+PDVH0gVDKvNSRgZfrm5+QCISOi58pY1MXhm1uHg+rPDfabqnoxJp KQ2snf/gwgKY3ut4pfodX/lTGwokRt4AgLTAkMoK3cz7enVJg/fyTCdGE/3gwsTo+LBu2+J82qDE 6IEXyrd7YvYwbpgjyPOtQHTikvhz+NKgsNGWFhhSWU3uwqWPBx9aRwfjPTCFgXx5JY50tumWKbaF FS7uGQypLINKZH/tukb/kN6DSSOCFfO3oqu/3biZiH0ZVvjF1Np7AiVG31sdXO/P8GfhqtaLbE8B qOlBZ++xuMXFbudaljxBDnNJHbZlFwF407bFh6krhFRW7Jcztlc9Uee5HD+DaWsCTy/YgbaOvZpl 2Y1hhU87QVLxvpQpMfpzfeXuZfmLA/Rw1wdaZOS3Pm7aNQDGJUZ/qatqKs5etIj03TiKQv8aaFOW OHRm30+nm4zS229DmVs6Ulm6OW/50iD9G1Hsqnrbt2lNwyoXYwMAPnk4N1D4aO4qEtW6wagHeZ4S fNP1mW6Zdt1c5WEE8Lll5qKCQbdiGIh/h+JlK6WixcHM4z2fb9tUtkOO6hdw3Yzi2axdON33xaxu zLSGFf7HXCA1Dav+5Nn2Kyd7DyYK5bXw0QWIJM4j7rqGmvKd8gwZw5D+I3K8jyGhmzj366lpi4uW Oz0gEUIgpDKPxGjr/VlLanZubJknXLMYiH8PjccwK26C27Oouu8tfHysWbs6HnkxrPATdwVTLaSy zW63+8BLzzX6H1lSSrtjBzFpRPBkZi0mrk3Z7Z2tP5xqMiruhP0PTKL5EqMnSgKr87eUvSqPGf3I psux53CDpie0QFjhf90NhBDiVlJ1LaqmcqXq2l/7aU7826E94rWjQb3iXbYXgAzAC8ADwI1//zF1 OkQIAUIIBSAlc6tfpkjr52XTj4SFi937eP3MmDAB2I5YyaT63AmyuVDHmAAQt0FOzARg/aeGhBCS 3EjnCBygMwKAnXL+AdDkiZ/xYgR3AAAAAElFTkSuQmCC''')), 'file2': (u'gtk-no.png', 'image/png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz AAAN1wAADdcBQiibeAAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPBoAAAM1SURB VDiNrZTPaxtHFMc/M7uzs7KobckyDjQxtoPbnlJyMBRKgsGFHkwvvfbuU/+BQCilBHrof+BT7zmV 4lsNxqUUXBo1tIdSsCy3uI7jrKSVJUveX68H7QpHSW4d+O6PeW8/vJn3nVUiwv853DcFHip1X8OG A+sp3AVwoJ7CXga7j0T2X/edmqzwgVIVC9teqbS5tLxsZqtV9625OQAugoBOu500G404Ggx2rmDr a5H2G4FfKPUh8P3KysrUe3fuWA14c3NoYxARsjgmbrXIlOLPp0+vGo3GJfDJVyI/vQJ8oFTFwOHa 2lqlVqmgSyXM9PQoSwQRgVxxGCJpStBuc3Bw0I7hdlGpLsgWtpdu3ZqqlssMTk6QJCFqtYiC4BVl cczl0RGVcpmlxcUpC9sFRxcNMNZurq6u2rBex7GWtNUiDQIkCMiKe6tF2mqRBAHKdQmfPOGd1VVr rN18qNT9cZc1bLy9sGCS01MIQ3QYohwHnS9RFfssQiaCZBk6SZAwJDo95eaNG+bw+HgD2HdzO6zP lkqu6nSwgDo7Q/s+OstQ14CZCGTZiB1Fo9xOh5npadeB9XGFGdyd9X2ckxNcwO31cPp9HBEUoPL9 kVEuaT6ngKzbZWZhgcKrY2PrJMGIoPNJVwQn3+TrwDRXkscyEaI4Hufo/FLvXVzg+T7eqONYwM9V uvbs5zEPMIDn+/S6XTTUx8AU9tq9Xmw8D7dIfA20ABUwAxhr6fT7cQp7Y2AGu81uN8FaXGtx8mWb CWgBdAv5PuL7NMMwyWB3DHwksh8nyc6v5+dDU62ilcIZdR93AqQLKYVbrfLb8+fDOEl2ip/F+KRc wdZRvz9oDgboWg2MQSYakeXvGIOen+d4MKDR6w2vYKvgjM+yUmrqM/h4Bb69ba1/r1z2SlGEmyQj EwPiusTGMPA8fuz348PhcPg7fP4d/AC8EJHoOnAGqC3C8qfwZVWptXc9z1nQ2qm5I3cFacppmmZ/ RVF6LlJ/DN88gybwAvj3JWAOnQZqQOUjuHcTPpiH931YAhjC38/gj3/glz34GQiBc+BMRJKXlnx9 KKWc3CmFLCN/JyMuQ+ASuJIJwH8C25TmkHULEgAAAABJRU5ErkJggg=='''))}, 'forms': {'text': u'this is another text with ümläüts'}} browser_test_cases['ie6-2png1txt'] = {'data': b64decode(b''' LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS03ZDkxYjAzYTIwMTI4DQpDb250ZW50LURpc3Bv c2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9ImZpbGUxIjsgZmlsZW5hbWU9IkM6XFB5dGhvbjI1XHd6 dGVzdFx3ZXJremV1Zy1tYWluXHRlc3RzXG11bHRpcGFydFxmaXJlZm94My0ycG5nMXR4dFxmaWxl MS5wbmciDQpDb250ZW50LVR5cGU6IGltYWdlL3gtcG5nDQoNColQTkcNChoKAAAADUlIRFIAAAAQ AAAAEAgGAAAAH/P/YQAAAARnQU1BAACvyDcFiukAAAAZdEVYdFNvZnR3YXJlAEFkb2JlIEltYWdl UmVhZHlxyWU8AAABnUlEQVQ4y6VTMWvCQBS+qwEFB10KGaS1P6FDpw7SrVvzAwRRx04VCk4K6iAo DhLXdhFcW9qhZCk4FQoW0gp2U4lQRDAUS4hJmn5Xgg2lsQ198PHu3b3vu5d3L9S2bfIf47wOer1e wzTNtGEYBP48kUjkfsrb8BIAMb1cLovwRfi07wrYzcCr4/1/Am4FzzhzBGZeefR7E7vd7j0Iu4wY jUYDBMfD0dBiMUQfstns3toKkHgF6EgmqqruW6bFiHcsxr70awVu63Q6NiOmUinquwfMdF1f28CV gCRJx0jMAQ1BEFquRn7CbYVCYZVbr9dbnJMohoIh9kViu90WEW9nMpmxu4JyubyF/VEsFiNcgCPy oyxiu7XhCPBzdU4s652VnUccbDabPLyN2C6VSmwdhFgel5DB84AJb64mEUlvmqadTKcv40gkUkUs g1DjeZ7iRsrWgByP71T7/afxYrHIYry/eoBD9mxsaK4VRamFw2EBQknMAWGvRClNTpQJAfkCxFNg Bmiez1ipVA4hdgQcOD/TLfylKIo3vubgL/YBnIw+ioOMLtwAAAAASUVORK5CYIINCi0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tN2Q5MWIwM2EyMDEyOA0KQ29udGVudC1EaXNwb3NpdGlvbjog Zm9ybS1kYXRhOyBuYW1lPSJmaWxlMiI7IGZpbGVuYW1lPSJDOlxQeXRob24yNVx3enRlc3Rcd2Vy a3pldWctbWFpblx0ZXN0c1xtdWx0aXBhcnRcZmlyZWZveDMtMnBuZzF0eHRcZmlsZTIucG5nIg0K Q29udGVudC1UeXBlOiBpbWFnZS94LXBuZw0KDQqJUE5HDQoaCgAAAA1JSERSAAAAEAAAABAIBgAA AB/z/2EAAAAEZ0FNQQAAr8g3BYrpAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccll PAAAAlFJREFUGBmlwd1rzXEcwPH353d+51jO5jDZFpnnJNaiUSK5mkJKMYkUSS1RSvwDpnFBkYeY O2p2sZRceCh5mpKnUXZssVaO2Q4Hw9nO+Z3v52O/ldoFF8vrJWbG/5CNB67uXbm65lgma3EzQBVT xanD1FBTzDnUDHMOp8qEWPCroyN1uPVE3Rm/ZkXNqWhR3CsvYiziv7LuFHDGzwbmZTM/GavBwDyG +eaMhm1zGavdjT2EfDMllC84DDA1nIJiqBpOFVcwXMEIPt8l+/wykeIq9pXd49XZ/Tt8zAiJJ4gZ 5gkmhqjgeYKIh4hDM9eJ9j6lomo7iVmL+dY9n+StpuO+U0fIA0wEBCIGKqBqRAwK6dvEcm+Iz1tB 5l0HMclTMqGC4smVCd/UGCECZniAiYCACOT77yM/npCYvYZcbzOx8ULPyyQDWZBcptpTdfwhIiBC yANy6fsUvtwmMWctQx8vItGvRItLiFuGK6nlLN3X2ukVgoARIogIIRGhL3md7IebJOZuYCh1Di8a kB+YSfphO1NqG/g4OJGQZ04JRQABRIT+5A1+pNooW7iO/KcmIjEjNzCD9KMXVGw6T1H5AkyVkK+q /CFAV1szhe+vKchUel+fZlJZjKHMdL49S1K55QLRxDRCakbIT3X3tNSfDOrUOdQptdLE5vpLvG0+ SOeDNsZVVvO9L8WNoa30NTzGVFEl1MIwMTNGO7JnUXBoV72P53h55xo93V0/E1NKV9YebW/nL8TM GK1uVengktnl/rIFs7Borm2wP71zfeOr9/zDb6ZFKM6WU+GQAAAAAElFTkSuQmCCDQotLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLTdkOTFiMDNhMjAxMjgNCkNvbnRlbnQtRGlzcG9zaXRpb246 IGZvcm0tZGF0YTsgbmFtZT0idGV4dCINCg0KaWU2IHN1Y2tzIDotLw0KLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS03ZDkxYjAzYTIwMTI4LS0NCg=='''), 'boundary':'---------------------------7d91b03a20128', 'files': {'file1': (u'file1.png', 'image/x-png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAGdSURBVDjLpVMxa8JAFL6rAQUHXQoZpLU/ oUOnDtKtW/MDBFHHThUKTgrqICgOEtd2EVxb2qFkKTgVChbSCnZTiVBEMBRLiEmafleCDaWxDX3w 8e7dve+7l3cv1LZt8h/jvA56vV7DNM20YRgE/jyRSOR+ytvwEgAxvVwui/BF+LTvCtjNwKvj/X8C bgXPOHMEZl559HsTu93uPQi7jBiNRgMEx8PR0GIxRB+y2eze2gqQeAXoSCaqqu5bpsWIdyzGvvRr BW7rdDo2I6ZSKeq7B8x0XV/bwJWAJEnHSMwBDUEQWq5GfsJthUJhlVuv11uckyiGgiH2RWK73RYR b2cymbG7gnK5vIX9USwWI1yAI/KjLGK7teEI8HN1TizrnZWdRxxsNps8vI3YLpVKbB2EWB6XkMHz gAlvriYRSW+app1Mpy/jSCRSRSyDUON5nuJGytaAHI/vVPv9p/FischivL96gEP2bGxorhVFqYXD YQFCScwBYa9EKU1OlAkB+QLEU2AGaJ7PWKlUDiF2BBw4P9Mt/KUoije+5uAv9gGcjD6Kg4wu3AAA AABJRU5ErkJggg==''')), 'file2': (u'file2.png', 'image/x-png', b64decode(b''' iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAABGdBTUEAAK/INwWK6QAAABl0RVh0 U29mdHdhcmUAQWRvYmUgSW1hZ2VSZWFkeXHJZTwAAAJRSURBVBgZpcHda81xHMDx9+d3fudYzuYw 2RaZ5yTWolEiuZpCSjGJFEktUUr8A6ZxQZGHmDtqdrGUXHgoeZqSp1F2bLFWjtkOB8PZzvmd7+dj v5XaBRfL6yVmxv+QjQeu7l25uuZYJmtxM0AVU8Wpw9RQU8w51AxzDqfKhFjwq6Mjdbj1RN0Zv2ZF zaloUdwrL2Is4r+y7hRwxs8G5mUzPxmrwcA8hvnmjIZtcxmr3Y09hHwzJZQvOAwwNZyCYqgaThVX MFzBCD7fJfv8MpHiKvaV3ePV2f07fMwIiSeIGeYJJoao4HmCiIeIQzPXifY+paJqO4lZi/nWPZ/k rabjvlNHyANMBAQiBiqgakQMCunbxHJviM9bQeZdBzHJUzKhguLJlQnf1BghAmZ4gImAgAjk++8j P56QmL2GXG8zsfFCz8skA1mQXKbaU3X8ISIgQsgDcun7FL7cJjFnLUMfLyLRr0SLS4hbhiup5Szd 19rpFYKAESKICCERoS95neyHmyTmbmAodQ4vGpAfmEn6YTtTahv4ODiRkGdOCUUAAUSE/uQNfqTa KFu4jvynJiIxIzcwg/SjF1RsOk9R+QJMlZCvqvwhQFdbM4XvrynIVHpfn2ZSWYyhzHS+PUtSueUC 0cQ0QmpGyE9197TUnwzq1DnUKbXSxOb6S7xtPkjngzbGVVbzvS/FjaGt9DU8xlRRJdTCMDEzRjuy Z1FwaFe9j+d4eecaPd1dPxNTSlfWHm1v5y/EzBitblXp4JLZ5f6yBbOwaK5tsD+9c33jq/f8w2+m RSjOllPhkAAAAABJRU5ErkJggg=='''))}, 'forms': {'text': u'ie6 sucks :-/'}} class TestWerkzeugExamples(PushTestBase): def test_werkzeug_examples(self): """Tests multipart parsing against data collected from webbrowsers""" for name in browser_test_cases: self.reset( boundary=browser_test_cases[name]['boundary'], strict=True, header_charset='utf8' ) files = browser_test_cases[name]['files'] forms = browser_test_cases[name]['forms'] self.parse(browser_test_cases[name]['data']) for field in files: segment, body = self.get_segment(field) self.assertTrue(segment.complete) self.assertEqual(segment.name, field) self.assertEqual(segment.filename, files[field][0]) self.assertEqual(segment.content_type, files[field][1]) self.assertEqual(body, files[field][2]) for field in forms: segment, body = self.get_segment(field) self.assertEqual(segment.name, field) self.assertEqual(segment.filename, None) self.assertEqual(segment.content_type, None) self.assertEqual(body.decode(segment.charset or 'utf8'), forms[field]) class TestRealWorldExamples(PushTestBase): def test_special_characters(self): """ Test the ultimate segment name/filename from hell. """ teststring = 'test \\ \\\\ ; ö " = ;' firefox_131 = ['---------------------------3697486332756351920303607403', b'-----------------------------3697486332756351920303607403\r\nContent-Disposition: form-data; name="test \\ \\\\ ; \xc3\xb6 %22 = ;"; filename="test \\ \\\\ ; \xc3\xb6 %22 = ;"\r\nContent-Type: application/octet-stream\r\n\r\ntest \\ \\\\ ; \xc3\xb6 " = ;\r\n-----------------------------3697486332756351920303607403--\r\n'] chrome_129 = ["----WebKitFormBoundary9duA54BXJUGUymtb", b'------WebKitFormBoundary9duA54BXJUGUymtb\r\nContent-Disposition: form-data; name="test \\ \\\\ ; \xc3\xb6 %22 = ;"; filename="test \\ \\\\ ; \xc3\xb6 %22 = ;"\r\nContent-Type: application/octet-stream\r\n\r\ntest \\ \\\\ ; \xc3\xb6 " = ;\r\n------WebKitFormBoundary9duA54BXJUGUymtb--\r\n'] for boundary, body in [firefox_131, chrome_129]: print(repr(boundary)) print(repr(body)) self.reset(boundary=boundary, strict=True, header_charset='utf8') self.parse(body) segment, body = self.get_segment(teststring) self.assertEqual(segment.name, teststring) self.assertEqual(segment.filename, teststring) self.assertEqual(body, teststring.encode("utf8")) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1731773105.9506705 multipart-1.2.1/test/test_wsgi_parser.py0000644000000000000000000001433114716141262015405 0ustar00# -*- coding: utf-8 -*- from .utils import BaseParserTest import multipart class TestFormParser(BaseParserTest): def test_is_form_request(self): self.assertTrue(multipart.is_form_request({"CONTENT_TYPE": "multipart/form-data"})) self.assertTrue(multipart.is_form_request({"CONTENT_TYPE": "Multipart/Form-Data; foo=bar; baz=\"a b c\""})) self.assertTrue(multipart.is_form_request({"CONTENT_TYPE": "application/x-www-form-urlencoded"})) self.assertTrue(multipart.is_form_request({"CONTENT_TYPE": "application/x-url-encoded"})) self.assertFalse(multipart.is_form_request({"CONTENT_TYPE": "application/x-form"})) self.assertFalse(multipart.is_form_request({})) def test_multipart(self): self.write_field("file1", "abc", filename="random.png", content_type="image/png") self.write_field("text1", "abc",) self.write_end() forms, files = self.parse_form_data() self.assertEqual(forms['text1'], 'abc') self.assertEqual(files['file1'].file.read(), b'abc') self.assertEqual(files['file1'].filename, 'random.png') self.assertEqual(files['file1'].name, 'file1') self.assertEqual(files['file1'].content_type, 'image/png') def test_empty(self): self.write_end() forms, files = self.parse_form_data() self.assertEqual(0, len(forms)) self.assertEqual(0, len(files)) def test_urlencoded(self): for ctype in ('application/x-www-form-urlencoded', 'application/x-url-encoded'): self.reset().write('a=b&c=d') self.environ['CONTENT_TYPE'] = ctype forms, files = self.parse_form_data() self.assertEqual(forms['a'], 'b') self.assertEqual(forms['c'], 'd') def test_urlencoded_latin1(self): for ctype in ('application/x-www-form-urlencoded', 'application/x-url-encoded'): self.reset().write(b'a=\xe0\xe1&e=%E8%E9') self.environ['CONTENT_TYPE'] = ctype forms, files = self.parse_form_data(charset='iso-8859-1') self.assertEqual(forms['a'], 'àá') self.assertEqual(forms['e'], 'èé') def test_urlencoded_utf8(self): for ctype in ('application/x-www-form-urlencoded', 'application/x-url-encoded'): self.reset().write(b'a=\xc6\x80\xe2\x99\xad&e=%E1%B8%9F%E2%99%AE') self.environ['CONTENT_TYPE'] = ctype forms, files = self.parse_form_data() self.assertEqual(forms['a'], 'ƀ♭') self.assertEqual(forms['e'], 'ḟ♮') def test_empty(self): with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_wrong_method(self): self.environ['REQUEST_METHOD'] = 'GET' with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_missing_content_type(self): self.environ['CONTENT_TYPE'] = None self.parse_form_data(strict=False) with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_unsupported_content_type(self): self.environ['CONTENT_TYPE'] = 'multipart/fantasy' self.parse_form_data(strict=False) with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_missing_boundary(self): self.environ['CONTENT_TYPE'] = 'multipart/form-data' with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_invalid_content_length(self): self.environ['CONTENT_LENGTH'] = '' with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) self.environ['CONTENT_LENGTH'] = 'notanumber' with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_invalid_environ(self): self.environ['wsgi.input'] = None self.parse_form_data(strict=False) with self.assertRaises(multipart.MultipartError): self.parse_form_data(strict=True) def test_big_urlencoded_detect_early(self): self.environ['CONTENT_TYPE'] = 'application/x-www-form-urlencoded' self.environ['CONTENT_LENGTH'] = 1024+1 self.write('a=b') with self.assertRaises(multipart.MultipartError): self.parse_form_data(mem_limit=1024, strict=True) def test_big_urlencoded_detect_late(self): self.environ['CONTENT_TYPE'] = 'application/x-www-form-urlencoded' self.write('a='+'b'*1024) with self.assertRaises(multipart.MultipartError): self.parse_form_data(mem_limit=1024, strict=True) def test_content_length(self): self.write('a=b&c=ddd') self.environ['CONTENT_TYPE'] = 'application/x-www-form-urlencoded' self.environ['CONTENT_LENGTH'] = '7' # Obey Content-Length, do not overread forms, files = self.parse_form_data() self.assertEqual(forms["c"], "d") # Detect short inputs with self.assertMultipartError("Unexpected end of data stream"): self.environ['CONTENT_LENGTH'] = '10' self.parse_form_data(strict=True) def test_close_on_error(self): self.write_field("file1", 'x'*1024, filename="foo.bin") self.write_field("file2", 'x'*1025, filename="foo.bin") # self.write_end() <-- bad multipart # In case of an error, all parts parsed up until then should be closed # Can't really be tested here, but will show up in coverace with self.assertMultipartError("Unexpected end of multipart stream"): self.parse_form_data(strict=True) def test_ignore_errors(self): self.write_field("file1", 'x'*1024, filename="foo.bin") # self.write_end() <-- bad multipart # Strict mode: throw by default with self.assertMultipartError("Unexpected end of multipart stream"): self.parse_form_data(strict=True) self.parse_form_data(strict=True, ignore_errors=True) # Non-Strict mode: Ignore by default self.parse_form_data(strict=False) with self.assertMultipartError("Unexpected end of multipart stream"): self.parse_form_data(strict=False, ignore_errors=False) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1731774273.4053965 multipart-1.2.1/test/utils.py0000644000000000000000000000612614716143501013162 0ustar00from contextlib import contextmanager import unittest from io import BytesIO import multipart as multipart from multipart import to_bytes class BaseParserTest(unittest.TestCase): def setUp(self): self.data = BytesIO() self.boundary = 'foo' self.environ = { 'REQUEST_METHOD':'POST', 'CONTENT_TYPE':'multipart/form-data; boundary=%s' % self.boundary } self.to_close = [] def tearDown(self): for part in self.to_close: if hasattr(part, 'close'): part.close() def reset(self): self.data.seek(0) self.data.truncate() return self def write(self, *chunks): for chunk in chunks: self.data.write(to_bytes(chunk)) return self def write_boundary(self): if self.data.tell() > 0: self.write(b'\r\n') self.write(b'--', to_bytes(self.boundary), b'\r\n') def write_end(self, force=False): end = b'--' + to_bytes(self.boundary) + b'--' if not force and self.data.getvalue().endswith(end): return if self.data.tell() > 0: self.write(b'\r\n') self.write(end) def write_header(self, header, value, **opts): line = to_bytes(header) + b': ' + to_bytes(value) for opt, val in opts.items(): if val is not None: line += b"; " + to_bytes(opt) + b'=' + to_bytes(multipart.content_disposition_quote(val)) self.write(line + b'\r\n') def write_field(self, name, data, filename=None, content_type=None): self.write_boundary() self.write_header("Content-Disposition", "form-data", name=name, filename=filename) if content_type: self.write_header("Content-Type", content_type) self.write(b"\r\n") self.write(data) def get_buffer_copy(self): return BytesIO(self.data.getvalue()) def parser(self, *lines, **kwargs): if lines: self.reset() self.write(*lines) self.data.seek(0) kwargs.setdefault("boundary", self.boundary) p = multipart.MultipartParser(self.data, **kwargs) for part in p: self.to_close.append(part) return p def parse_form_data(self, *lines, **kwargs): if lines: self.reset() self.write(*lines) environ = kwargs.setdefault('environ', self.environ.copy()) environ.setdefault('wsgi.input', self.get_buffer_copy()) for key, value in list(environ.items()): if value is None: del environ[key] forms, files = multipart.parse_form_data(**kwargs) self.to_close.extend(part for _, part in files.iterallitems()) return forms, files def assertParserFails(self, *a, **ka): self.assertRaises(multipart.MultipartError, self.parser, *a, **ka) @contextmanager def assertMultipartError(self, message: str = None): with self.assertRaises(multipart.MultipartError) as ex: yield if message: self.assertIn(message, str(ex.exception))multipart-1.2.1/PKG-INFO0000644000000000000000000001037700000000000011530 0ustar00Metadata-Version: 2.3 Name: multipart Version: 1.2.1 Summary: Parser for multipart/form-data Author-email: Marcel Hellkamp Requires-Python: >=3.8 Description-Content-Type: text/x-rst Classifier: Development Status :: 6 - Mature Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: MIT License Classifier: Topic :: Internet :: WWW/HTTP Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries Classifier: Topic :: Internet :: WWW/HTTP :: WSGI Classifier: Programming Language :: Python :: 3 Requires-Dist: pytest ; extra == "dev" Requires-Dist: pytest-cov ; extra == "dev" Requires-Dist: build ; extra == "dev" Requires-Dist: twine ; extra == "dev" Requires-Dist: sphinx>=8,<9 ; extra == "docs" Requires-Dist: sphinx-autobuild ; extra == "docs" Project-URL: Changelog, https://multipart.readthedocs.io/en/latest/changelog.html Project-URL: Documentation, https://multipart.readthedocs.io/ Project-URL: Homepage, https://multipart.readthedocs.io/ Project-URL: Issues, https://github.com/defnull/multipart/issues Project-URL: PyPI, https://pypi.org/project/multipart/ Project-URL: Source, https://github.com/defnull/multipart Provides-Extra: dev Provides-Extra: docs ================================= Python multipart/form-data parser ================================= .. image:: https://github.com/defnull/multipart/actions/workflows/test.yaml/badge.svg :target: https://github.com/defnull/multipart/actions/workflows/test.yaml :alt: Tests Status .. image:: https://img.shields.io/pypi/v/multipart.svg :target: https://pypi.python.org/pypi/multipart/ :alt: Latest Version .. image:: https://img.shields.io/pypi/l/multipart.svg :target: https://pypi.python.org/pypi/multipart/ :alt: License .. _HTML5: https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#multipart-form-data .. _RFC7578: https://www.rfc-editor.org/rfc/rfc7578 .. _WSGI: https://peps.python.org/pep-3333 .. _ASGI: https://asgi.readthedocs.io/en/latest/ .. _SansIO: https://sans-io.readthedocs.io/ .. _asyncio: https://docs.python.org/3/library/asyncio.html This module provides a fast incremental non-blocking parser for ``multipart/form-data`` [HTML5_, RFC7578_], as well as blocking alternatives for easier use in WSGI_ or CGI applications: * **PushMultipartParser**: Fast SansIO_ (incremental, non-blocking) parser suitable for ASGI_, asyncio_ and other IO, time or memory constrained environments. * **MultipartParser**: Streaming parser that reads from a byte stream and yields memory- or disk-buffered `MultipartPart` instances. * **WSGI Helper**: High-level functions and containers for WSGI_ or CGI applications with support for both `multipart` and `urlencoded` form submissions. Features ======== * Pure python single file module with no dependencies. * Optimized for both blocking and non-blocking applications. * 100% test coverage with test data from actual browsers and HTTP clients. * High throughput and low latency (see `benchmarks `_). * Predictable memory and disk resource consumption via fine grained limits. * Strict mode: Spent less time parsing malicious or broken inputs. Scope and compatibility ======================= All parsers in this module implement ``multipart/form-data`` as defined by HTML5_ and RFC7578_, supporting all modern browsers or HTTP clients in use today. Legacy browsers (e.g. IE6) are supported to some degree, but only if the required workarounds do not impact performance or security. In detail this means: * Just ``multipart/form-data``, not suitable for email parsing. * No ``multipart/mixed`` support (deprecated in RFC7578_). * No ``base64`` or ``quoted-printable`` transfer encoding (deprecated in RFC7578_). * No ``encoded-word`` or ``name=_charset_`` encoding markers (deprecated in HTML5_). * No support for clearly broken clients (e.g. invalid line breaks or headers). Installation ============ ``pip install multipart`` Documentation ============= Examples and API documentation can be found at: https://multipart.readthedocs.io/ License ======= .. __: https://github.com/defnull/multipart/raw/master/LICENSE Code and documentation are available under MIT License (see LICENSE__).