pax_global_header00006660000000000000000000000064134735411520014517gustar00rootroot0000000000000052 comment=6807101917261221625dfb647f7f91a98e9e80e8 headerparser-0.4.0/000077500000000000000000000000001347354115200141655ustar00rootroot00000000000000headerparser-0.4.0/.gitignore000066400000000000000000000002031347354115200161500ustar00rootroot00000000000000*.egg *.egg-info/ *.pyc .cache/ .coverage .eggs/ .pytest_cache/ .tox/ __pycache__/ build/ dist/ docs/.doctrees/ docs/_build/ venv/ headerparser-0.4.0/.travis.yml000066400000000000000000000005241347354115200162770ustar00rootroot00000000000000language: python cache: pip jobs: include: - python: 2.7 - python: 3.4 - python: 3.5 - python: 3.6 - python: 3.7 dist: xenial sudo: true - python: pypy - python: pypy3 install: - pip install codecov tox script: - tox -e py after_success: - codecov headerparser-0.4.0/CHANGELOG.md000066400000000000000000000031701347354115200157770ustar00rootroot00000000000000v0.4.0 (2019-05-29) ------------------- - Added a `scan()` function combining the behavior of `scan_file()` and `scan_lines()`, which are now deprecated - Gave `HeaderParser` a `parse()` method combining the behavior of `parse_file()` and `parse_lines()`, which are now deprecated - Added `scan_next_stanza()` and `scan_next_stanza_string()` functions for scanning & consuming input only up to the end of the first header section - Added `scan_stanzas()` and `scan_stanzas_string()` functions for scanning input composed entirely of multiple stanzas/header sections - Gave `HeaderParser` `parse_next_stanza()` and `parse_next_stanza_string()` methods for parsing & comsuming input only up to the end of the first header section - Gave `HeaderParser` `parse_stanzas()` and `parse_stanzas_string()` methods for parsing input composed entirely of multiple stanzas/header sections v0.3.0 (2018-10-12) ------------------- - Drop support for Python 3.3 - Gave `HeaderParser` and the scanner functions options for configuring scanning behavior: - `separator_regex` - `skip_leading_newlines` - Fixed a `DeprecationWarning` in Python 3.7 v0.2.0 (2018-02-14) ------------------- - `NormalizedDict`'s default normalizer (exposed as the `lower()` function) now passes non-strings through unchanged - `HeaderParser` instances can now be compared for non-identity equality - `HeaderParser.add_field()` and `HeaderParser.add_additional()` now take an optional `action` argument for customizing the parser's behavior when a field is encountered - Made the `unfold()` function public v0.1.0 (2017-03-17) ------------------- Initial release headerparser-0.4.0/LICENSE000066400000000000000000000021071347354115200151720ustar00rootroot00000000000000The MIT License (MIT) Copyright (c) 2017-2019 John Thorvald Wodder II Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. headerparser-0.4.0/MANIFEST.in000066400000000000000000000001651347354115200157250ustar00rootroot00000000000000include CHANGELOG.* CONTRIBUTORS.* LICENSE tox.ini graft docs prune docs/_build global-exclude *.py[cod] __pycache__ headerparser-0.4.0/NOTES.md000066400000000000000000000041131347354115200153760ustar00rootroot00000000000000Relevant extracts from : §2.1: Note: Common parlance and earlier versions of this specification use the term "header" to either refer to the entire header section or to refer to an individual header field. To avoid ambiguity, this document does not use the terms "header" or "headers" in isolation, but instead always uses "header field" to refer to the individual field and "header section" to refer to the entire collection. §2.2: Header fields are lines beginning with a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of printable US-ASCII characters as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters (together known as the white space characters, WSP). A field body MUST NOT include CR and LF except when used in "folding" and "unfolding", as described in section 2.2.3. All field bodies MUST conform to the syntax described in sections 3 and 4 of this specification. -------------------------------------------------------------------------------- Additional relevant RFCs: - On internationalization: - — MIME Part Three: Message Header Extensions for Non-ASCII Text - — MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations - On header fields with "parameterized" syntax: - , §5.1 — Syntax of the Content-Type Header Field - , §5 — The Link Header Field - , §4 — Forwarded HTTP Header Field - — The Content-Disposition Header Field See also: headerparser-0.4.0/README.rst000066400000000000000000000125311347354115200156560ustar00rootroot00000000000000.. image:: http://www.repostatus.org/badges/latest/active.svg :target: http://www.repostatus.org/#active :alt: Project Status: Active — The project has reached a stable, usable state and is being actively developed. .. image:: https://travis-ci.org/jwodder/headerparser.svg?branch=master :target: https://travis-ci.org/jwodder/headerparser .. image:: https://codecov.io/gh/jwodder/headerparser/branch/master/graph/badge.svg :target: https://codecov.io/gh/jwodder/headerparser .. image:: https://img.shields.io/pypi/pyversions/headerparser.svg :target: https://pypi.org/project/headerparser .. image:: https://img.shields.io/github/license/jwodder/headerparser.svg :target: https://opensource.org/licenses/MIT :alt: MIT License .. image:: https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg :target: https://saythanks.io/to/jwodder `GitHub `_ | `PyPI `_ | `Documentation `_ | `Issues `_ | `Changelog `_ ``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's ``argparse`` module. (Everyone loves ``argparse``, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. The Format ========== RFC 822-style headers are header fields that follow the general format of e-mail headers as specified by RFC 822 and friends: each field is a line of the form "``Name: Value``", with long values continued onto multiple lines ("folded") by indenting the extra lines. A blank line marks the end of the header section and the beginning of the message body. This basic grammar has been used by numerous textual formats besides e-mail, including but not limited to: - HTTP request & response headers - Usenet messages - most Python packaging metadata files - Debian packaging control files - ``META-INF/MANIFEST.MF`` files in Java JARs - a subset of the `YAML `_ serialization format — all of which this package can parse. Installation ============ Just use `pip `_ (You have pip, right?) to install ``headerparser`` and its dependencies:: pip install headerparser Examples ======== Define a parser:: >>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data') Parse some headers and inspect the results:: >>> msg = parser.parse_string('''\ ... Name: Sample Input ... Public: yes ... tag: doctest, examples, ... whatever ... TAG: README ... ... Wait, why I am using a body instead of the "Data" field? ... ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): ... KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n' Fail to parse headers that don't meet your requirements:: >>> parser.parse_string('Type: demonstration') Traceback (most recent call last): ... headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse_string('Name: Bad type\nType: other') Traceback (most recent call last): ... headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse_string('Name: unknown field\nField: Value') Traceback (most recent call last): ... headerparser.errors.UnknownFieldError: Unknown header field 'Field' Allow fields you didn't even think of:: >>> parser.add_additional() >>> msg = parser.parse_string('Name: unknown field\nField: Value') >>> msg['Field'] 'Value' Just split some headers into names & values and worry about validity later:: >>> for field in headerparser.scan_string('''\ ... Name: Scanner Sample ... Unknown headers: no problem ... Unparsed-Boolean: yes ... CaSe-SeNsItIvE-rEsUlTs: true ... Whitespace around colons:optional ... Whitespace around colons : I already said it's optional. ... That means you have the _option_ to use as much as you want! ... ... And there's a body, too, I guess. ... '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n") headerparser-0.4.0/TODO.md000066400000000000000000000164411347354115200152620ustar00rootroot00000000000000- Should string `default` values be passed through `type` etc. like in argparse? - Rethink how the original exception data is attached to `FieldTypeError`s - Include everything from `sys.exc_info()`? - Rename `NormalizedDict.normalized_dict()` to something that doesn't imply it returns a `NormalizedDict`? - Add docstrings to private classes and attributes - Write more tests - different header name normalizers (identity, hyphens=underscores, titlecase?, etc.) - `add_additional` - calling `add_additional` multiple times (some times with `allow=False`) - `add_additional(False, extra arguments ...)` - `add_additional` when a header has a `dest` that's just a normalized form of one of its names - calling `add_field`/`add_additional` on a `HeaderParser` after a previous call raised an error - scanning & parsing Unicode - normalizer that returns a non-string - non-string keys in `NormalizedDict` with the default normalizer - equality of `HeaderParser` objects - Test that `HeaderParser.parse_stream()` won't choke on non-string inputs - passing scanner options to `HeaderParser` - scanning files not opened in universal newlines mode - Improve documentation & examples - Contrast handling of multi-occurrence fields with that of the standard library - Draw attention to the case-insensitivity of field names when parsing and when retriving from the dict - Give examples of custom normalization (or at least explain what it is and why it's worth having) - Add `action` examples - Add example recipes to the documentation of `HeaderParser`s for common mail-like formats - Write more user-friendly documentation that goes through `HeaderParser` feature by feature like `attrs`' documentation Features ======== - Add some sort of handling for "From " lines - Give `NormalizedDict` a `from_line` attribute - Give the scanner a `from_line_regex` parameter; if the first line of a stanza matches the regex, it is assumed to be a "From" line - Create a "`SpecialHeader`" enum with `FromLine` and `Body` values for use as the first element of `(header, value)` pairs yielded by the scanner representing "From " lines and bodies - Use the enum values as keys in `NormalizedDict`s instead of having dedicated `from_line` and `body` attributes? - Give the parser an option for requiring a "From " line - Export premade regexes for matching Unix mail "From " lines, HTTP request lines, and HTTP response status lines - Write an entry point for converting RFC822-style files/headers to JSON - name: `mail2json`? `headers2json`? - include options for: - parsing multiple stanzas into an array of JSON objects - setting the key name for the "message body" - handling of multiple occurrences of the same header in a single stanza; choices: - raise an error - combine multi-occurrence headers into an array of values - use an array of values for all headers regardless of multiplicity (default?) - output an array of `{"header": ..., "value": ...}` objects - handling of non-ASCII characters and the various ways in which they can be escaped - handling of "From " lines (and/or other non-header headers like the first line of an HTTP request or response?) - handling of header lettercases? Scanning -------- - Give the scanner options for: - definition of "whitespace" for purposes of folding (standard: 0x20 and TAB) - line separator/terminator (default: CR, LF, and CRLF; standard: only CRLF, with lone CR and LF being obsolete) - using Unicode definitions of line endings and horizontal whitespace - stripping leading whitespace from folded lines? (standard: no) - handling "From " lines and the like - ignoring all blank lines? - comments? (cf. robots.txt) - internationalization of header names - treating `---` as a blank line? - Error handling: - header lines without a colon or indentation (options: error, header with empty value, or start of body) - empty header name (options: error, header with empty name, look for next colon, or start of body) - all-whitespace line (considered obsolete by RFC 5322) Parsing ------- - Include utility callables for header types: - RFC822 dates, addresses, etc. - Content-Type-style "parameterized" headers - Include an `object_pairs_hook` for the parameters? - cf. `cgi.parse_header()` - internationalized strings - converting lines with just '.' to blank lines - Somehow support the types in `email.headerregistry` - Provide a `Normalizer` class with options for casing, trimming whitespace, squashing whitespace, converting hyphens and underscores to the same character, squashing hyphens & underscores, etc. - unfolding if & only if the first line of the value contains any non-whitespace? (cf. most multiline fields in Debian control files) - DKIM headers? - removing RFC 822 comments? - comma-and-space-separated lists? - cf. `urllib.request.parse_http_list()`? - New `add_field` and `add_additional` options to add: - `default_action=callable` for defining what to do when a header is absent - `multiple_type` and `multiple_action` — like `type` and `action`, but called on a list of all values encountered for a `multiple` field - `i18n=bool` — turns on decoding of internationalized mail headers before passing to `type` (Do this via a custom type instead?) - Give `add_additional` an option for controlling whether to normalize additional header names before adding them to the dict? - Requiring/forbidding nonempty/non-whitespace bodies - Add public methods for removing, inspecting, & modifying header definitions - Make the `body`, `scanner_opts`, etc. attributes public - Support constructing a complete `HeaderParser` in a single expression from a `dict` rather than having to make multiple calls to `add_field` - Support converting a `HeaderParser` instance to such a `dict` - Support modifying a `HeaderParser`'s field definitions after they're defined? - Allow two different named fields to have the same `dest` if they both have `multiple=True`? (or both `multiple=False`?) - Give `add_additional` an argument for putting all additional fields in a given subdict (or a presupplied arbitrary mapping object?) so that named fields can still use custom dests? - Give parsers a way to store parsed fields in a presupplied arbitrary mapping object (or one created from a `dict_factory`/`dict_cls` callable?) instead of creating a new NormalizedDict? - Give `HeaderParser` an option (`body_key`?) for storing the body in a given `dict` key - Create a `BODY` token to use as a `dict` key for storing bodies instead of storing them as an attribute? - Add an option/method for ignoring & discarding any unknown/"additional" fields - Add handling for fields that can either occur in the header or be the body (e.g., "Description" in Python packaging METADATA) - Require scanner options to be passed to `HeaderParser`'s constructor in a `scanner_opts={}` `dict` instead of as `**kwargs` headerparser-0.4.0/docs/000077500000000000000000000000001347354115200151155ustar00rootroot00000000000000headerparser-0.4.0/docs/changelog.rst000066400000000000000000000033361347354115200176030ustar00rootroot00000000000000.. currentmodule:: headerparser Changelog ========= v0.4.0 (2019-05-29) ------------------- - Added a `scan()` function combining the behavior of `scan_file()` and `scan_lines()`, which are now deprecated - Gave `HeaderParser` a `~HeaderParser.parse()` method combining the behavior of `~HeaderParser.parse_file()` and `~HeaderParser.parse_lines()`, which are now deprecated - Added `scan_next_stanza()` and `scan_next_stanza_string()` functions for scanning & consuming input only up to the end of the first header section - Added `scan_stanzas()` and `scan_stanzas_string()` functions for scanning input composed entirely of multiple stanzas/header sections - Gave `HeaderParser` `parse_next_stanza()` and `parse_next_stanza_string()` methods for parsing & comsuming input only up to the end of the first header section - Gave `HeaderParser` `parse_stanzas()` and `parse_stanzas_string()` methods for parsing input composed entirely of multiple stanzas/header sections v0.3.0 (2018-10-12) ------------------- - Drop support for Python 3.3 - Gave `HeaderParser` and the scanner functions options for configuring scanning behavior: - ``separator_regex`` - ``skip_leading_newlines`` - Fixed a `DeprecationWarning` in Python 3.7 v0.2.0 (2018-02-14) ------------------- - `NormalizedDict`'s default normalizer (exposed as the `lower()` function) now passes non-strings through unchanged - `HeaderParser` instances can now be compared for non-identity equality - `HeaderParser.add_field()` and `HeaderParser.add_additional()` now take an optional ``action`` argument for customizing the parser's behavior when a field is encountered - Made the `unfold()` function public v0.1.0 (2017-03-17) ------------------- Initial release headerparser-0.4.0/docs/conf.py000066400000000000000000000015521347354115200164170ustar00rootroot00000000000000from headerparser import __version__ project = 'headerparser' author = 'John T. Wodder II' copyright = '2017-2019 John T. Wodder II' extensions = [ 'sphinx.ext.autodoc', 'sphinx.ext.intersphinx', 'sphinx.ext.todo', 'sphinx.ext.viewcode', ] autodoc_default_options = { 'members': None, 'undoc-members': None, } intersphinx_mapping = { "python": ("https://docs.python.org/3", None), } exclude_patterns = ['_build'] source_suffix = '.rst' source_encoding = 'utf-8-sig' master_doc = 'index' version = __version__ release = __version__ today_fmt = '%Y %b %d' default_role = 'py:obj' pygments_style = 'sphinx' todo_include_todos = True html_theme = 'sphinx_rtd_theme' html_theme_options = { "collapse_navigation": False, } html_last_updated_fmt = '%Y %b %d' html_show_sourcelink = True html_show_sphinx = True html_show_copyright = True headerparser-0.4.0/docs/errors.rst000066400000000000000000000020451347354115200171640ustar00rootroot00000000000000.. currentmodule:: headerparser Exceptions ========== .. autoexception:: headerparser.errors.Error :show-inheritance: Parser Errors ------------- .. autoexception:: headerparser.errors.ParserError :show-inheritance: .. autoexception:: headerparser.errors.BodyNotAllowedError :show-inheritance: .. autoexception:: headerparser.errors.DuplicateFieldError :show-inheritance: .. autoexception:: headerparser.errors.FieldTypeError :show-inheritance: .. autoexception:: headerparser.errors.InvalidChoiceError :show-inheritance: .. autoexception:: headerparser.errors.MissingBodyError :show-inheritance: .. autoexception:: headerparser.errors.MissingFieldError :show-inheritance: .. autoexception:: headerparser.errors.UnknownFieldError :show-inheritance: Scanner Errors -------------- .. autoexception:: headerparser.errors.ScannerError :show-inheritance: .. autoexception:: headerparser.errors.MalformedHeaderError :show-inheritance: .. autoexception:: headerparser.errors.UnexpectedFoldingError :show-inheritance: headerparser-0.4.0/docs/format.rst000066400000000000000000000062361347354115200171460ustar00rootroot00000000000000Input Format ============ `headerparser` accepts a syntax that is intended to be a simplified superset of the Internet Message (e-mail) Format specified in :rfc:`822`, :rfc:`2822`, and :rfc:`5322`. Specifically: - Everything in the input up to (but not including) the first blank line (i.e., a line containing only a line ending) constitutes a :dfn:`stanza` or :dfn:`header section`. Everything after the first blank line is a free-form :dfn:`message body`. If there are no blank lines, the entire input is used as the header section, and there is no body. .. note:: By default, blank lines at the beginning of a document are interpreted as the ending of a zero-length stanza. Such blank lines can instead be ignored by setting the ``skip_leading_newlines`` :ref:`scanner option ` to true. - A stanza or header section is composed of zero or more :dfn:`header fields`. A header field is composed of one or more lines, with all lines after the first beginning with a space or tab. Additionally, the first line must contain a colon (optionally surrounded by whitespace); everything before the colon is the :dfn:`header field name`, while everything after (including subsequent lines) is the :dfn:`header field value`. .. note:: Name-value separators other than a colon can be used by setting the ``separator_regex`` :ref:`scanner option ` appropriately. .. note:: This format only recognizes CR, LF, and CR LF sequences as line endings. An example:: Key: Value Foo: Bar Bar:Whitespace around the colon is optional Baz : Very optional Long-Field: This field has a very long value, so I'm going to split it across multiple lines. The above line is all whitespace. This counts as line folding, and so we're still in the "Long Field" value, but the RFCs consider such lines obsolete, so you should avoid using them. . One alternative to an all-whitespace line is a line with just indentation and a period. Debian package description fields use this. Foo: Wait, I already defined a value for this key. What happens now? What happens now: It depends on whether the `multiple` option for the "Foo" field was set in the HeaderParser. If multiple=True: The "Foo" key in the dictionary returned by HeaderParser.parse_string() would map to a list of all of Foo's values If multiple=False: A ParserError is raised If multiple=False but there's only one "Foo" anyway: The "Foo" key in the result dictionary would map to just a single string. Compare this to: the standard library's `email` package, which accepts multi-occurrence fields, but *which* occurrence Message.__getitem__ returns is unspecified! Are we still in the header: no There was a blank line above, so we're now in the body, which isn't processed for headers. Good thing, too, because this isn't a valid header line. On the other hand, this is not a valid RFC 822-style document:: An indented first line — without a "Name:" line before it! A header line without a colon isn't good, either. Does this make up for the above: no headerparser-0.4.0/docs/index.rst000066400000000000000000000100011347354115200167460ustar00rootroot00000000000000.. module:: headerparser ============================================== headerparser — argparse for mail-style headers ============================================== `GitHub `_ | `PyPI `_ | `Documentation `_ | `Issues `_ | :doc:`Changelog ` .. toctree:: :hidden: format parser scanner util errors changelog `headerparser` parses key-value pairs in the style of :rfc:`822` (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's `argparse` module. (Everyone loves `argparse`, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. Installation ============ Just use `pip `_ (You have pip, right?) to install ``headerparser`` and its dependencies:: pip install headerparser Examples ======== Define a parser:: >>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data') Parse some headers and inspect the results:: >>> msg = parser.parse_string('''\ ... Name: Sample Input ... Public: yes ... tag: doctest, examples, ... whatever ... TAG: README ... ... Wait, why I am using a body instead of the "Data" field? ... ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): ... KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n' Fail to parse headers that don't meet your requirements:: >>> parser.parse_string('Type: demonstration') Traceback (most recent call last): ... headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse_string('Name: Bad type\nType: other') Traceback (most recent call last): ... headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse_string('Name: unknown field\nField: Value') Traceback (most recent call last): ... headerparser.errors.UnknownFieldError: Unknown header field 'Field' Allow fields you didn't even think of:: >>> parser.add_additional() >>> msg = parser.parse_string('Name: unknown field\nField: Value') >>> msg['Field'] 'Value' Just split some headers into names & values and worry about validity later:: >>> for field in headerparser.scan_string('''\ ... Name: Scanner Sample ... Unknown headers: no problem ... Unparsed-Boolean: yes ... CaSe-SeNsItIvE-rEsUlTs: true ... Whitespace around colons:optional ... Whitespace around colons : I already said it's optional. ... That means you have the _option_ to use as much as you want! ... ... And there's a body, too, I guess. ... '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n") Indices and tables ================== * :ref:`genindex` * :ref:`search` headerparser-0.4.0/docs/parser.rst000066400000000000000000000001141347354115200171370ustar00rootroot00000000000000.. currentmodule:: headerparser Parser ====== .. autoclass:: HeaderParser headerparser-0.4.0/docs/requirements.txt000066400000000000000000000001571347354115200204040ustar00rootroot00000000000000Sphinx~=1.8 sphinx_rtd_theme~=0.3.0 # Force Read the Docs to use an up-to-date setuptools: setuptools>=34.4.0 headerparser-0.4.0/docs/scanner.rst000066400000000000000000000066061347354115200173100ustar00rootroot00000000000000.. currentmodule:: headerparser Scanner ======= Scanner functions perform basic parsing of RFC 822-style header fields, splitting them up into sequences of ``(name, value)`` pairs without any further validation or transformation. In each pair, the first element (the header field name) is the substring up to but not including the first whitespace-padded colon (or other delimiter specified by ``separator_regex``) in the first source line of the header field. The second element (the header field value) is a single string, the concatenation of one or more lines, starting with the substring after the first colon in the first source line, with leading whitespace on lines after the first preserved; the ending of each line is converted to ``'\n'`` (added if there is no line ending in the actual input), and the last line of the field value has its trailing line ending (if any) removed. .. note:: "Line ending" here means a CR, LF, or CR LF sequence. Unicode line separators are not treated as line endings and are not trimmed or converted to ``'\n'``. The various functions differ in how they behave once the end of the header section is encountered: - `scan()` and `scan_string()` gather up everything after the header section and (if there is anything) yield it as a ``(None, body)`` pair - `scan_next_stanza()` and `scan_next_stanza_string()` stop processing input at the end of the header section; `scan_next_stanza()` leaves the unprocessed input in the iterator, while `scan_next_stanza_string()` returns the rest of the input alongside the header fields - `scan_stanzas()` and `scan_stanzas_string()` expect their input to consist entirely of multiple blank-line-terminated header sections, all of which are processed The `scan()`, `scan_next_stanza()`, and `scan_stanzas()` functions take as input an iterable of strings (e.g., a text file object) and treat each string as a single line, regardless of whether it ends with a line ending or not (or even whether it contains a line ending in the middle of the string). The `scan_string()`, `scan_next_stanza_string()`, and `scan_stanzas_string()` functions take as input a single string which is then broken into lines on CR, LF, and CR LF boundaries and then processed as a list of strings. .. autofunction:: scan .. autofunction:: scan_string .. autofunction:: scan_next_stanza .. autofunction:: scan_next_stanza_string .. autofunction:: scan_stanzas .. autofunction:: scan_stanzas_string Deprecated Functions -------------------- .. autofunction:: scan_file .. autofunction:: scan_lines .. _scan_opts: Scanner Options --------------- The following keyword arguments can be passed to `HeaderParser` and the scanner functions in order to configure scanning behavior: ``separator_regex=r'[ \t]*:[ \t]*'`` A regex (as a `str` or compiled regex object) defining the name-value separator. When the regex matches a line, everything before the matched substring becomes the field name, and everything after becomes the first line of the field value. Note that the regex must match any surrounding whitespace in order for it to be trimmed from the key & value. ``skip_leading_newlines=False`` If `True`, blank lines at the beginning of the input will be discarded. If `False`, a blank line at the beginning of the input marks the end of an empty header section. .. versionadded:: 0.3.0 ``separator_regex``, ``skip_leading_newlines`` headerparser-0.4.0/docs/util.rst000066400000000000000000000002331347354115200166220ustar00rootroot00000000000000.. currentmodule:: headerparser Utilities ========= .. autoclass:: NormalizedDict .. autofunction:: BOOL .. autofunction:: lower .. autofunction:: unfold headerparser-0.4.0/headerparser/000077500000000000000000000000001347354115200166325ustar00rootroot00000000000000headerparser-0.4.0/headerparser/__init__.py000066400000000000000000000044731347354115200207530ustar00rootroot00000000000000""" argparse for mail-style headers ``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's ``argparse`` module. (Everyone loves ``argparse``, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. Visit or for more information. """ from .errors import ( Error, ParserError, DuplicateFieldError, FieldTypeError, InvalidChoiceError, MissingFieldError, UnknownFieldError, MissingBodyError, BodyNotAllowedError, ScannerError, MalformedHeaderError, UnexpectedFoldingError, ) from .normdict import NormalizedDict from .parser import HeaderParser from .scanner import scan, scan_file, scan_lines, scan_string, scan_stanzas, \ scan_stanzas_string, scan_next_stanza, \ scan_next_stanza_string from .types import BOOL, lower, unfold __version__ = '0.4.0' __author__ = 'John Thorvald Wodder II' __author_email__ = 'headerparser@varonathe.org' __license__ = 'MIT' __url__ = 'https://github.com/jwodder/headerparser' __all__ = [ 'BOOL', 'BodyNotAllowedError', 'DuplicateFieldError', 'Error', 'HeaderParser', 'FieldTypeError', 'InvalidChoiceError', 'MalformedHeaderError', 'MissingBodyError', 'MissingFieldError', 'NormalizedDict', 'ParserError', 'ScannerError', 'UnexpectedFoldingError', 'UnknownFieldError', 'lower', 'scan', 'scan_file', 'scan_lines', 'scan_next_stanza', 'scan_next_stanza_string', 'scan_stanzas', 'scan_stanzas_string', 'scan_string', 'unfold', ] headerparser-0.4.0/headerparser/errors.py000066400000000000000000000077361347354115200205350ustar00rootroot00000000000000class Error(Exception): """ Superclass for all custom exceptions raised by the package """ pass class ParserError(Error, ValueError): """ Superclass for all custom exceptions related to errors in parsing """ pass class MissingFieldError(ParserError): """ Raised when a header field marked as required is not present in the input """ def __init__(self, name): #: The name of the missing header field self.name = name super(MissingFieldError, self).__init__(name) def __str__(self): return 'Required header field {0.name!r} is not present'.format(self) class UnknownFieldError(ParserError): """ Raised when an unknown header field is encountered and additional header fields are not enabled """ def __init__(self, name): #: The name of the unknown header field self.name = name super(UnknownFieldError, self).__init__(name) def __str__(self): return 'Unknown header field {0.name!r}'.format(self) class DuplicateFieldError(ParserError): """ Raised when a header field not marked as multiple occurs two or more times in the input """ def __init__(self, name): #: The name of the duplicated header field self.name = name super(DuplicateFieldError, self).__init__(name) def __str__(self): return 'Header field {0.name!r} occurs more than once'.format(self) class FieldTypeError(ParserError): """ Raised when a ``type`` callable raises an exception """ def __init__(self, name, value, exc_value): #: The name of the header field for which the ``type`` callable was #: called self.name = name #: The value on which the ``type`` callable was called self.value = value #: The exception raised by the ``type`` callable self.exc_value = exc_value super(FieldTypeError, self).__init__(name, value, exc_value) def __str__(self): return 'Error while parsing {0.name!r}: {0.value!r}:'\ ' {0.exc_value.__class__.__name__}: {0.exc_value}'.format(self) class InvalidChoiceError(ParserError): """ Raised when a header field is given a value that is not one of its allowed choices """ def __init__(self, name, value): #: The name of the header field self.name = name #: The invalid value self.value = value super(InvalidChoiceError, self).__init__(name, value) def __str__(self): return '{0.value!r} is not a valid choice for {0.name!r}'.format(self) class MissingBodyError(ParserError): """ Raised when ``body=True`` but there is no message body in the input """ def __str__(self): return 'Message body is required but missing' class BodyNotAllowedError(ParserError): """ Raised when ``body=False`` and the parser encounters a message body """ def __str__(self): return 'Message body is present but not allowed' class ScannerError(Error, ValueError): """ Superclass for all custom exceptions related to errors in scanning """ pass class MalformedHeaderError(ScannerError): """ Raised when the scanner encounters an invalid header line, i.e., a line without either a colon or leading whitespace """ def __init__(self, line): #: The invalid header line self.line = line super(MalformedHeaderError, self).__init__(line) def __str__(self): return 'Invalid header line encountered: {0.line!r}'.format(self) class UnexpectedFoldingError(ScannerError): """ Raised when the scanner encounters a folded (indented) line that is not preceded by a valid header line """ def __init__(self, line): #: The line containing the unexpected folding (indentation) self.line = line super(UnexpectedFoldingError, self).__init__(line) def __str__(self): return 'Indented line without preceding header line encountered:'\ ' {0.line!r}'.format(self) headerparser-0.4.0/headerparser/normdict.py000066400000000000000000000107441347354115200210310ustar00rootroot00000000000000from six import PY2, iteritems, itervalues from .types import lower if PY2: from collections import Mapping, MutableMapping else: from collections.abc import Mapping, MutableMapping class NormalizedDict(MutableMapping): """ A generalization of a case-insensitive dictionary. `NormalizedDict` takes a callable (the "normalizer") that is applied to any key passed to its `~object.__getitem__`, `~object.__setitem__`, or `~object.__delitem__` method, and the result of the call is then used for the actual lookup. When iterating over a `NormalizedDict`, each key is returned as the "pre-normalized" form passed to `~object.__setitem__` the last time the key was set (but see `normalized()` below). Aside from this, `NormalizedDict` behaves like a normal `~collections.abc.MutableMapping` class. If a normalizer is not specified upon instantiation, a default will be used that converts strings to lowercase and leaves everything else unchanged, so `NormalizedDict` defaults to yet another case-insensitive dictionary. Two `NormalizedDict` instances compare equal iff their normalizers, bodies, and `normalized_dict()` return values are equal. When comparing a `NormalizedDict` to any other type of mapping, the other mapping is first converted to a `NormalizedDict` using the same normalizer. :param mapping data: a mapping or iterable of ``(key, value)`` pairs with which to initialize the instance :param callable normalizer: A callable to apply to keys before looking them up; defaults to `lower`. The callable MUST be idempotent (i.e., ``normalizer(x)`` must equal ``normalizer(normalizer(x))`` for all inputs) or else bad things will happen to your dictionary. :param body: initial value for the `body` attribute :type body: string or `None` """ def __init__(self, data=None, normalizer=None, body=None): self._data = {} self.normalizer = normalizer or lower #: This is where `HeaderParser` stores the message body (if any) #: accompanying the header section represented by the mapping self.body = body if data is not None: # Don't call `update` until after `normalizer` is set. self.update(data) def __getitem__(self, key): return self._data[self.normalizer(key)][1] def __setitem__(self, key, value): self._data[self.normalizer(key)] = (key, value) def __delitem__(self, key): del self._data[self.normalizer(key)] def __iter__(self): return (key for key, value in itervalues(self._data)) def __len__(self): return len(self._data) def __eq__(self, other): if isinstance(other, NormalizedDict): if self.normalizer != other.normalizer or self.body != other.body: return False elif isinstance(other, Mapping): if self.body is not None: return False other = NormalizedDict(other, normalizer=self.normalizer) else: return NotImplemented return self.normalized_dict() == other.normalized_dict() def __ne__(self, other): return not (self == other) def __repr__(self): return '{0.__module__}.{0.__name__}'\ '({2!r}, normalizer={1.normalizer!r}, body={1.body!r})'\ .format(type(self), self, dict(self)) def normalized(self): """ Return a copy of the instance such that iterating over it will return normalized keys instead of the keys passed to `~object.__setitem__` >>> normdict = NormalizedDict() >>> normdict['Foo'] = 23 >>> normdict['bar'] = 42 >>> sorted(normdict) ['Foo', 'bar'] >>> sorted(normdict.normalized()) ['bar', 'foo'] :rtype: NormalizedDict """ return NormalizedDict( self.normalized_dict(), normalizer=self.normalizer, body=self.body, ) def normalized_dict(self): """ Convert to a `dict` with all keys normalized. (A `dict` with non-normalized keys can be obtained with ``dict(normdict)``.) :rtype: dict """ return {key: value for key, (_, value) in iteritems(self._data)} def copy(self): """ Create a shallow copy of the mapping """ dup = type(self)() dup._data = self._data.copy() dup.normalizer = self.normalizer dup.body = self.body return dup headerparser-0.4.0/headerparser/parser.py000066400000000000000000000545721347354115200205150ustar00rootroot00000000000000from warnings import warn from six import itervalues, string_types from . import errors from .normdict import NormalizedDict from .scanner import scan, scan_string, scan_stanzas, scan_stanzas_string, \ scan_next_stanza, scan_next_stanza_string from .types import lower, unfold class HeaderParser(object): """ A parser for RFC 822-style header sections. Define the fields the parser should recognize with the `add_field()` method, configure handling of unrecognized fields with `add_additional()`, and then parse input with `parse()` or another `!parse_*()` method. :param callable normalizer: By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting ``normalizer`` to a custom callable that takes a field name and returns a "normalized" name for use in equality testing. The normalizer will also be used when looking up keys in the `NormalizedDict` instances returned by the parser's `!parse_*()` methods. :param bool body: whether the parser should allow or forbid a body after the header section; `True` means a body is required, `False` means a body is prohibited, and `None` (the default) means a body is optional :param kwargs: :ref:`scanner options ` """ def __init__(self, normalizer=None, body=None, **kwargs): #: The ``normalizer`` argument passed to the constructor, or `lower` if #: no normalizer was supplied self._normalizer = normalizer or lower #: The ``body`` argument passed to the constructor self._body = body #: Scanner options self._scan_opts = kwargs #: A mapping from normalized field names to `NamedField` instances self._fielddefs = dict() #: The set of all normalized ``dest`` values for all named fields #: defined so far self._dests = set() #: If additional fields are enabled, this is the `FieldDef` instance #: used to process them; otherwise, it is `None`. self._additional = None #: Whether any fields with custom ``dest`` values have been defined, #: thereby precluding `add_additional()` self._custom_dests = False def __eq__(self, other): if type(self) is type(other): return vars(self) == vars(other) else: return NotImplemented def __ne__(self, other): return not (self == other) def add_field(self, name, *altnames, **kwargs): """ Define a header field for the parser to parse. During parsing, if a field is encountered whose name (*modulo* normalization) equals either ``name`` or one of the ``altnames``, the field's value will be processed according to the options in ``**kwargs``. (If no options are specified, the value will just be stored in the result dictionary.) .. versionadded:: 0.2.0 ``action`` argument added :param string name: the primary name for the field, used in error messages and as the default value of ``dest`` :param strings altnames: field name synonyms :param dest: The key in the result dictionary in which the field's value(s) will be stored; defaults to ``name``. When additional headers are enabled (see `add_additional`), ``dest`` must equal (after normalization) one of the field's names. :param bool required: If `True` (default `False`), the ``parse_*`` methods will raise a `~headerparser.errors.MissingFieldError` if the field is not present in the input :param default: The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input. ``default`` cannot be set when the field is required. ``type``, ``unfold``, and ``action`` will not be applied to the default value, and the default value need not belong to ``choices``. :param bool multiple: If `True`, the header field will be allowed to occur more than once in the input, and all of the field's values will be stored in a list. If `False` (the default), a `~headerparser.errors.DuplicateFieldError` will be raised if the field occurs more than once in the input. :param bool unfold: If `True` (default `False`), the field value will be "unfolded" (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying ``type`` :param callable type: a callable to apply to the field value before storing it in the result dictionary :param iterable choices: A sequence of values which the field is allowed to have. If ``choices`` is defined, all occurrences of the field in the input must have one of the given values (after applying ``type``) or else an `~headerparser.errors.InvalidChoiceError` is raised. :param callable action: A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field's ``name``, and the field's value (after processing with ``type`` and ``unfold`` and checking against ``choices``). The callable replaces the default behavior of storing the field's values in the result dictionary, and so the callable must explicitly store the values if desired. When ``action`` is defined for a field, ``dest`` cannot be. :return: `None` :raises ValueError: - if another field with the same name or ``dest`` was already defined - if ``dest`` is not one of the field's names and `add_additional` is enabled - if ``default`` is defined and ``required`` is true - if ``choices`` is an empty sequence - if both ``dest`` and ``action`` are defined :raises TypeError: if ``name`` or one of the ``altnames`` is not a string """ if 'action' in kwargs and 'dest' in kwargs: raise ValueError('`action` and `dest` are mutually exclusive') kwargs.setdefault('dest', name) hd = NamedField(name=name, **kwargs) normed = set(map(self._normalizer, (name,) + altnames)) # Error before modifying anything: redefs = [n for n in self._fielddefs if n in normed] if redefs: raise ValueError('field defined more than once: ' + repr(redefs[0])) if self._normalizer(hd.dest) in self._dests: raise ValueError('destination defined more than once: ' + repr(hd.dest)) if self._normalizer(hd.dest) not in normed: if self._additional is not None: raise ValueError('add_additional and `dest` are mutually exclusive') self._custom_dests = True for n in normed: self._fielddefs[n] = hd self._dests.add(self._normalizer(hd.dest)) def add_additional(self, enable=True, **kwargs): """ Specify how the parser should handle fields in the input that were not previously registered with `add_field`. By default, unknown fields will cause the ``parse_*`` methods to raise an `~headerparser.errors.UnknownFieldError`, but calling this method with ``enable=True`` (the default) will change the parser's behavior so that all unregistered fields are processed according to the options in ``**kwargs``. (If no options are specified, the additional values will just be stored in the result dictionary.) If this method is called more than once, only the settings from the last call will be used. Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of ``multiple``) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done through `add_field`. .. versionadded:: 0.2.0 ``action`` argument added :param bool enable: whether the parser should accept input fields that were not registered with `add_field`; setting this to `False` disables additional fields and restores the parser's default behavior :param bool multiple: If `True`, each additional header field will be allowed to occur more than once in the input, and each field's values will be stored in a list. If `False` (the default), a `~headerparser.errors.DuplicateFieldError` will be raised if an additional field occurs more than once in the input. :param bool unfold: If `True` (default `False`), additional field values will be "unfolded" (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying ``type`` :param callable type: a callable to apply to additional field values before storing them in the result dictionary :param iterable choices: A sequence of values which additional fields are allowed to have. If ``choices`` is defined, all additional field values in the input must have one of the given values (after applying ``type``) or else an `~headerparser.errors.InvalidChoiceError` is raised. :param callable action: A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field's name, and the field's value (after processing with ``type`` and ``unfold`` and checking against ``choices``). The callable replaces the default behavior of storing the field's values in the result dictionary, and so the callable must explicitly store the values if desired. :return: `None` :raises ValueError: - if ``enable`` is true and a previous call to `add_field` used a custom ``dest`` - if ``choices`` is an empty sequence """ if enable: if self._custom_dests: raise ValueError('add_additional and `dest` are mutually exclusive') self._additional = FieldDef(**kwargs) else: self._additional = None def parse_stream(self, fields): """ Process a sequence of ``(name, value)`` pairs as returned by `scan()` or `scan_string()` and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call. :param fields: a sequence of ``(name, value)`` pairs representing the input fields :type fields: iterable of pairs of strings :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ValueError: if the input contains more than one body pair """ data = NormalizedDict(normalizer=self._normalizer) fields_seen = set() body_seen = False for k,v in fields: if k is None: if body_seen: raise ValueError('Body appears twice in input') if self._body is not None and not self._body: raise errors.BodyNotAllowedError() data.body = v body_seen = True else: try: hd = self._fielddefs[self._normalizer(k)] except KeyError: if self._additional is not None: hd = self._additional else: raise errors.UnknownFieldError(k) else: fields_seen.add(hd.name) hd.process(data, k, v) for hd in itervalues(self._fielddefs): if hd.name not in fields_seen: if hd.required: raise errors.MissingFieldError(hd.name) elif hasattr(hd, 'default'): data[hd.dest] = hd.default if self._body and not body_seen: raise errors.MissingBodyError() return data def parse(self, iterable): """ .. versionadded:: 0.4.0 Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle or sequence of lines and return a dictionary of the header fields (possibly with body attached). If ``iterable`` is an iterable of `str`, newlines will be appended to lines in multiline header fields where not already present but will not be inserted where missing inside the body. :param iterable: a text-file-like object or iterable of lines to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ return self.parse_stream(scan(iterable, **self._scan_opts)) def parse_file(self, fp): """ Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle and return a dictionary of the header fields (possibly with body attached) .. deprecated:: 0.4.0 Use `parse()` instead. :param fp: the file to parse :type fp: file-like object :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ warn( 'HeaderParser.parse_file() is deprecated.' ' Use the parse() method instead.', DeprecationWarning, ) return self.parse_stream(scan(fp, **self._scan_opts)) def parse_lines(self, iterable): """ Parse an RFC 822-style header field section (possibly followed by a message body) from the given sequence of lines and return a dictionary of the header fields (possibly with body attached). Newlines will be inserted where not already present in multiline header fields but will not be inserted inside the body. .. deprecated:: 0.4.0 Use `parse()` instead. :param iterable: a sequence of lines comprising the text to parse :type iterable: iterable of strings :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ warn( 'HeaderParser.parse_lines() is deprecated.' ' Use the parse() method instead.', DeprecationWarning, ) return self.parse_stream(scan(iterable, **self._scan_opts)) def parse_string(self, s): """ Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached) :param string s: the text to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ return self.parse_stream(scan_string(s, **self._scan_opts)) def parse_stanzas(self, iterable): """ .. versionadded:: 0.4.0 Parse zero or more stanzas of RFC 822-style header fields from the given filehandle or sequence of lines and return a generator of dictionaries of header fields. All of the input is treated as header sections, not message bodies; as a result, calling this method when ``body`` is true will produce a `MissingBodyError`. :param iterable: a text-file-like object or iterable of lines to parse :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ return self.parse_stanzas_stream( scan_stanzas(iterable, **self._scan_opts) ) def parse_stanzas_string(self, s): """ .. versionadded:: 0.4.0 Parse zero or more stanzas of RFC 822-style header fields from the given string and return a generator of dictionaries of header fields. All of the input is treated as header sections, not message bodies; as a result, calling this method when ``body`` is true will produce a `MissingBodyError`. :param string s: the text to parse :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ return self.parse_stanzas_stream( scan_stanzas_string(s, **self._scan_opts) ) def parse_stanzas_stream(self, fields): """ .. versionadded:: 0.4.0 Parse an iterable of iterables of ``(name, value)`` pairs as returned by `scan_stanzas()` or `scan_stanzas_string()` and return a generator of dictionaries of header fields. This is a low-level method that you will usually not need to call. :param fields: an iterable of iterables of pairs of strings :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ for stanza in fields: yield self.parse_stream(stanza) def parse_next_stanza(self, iterator): """ .. versionadded:: 0.4.0 Parse a RFC 822-style header field section from the contents of the given filehandle or iterator of lines and return a dictionary of the header fields. Input processing stops at the end of the header section, leaving the rest of the iterator unconsumed. As a message body is not consumed, calling this method when ``body`` is true will produce a `MissingBodyError`. :param iterator: a text-file-like object or iterator of lines to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ return self.parse_stream(scan_next_stanza(iterator, **self._scan_opts)) def parse_next_stanza_string(self, s): """ .. versionadded:: 0.4.0 Parse a RFC 822-style header field section from the given string and return a pair of a dictionary of the header fields and the rest of the string. As a message body is not consumed, calling this method when ``body`` is true will produce a `MissingBodyError`. :param string s: the text to parse :rtype: pair of `NormalizedDict` and a string :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ fields, extra = scan_next_stanza_string(s, **self._scan_opts) return (self.parse_stream(fields), extra) class FieldDef(object): def __init__(self, type=None, multiple=False, unfold=False, choices=None, action=None): self.type = type self.multiple = bool(multiple) self.unfold = bool(unfold) if choices is not None: choices = list(choices) if not choices: raise ValueError('empty list supplied for choices') self.choices = choices self.action = action def __eq__(self, other): if type(self) is type(other): return vars(self) == vars(other) else: return NotImplemented def __ne__(self, other): return not (self == other) def _process(self, data, name, dest, value): if self.unfold: value = unfold(value) if self.type is not None: try: value = self.type(value) except errors.FieldTypeError: raise except Exception as e: raise errors.FieldTypeError(name, value, e) if self.choices is not None and value not in self.choices: raise errors.InvalidChoiceError(name, value) if self.action is not None: self.action(data, name, value) elif self.multiple: data.setdefault(dest, []).append(value) elif dest in data: raise errors.DuplicateFieldError(name) else: data[dest] = value def process(self, data, name, value): self._process(data, name, name, value) class NamedField(FieldDef): def __init__(self, name, dest, required=False, **kwargs): if not isinstance(name, string_types): raise TypeError('field names must be strings') self.name = name self.dest = dest self.required = bool(required) if 'default' in kwargs: if self.required: raise ValueError('required and default are mutually exclusive') self.default = kwargs.pop('default') super(NamedField, self).__init__(**kwargs) def process(self, data, _, value): self._process(data, self.name, self.dest, value) headerparser-0.4.0/headerparser/scanner.py000066400000000000000000000220201347354115200206310ustar00rootroot00000000000000import re from warnings import warn from .errors import MalformedHeaderError, UnexpectedFoldingError from .util import ascii_splitlines def scan_string(s, **kwargs): """ Scan a string for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. See `scan()` for more information on the exact behavior of the scanner. :param s: a string which will be broken into lines on CR, LF, and CR LF boundaries and passed to `scan()` :param kwargs: :ref:`scanner options ` :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ return scan(ascii_splitlines(s), **kwargs) def scan_file(fp, **kwargs): """ Scan a file for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. See `scan()` for more information on the exact behavior of the scanner. .. deprecated:: 0.4.0 Use `scan()` instead. :param fp: A file-like object than can be iterated over to produce lines to pass to `scan()`. Opening the file in universal newlines mode is recommended. :param kwargs: :ref:`scanner options ` :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ warn('scan_file() is deprecated. Use scan() instead.', DeprecationWarning) return scan(fp, **kwargs) def scan_lines(fp, **kwargs): """ Scan an iterable of lines for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. See `scan()` for more information on the exact behavior of the scanner. .. deprecated:: 0.4.0 Use `scan()` instead. :param iterable: an iterable of strings representing lines of input :param kwargs: :ref:`scanner options ` :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ warn('scan_lines() is deprecated. Use scan() instead.', DeprecationWarning) return scan(fp, **kwargs) def scan(iterable, **kwargs): """ .. versionadded:: 0.4.0 Scan a text-file-like object or iterable of lines for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. All lines after the first blank line are concatenated & yielded as-is in a ``(None, body)`` pair. (Note that body lines which do not end with a line terminator will not have one appended.) If there is no empty line in ``iterable``, then no body pair is yielded. If the empty line is the last line in ``iterable``, the body will be the empty string. If the empty line is the *first* line in ``iterable`` and the ``skip_leading_newlines`` option is false (the default), then all other lines will be treated as part of the body and will not be scanned for header fields. :param iterable: a text-file-like object or iterable of strings representing lines of input :param kwargs: :ref:`scanner options ` :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ lineiter = iter(iterable) for name, value in _scan_next_stanza(lineiter, **kwargs): if name is not None: yield (name, value) elif value: yield (None, ''.join(lineiter)) def scan_next_stanza(iterator, **kwargs): """ .. versionadded:: 0.4.0 Scan a text-file-like object or iterator of lines for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input. Input processing stops as soon as a blank line is encountered, leaving the rest of the iterator unconsumed (If ``skip_leading_newlines`` is true, the function only stops on a blank line after a non-blank line). :param iterator: a text-file-like object or iterator of strings representing lines of input :param kwargs: :ref:`scanner options ` :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ for name, value in _scan_next_stanza(iterator, **kwargs): if name is not None: yield (name, value) def _scan_next_stanza( iterator, separator_regex = re.compile(r'[ \t]*:[ \t]*'), # noqa: B008 skip_leading_newlines = False, ): """ .. versionadded:: 0.4.0 Like `scan_next_stanza()`, except it additionally yields as its last item a ``(None, flag)`` pair where ``flag`` is `True` iff the stanza was terminated by a blank line (thereby suggesting there is more input left to process), `False` iff the stanza was terminated by EOF. This is the core function that all other scanners ultimately call. """ name = None value = '' begun = False more_left = False if not hasattr(separator_regex, 'match'): separator_regex = re.compile(separator_regex) for line in iterator: line = line.rstrip('\r\n') if line.startswith((' ', '\t')): begun = True if name is not None: value += '\n' + line else: raise UnexpectedFoldingError(line) else: m = separator_regex.search(line) if m: begun = True if name is not None: yield (name, value) name = line[:m.start()] value = line[m.end():] elif line == '': if skip_leading_newlines and not begun: continue else: more_left = True break else: raise MalformedHeaderError(line) if name is not None: yield (name, value) yield (None, more_left) def scan_next_stanza_string(s, **kwargs): """ .. versionadded:: 0.4.0 Scan a string for RFC 822-style header fields and return a pair ``(fields, extra)`` where ``fields`` is a list of ``(name, value)`` pairs for each header field in the input up to the first blank line and ``extra`` is everything after the first blank line (If ``skip_leading_newlines`` is true, the dividing point is instead the first blank line after a non-blank line); if there is no appropriate blank line in the input, ``extra`` is the empty string. :param s: a string to scan :param kwargs: :ref:`scanner options ` :rtype: pair of a list of pairs of strings and a string :raises ScannerError: if the header section is malformed """ lineiter = iter(ascii_splitlines(s)) fields = list(scan_next_stanza(lineiter, **kwargs)) body = ''.join(lineiter) return (fields, body) def scan_stanzas(iterable, **kwargs): """ .. versionadded:: 0.4.0 Scan a text-file-like object or iterable of lines for zero or more stanzas of RFC 822-style header fields and return a generator of lists of ``(name, value)`` pairs, where each list represents a stanza of header fields in the input. The stanzas are terminated by blank lines. Consecutive blank lines between stanzas are treated as a single blank line. Blank lines at the end of the input are discarded without creating a new stanza. :param iterable: a text-file-like object or iterable of strings representing lines of input :param kwargs: :ref:`scanner options ` :rtype: generator of lists of pairs of strings :raises ScannerError: if the header section is malformed """ lineiter = iter(iterable) while True: fields = list(_scan_next_stanza(lineiter, **kwargs)) more_left = fields.pop()[1] if fields or more_left: yield fields else: break kwargs["skip_leading_newlines"] = True def scan_stanzas_string(s, **kwargs): """ .. versionadded:: 0.4.0 Scan a string for zero or more stanzas of RFC 822-style header fields and return a generator of lists of ``(name, value)`` pairs, where each list represents a stanza of header fields in the input. The stanzas are terminated by blank lines. Consecutive blank lines between stanzas are treated as a single blank line. Blank lines at the end of the input are discarded without creating a new stanza. :param s: a string which will be broken into lines on CR, LF, and CR LF boundaries and passed to `scan_stanzas()` :param kwargs: :ref:`scanner options ` :rtype: generator of lists of pairs of strings :raises ScannerError: if the header section is malformed """ return scan_stanzas(ascii_splitlines(s), **kwargs) headerparser-0.4.0/headerparser/types.py000066400000000000000000000031001347354115200203420ustar00rootroot00000000000000import re TRUTHY = {'yes', 'y', 'on', 'true', '1'} FALSEY = {'no', 'n', 'off', 'false', '0'} def BOOL(s): """ Convert boolean-like strings to `bool` values. The strings ``'yes'``, ``'y'``, ``'on'``, ``'true'``, and ``'1'`` are converted to `True`, and the strings ``'no'``, ``'n'``, ``'off'``, ``'false'``, and ``'0'`` are converted to `False`. The conversion is case-insensitive and ignores leading & trailing whitespace. Any value that cannot be converted to a `bool` results in a `ValueError`. :param string s: a boolean-like string to convert to a `bool` :rtype: bool :raises ValueError: if ``s`` is not one of the values listed above """ b = s.strip().lower() if b in TRUTHY: return True elif b in FALSEY: return False else: raise ValueError('invalid boolean: ' + repr(s)) def lower(s): """ .. versionadded:: 0.2.0 Convert ``s`` to lowercase by calling its :meth:`~str.lower()` method if it has one; otherwise, return ``s`` unchanged """ try: return s.lower() except (TypeError, AttributeError): return s def unfold(s): r""" .. versionadded:: 0.2.0 Remove folding whitespace from a string by converting line breaks (and any whitespace adjacent to line breaks) to a single space and removing leading & trailing whitespace. >>> unfold('This is a \n folded string.\n') 'This is a folded string.' :param string s: a string to unfold :rtype: string """ return re.sub(r'[ \t]*[\r\n][ \t\r\n]*', ' ', s).strip(' ') headerparser-0.4.0/headerparser/util.py000066400000000000000000000003741347354115200201650ustar00rootroot00000000000000import re def ascii_splitlines(s): lines = [] lastend = 0 for m in re.finditer(r'\r\n?|\n', s): lines.append(s[lastend:m.end()]) lastend = m.end() if lastend < len(s): lines.append(s[lastend:]) return lines headerparser-0.4.0/setup.cfg000066400000000000000000000031121347354115200160030ustar00rootroot00000000000000[aliases] make=sdist bdist_wheel [bdist_wheel] universal=1 [metadata] name = headerparser #version = # Set in setup.py description = argparse for mail-style headers long_description = file:README.rst long_description_content_type = text/x-rst author = John Thorvald Wodder II author_email = headerparser@varonathe.org license = MIT license_file = LICENSE url = https://github.com/jwodder/headerparser keywords = e-mail email mail rfc822 headers rfc2822 rfc5322 parser classifiers = Development Status :: 4 - Beta #Development Status :: 5 - Production/Stable Programming Language :: Python :: 2 Programming Language :: Python :: 2.7 Programming Language :: Python :: 3 Programming Language :: Python :: 3.4 Programming Language :: Python :: 3.5 Programming Language :: Python :: 3.6 Programming Language :: Python :: 3.7 Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: Implementation :: PyPy License :: OSI Approved :: MIT License Intended Audience :: Developers Topic :: Communications :: Email Topic :: Communications :: Usenet News Topic :: Internet :: WWW/HTTP Topic :: Text Processing project_urls = Source Code = https://github.com/jwodder/headerparser Bug Tracker = https://github.com/jwodder/headerparser/issues Documentation = https://headerparser.readthedocs.io Say Thanks! = https://saythanks.io/to/jwodder [options] packages = find: python_requires = >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4 install_requires = six ~= 1.1 headerparser-0.4.0/setup.py000066400000000000000000000006421347354115200157010ustar00rootroot00000000000000from os.path import dirname, join import re from setuptools import setup with open(join(dirname(__file__), 'headerparser', '__init__.py')) as fp: for line in fp: m = re.search(r'^\s*__version__\s*=\s*([\'"])([^\'"]+)\1\s*$', line) if m: version = m.group(2) break else: raise RuntimeError('Unable to find own __version__ string') setup(version=version) headerparser-0.4.0/test/000077500000000000000000000000001347354115200151445ustar00rootroot00000000000000headerparser-0.4.0/test/test_normdict.py000066400000000000000000000127221347354115200204000ustar00rootroot00000000000000import re import pytest from headerparser import NormalizedDict, lower def test_empty(): nd = NormalizedDict() assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is lower def test_one(): nd = NormalizedDict({"Foo": "bar"}) assert dict(nd) == {"Foo": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is lower def test_get_cases(): nd = NormalizedDict({"Foo": "bar"}) assert nd["Foo"] == "bar" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] def test_set(): nd = NormalizedDict() assert dict(nd) == {} nd["Foo"] = "bar" assert dict(nd) == {"Foo": "bar"} assert nd["Foo"] == "bar" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] nd["fOO"] = "quux" assert dict(nd) == {"fOO": "quux"} assert nd["Foo"] == "quux" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] def test_del(): nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}) del nd["Foo"] assert dict(nd) == {"Bar": "FOO"} del nd["BAR"] assert dict(nd) == {} def test_del_nexists(): nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}) with pytest.raises(KeyError): del nd["Baz"] def test_eq_empty(): nd = NormalizedDict() nd2 = NormalizedDict() assert nd == nd2 def test_eq_nonempty(): nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"Foo": "bar"}) assert nd == nd2 def test_eq_cases(): nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"fOO": "bar"}) assert nd == nd2 def test_neq(): assert NormalizedDict({"Foo": "bar"}) != NormalizedDict({"Foo": "BAR"}) def test_normalized(): nd = NormalizedDict({"Foo": "BAR"}) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"foo": "BAR"} assert nd2.body is None assert nd == nd2 def test_normalized_with_body(): nd = NormalizedDict({"Foo": "BAR"}, body='Glarch.') nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"foo": "BAR"} assert nd2.body == 'Glarch.' assert nd == nd2 def test_normalized_dict(): nd = NormalizedDict({"Foo": "BAR"}) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"foo": "BAR"} def test_eq_dict(): nd = NormalizedDict({"Foo": "BAR"}) assert nd == {"Foo": "BAR"} assert {"Foo": "BAR"} == nd assert nd == {"FOO": "BAR"} assert {"FOO": "BAR"} == nd assert nd == {"foo": "BAR"} assert {"foo": "BAR"} == nd assert nd != {"Foo": "bar"} assert {"Foo": "bar"} != nd def test_body_neq_dict(): nd = NormalizedDict({"Foo": "BAR"}, body='') assert nd != {"Foo": "BAR"} assert {"Foo": "BAR"} != nd def test_eq_body(): nd = NormalizedDict({"Foo": "bar"}, body='') nd2 = NormalizedDict({"fOO": "bar"}, body='') assert nd == nd2 def test_neq_body(): nd = NormalizedDict({"Foo": "bar"}, body='yes') nd2 = NormalizedDict({"fOO": "bar"}, body='no') assert nd != nd2 def test_neq_none(): assert NormalizedDict() != None # noqa: E711 assert None != NormalizedDict() # noqa: E711 def test_neq_bool(): assert NormalizedDict() != False # noqa: E712 assert False != NormalizedDict() # noqa: E712 def test_neq_int(): assert NormalizedDict() != 42 assert 42 != NormalizedDict() def test_init_list(): nd = NormalizedDict([("Foo", "bar"), ("Bar", "baz"), ("FOO", "quux")]) assert dict(nd) == {"FOO": "quux", "Bar": "baz"} def test_copy(): nd = NormalizedDict({"Foo": "bar"}) nd2 = nd.copy() assert nd is not nd2 assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body is None assert nd == nd2 nd2["Foo"] = "gnusto" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto"} assert nd != nd2 nd2["fOO"] = "quux" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"fOO": "quux"} assert nd != nd2 nd2["Glarch"] = "baz" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"fOO": "quux", "Glarch": "baz"} assert nd != nd2 def test_copy_with_body(): nd = NormalizedDict({"Foo": "bar"}, body='Glarch.') nd2 = nd.copy() assert nd is not nd2 assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body == 'Glarch.' assert nd == nd2 nd2.body = 'quux' assert nd.body == 'Glarch.' assert nd2.body == 'quux' assert nd != nd2 def test_neq_normalizers_empty(): nd = NormalizedDict() nd2 = NormalizedDict(normalizer=lambda x: x) assert dict(nd) == dict(nd2) == {} assert nd != nd2 def test_neq_normalizers_nonempty(): nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=lambda x: x) assert dict(nd) == dict(nd2) == {"Foo": "bar"} assert nd != nd2 def normdash(s): return re.sub(r'[-_\s]+', '-', s.lower()) def identity(s): return s @pytest.mark.parametrize('data', [ {}, {'Foo': 'Bar'}, {'foo': 'Bar'}, {'FOO_BAR': 'BAZ'}, ]) @pytest.mark.parametrize('normalizer', [None, lower, normdash, identity]) @pytest.mark.parametrize('body', [None, 'Glarch.']) def test_repr(data, normalizer, body): nd = NormalizedDict(data, body=body, normalizer=normalizer) assert repr(nd) == 'headerparser.normdict.NormalizedDict'\ '({!r}, normalizer={!r}, body={!r})'\ .format(data, normalizer or lower, body) headerparser-0.4.0/test/test_normdict_custom.py000066400000000000000000000121721347354115200217710ustar00rootroot00000000000000import re import pytest from headerparser import NormalizedDict def normdash(s): return re.sub(r'[-_\s]+', '-', s.lower()) def test_empty(): nd = NormalizedDict(normalizer=normdash) assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is normdash def test_one(): nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) assert dict(nd) == {"A Key": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is normdash def test_get_cases(): nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) assert nd["A Key"] == "bar" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] def test_set(): nd = NormalizedDict(normalizer=normdash) assert dict(nd) == {} nd["A Key"] = "bar" assert dict(nd) == {"A Key": "bar"} assert nd["A Key"] == "bar" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] nd["A-Key"] = "quux" assert dict(nd) == {"A-Key": "quux"} assert nd["A Key"] == "quux" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] def test_del(): nd = NormalizedDict( {"A Key": "bar", "Another-Key": "FOO"}, normalizer=normdash, ) del nd["A Key"] assert dict(nd) == {"Another-Key": "FOO"} del nd["ANOTHER_KEY"] assert dict(nd) == {} def test_del_nexists(): nd = NormalizedDict( {"A Key": "bar", "Another-Key": "FOO"}, normalizer=normdash, ) with pytest.raises(KeyError): del nd["AKey"] def test_eq_empty(): nd = NormalizedDict(normalizer=normdash) nd2 = NormalizedDict(normalizer=normdash) assert nd == nd2 def test_eq_nonempty(): nd = NormalizedDict({"Foo": "bar"}, normalizer=normdash) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=normdash) assert nd == nd2 def test_eq_cases(): nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) nd2 = NormalizedDict({"a_key": "bar"}, normalizer=normdash) assert nd == nd2 def test_neq(): assert NormalizedDict({"A Key": "A Value"}, normalizer=normdash) \ != NormalizedDict({"A Key": "a_value"}, normalizer=normdash) def test_normalized(): nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"a-key": "BAR"} assert nd2.body is None assert nd2.normalizer is normdash assert nd == nd2 def test_normalized_with_body(): nd = NormalizedDict({"A Key": "BAR"}, body='Foo Baz', normalizer=normdash) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"a-key": "BAR"} assert nd2.body == 'Foo Baz' assert nd2.normalizer is normdash assert nd == nd2 def test_normalized_dict(): nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"a-key": "BAR"} def test_eq_dict(): nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) assert nd == {"A Key": "BAR"} assert {"A Key": "BAR"} == nd assert nd == {"A_KEY": "BAR"} assert {"A_KEY": "BAR"} == nd assert nd == {"a-key": "BAR"} assert {"a-key": "BAR"} == nd assert nd != {"A Key": "bar"} assert {"A Key": "bar"} != nd def test_body_neq_dict(): nd = NormalizedDict({"A Key": "BAR"}, body='', normalizer=normdash) assert nd != {"A Key": "BAR"} assert {"A Key": "BAR"} != nd def test_eq_body(): nd = NormalizedDict({"A Key": "bar"}, body='', normalizer=normdash) nd2 = NormalizedDict({"a_KEY": "bar"}, body='', normalizer=normdash) assert nd == nd2 def test_neq_body(): nd = NormalizedDict({"A Key": "bar"}, body='yes', normalizer=normdash) nd2 = NormalizedDict({"a_KEY": "bar"}, body='no', normalizer=normdash) assert nd != nd2 def test_init_list(): nd = NormalizedDict( [("A Key", "bar"), ("Another-Key", "baz"), ("A_KEY", "quux")], normalizer=normdash, ) assert dict(nd) == {"A_KEY": "quux", "Another-Key": "baz"} def test_copy(): nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"A Key": "bar"} assert nd2.body is None assert nd2.normalizer is normdash assert nd == nd2 nd2["A Key"] = "gnusto" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"A Key": "gnusto"} assert nd != nd2 nd2["a-key"] = "quux" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"a-key": "quux"} assert nd != nd2 nd2["Another_Key"] = "baz" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"a-key": "quux", "Another_Key": "baz"} assert nd != nd2 def test_copy_with_body(): nd = NormalizedDict({"A Key": "bar"}, body='Glarch.', normalizer=normdash) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"A Key": "bar"} assert nd2.body == 'Glarch.' assert nd2.normalizer is normdash assert nd == nd2 nd2.body = 'quux' assert nd.body == 'Glarch.' assert nd2.body == 'quux' assert nd != nd2 headerparser-0.4.0/test/test_normdict_identity.py000066400000000000000000000113401347354115200223040ustar00rootroot00000000000000import pytest from headerparser import NormalizedDict def identity(s): return s def test_empty(): nd = NormalizedDict(normalizer=identity) assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is identity def test_one(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert dict(nd) == {"Foo": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is identity def test_get_cases(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert nd["Foo"] == "bar" assert "foo" not in nd assert "FOO" not in nd assert "fOO" not in nd def test_set(): nd = NormalizedDict(normalizer=identity) assert dict(nd) == {} nd["Foo"] = "bar" assert dict(nd) == {"Foo": "bar"} assert len(nd) == 1 assert nd["Foo"] == "bar" nd["fOO"] = "quux" assert dict(nd) == {"Foo": "bar", "fOO": "quux"} assert len(nd) == 2 assert nd["Foo"] == "bar" assert nd["fOO"] == "quux" def test_del(): nd = NormalizedDict({"Foo": "bar", "fOO": "BAR"}, normalizer=identity) del nd["Foo"] assert dict(nd) == {"fOO": "BAR"} del nd["fOO"] assert dict(nd) == {} def test_del_nexists(): nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}, normalizer=identity) with pytest.raises(KeyError): del nd["fOO"] def test_eq_empty(): nd = NormalizedDict(normalizer=identity) nd2 = NormalizedDict(normalizer=identity) assert nd == nd2 def test_eq_nonempty(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert nd == nd2 def test_neq_cases(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = NormalizedDict({"fOO": "bar"}, normalizer=identity) assert nd != nd2 def test_neq(): assert NormalizedDict({"Foo": "bar"}, normalizer=identity) \ != NormalizedDict({"Foo": "BAR"}, normalizer=identity) def test_normalized(): nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "BAR"} assert nd2.body is None assert nd2.normalizer is identity assert nd == nd2 def test_normalized_with_body(): nd = NormalizedDict({"Foo": "BAR"}, body='Glarch.', normalizer=identity) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "BAR"} assert nd2.body == 'Glarch.' assert nd2.normalizer is identity assert nd == nd2 def test_normalized_dict(): nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"Foo": "BAR"} def test_eq_dict(): nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) assert nd == {"Foo": "BAR"} assert {"Foo": "BAR"} == nd assert nd != {"FOO": "BAR"} assert {"FOO": "BAR"} != nd assert nd != {"foo": "BAR"} assert {"foo": "BAR"} != nd assert nd != {"Foo": "bar"} assert {"Foo": "bar"} != nd def test_body_neq_dict(): nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity, body='') assert nd != {"Foo": "BAR"} assert {"Foo": "BAR"} != nd def test_eq_body(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity, body='') nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity, body='') assert nd == nd2 def test_neq_body(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity, body='yes') nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity, body='no') assert nd != nd2 def test_init_list(): nd = NormalizedDict([("Foo", "bar"), ("Bar", "baz"), ("FOO", "quux")], normalizer=identity) assert dict(nd) == {"Foo": "bar", "FOO": "quux", "Bar": "baz"} def test_copy(): nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body is None assert nd2.normalizer is identity assert nd == nd2 nd2["Foo"] = "gnusto" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto"} assert nd != nd2 nd2["fOO"] = "quux" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto", "fOO": "quux"} assert nd != nd2 def test_copy_with_body(): nd = NormalizedDict({"Foo": "bar"}, body='Glarch.', normalizer=identity) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body == 'Glarch.' assert nd2.normalizer is identity assert nd == nd2 nd2.body = 'quux' assert nd.body == 'Glarch.' assert nd2.body == 'quux' assert nd != nd2 headerparser-0.4.0/test/test_parser.py000066400000000000000000000261641347354115200200620ustar00rootroot00000000000000import pytest from six import StringIO import headerparser from headerparser import HeaderParser def test_simple(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_out_of_order(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBaz: blue\nBar: green\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_different_cases(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBAR: green\nbaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_empty_body(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == '' def test_blank_body(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n\n\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == '\n' def test_body(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n\nThis is a test.') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == 'This is a test.' def test_headerlike_body(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('''\ Foo: red Bar: green Baz: blue Foo: quux Bar: glarch Baz: cleesh ''') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == 'Foo: quux\nBar: glarch\nBaz: cleesh\n' def test_missing(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green'} assert msg.body is None def test_required(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', required=True) msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_required_default(): parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field('Foo', required=True, default='Why?') assert 'required and default are mutually exclusive' in str(excinfo.value) def test_required_none(): parser = HeaderParser() parser.add_field('None', required=True, type=lambda _: None) msg = parser.parse_string('None: whatever') assert dict(msg) == {'None': None} assert msg.body is None def test_missing_required(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\n') assert str(excinfo.value) == "Required header field 'Baz' is not present" assert excinfo.value.name == 'Baz' def test_present_default(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', default=42) msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_missing_default(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', default=42) msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 42} assert msg.body is None def test_missing_None_default(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', default=None) msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': None} assert msg.body is None def test_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nFOO: magenta\nBar: green\nfoo : crimson\n') assert dict(msg) == {'Foo': ['red', 'magenta', 'crimson'], 'Bar': 'green'} assert msg.body is None def test_one_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': ['red'], 'Bar': 'green'} assert msg.body is None def test_no_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True) parser.add_field('Bar') msg = parser.parse_string('Bar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body is None def test_bad_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True) parser.add_field('Bar') with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nFOO: magenta\nBar: green\nBar: lime\n') assert str(excinfo.value) == "Header field 'Bar' occurs more than once" assert excinfo.value.name == 'Bar' def test_default_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, default=42) parser.add_field('Bar') msg = parser.parse_string('Bar: green\n') assert dict(msg) == {'Foo': 42, 'Bar': 'green'} assert msg.body is None def test_present_default_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, default=42) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': ['red'], 'Bar': 'green'} assert msg.body is None def test_present_default_many_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, default=42) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nFOO: magenta\nBar: green\n') assert dict(msg) == {'Foo': ['red', 'magenta'], 'Bar': 'green'} assert msg.body is None def test_required_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, required=True) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': ['red'], 'Bar': 'green'} assert msg.body is None def test_required_many_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, required=True) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nFOO: magenta\nBar: green\n') assert dict(msg) == {'Foo': ['red', 'magenta'], 'Bar': 'green'} assert msg.body is None def test_missing_required_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True, required=True) parser.add_field('Bar') with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse_string('Bar: green\n') assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == 'Foo' def test_unknown(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nQuux: blue\n') assert str(excinfo.value) == "Unknown header field 'Quux'" assert excinfo.value.name == 'Quux' def test_empty_input(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('') assert dict(msg) == {} assert msg.body is None def test_trailing_whitespace(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red \nBar: green\n (ish) \nBaz: blue\n ') assert dict(msg) == { 'Foo': 'red ', 'Bar': 'green\n (ish) ', 'Baz': 'blue\n ', } assert msg.body is None def test_redefinition(): parser = HeaderParser() parser.add_field('Foo') with pytest.raises(ValueError) as excinfo: parser.add_field('FOO') assert 'field defined more than once' in str(excinfo.value) def test_many_missing_required(): parser = HeaderParser() parser.add_field('Foo', required=True) parser.add_field('Bar', required=True) parser.add_field('Baz', required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse_string('') assert excinfo.value.name in ('Foo', 'Bar', 'Baz') def test_unfold(): parser = HeaderParser() parser.add_field('Folded') parser.add_field('Unfolded', unfold=True) msg = parser.parse_string( 'Folded: This is\n' ' test\n' '\ttext.\n' 'UnFolded: This is\n' ' test\n' '\ttext.\n' ) assert dict(msg) == { "Folded": "This is\n test\n\ttext.", "Unfolded": "This is test text.", } assert msg.body is None def test_space_in_name(): parser = HeaderParser() parser.add_field('Key Name') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('key name: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Key Name': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_scan_opts_passed(mocker): import headerparser.parser mocker.patch( 'headerparser.parser.scan_string', wraps=headerparser.parser.scan_string, ) parser = HeaderParser( separator_regex=r'\s*:\s*', skip_leading_newlines=True, ) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') headerparser.parser.scan_string.assert_called_with( 'Foo: red\nBar: green\nBaz: blue\n', separator_regex=r'\s*:\s*', skip_leading_newlines=True, ) def test_deprecated_parse_lines(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') INPUT = 'Foo: red\nBar: green\nBaz: blue\n'.splitlines(True) with pytest.warns(DeprecationWarning): msg = parser.parse_lines(INPUT) assert msg == parser.parse(INPUT) def test_deprecated_parse_file(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') INPUT = StringIO('Foo: red\nBar: green\nBaz: blue\n') with pytest.warns(DeprecationWarning): msg = parser.parse_file(INPUT) INPUT.seek(0) assert msg == parser.parse(INPUT) headerparser-0.4.0/test/test_parser_action.py000066400000000000000000000216531347354115200214150ustar00rootroot00000000000000import pytest import headerparser from headerparser import BOOL, HeaderParser @pytest.fixture def use_as_body(mocker): def _use(nd, name, value): nd.body = value return mocker.Mock(side_effect=_use) def test_action(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_missing(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Bar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None assert not stub.called def test_action_type(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, type=BOOL) parser.add_field('Bar') msg = parser.parse_string('Foo: yes\nBar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', True) def test_action_type_error(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, type=BOOL) parser.add_field('Bar') with pytest.raises(headerparser.FieldTypeError): parser.parse_string('Foo: maybe\nBar: green\n') assert not stub.called def test_action_required(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, required=True) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_required_missing(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, required=True) parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.MissingFieldError): parser.parse_string('Bar: green\nBaz: blue\n') assert not stub.called def test_action_choices(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, choices=['red', 'green', 'blue']) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_bad_choice(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, choices=['red', 'green', 'blue']) parser.add_field('Bar') with pytest.raises(headerparser.InvalidChoiceError): parser.parse_string('Foo: taupe\nBar: green\n') assert not stub.called def test_action_unfold(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, unfold=True) parser.add_field('Bar') msg = parser.parse_string('Foo: folded\n text \nBar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'folded text') def test_action_no_unfold(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub) parser.add_field('Bar') msg = parser.parse_string('Foo: folded\n text \nBar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'folded\n text ') def test_action_default(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, default='orange') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_default_missing(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, default='orange') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Bar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'orange', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None assert not stub.called def test_action_different_case(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('FOO: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_multiname(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', 'Quux', action=stub) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('quux: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None stub.assert_called_once_with(msg, 'Foo', 'red') def test_action_multiple(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo', action=stub, multiple=True) parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string( 'Foo: red\n' 'Bar: green\n' 'FOO: purple\n' 'Baz: blue\n' 'foo: orange\n' ) assert dict(msg) == {'Bar': 'green', 'Baz': 'blue'} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, 'Foo', 'red'), mocker.call(msg, 'Foo', 'purple'), mocker.call(msg, 'Foo', 'orange'), ] def test_action_dest(mocker): stub = mocker.stub() parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field('Foo', action=stub, dest='bar') assert '`action` and `dest` are mutually exclusive' in str(excinfo.value) assert not stub.called def test_action_normalized_dest(mocker): stub = mocker.stub() parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field('Foo', action=stub, dest='foo') assert '`action` and `dest` are mutually exclusive' in str(excinfo.value) assert not stub.called def test_action_additional(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo') parser.add_additional(action=stub) msg = parser.parse_string('Bar: green\nFoo: red\nBaz: blue\n') assert dict(msg) == {'Foo': 'red'} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, 'Bar', 'green'), mocker.call(msg, 'Baz', 'blue'), ] def test_action_multiple_additional(mocker): stub = mocker.stub() parser = HeaderParser() parser.add_field('Foo') parser.add_additional(action=stub, multiple=True) msg = parser.parse_string( 'Bar: green\n' 'Foo: red\n' 'Baz: blue\n' 'baz: mauve\n' 'BAR: taupe\n' ) assert dict(msg) == {'Foo': 'red'} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, 'Bar', 'green'), mocker.call(msg, 'Baz', 'blue'), mocker.call(msg, 'baz', 'mauve'), mocker.call(msg, 'BAR', 'taupe'), ] @pytest.mark.parametrize('body', [True, None]) def test_action_set_body_overwritten(body, use_as_body): parser = HeaderParser(body=body) parser.add_field('Foo', action=use_as_body) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n\nThis is the body.\n') assert dict(msg) == {'Bar': 'green'} assert msg.body == 'This is the body.\n' use_as_body.assert_called_once_with(msg, 'Foo', 'red') def test_action_set_body_forbidden(use_as_body, mocker): parser = HeaderParser(body=False) parser.add_field('Foo', action=use_as_body) parser.add_field('Bar') with pytest.raises(headerparser.BodyNotAllowedError): parser.parse_string('Foo: red\nBar: green\n\nThis is the body.\n') use_as_body.assert_called_once_with(mocker.ANY, 'Foo', 'red') @pytest.mark.parametrize('body', [False, None]) def test_action_set_body(body, use_as_body): parser = HeaderParser(body=body) parser.add_field('Foo', action=use_as_body) parser.add_field('Bar') msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Bar': 'green'} assert msg.body == 'red' use_as_body.assert_called_once_with(msg, 'Foo', 'red') def test_action_set_body_missing(use_as_body, mocker): parser = HeaderParser(body=True) parser.add_field('Foo', action=use_as_body) parser.add_field('Bar') with pytest.raises(headerparser.MissingBodyError): parser.parse_string('Foo: red\nBar: green\n') use_as_body.assert_called_once_with(mocker.ANY, 'Foo', 'red') headerparser-0.4.0/test/test_parser_additional.py000066400000000000000000000215161347354115200222460ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional() msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_many_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional() msg = parser.parse_string( 'Foo: red\nBar: green\nBaz: blue\nQUUX: purple\nglarch: orange\n' ) assert dict(msg) == { 'Foo': 'red', 'Bar': 'green', 'Baz': 'blue', 'QUUX': 'purple', 'glarch': 'orange', } assert msg.body is None def test_intermixed_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional() msg = parser.parse_string( 'QUUX: purple\nBar: green\nglarch: orange\nFoo: red\nBaz: blue\n' ) assert dict(msg) == { 'Foo': 'red', 'Bar': 'green', 'Baz': 'blue', 'QUUX': 'purple', 'glarch': 'orange', } assert msg.body is None def test_additional_only(): parser = HeaderParser() parser.add_additional() msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_dest_additional(): parser = HeaderParser() parser.add_field('Foo', dest='dest') parser.add_field('Bar') with pytest.raises(ValueError) as excinfo: parser.add_additional() assert 'add_additional and `dest` are mutually exclusive' in str(excinfo.value) def test_additional_dest(): parser = HeaderParser() parser.add_additional() parser.add_field('Foo') with pytest.raises(ValueError) as excinfo: parser.add_field('Bar', dest='dest') assert 'add_additional and `dest` are mutually exclusive' in str(excinfo.value) def test_additional_bad_named_multiple(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nFOO: magenta\nBar: green\n') assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == 'Foo' def test_additional_named_multiple(): parser = HeaderParser() parser.add_field('Foo', multiple=True) parser.add_additional() msg = parser.parse_string('Foo: red\nFOO: magenta\nBar: green\n') assert dict(msg) == {'Foo': ['red', 'magenta'], 'Bar': 'green'} assert msg.body is None def test_additional_bad_multiple(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBar: lime\n') assert str(excinfo.value) == "Header field 'Bar' occurs more than once" assert excinfo.value.name == 'Bar' def test_additional_bad_multiple_cases(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBAR: lime\n') assert str(excinfo.value) == "Header field 'BAR' occurs more than once" assert excinfo.value.name == 'BAR' def test_multiple_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional(multiple=True) msg = parser.parse_string('Foo: red\nBar: green\nBAR: lime\n') assert dict(msg) == {'Foo': 'red', 'Bar': ['green', 'lime']} assert msg.body is None def test_one_multiple_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional(multiple=True) msg = parser.parse_string('Foo: red\nBAR: lime\n') assert dict(msg) == {'Foo': 'red', 'BAR': ['lime']} assert msg.body is None def test_multiple_additional_bad_named_multiple(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(multiple=True) with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\nFOO: magenta\n') assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == 'Foo' def test_additional_missing_named(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional() msg = parser.parse_string('Baz: blue\nQUUX: purple\nglarch: orange\n') assert dict(msg) == {'Baz': 'blue', 'QUUX': 'purple', 'glarch': 'orange'} assert msg.body is None def test_additional_missing_required_named(): parser = HeaderParser() parser.add_field('Foo', required=True) parser.add_field('Bar') parser.add_additional() with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse_string('Baz: blue\nQUUX: purple\nglarch: orange\n') assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == 'Foo' def test_missing_additional(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional() msg = parser.parse_string('Foo: red\nBar: green\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green'} assert msg.body is None def test_additional_type(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(type=int) msg = parser.parse_string('Foo: 1\nBar: 2\nBaz: 3\n') assert dict(msg) == {'Foo': '1', 'Bar': '2', 'Baz': 3} assert msg.body is None def test_additional_bad_type(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(type=int) with pytest.raises(headerparser.FieldTypeError) as excinfo: parser.parse_string('Foo: 1\nBar: 2\nBaz: three\n') assert str(excinfo.value) == ( "Error while parsing 'Baz': 'three': ValueError: " + str(excinfo.value.exc_value) ) assert excinfo.value.name == 'Baz' assert excinfo.value.value == 'three' assert isinstance(excinfo.value.exc_value, ValueError) def test_additional_choices(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(choices=['red', 'green', 'blue']) msg = parser.parse_string('Foo: mauve\nBar: red\nBaz: green\nQuux: blue\n') assert dict(msg) == { 'Foo': 'mauve', 'Bar': 'red', 'Baz': 'green', 'Quux': 'blue', } assert msg.body is None def test_additional_bad_choices(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(choices=['red', 'green', 'blue']) with pytest.raises(headerparser.InvalidChoiceError) as excinfo: parser.parse_string('Foo: mauve\nBar: red\nBaz: green\nQuux: taupe\n') assert str(excinfo.value) == "'taupe' is not a valid choice for 'Quux'" assert excinfo.value.name == 'Quux' assert excinfo.value.value == 'taupe' def test_additional_unfold(): parser = HeaderParser() parser.add_field('Foo') parser.add_additional(unfold=True) msg = parser.parse_string( 'Foo: This is\n' ' test\n' ' text.\n' 'Bar: This is\n' ' test\n' ' text.\n' ) assert dict(msg) == { "Foo": "This is\n test\n text.", "Bar": "This is test text.", } assert msg.body is None def test_bad_additional_dest(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') with pytest.raises(TypeError): parser.add_additional(dest='somewhere') def test_bad_additional_required(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') with pytest.raises(TypeError): parser.add_additional(required=True) def test_bad_additional_default(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') with pytest.raises(TypeError): parser.add_additional(default='') def test_additional_multiname(): parser = HeaderParser() parser.add_field('Foo', 'Oof') parser.add_field('Bar', 'Baz') parser.add_additional() msg = parser.parse_string('Oof: red\nBar: green\nQuux: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Quux': 'blue'} assert msg.body is None def test_additional_off(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_additional(False) with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert str(excinfo.value) == "Unknown header field 'Baz'" assert excinfo.value.name == 'Baz' headerparser-0.4.0/test/test_parser_body.py000066400000000000000000000076471347354115200211040ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_require_body(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This space intentionally left nonblank.\n' ) assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == 'This space intentionally left nonblank.\n' def test_empty_required_body(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body == '' def test_missing_required_body(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.MissingBodyError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert str(excinfo.value) == "Message body is required but missing" def test_forbid_body(): parser = HeaderParser(body=False) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert msg.body is None def test_empty_forbidden_body(): parser = HeaderParser(body=False) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\n\n') assert str(excinfo.value) == "Message body is present but not allowed" def test_present_forbidden_body(): parser = HeaderParser(body=False) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse_string( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This space intentionally left nonblank.\n' ) assert str(excinfo.value) == "Message body is present but not allowed" def test_headers_as_required_body(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg = parser.parse_string('\nFoo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {} assert msg.body == 'Foo: red\nBar: green\nBaz: blue\n' def test_headers_as_forbidden_body(): parser = HeaderParser(body=False) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse_string('\nFoo: red\nBar: green\nBaz: blue\n') assert str(excinfo.value) == "Message body is present but not allowed" def test_required_body_only(): parser = HeaderParser(body=True) msg = parser.parse_string('\nFoo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {} assert msg.body == 'Foo: red\nBar: green\nBaz: blue\n' def test_body_as_unknown_headers(): parser = HeaderParser(body=True) with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert str(excinfo.value) == "Unknown header field 'Foo'" assert excinfo.value.name == 'Foo' def test_require_body_all_empty(): parser = HeaderParser(body=True) msg = parser.parse_string('\n') assert dict(msg) == {} assert msg.body == '' def test_forbid_body_all_empty(): parser = HeaderParser(body=False) with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse_string('\n\n') assert str(excinfo.value) == "Message body is present but not allowed" headerparser-0.4.0/test/test_parser_choices.py000066400000000000000000000064701347354115200215550ustar00rootroot00000000000000import pytest from headerparser import BOOL, HeaderParser, InvalidChoiceError def test_choices(): parser = HeaderParser() parser.add_field('Color', choices=['red', 'green', 'blue']) msg = parser.parse_string('Color: green') assert dict(msg) == {'Color': 'green'} assert msg.body is None def test_invalid_choice(): parser = HeaderParser() parser.add_field('Color', choices=['red', 'green', 'blue']) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse_string('Color: taupe') assert str(excinfo.value) == "'taupe' is not a valid choice for 'Color'" assert excinfo.value.name == 'Color' assert excinfo.value.value == 'taupe' def test_no_choice(): parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field('Unicorn', choices=[]) assert str(excinfo.value) == 'empty list supplied for choices' def test_default_choice(): parser = HeaderParser() parser.add_field('Color', choices=['red','green','blue'], default='beige') msg = parser.parse_string('Color: blue') assert dict(msg) == {'Color': 'blue'} assert msg.body is None def test_missing_default_choice(): parser = HeaderParser() parser.add_field('Color', choices=['red','green','blue'], default='beige') msg = parser.parse_string('') assert dict(msg) == {'Color': 'beige'} assert msg.body is None def test_unfold_multiple_choices(): parser = HeaderParser() parser.add_field('Corner', choices=[ 'upper left', 'upper right', 'lower left', 'lower right' ], unfold=True, multiple=True) msg = parser.parse_string('Corner: lower right\nCorner: upper\n left\n') assert dict(msg) == {'Corner': ['lower right', 'upper left']} assert msg.body is None def test_unfold_indented_choices(): parser = HeaderParser() parser.add_field('Corner', choices=[ 'upper left', 'upper right', 'lower left', 'lower right' ], unfold=True) msg = parser.parse_string('Corner: upper\n right') assert dict(msg) == {'Corner': 'upper right'} assert msg.body is None def test_lower_choices(): parser = HeaderParser() parser.add_field('Color', choices=['red', 'green', 'blue'], type=str.lower) msg = parser.parse_string('Color: RED') assert dict(msg) == {'Color': 'red'} assert msg.body is None def test_lower_invalid_choice(): parser = HeaderParser() parser.add_field('Color', choices=['red', 'green', 'blue'], type=str.lower) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse_string('Color: MAUVE') assert str(excinfo.value) == "'mauve' is not a valid choice for 'Color'" assert excinfo.value.name == 'Color' assert excinfo.value.value == 'mauve' def test_bool_choices(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, choices=(False, 'foo')) msg = parser.parse_string('Boolean: N\n') assert dict(msg) == {'Boolean': False} assert msg.body is None def test_bool_choices_invalid_choice(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, choices=(False, 'foo')) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse_string('BOOLEAN: Y\n') assert str(excinfo.value) == "True is not a valid choice for 'Boolean'" assert excinfo.value.name == 'Boolean' assert excinfo.value.value is True headerparser-0.4.0/test/test_parser_dest.py000066400000000000000000000074671347354115200211060ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_dest(): parser = HeaderParser() parser.add_field('Foo', dest='notfoo') parser.add_field('Bar', dest='notbar') parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'notfoo': 'red', 'notbar': 'green', 'Baz': 'blue'} assert msg.body is None def test_dest_conflict(): parser = HeaderParser() parser.add_field('Foo', dest='quux') with pytest.raises(ValueError) as excinfo: parser.add_field('Bar', dest='QUUX') assert 'destination defined more than once' in str(excinfo.value) def test_header_vs_eq_dest(): parser = HeaderParser() parser.add_field('Foo') with pytest.raises(ValueError) as excinfo: parser.add_field('Bar', dest='Foo') assert 'destination defined more than once' in str(excinfo.value) def test_header_vs_like_dest(): parser = HeaderParser() parser.add_field('Foo') with pytest.raises(ValueError) as excinfo: parser.add_field('Bar', dest='foo') assert 'destination defined more than once' in str(excinfo.value) def test_dest_vs_eq_header(): parser = HeaderParser() parser.add_field('Bar', dest='Foo') with pytest.raises(ValueError) as excinfo: parser.add_field('Foo') assert 'destination defined more than once' in str(excinfo.value) def test_header_eq_dest(): parser = HeaderParser() parser.add_field('Foo', dest='Foo') msg = parser.parse_string('foo: red') assert dict(msg) == {'Foo': 'red'} assert msg.body is None def test_header_like_dest(): parser = HeaderParser() parser.add_field('Foo', dest='FOO') msg = parser.parse_string('foo: red') assert dict(msg) == {'FOO': 'red'} assert msg.body is None def test_header_missing_default_dest(): parser = HeaderParser() parser.add_field('Foo', dest='FOO', default=42) msg = parser.parse_string('') assert dict(msg) == {'FOO': 42} assert msg.body is None def test_switched_dest(): parser = HeaderParser() parser.add_field('Foo', dest='Bar') parser.add_field('Bar', dest='Foo') msg = parser.parse_string('Foo: foo\nBar: bar\n') assert dict(msg) == {'Bar': 'foo', 'Foo': 'bar'} assert msg.body is None def test_one_missing_required_switched_dest(): parser = HeaderParser() parser.add_field('Foo', dest='Bar', required=True) parser.add_field('Bar', dest='Foo', required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse_string('Foo: foo\n') assert str(excinfo.value) == "Required header field 'Bar' is not present" assert excinfo.value.name == 'Bar' def test_missing_default_switched_dest(): parser = HeaderParser() parser.add_field('Foo', dest='Bar', default=42) parser.add_field('Bar', dest='Foo', default='17') msg = parser.parse_string('') assert dict(msg) == {'Bar': 42, 'Foo': '17'} assert msg.body is None def test_one_missing_default_switched_dest(): parser = HeaderParser() parser.add_field('Foo', dest='Bar', default=42) parser.add_field('Bar', dest='Foo', default='17') msg = parser.parse_string('Foo: 42') assert dict(msg) == {'Bar': '42', 'Foo': '17'} assert msg.body is None def test_dest_multiple(): parser = HeaderParser() parser.add_field('Foo', dest='list', multiple=True) msg = parser.parse_string('Foo: red\nFoo: green\nFoo: blue') assert dict(msg) == {'list': ['red', 'green', 'blue']} assert msg.body is None def test_dest_as_unknown_header(): parser = HeaderParser() parser.add_field('Foo', dest='Bar') with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse_string('Bar: not a header') assert str(excinfo.value) == "Unknown header field 'Bar'" assert excinfo.value.name == 'Bar' headerparser-0.4.0/test/test_parser_eq.py000066400000000000000000000016411347354115200205400ustar00rootroot00000000000000from headerparser import HeaderParser def test_eq_empty(): p1 = HeaderParser() p2 = HeaderParser() assert p1 == p2 def test_eq_one_field(): p1 = HeaderParser() p1.add_field('Foo') p2 = HeaderParser() p2.add_field('Foo') assert p1 == p2 def test_neq_empty_one_field(): p1 = HeaderParser() p2 = HeaderParser() p2.add_field('Foo') assert p1 != p2 def test_eq_two_fields(): p1 = HeaderParser() p1.add_field('Foo') p1.add_field('Bar') p2 = HeaderParser() p2.add_field('Foo') p2.add_field('Bar') assert p1 == p2 def test_eq_out_of_order(): p1 = HeaderParser() p1.add_field('Foo') p1.add_field('Bar') p2 = HeaderParser() p2.add_field('Bar') p2.add_field('Foo') assert p1 == p2 # multiple, type, action, default, required, custom dest, additional, # normalizer, body, altnames, altnames with different cases, unfold, choices headerparser-0.4.0/test/test_parser_multiname.py000066400000000000000000000031371347354115200221300ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_multiname_use_first(): parser = HeaderParser() parser.add_field('Foo', 'Bar') msg = parser.parse_string('Foo: red') assert dict(msg) == {'Foo': 'red'} assert msg.body is None def test_multiname_use_second(): parser = HeaderParser() parser.add_field('Foo', 'Bar') msg = parser.parse_string('Bar: red') assert dict(msg) == {'Foo': 'red'} assert msg.body is None def test_multiname_multiple(): parser = HeaderParser() parser.add_field('Foo', 'Bar', multiple=True) parser.add_field('Baz') msg = parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert dict(msg) == {'Foo': ['red', 'green'], 'Baz': 'blue'} assert msg.body is None def test_multiname_bad_multiple(): parser = HeaderParser() parser.add_field('Foo', 'Bar') parser.add_field('Baz') with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse_string('Foo: red\nBar: green\nBaz: blue\n') assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == 'Foo' def test_multiname_conflict(): parser = HeaderParser() parser.add_field('Foo', 'Bar', multiple=True) with pytest.raises(ValueError) as excinfo: parser.add_field('Baz', 'BAR') assert 'field defined more than once' in str(excinfo.value) def test_multiname_dest(): parser = HeaderParser() parser.add_field('Foo', 'Bar', dest='Baz') msg = parser.parse_string('Bar: red') assert dict(msg) == {'Baz': 'red'} assert msg.body is None headerparser-0.4.0/test/test_parser_next_stanza.py000066400000000000000000000033671347354115200225000ustar00rootroot00000000000000import pytest from six import StringIO from headerparser import HeaderParser, MissingBodyError def test_simple(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') fp = StringIO( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This body is not consumed.\n' ) msg = parser.parse_next_stanza(fp) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None assert fp.read() == 'This body is not consumed.\n' def test_simple_string(): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msg, rest = parser.parse_next_stanza_string( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This body is not consumed.\n' ) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None assert rest == 'This body is not consumed.\n' def test_body_true(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') fp = StringIO( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This body is not consumed.\n' ) with pytest.raises(MissingBodyError): parser.parse_next_stanza(fp) def test_body_true_string(): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') with pytest.raises(MissingBodyError): parser.parse_next_stanza_string( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This body is not consumed.\n' ) headerparser-0.4.0/test/test_parser_stanzas.py000066400000000000000000000143141347354115200216170ustar00rootroot00000000000000import pytest from six import StringIO import headerparser from headerparser import HeaderParser, scan_stanzas_string def parse_stanzas_string(p, s): return p.parse_stanzas_string(s) def parse_stanzas_string_as_file(p, s): return p.parse_stanzas(StringIO(s)) def parse_stanzas_string_as_stream(p, s): return p.parse_stanzas_stream(scan_stanzas_string(s)) @pytest.fixture(params=[ parse_stanzas_string, parse_stanzas_string_as_file, parse_stanzas_string_as_stream, ]) def pmethod(request): return request.param def test_simple(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') m1, m2, m3 = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\n\n' 'Baz: sapphire\nBar: emerald\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\nFoo: fire\n\n' ) assert dict(m1) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert m1.body is None assert dict(m2) == {'Foo': 'ruby', 'Bar': 'emerald', 'Baz': 'sapphire'} assert m2.body is None assert dict(m3) == {'Foo': 'fire', 'Bar': 'earth', 'Baz': 'water'} assert m3.body is None def test_invalid_stanza(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msgs = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\n\n' 'Baz: sapphire\nBar: emerald\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\nFoo: fire\nQuux: aether\nCleesh: air\n\n' 'Baz: ice\nFoo: lightning\nBar: mud\n\n' ) m1 = next(msgs) assert dict(m1) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert m1.body is None m2 = next(msgs) assert dict(m2) == {'Foo': 'ruby', 'Bar': 'emerald', 'Baz': 'sapphire'} assert m2.body is None with pytest.raises(headerparser.UnknownFieldError) as excinfo: next(msgs) assert str(excinfo.value) == "Unknown header field 'Quux'" assert excinfo.value.name == 'Quux' def test_some_required(pmethod): parser = HeaderParser() parser.add_field('Foo', required=True) parser.add_field('Bar') parser.add_field('Baz') msgs = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\n\n' 'Baz: sapphire\nBar: emerald\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\n\n' 'Baz: ice\nFoo: lightning\nBar: mud\n\n' ) m1 = next(msgs) assert dict(m1) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert m1.body is None m2 = next(msgs) assert dict(m2) == {'Foo': 'ruby', 'Bar': 'emerald', 'Baz': 'sapphire'} assert m2.body is None with pytest.raises(headerparser.MissingFieldError) as excinfo: next(msgs) assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == 'Foo' def test_disjoint_keys(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') m1, m2, m3 = pmethod( parser, 'Foo: red\n\n' 'Bar: green\n\n' 'Baz: blue\n\n' ) assert dict(m1) == {'Foo': 'red'} assert m1.body is None assert dict(m2) == {'Bar': 'green'} assert m2.body is None assert dict(m3) == {'Baz': 'blue'} assert m3.body is None def test_overlapping_keys(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') m1, m2, m3 = pmethod( parser, 'Foo: red\n\n' 'Bar: green\nFoo: yellow\n\n' 'Foo: white\nBaz: blue\n\n' ) assert dict(m1) == {'Foo': 'red'} assert m1.body is None assert dict(m2) == {'Foo': 'yellow', 'Bar': 'green'} assert m2.body is None assert dict(m3) == {'Foo': 'white', 'Baz': 'blue'} assert m3.body is None def test_multiple(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar', multiple=True) parser.add_field('Baz') m1, m2, m3 = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\nBar: lime\n\n' 'Baz: sapphire\nBar: emerald\nBar: jade\nBar: green\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\nFoo: fire\nBar: mud\nBar: land\nBar: solid\n\n' ) assert dict(m1) == {'Foo': 'red', 'Bar': ['green', 'lime'], 'Baz': 'blue'} assert m1.body is None assert dict(m2) == { 'Foo': 'ruby', 'Bar': ['emerald', 'jade', 'green'], 'Baz': 'sapphire', } assert m2.body is None assert dict(m3) == { 'Foo': 'fire', 'Bar': ['earth', 'mud', 'land', 'solid'], 'Baz': 'water', } assert m3.body is None def test_default(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', default='DEF') m1, m2, m3 = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\n\n' 'Bar: emerald\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\nFoo: fire\n\n' ) assert dict(m1) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'blue'} assert m1.body is None assert dict(m2) == {'Foo': 'ruby', 'Bar': 'emerald', 'Baz': 'DEF'} assert m2.body is None assert dict(m3) == {'Foo': 'fire', 'Bar': 'earth', 'Baz': 'water'} assert m3.body is None def test_default_inverted(pmethod): parser = HeaderParser() parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz', default='DEF') m1, m2, m3 = pmethod( parser, 'Foo: red\nBar: green\n\n' 'Baz: sapphire\nBar: emerald\nFoo: ruby\n\n' 'Bar: earth\nFoo: fire\n\n' ) assert dict(m1) == {'Foo': 'red', 'Bar': 'green', 'Baz': 'DEF'} assert m1.body is None assert dict(m2) == {'Foo': 'ruby', 'Bar': 'emerald', 'Baz': 'sapphire'} assert m2.body is None assert dict(m3) == {'Foo': 'fire', 'Bar': 'earth', 'Baz': 'DEF'} assert m3.body is None def test_body_true(pmethod): parser = HeaderParser(body=True) parser.add_field('Foo') parser.add_field('Bar') parser.add_field('Baz') msgs = pmethod( parser, 'Foo: red\nBar: green\nBaz: blue\n\n' 'Baz: sapphire\nBar: emerald\nFoo: ruby\n\n' 'Bar: earth\nBaz: water\nFoo: fire\n\n' ) with pytest.raises(headerparser.MissingBodyError): next(msgs) headerparser-0.4.0/test/test_parser_types.py000066400000000000000000000061721347354115200213030ustar00rootroot00000000000000import pytest from headerparser import BOOL, HeaderParser, FieldTypeError def test_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL) msg = parser.parse_string('Boolean: yes\n') assert dict(msg) == {'Boolean': True} assert msg.body is None def test_multiple_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, multiple=True) msg = parser.parse_string('''\ Boolean: yes Boolean: y Boolean: on Boolean: true Boolean: 1 Boolean: YES Boolean: TRUE Boolean: no Boolean: n Boolean: off Boolean: false Boolean: 0 Boolean: NO Boolean: FALSE ''') assert dict(msg) == {'Boolean': [True] * 7 + [False] * 7} assert msg.body is None def test_default_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, default='foo') msg = parser.parse_string('Boolean: Off') assert dict(msg) == {'Boolean': False} assert msg.body is None def test_missing_default_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, default='foo') msg = parser.parse_string('') assert dict(msg) == {'Boolean': 'foo'} assert msg.body is None def test_invalid_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL) with pytest.raises(FieldTypeError) as excinfo: parser.parse_string('Boolean: One\n') assert str(excinfo.value) == ( "Error while parsing 'Boolean': 'One': ValueError: invalid boolean:" " 'One'" ) assert excinfo.value.name == 'Boolean' assert excinfo.value.value == 'One' assert isinstance(excinfo.value.exc_value, ValueError) def test_bool_and_not_bool(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL) parser.add_field('String') msg = parser.parse_string('Boolean: yes\nString: no\n') assert dict(msg) == {'Boolean': True, 'String': 'no'} assert msg.body is None def test_bool_choices_bad_type(): parser = HeaderParser() parser.add_field('Boolean', type=BOOL, choices=(False, 'foo')) with pytest.raises(FieldTypeError) as excinfo: parser.parse_string('BOOLEAN: foo\n') assert str(excinfo.value) == ( "Error while parsing 'Boolean': 'foo': ValueError: invalid boolean:" " 'foo'" ) assert excinfo.value.name == 'Boolean' assert excinfo.value.value == 'foo' assert isinstance(excinfo.value.exc_value, ValueError) assert 'invalid boolean' in str(excinfo.value.exc_value) def test_native_type(): parser = HeaderParser() parser.add_field('Number', 'No.', type=int, dest='#') msg = parser.parse_string('Number: 42') assert dict(msg) == {"#": 42} assert msg.body is None def test_bad_native_type(): parser = HeaderParser() parser.add_field('Number', 'No.', type=int, dest='#') with pytest.raises(FieldTypeError) as excinfo: parser.parse_string('No.: forty-two') assert str(excinfo.value) == ( "Error while parsing 'Number': 'forty-two': ValueError: " + str(excinfo.value.exc_value) ) assert excinfo.value.name == 'Number' assert excinfo.value.value == 'forty-two' assert isinstance(excinfo.value.exc_value, ValueError) headerparser-0.4.0/test/test_scan_next_stanza.py000066400000000000000000000111361347354115200221210ustar00rootroot00000000000000import pytest from headerparser import scan_next_stanza, scan_next_stanza_string def test_simple(): lines = [ 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', '\n', 'This is a body.\n', ] liter = iter(lines) assert list(scan_next_stanza(liter)) \ == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] assert list(liter) == ['This is a body.\n'] def test_simple_string(): lines = ( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This is a body.\n' ) assert scan_next_stanza_string(lines) == ( [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], 'This is a body.\n', ) @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_extra_interstitial_blanks(skip_leading_newlines): lines = [ 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', '\n', '\n', 'This is a body.\n', ] liter = iter(lines) assert list(scan_next_stanza( liter, skip_leading_newlines=skip_leading_newlines, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] assert list(liter) == ['\n', 'This is a body.\n'] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_extra_interstitial_blanks_string(skip_leading_newlines): lines = ( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' '\n' 'This is a body.\n' ) assert scan_next_stanza_string( lines, skip_leading_newlines=skip_leading_newlines, ) == ( [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], '\nThis is a body.\n', ) def test_leading_blanks_skip(): lines = [ '\n', '\n', 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', '\n', 'This is a body.\n', ] liter = iter(lines) assert list(scan_next_stanza(liter, skip_leading_newlines=True)) \ == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] assert list(liter) == ['This is a body.\n'] def test_leading_blanks_skip_string(): lines = ( '\n' '\n' 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This is a body.\n' ) assert scan_next_stanza_string(lines, skip_leading_newlines=True) == ( [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], 'This is a body.\n', ) def test_leading_blanks_no_skip(): lines = [ '\n', '\n', 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', '\n', 'This is a body.\n', ] liter = iter(lines) assert list(scan_next_stanza(liter, skip_leading_newlines=False)) == [] assert list(liter) == [ '\n', 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', '\n', 'This is a body.\n', ] def test_leading_blanks_no_skip_string(): lines = ( '\n' '\n' 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This is a body.\n' ) assert scan_next_stanza_string(lines, skip_leading_newlines=False) == ( [], '\n' 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This is a body.\n', ) def test_stanza_only(): lines = [ 'Foo: red\n', 'Bar: green\n', 'Baz: blue\n', ] liter = iter(lines) assert list(scan_next_stanza(liter)) \ == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] assert list(liter) == [] def test_stanza_only_string(): lines = ( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' ) assert scan_next_stanza_string(lines) == ( [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], '', ) def test_empty(): lines = [] liter = iter(lines) assert list(scan_next_stanza(liter)) == [] assert list(liter) == [] def test_empty_string(): assert scan_next_stanza_string('') == ([], '') def test_all_blanks_skip(): lines = ['\n', '\n'] liter = iter(lines) assert list(scan_next_stanza(liter, skip_leading_newlines=True)) == [] assert list(liter) == [] def test_all_blanks_no_skip(): lines = ['\n', '\n'] liter = iter(lines) assert list(scan_next_stanza(liter, skip_leading_newlines=False)) == [] assert list(liter) == ['\n'] def test_all_blanks_skip_string(): assert scan_next_stanza_string('\n\n', skip_leading_newlines=True) \ == ([], '') def test_all_blanks_no_skip_string(): assert scan_next_stanza_string('\n\n', skip_leading_newlines=False) \ == ([], '\n') headerparser-0.4.0/test/test_scan_stanzas.py000066400000000000000000000111171347354115200212450ustar00rootroot00000000000000import pytest from six import StringIO from headerparser import MalformedHeaderError, scan_stanzas, \ scan_stanzas_string def scan_stanzas_string_as_file(s, **kwargs): return scan_stanzas(StringIO(s), **kwargs) def scan_stanzas_string_as_list(s, **kwargs): return scan_stanzas(s.splitlines(True), **kwargs) @pytest.fixture(params=[ scan_stanzas_string_as_file, scan_stanzas_string_as_list, scan_stanzas_string, ]) def scanner(request): return request.param def test_simple(scanner): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' 'Blue: foo\n' 'Red: bar\n' 'Green: baz\n' )) == [ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')], [('Blue', 'foo'), ('Red', 'bar'), ('Green', 'baz')], ] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_extra_interstitial_blanks(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' '\n' '\n' 'Blue: foo\n' 'Red: bar\n' 'Green: baz\n', skip_leading_newlines=skip_leading_newlines, )) == [ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')], [('Blue', 'foo'), ('Red', 'bar'), ('Green', 'baz')], ] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_trailing_blanks(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' 'Blue: foo\n' 'Red: bar\n' 'Green: baz\n' '\n' '\n', skip_leading_newlines=skip_leading_newlines, )) == [ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')], [('Blue', 'foo'), ('Red', 'bar'), ('Green', 'baz')], ] def test_leading_blanks_skip(scanner): assert list(scanner( '\n' '\n' 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' 'Blue: foo\n' 'Red: bar\n' 'Green: baz\n', skip_leading_newlines=True, )) == [ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')], [('Blue', 'foo'), ('Red', 'bar'), ('Green', 'baz')], ] def test_leading_blanks_no_skip(scanner): assert list(scanner( '\n' '\n' 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' 'Blue: foo\n' 'Red: bar\n' 'Green: baz\n', skip_leading_newlines=False, )) == [ [], [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')], [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')], [('Blue', 'foo'), ('Red', 'bar'), ('Green', 'baz')], ] def test_invalid_stanza(scanner): stanzas = scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Quux: ruby\n' 'Glarch: sapphire\n' 'Cleesh: garnet\n' '\n' 'Blue: foo\n' "Wait, this isn't a header.\n" 'Green: baz\n' ) assert next(stanzas) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] assert next(stanzas) \ == [('Quux', 'ruby'), ('Glarch', 'sapphire'), ('Cleesh', 'garnet')] with pytest.raises(MalformedHeaderError) as excinfo: next(stanzas) assert str(excinfo.value) == ( "Invalid header line encountered: \"Wait, this isn't a header.\"" ) @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_empty(scanner, skip_leading_newlines): assert list(scanner('', skip_leading_newlines=skip_leading_newlines)) == [] def test_all_blanks_skip(scanner): assert list(scanner('\n\n', skip_leading_newlines=True)) == [] def test_all_blanks_no_skip(scanner): assert list(scanner('\n\n', skip_leading_newlines=False)) == [[]] headerparser-0.4.0/test/test_scanner.py000066400000000000000000000222131347354115200202060ustar00rootroot00000000000000import re import pytest from six import StringIO import headerparser from headerparser import scan, scan_file, scan_lines, scan_string def scan_string_as_file(s, **kwargs): return scan(StringIO(s), **kwargs) def scan_string_as_list(s, **kwargs): return scan(s.splitlines(True), **kwargs) @pytest.fixture(params=[scan_string_as_file, scan_string_as_list, scan_string]) def scanner(request): return request.param def test_simple(scanner): assert list(scanner('Foo: red\nBar: green\nBaz: blue\n')) == \ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_empty_body(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\nBar: green\nBaz: blue\n\n', skip_leading_newlines=skip_leading_newlines, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue'), (None, '')] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_blank_body(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\nBar: green\nBaz: blue\n\n\n', skip_leading_newlines=skip_leading_newlines, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue'), (None, '\n')] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_body(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'This is a test.', skip_leading_newlines=skip_leading_newlines, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue'), (None, 'This is a test.')] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_body_extra_blanks(scanner, skip_leading_newlines): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' '\n' 'This is a test.', skip_leading_newlines=skip_leading_newlines, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue'), (None, '\nThis is a test.')] def test_headerlike_body(scanner): assert list(scanner( 'Foo: red\n' 'Bar: green\n' 'Baz: blue\n' '\n' 'Foo: quux\n' 'Bar: glarch\n' 'Baz: cleesh\n' )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue'), (None, 'Foo: quux\nBar: glarch\nBaz: cleesh\n')] def test_circumcolon_whitespace(scanner): assert list(scanner( 'Key1: Value1\n' 'Key2 :Value2\n' 'Key3 : Value3\n' 'Key4:Value4\n' )) == [('Key1', 'Value1'), ('Key2', 'Value2'), ('Key3', 'Value3'), ('Key4', 'Value4')] def test_circumcolon_whitespace_spaceless_separator_regex(scanner): assert list(scanner( 'Key1: Value1\n' 'Key2 :Value2\n' 'Key3 : Value3\n' 'Key4:Value4\n', separator_regex=':', )) == [('Key1', ' Value1'), ('Key2 ', 'Value2'), ('Key3 ', ' Value3'), ('Key4', 'Value4')] def test_folding(scanner): assert list(scanner( 'Key1: Value1\n' ' Folded\n' ' More folds\n' 'Key2: Value2\n' ' Folded\n' ' Fewer folds\n' 'Key3: Value3\n' ' Key4: Not a real header\n' 'Key4: \n' '\tTab after empty line\n' ' \n' ' After an "empty" folded line\n' 'Key5:\n' ' After a line without even a space!\n' )) == [ ('Key1', 'Value1\n Folded\n More folds'), ('Key2', 'Value2\n Folded\n Fewer folds'), ('Key3', 'Value3\n Key4: Not a real header'), ('Key4', '\n\tTab after empty line\n \n After an "empty" folded line'), ('Key5', '\n After a line without even a space!'), ] def test_no_final_newline(scanner): assert list(scanner('Foo: red\nBar: green\nBaz: blue')) == \ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_leading_newline(scanner): assert list(scanner('\nFoo: red\nBar: green\nBaz: blue\n')) == \ [(None, 'Foo: red\nBar: green\nBaz: blue\n')] def test_skip_leading_newlines(scanner): assert list(scanner( '\nFoo: red\nBar: green\nBaz: blue\n', skip_leading_newlines=True, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_cr_terminated(): assert list(scan_string('Foo: red\rBar: green\rBaz: blue\r')) == \ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_crlf_terminated(scanner): assert list(scanner('Foo: red\r\nBar: green\r\nBaz: blue\r\n')) == \ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_mixed_terminators(): assert list(scan_string('Foo: red\nBar: green\rBaz: blue\r\n')) == \ [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_mixed_folding(): assert list(scan_string( 'Foo: line\n' ' feed\n' 'Bar: carriage\r' ' return\r' 'Baz: CR\r\n' ' LF\r\n' )) == [ ('Foo', 'line\n feed'), ('Bar', 'carriage\n return'), ('Baz', 'CR\n LF'), ] def test_malformed_header(scanner): with pytest.raises(headerparser.MalformedHeaderError) as excinfo: list(scanner('Foo: red\nBar green\nBaz: blue\n')) assert str(excinfo.value) == "Invalid header line encountered: 'Bar green'" assert excinfo.value.line == 'Bar green' def test_unexpected_folding(scanner): with pytest.raises(headerparser.UnexpectedFoldingError) as excinfo: list(scanner(' Foo: red\nBar green\nBaz: blue\n')) assert str(excinfo.value) == ( "Indented line without preceding header line encountered: ' Foo: red'" ) assert excinfo.value.line == ' Foo: red' def test_multiple(scanner): assert list(scanner( 'Foo: value1\n' 'Foo: value2\n' 'FOO: VALUE3\n' 'fOO: valueFour\n' )) == [ ('Foo', 'value1'), ('Foo', 'value2'), ('FOO', 'VALUE3'), ('fOO', 'valueFour'), ] @pytest.mark.parametrize('skip_leading_newlines', [True, False]) def test_empty(scanner, skip_leading_newlines): assert list(scanner('', skip_leading_newlines=skip_leading_newlines)) == [] def test_one_empty_line(scanner): assert list(scanner('\n')) == [(None, '')] def test_one_empty_line_skip_leading_newlines(scanner): assert list(scanner('\n', skip_leading_newlines=True)) == [] def test_two_empty_lines(scanner): assert list(scanner('\n\n')) == [(None, '\n')] def test_two_empty_lines_skip_leading_newlines(scanner): assert list(scanner('\n\n', skip_leading_newlines=True)) == [] def test_lines_no_ends(): assert list(scan([ 'Key: value', 'Folded: hold on', ' let me check', ' ', ' yes', '', 'Newlines will not be added to this body.', "So it'll look bad.", ])) == [ ('Key', 'value'), ('Folded', 'hold on\n let me check\n \n yes'), (None, "Newlines will not be added to this body.So it'll look bad."), ] def test_untrimmed_value(scanner): assert list(scanner( 'Leading: value\n' 'Trailing: value \n' 'Leading-Tab:\tvalue\n' 'Trailing-Tab:value\t\n' )) == [ ('Leading', 'value'), ('Trailing', 'value '), ('Leading-Tab', 'value'), ('Trailing-Tab', 'value\t'), ] def test_space_in_name(scanner): assert list(scanner('Key Name: value')) == [('Key Name', 'value')] @pytest.mark.parametrize('separator_regex', [ r'\s*=\s*', re.compile(r'\s*=\s*'), ]) def test_separator_regex(scanner, separator_regex): assert list(scanner( 'Foo = red\nBar =green\nBaz= blue\n', separator_regex=separator_regex, )) == [('Foo', 'red'), ('Bar', 'green'), ('Baz', 'blue')] def test_multi_colon(scanner): assert list(scanner('Foo: red : crimson: scarlet\n')) == \ [('Foo', 'red : crimson: scarlet')] def test_separator_regex_multi_match(scanner): assert list(scanner( 'Foo = red = crimson=scarlet\n', separator_regex=r'\s*=\s*', )) == [('Foo', 'red = crimson=scarlet')] def test_separator_regex_mixed_multi_match(scanner): assert list(scanner( 'Key: Value = foo\nKey = Value: foo\n', separator_regex=r'\s*=\s*', )) == [('Key: Value', 'foo'), ('Key', 'Value: foo')] def test_separator_regex_default_separator(scanner): with pytest.raises(headerparser.MalformedHeaderError) as excinfo: list(scanner('Foo = red\nBar: green\n', separator_regex=r'\s*=\s*')) assert str(excinfo.value) == "Invalid header line encountered: 'Bar: green'" assert excinfo.value.line == 'Bar: green' def test_deprecated_scan_lines(mocker): mockscan = mocker.patch( 'headerparser.scanner.scan', return_value=mocker.sentinel.OUTPUT, ) with pytest.warns(DeprecationWarning): r = scan_lines(mocker.sentinel.INPUT) mockscan.assert_called_once_with(mocker.sentinel.INPUT) assert r is mocker.sentinel.OUTPUT def test_deprecated_scan_file(mocker): mockscan = mocker.patch( 'headerparser.scanner.scan', return_value=mocker.sentinel.OUTPUT, ) with pytest.warns(DeprecationWarning): r = scan_file(mocker.sentinel.INPUT) mockscan.assert_called_once_with(mocker.sentinel.INPUT) assert r is mocker.sentinel.OUTPUT headerparser-0.4.0/test/test_unfold.py000066400000000000000000000040451347354115200200470ustar00rootroot00000000000000from headerparser import unfold def test_unfold_single_line(): assert unfold('some value') == 'some value' def test_unfold_two_lines(): assert unfold('some\nvalue') == 'some value' def test_unfold_folded_lines(): assert unfold('some\n value') == 'some value' def test_unfold_leading_space(): assert unfold(' some value') == 'some value' def test_unfold_leading_empty_line(): assert unfold('\nsome value') == 'some value' def test_unfold_leading_space_line(): assert unfold(' \nsome value') == 'some value' def test_unfold_leading_line_space(): assert unfold('\n some value') == 'some value' def test_unfold_trailing_space(): assert unfold('some value ') == 'some value' def test_unfold_trailing_empty_line(): assert unfold('some value\n') == 'some value' def test_unfold_trailing_line_space(): assert unfold('some value\n ') == 'some value' def test_unfold_trailing_space_line(): assert unfold('some value \n') == 'some value' def test_unfold_embedded_spaces(): assert unfold('A period ends a sentence. It is followed by two spaces.') \ == 'A period ends a sentence. It is followed by two spaces.' def test_unfold_embedded_tabs(): assert unfold('x\ty\n0\t1\n') == 'x\ty 0\t1' def test_unfold_varying_indent(): assert unfold('Value1\n Folded\n More folds\n Fewer folds\n') \ == 'Value1 Folded More folds Fewer folds' def test_unfold_tab_indent(): assert unfold('some\n\tvalue') == 'some value' def test_unfold_tab_and_space_indent(): assert unfold('some\n\t value') == 'some value' def test_unfold_form_field(): assert unfold('some\n \f value') == 'some \f value' def test_unfold_spaceful_line(): assert unfold('some \n \n value') == 'some value' def test_unfold_parbreak(): assert unfold('some\n\nvalue') == 'some value' def test_unfold_cr(): assert unfold('some\r value') == 'some value' def test_unfold_crlf(): assert unfold('some\r\n value') == 'some value' def test_unfold_mixed(): assert unfold('some\nsort\rof\r\nvalue') == 'some sort of value' headerparser-0.4.0/tox.ini000066400000000000000000000012471347354115200155040ustar00rootroot00000000000000[tox] envlist = py27,py34,py35,py36,py37,pypy,pypy3 skip_missing_interpreters = True [testenv] usedevelop = True deps = pytest~=4.0 pytest-cov~=2.0 pytest-flakes~=4.0 pytest-mock~=1.6 commands = pytest {posargs} headerparser test README.rst docs/index.rst [pytest] addopts = --cache-clear --cov=headerparser --no-cov-on-fail --doctest-modules --flakes doctest_optionflags = IGNORE_EXCEPTION_DETAIL filterwarnings = error [coverage:run] branch = True [coverage:report] precision = 2 show_missing = True [testenv:docs] basepython = python3 deps = -rdocs/requirements.txt changedir = docs commands = sphinx-build -E -W -b html . _build/html