pax_global_header00006660000000000000000000000064145073032440014514gustar00rootroot0000000000000052 comment=1bb7268ace7beb8b0d8fc48e6f2662f58de76c26 headerparser-0.5.1/000077500000000000000000000000001450730324400141645ustar00rootroot00000000000000headerparser-0.5.1/.github/000077500000000000000000000000001450730324400155245ustar00rootroot00000000000000headerparser-0.5.1/.github/dependabot.yml000066400000000000000000000003601450730324400203530ustar00rootroot00000000000000version: 2 updates: - package-ecosystem: github-actions directory: / schedule: interval: weekly commit-message: prefix: "[gh-actions]" include: scope labels: - dependencies - d:github-actions headerparser-0.5.1/.github/workflows/000077500000000000000000000000001450730324400175615ustar00rootroot00000000000000headerparser-0.5.1/.github/workflows/test.yml000066400000000000000000000027031450730324400212650ustar00rootroot00000000000000name: Test on: pull_request: push: branches-ignore: - 'dependabot/**' schedule: - cron: '0 6 * * *' concurrency: group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.ref_name }} cancel-in-progress: true jobs: test: runs-on: ubuntu-latest strategy: fail-fast: false matrix: python-version: - '3.7' - '3.8' - '3.9' - '3.10' - '3.11' - '3.12' - 'pypy-3.7' - 'pypy-3.8' - 'pypy-3.9' - 'pypy-3.10' toxenv: [py] include: - python-version: '3.7' toxenv: lint - python-version: '3.7' toxenv: typing steps: - name: Check out repository uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v4 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | python -m pip install --upgrade pip wheel python -m pip install --upgrade --upgrade-strategy=eager coverage tox - name: Run tests run: tox -e ${{ matrix.toxenv }} - name: Generate XML coverage report if: matrix.toxenv == 'py' run: coverage xml - name: Upload coverage to Codecov if: matrix.toxenv == 'py' uses: codecov/codecov-action@v3 with: fail_ci_if_error: false # vim:set et sts=2: headerparser-0.5.1/.gitignore000066400000000000000000000002041450730324400161500ustar00rootroot00000000000000*.egg *.egg-info/ *.pyc .cache/ .coverage* .eggs/ .pytest_cache/ .tox/ __pycache__/ build/ dist/ docs/.doctrees/ docs/_build/ venv/ headerparser-0.5.1/.pre-commit-config.yaml000066400000000000000000000013041450730324400204430ustar00rootroot00000000000000repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.4.0 hooks: - id: check-added-large-files - id: check-json - id: check-toml - id: check-yaml - id: end-of-file-fixer - id: trailing-whitespace exclude: '^docs/format\.rst$' - repo: https://github.com/psf/black rev: 23.1.0 hooks: - id: black - repo: https://github.com/PyCQA/isort rev: 5.12.0 hooks: - id: isort - repo: https://github.com/PyCQA/flake8 rev: 6.0.0 hooks: - id: flake8 additional_dependencies: - flake8-bugbear - flake8-builtins - flake8-unused-arguments exclude: ^test/data headerparser-0.5.1/.readthedocs.yaml000066400000000000000000000003461450730324400174160ustar00rootroot00000000000000version: 2 formats: all python: install: - requirements: docs/requirements.txt - method: pip path: . build: os: ubuntu-22.04 tools: python: "3" sphinx: configuration: docs/conf.py fail_on_warning: true headerparser-0.5.1/CHANGELOG.md000066400000000000000000000052011450730324400157730ustar00rootroot00000000000000v0.5.1 (2023-10-04) ------------------- - Include `py.typed` file in distribution v0.5.0 (2023-10-04) ------------------- - Support Python 3.8 through 3.12 - Drop support for Python 2.7, 3.4, 3.5, and 3.6 - Removed `scan_file()`, `scan_lines()`, `HeaderParser.parse_file()`, and `HeaderParser.parse_lines()` (all deprecated in v0.4.0) - Type annotations added - The scanner options to the scanner functions are now keyword-only - `scan()` and `scan_stanzas()` can now parse strings directly. As a result, `scan_string()` and `scan_stanzas_string()` are now deprecated. - The `HeaderParser` methods `parse()` and `parse_stanzas()` can now parse strings directly. As a result, the `parse_string()` and `parse_stanzas_string()` methods are now deprecated. - Added a `Scanner` class with methods for scanning a shared input. As a result, the following are now deprecated: - `scan_next_stanza()` - `scan_next_stanza_string()` - `HeaderParser.parse_next_stanza()` - `HeaderParser.parse_next_stanza_string()` v0.4.0 (2019-05-29) ------------------- - Added a `scan()` function combining the behavior of `scan_file()` and `scan_lines()`, which are now deprecated - Gave `HeaderParser` a `parse()` method combining the behavior of `parse_file()` and `parse_lines()`, which are now deprecated - Added `scan_next_stanza()` and `scan_next_stanza_string()` functions for scanning & consuming input only up to the end of the first header section - Added `scan_stanzas()` and `scan_stanzas_string()` functions for scanning input composed entirely of multiple stanzas/header sections - Gave `HeaderParser` `parse_next_stanza()` and `parse_next_stanza_string()` methods for parsing & consuming input only up to the end of the first header section - Gave `HeaderParser` `parse_stanzas()` and `parse_stanzas_string()` methods for parsing input composed entirely of multiple stanzas/header sections v0.3.0 (2018-10-12) ------------------- - Drop support for Python 3.3 - Gave `HeaderParser` and the scanner functions options for configuring scanning behavior: - `separator_regex` - `skip_leading_newlines` - Fixed a `DeprecationWarning` in Python 3.7 v0.2.0 (2018-02-14) ------------------- - `NormalizedDict`'s default normalizer (exposed as the `lower()` function) now passes non-strings through unchanged - `HeaderParser` instances can now be compared for non-identity equality - `HeaderParser.add_field()` and `HeaderParser.add_additional()` now take an optional `action` argument for customizing the parser's behavior when a field is encountered - Made the `unfold()` function public v0.1.0 (2017-03-17) ------------------- Initial release headerparser-0.5.1/LICENSE000066400000000000000000000021071450730324400151710ustar00rootroot00000000000000The MIT License (MIT) Copyright (c) 2017-2023 John Thorvald Wodder II Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. headerparser-0.5.1/MANIFEST.in000066400000000000000000000001761450730324400157260ustar00rootroot00000000000000include CHANGELOG.* CONTRIBUTORS.* LICENSE tox.ini graft src graft docs prune docs/_build graft test global-exclude *.py[cod] headerparser-0.5.1/NOTES.md000066400000000000000000000041151450730324400153770ustar00rootroot00000000000000Relevant extracts from : §2.1: Note: Common parlance and earlier versions of this specification use the term "header" to either refer to the entire header section or to refer to an individual header field. To avoid ambiguity, this document does not use the terms "header" or "headers" in isolation, but instead always uses "header field" to refer to the individual field and "header section" to refer to the entire collection. §2.2: Header fields are lines beginning with a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of printable US-ASCII characters as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters (together known as the white space characters, WSP). A field body MUST NOT include CR and LF except when used in "folding" and "unfolding", as described in section 2.2.3. All field bodies MUST conform to the syntax described in sections 3 and 4 of this specification. -------------------------------------------------------------------------------- Additional relevant RFCs: - On internationalization: - — MIME Part Three: Message Header Extensions for Non-ASCII Text - — MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations - On header fields with "parameterized" syntax: - , §5.1 — Syntax of the Content-Type Header Field - , §5 — The Link Header Field - , §4 — Forwarded HTTP Header Field - — The Content-Disposition Header Field See also: headerparser-0.5.1/README.rst000066400000000000000000000120351450730324400156540ustar00rootroot00000000000000.. image:: http://www.repostatus.org/badges/latest/active.svg :target: http://www.repostatus.org/#active :alt: Project Status: Active — The project has reached a stable, usable state and is being actively developed. .. image:: https://github.com/jwodder/headerparser/workflows/Test/badge.svg?branch=master :target: https://github.com/jwodder/headerparser/actions?workflow=Test :alt: CI Status .. image:: https://codecov.io/gh/jwodder/headerparser/branch/master/graph/badge.svg :target: https://codecov.io/gh/jwodder/headerparser .. image:: https://img.shields.io/pypi/pyversions/headerparser.svg :target: https://pypi.org/project/headerparser .. image:: https://img.shields.io/github/license/jwodder/headerparser.svg :target: https://opensource.org/licenses/MIT :alt: MIT License `GitHub `_ | `PyPI `_ | `Documentation `_ | `Issues `_ | `Changelog `_ ``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's ``argparse`` module. (Everyone loves ``argparse``, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. The Format ========== RFC 822-style headers are header fields that follow the general format of e-mail headers as specified by RFC 822 and friends: each field is a line of the form "``Name: Value``", with long values continued onto multiple lines ("folded") by indenting the extra lines. A blank line marks the end of the header section and the beginning of the message body. This basic grammar has been used by numerous textual formats besides e-mail, including but not limited to: - HTTP request & response headers - Usenet messages - most Python packaging metadata files - Debian packaging control files - ``META-INF/MANIFEST.MF`` files in Java JARs - a subset of the `YAML `_ serialization format — all of which this package can parse. Installation ============ ``headerparser`` requires Python 3.7 or higher. Just use `pip `_ for Python 3 (You have pip, right?) to install ``headerparser``:: python3 -m pip install headerparser Examples ======== Define a parser: >>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data') Parse some headers and inspect the results: >>> msg = parser.parse('''\ ... Name: Sample Input ... Public: yes ... tag: doctest, examples, ... whatever ... TAG: README ... ... Wait, why I am using a body instead of the "Data" field? ... ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): ... KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n' Fail to parse headers that don't meet your requirements: >>> parser.parse('Type: demonstration') Traceback (most recent call last): ... headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse('Name: Bad type\nType: other') Traceback (most recent call last): ... headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse('Name: unknown field\nField: Value') Traceback (most recent call last): ... headerparser.errors.UnknownFieldError: Unknown header field 'Field' Allow fields you didn't even think of: >>> parser.add_additional() >>> msg = parser.parse('Name: unknown field\nField: Value') >>> msg['Field'] 'Value' Just split some headers into names & values and worry about validity later: >>> for field in headerparser.scan('''\ ... Name: Scanner Sample ... Unknown headers: no problem ... Unparsed-Boolean: yes ... CaSe-SeNsItIvE-rEsUlTs: true ... Whitespace around colons:optional ... Whitespace around colons : I already said it's optional. ... That means you have the _option_ to use as much as you want! ... ... And there's a body, too, I guess. ... '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n") headerparser-0.5.1/TODO.md000066400000000000000000000164421450730324400152620ustar00rootroot00000000000000- Should string `default` values be passed through `type` etc. like in argparse? - Rethink how the original exception data is attached to `FieldTypeError`s - Include everything from `sys.exc_info()`? - Rename `NormalizedDict.normalized_dict()` to something that doesn't imply it returns a `NormalizedDict`? - Add docstrings to private classes and attributes - Write more tests - different header name normalizers (identity, hyphens=underscores, titlecase?, etc.) - `add_additional` - calling `add_additional` multiple times (some times with `allow=False`) - `add_additional(False, extra arguments ...)` - `add_additional` when a header has a `dest` that's just a normalized form of one of its names - calling `add_field`/`add_additional` on a `HeaderParser` after a previous call raised an error - scanning & parsing Unicode - normalizer that returns a non-string - non-string keys in `NormalizedDict` with the default normalizer - equality of `HeaderParser` objects - Test that `HeaderParser.parse_stream()` won't choke on non-string inputs - passing scanner options to `HeaderParser` - scanning files not opened in universal newlines mode - Improve documentation & examples - Contrast handling of multi-occurrence fields with that of the standard library - Draw attention to the case-insensitivity of field names when parsing and when retrieving from the dict - Give examples of custom normalization (or at least explain what it is and why it's worth having) - Add `action` examples - Add example recipes to the documentation of `HeaderParser`s for common mail-like formats - Write more user-friendly documentation that goes through `HeaderParser` feature by feature like `attrs`' documentation Features ======== - Add some sort of handling for "From " lines - Give `NormalizedDict` a `from_line` attribute - Give the scanner a `from_line_regex` parameter; if the first line of a stanza matches the regex, it is assumed to be a "From" line - Create a "`SpecialHeader`" enum with `FromLine` and `Body` values for use as the first element of `(header, value)` pairs yielded by the scanner representing "From " lines and bodies - Use the enum values as keys in `NormalizedDict`s instead of having dedicated `from_line` and `body` attributes? - Give the parser an option for requiring a "From " line - Export premade regexes for matching Unix mail "From " lines, HTTP request lines, and HTTP response status lines - Write an entry point for converting RFC822-style files/headers to JSON - name: `mail2json`? `headers2json`? - include options for: - parsing multiple stanzas into an array of JSON objects - setting the key name for the "message body" - handling of multiple occurrences of the same header in a single stanza; choices: - raise an error - combine multi-occurrence headers into an array of values - use an array of values for all headers regardless of multiplicity (default?) - output an array of `{"header": ..., "value": ...}` objects - handling of non-ASCII characters and the various ways in which they can be escaped - handling of "From " lines (and/or other non-header headers like the first line of an HTTP request or response?) - handling of header lettercases? Scanning -------- - Give the scanner options for: - definition of "whitespace" for purposes of folding (standard: 0x20 and TAB) - line separator/terminator (default: CR, LF, and CRLF; standard: only CRLF, with lone CR and LF being obsolete) - using Unicode definitions of line endings and horizontal whitespace - stripping leading whitespace from folded lines? (standard: no) - handling "From " lines and the like - ignoring all blank lines? - comments? (cf. robots.txt) - internationalization of header names - treating `---` as a blank line? - Error handling: - header lines without a colon or indentation (options: error, header with empty value, or start of body) - empty header name (options: error, header with empty name, look for next colon, or start of body) - all-whitespace line (considered obsolete by RFC 5322) Parsing ------- - Include utility callables for header types: - RFC822 dates, addresses, etc. - Content-Type-style "parameterized" headers - Include an `object_pairs_hook` for the parameters? - cf. `cgi.parse_header()` - internationalized strings - converting lines with just '.' to blank lines - Somehow support the types in `email.headerregistry` - Provide a `Normalizer` class with options for casing, trimming whitespace, squashing whitespace, converting hyphens and underscores to the same character, squashing hyphens & underscores, etc. - unfolding if & only if the first line of the value contains any non-whitespace? (cf. most multiline fields in Debian control files) - DKIM headers? - removing RFC 822 comments? - comma-and-space-separated lists? - cf. `urllib.request.parse_http_list()`? - New `add_field` and `add_additional` options to add: - `default_action=callable` for defining what to do when a header is absent - `multiple_type` and `multiple_action` — like `type` and `action`, but called on a list of all values encountered for a `multiple` field - `i18n=bool` — turns on decoding of internationalized mail headers before passing to `type` (Do this via a custom type instead?) - Give `add_additional` an option for controlling whether to normalize additional header names before adding them to the dict? - Requiring/forbidding nonempty/non-whitespace bodies - Add public methods for removing, inspecting, & modifying header definitions - Make the `body`, `scanner_opts`, etc. attributes public - Support constructing a complete `HeaderParser` in a single expression from a `dict` rather than having to make multiple calls to `add_field` - Support converting a `HeaderParser` instance to such a `dict` - Support modifying a `HeaderParser`'s field definitions after they're defined? - Allow two different named fields to have the same `dest` if they both have `multiple=True`? (or both `multiple=False`?) - Give `add_additional` an argument for putting all additional fields in a given subdict (or a presupplied arbitrary mapping object?) so that named fields can still use custom dests? - Give parsers a way to store parsed fields in a presupplied arbitrary mapping object (or one created from a `dict_factory`/`dict_cls` callable?) instead of creating a new NormalizedDict? - Give `HeaderParser` an option (`body_key`?) for storing the body in a given `dict` key - Create a `BODY` token to use as a `dict` key for storing bodies instead of storing them as an attribute? - Add an option/method for ignoring & discarding any unknown/"additional" fields - Add handling for fields that can either occur in the header or be the body (e.g., "Description" in Python packaging METADATA) - Require scanner options to be passed to `HeaderParser`'s constructor in a `scanner_opts={}` `dict` instead of as `**kwargs` headerparser-0.5.1/docs/000077500000000000000000000000001450730324400151145ustar00rootroot00000000000000headerparser-0.5.1/docs/changelog.rst000066400000000000000000000054271450730324400176050ustar00rootroot00000000000000.. currentmodule:: headerparser Changelog ========= v0.5.1 (2023-10-04) ------------------- - Include :file:`py.typed` file in distribution v0.5.0 (2023-10-04) ------------------- - Support Python 3.8 through 3.12 - Drop support for Python 2.7, 3.4, 3.5, and 3.6 - Removed ``scan_file()``, ``scan_lines()``, ``HeaderParser.parse_file()``, and ``HeaderParser.parse_lines()`` (all deprecated in v0.4.0) - Type annotations added - The scanner options to the scanner functions are now keyword-only - `scan()` and `scan_stanzas()` can now parse strings directly. As a result, `scan_string()` and `scan_stanzas_string()` are now deprecated. - The `HeaderParser` methods `~HeaderParser.parse()` and `~HeaderParser.parse_stanzas()` can now parse strings directly. As a result, the `~HeaderParser.parse_string()` and `~HeaderParser.parse_stanzas_string()` methods are now deprecated. - Added a `Scanner` class with methods for scanning a shared input. As a result, the following are now deprecated: - `scan_next_stanza()` - `scan_next_stanza_string()` - `HeaderParser.parse_next_stanza()` - `HeaderParser.parse_next_stanza_string()` v0.4.0 (2019-05-29) ------------------- - Added a `scan()` function combining the behavior of ``scan_file()`` and ``scan_lines()``, which are now deprecated - Gave `HeaderParser` a `~HeaderParser.parse()` method combining the behavior of ``parse_file()`` and ``parse_lines()``, which are now deprecated - Added `scan_next_stanza()` and `scan_next_stanza_string()` functions for scanning & consuming input only up to the end of the first header section - Added `scan_stanzas()` and `scan_stanzas_string()` functions for scanning input composed entirely of multiple stanzas/header sections - Gave `HeaderParser` `parse_next_stanza()` and `parse_next_stanza_string()` methods for parsing & consuming input only up to the end of the first header section - Gave `HeaderParser` `parse_stanzas()` and `parse_stanzas_string()` methods for parsing input composed entirely of multiple stanzas/header sections v0.3.0 (2018-10-12) ------------------- - Drop support for Python 3.3 - Gave `HeaderParser` and the scanner functions options for configuring scanning behavior: - ``separator_regex`` - ``skip_leading_newlines`` - Fixed a `DeprecationWarning` in Python 3.7 v0.2.0 (2018-02-14) ------------------- - `NormalizedDict`'s default normalizer (exposed as the `lower()` function) now passes non-strings through unchanged - `HeaderParser` instances can now be compared for non-identity equality - `HeaderParser.add_field()` and `HeaderParser.add_additional()` now take an optional ``action`` argument for customizing the parser's behavior when a field is encountered - Made the `unfold()` function public v0.1.0 (2017-03-17) ------------------- Initial release headerparser-0.5.1/docs/conf.py000066400000000000000000000017221450730324400164150ustar00rootroot00000000000000from headerparser import __version__ project = "headerparser" author = "John T. Wodder II" copyright = "2017-2023 John T. Wodder II" # noqa: A001 extensions = [ "sphinx.ext.autodoc", "sphinx.ext.intersphinx", "sphinx.ext.viewcode", "sphinx_copybutton", ] autodoc_default_options = { "members": True, "undoc-members": True, } intersphinx_mapping = { "python": ("https://docs.python.org/3", None), } exclude_patterns = ["_build"] source_suffix = ".rst" source_encoding = "utf-8" master_doc = "index" version = __version__ release = __version__ today_fmt = "%Y %b %d" default_role = "py:obj" pygments_style = "sphinx" html_theme = "sphinx_rtd_theme" html_theme_options = { "collapse_navigation": False, "prev_next_buttons_location": "both", } html_last_updated_fmt = "%Y %b %d" html_show_sourcelink = True html_show_sphinx = True html_show_copyright = True copybutton_prompt_text = r">>> |\.\.\. |\$ " copybutton_prompt_is_regexp = True headerparser-0.5.1/docs/errors.rst000066400000000000000000000015571450730324400171720ustar00rootroot00000000000000.. currentmodule:: headerparser Exceptions ========== .. autoexception:: Error :show-inheritance: Parser Errors ------------- .. autoexception:: ParserError :show-inheritance: .. autoexception:: BodyNotAllowedError :show-inheritance: .. autoexception:: DuplicateFieldError :show-inheritance: .. autoexception:: FieldTypeError :show-inheritance: .. autoexception:: InvalidChoiceError :show-inheritance: .. autoexception:: MissingBodyError :show-inheritance: .. autoexception:: MissingFieldError :show-inheritance: .. autoexception:: UnknownFieldError :show-inheritance: Scanner Errors -------------- .. autoexception:: ScannerError :show-inheritance: .. autoexception:: MalformedHeaderError :show-inheritance: .. autoexception:: UnexpectedFoldingError :show-inheritance: .. autoexception:: ScannerEOFError :show-inheritance: headerparser-0.5.1/docs/format.rst000066400000000000000000000061721450730324400171440ustar00rootroot00000000000000Input Format ============ `headerparser` accepts a syntax that is intended to be a simplified superset of the Internet Message (e-mail) Format specified in :rfc:`822`, :rfc:`2822`, and :rfc:`5322`. Specifically: - Everything in the input up to (but not including) the first blank line (i.e., a line containing only a line ending) constitutes a :dfn:`stanza` or :dfn:`header section`. Everything after the first blank line is a free-form :dfn:`message body`. If there are no blank lines, the entire input is used as the header section, and there is no body. .. note:: By default, blank lines at the beginning of a document are interpreted as the ending of a zero-length stanza. Such blank lines can instead be ignored by setting the ``skip_leading_newlines`` `Scanner` option to true. - A stanza or header section is composed of zero or more :dfn:`header fields`. A header field is composed of one or more lines, with all lines after the first beginning with a space or tab. Additionally, the first line must contain a colon (optionally surrounded by whitespace); everything before the colon is the :dfn:`header field name`, while everything after (including subsequent lines) is the :dfn:`header field value`. .. note:: Name-value separators other than a colon can be used by setting the ``separator_regex`` `Scanner` option appropriately. .. note:: `headerparser` only recognizes CR, LF, and CR LF sequences as line endings. An example:: Key: Value Foo: Bar Bar:Whitespace around the colon is optional Baz : Very optional Long-Field: This field has a very long value, so I'm going to split it across multiple lines. The above line is all whitespace. This counts as line folding, and so we're still in the "Long Field" value, but the RFCs consider such lines obsolete, so you should avoid using them. . One alternative to an all-whitespace line is a line with just indentation and a period. Debian package description fields use this. Foo: Wait, I already defined a value for this key. What happens now? What happens now: It depends on whether the `multiple` option for the "Foo" field was set in the HeaderParser. If multiple=True: The "Foo" key in the dictionary returned by HeaderParser.parse() would map to a list of all of Foo's values If multiple=False: A ParserError is raised If multiple=False but there's only one "Foo" anyway: The "Foo" key in the result dictionary would map to just a single string. Compare this to: the standard library's `email` package, which accepts multi-occurrence fields, but *which* occurrence Message.__getitem__ returns is unspecified! Are we still in the header: no There was a blank line above, so we're now in the body, which isn't processed for headers. Good thing, too, because this isn't a valid header line. On the other hand, this is not a valid RFC 822-style document:: An indented first line — without a "Name:" line before it! A header line without a colon isn't good, either. Does this make up for the above: no headerparser-0.5.1/docs/index.rst000066400000000000000000000073741450730324400167700ustar00rootroot00000000000000.. module:: headerparser ============================================== headerparser — argparse for mail-style headers ============================================== `GitHub `_ | `PyPI `_ | `Documentation `_ | `Issues `_ | :doc:`Changelog ` .. toctree:: :hidden: format parser scanner util errors changelog `headerparser` parses key-value pairs in the style of :rfc:`822` (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's `argparse` module. (Everyone loves `argparse`, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. Installation ============ ``headerparser`` requires Python 3.7 or higher. Just use `pip `_ for Python 3 (You have pip, right?) to install ``headerparser``:: python3 -m pip install headerparser Examples ======== Define a parser: >>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data') Parse some headers and inspect the results: >>> msg = parser.parse('''\ ... Name: Sample Input ... Public: yes ... tag: doctest, examples, ... whatever ... TAG: README ... ... Wait, why I am using a body instead of the "Data" field? ... ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): ... KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n' Fail to parse headers that don't meet your requirements: >>> parser.parse('Type: demonstration') Traceback (most recent call last): ... headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse('Name: Bad type\nType: other') Traceback (most recent call last): ... headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse('Name: unknown field\nField: Value') Traceback (most recent call last): ... headerparser.errors.UnknownFieldError: Unknown header field 'Field' Allow fields you didn't even think of: >>> parser.add_additional() >>> msg = parser.parse('Name: unknown field\nField: Value') >>> msg['Field'] 'Value' Just split some headers into names & values and worry about validity later: >>> for field in headerparser.scan('''\ ... Name: Scanner Sample ... Unknown headers: no problem ... Unparsed-Boolean: yes ... CaSe-SeNsItIvE-rEsUlTs: true ... Whitespace around colons:optional ... Whitespace around colons : I already said it's optional. ... That means you have the _option_ to use as much as you want! ... ... And there's a body, too, I guess. ... '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n") Indices and tables ================== * :ref:`genindex` * :ref:`search` headerparser-0.5.1/docs/parser.rst000066400000000000000000000001141450730324400171360ustar00rootroot00000000000000.. currentmodule:: headerparser Parser ====== .. autoclass:: HeaderParser headerparser-0.5.1/docs/requirements.txt000066400000000000000000000000731450730324400204000ustar00rootroot00000000000000Sphinx~=7.0 sphinx-copybutton~=0.5.0 sphinx_rtd_theme~=1.0 headerparser-0.5.1/docs/scanner.rst000066400000000000000000000030331450730324400172760ustar00rootroot00000000000000.. currentmodule:: headerparser Scanner ======= The `Scanner` class and related functions perform basic parsing of RFC 822-style header fields, splitting formatted input up into sequences of ``(name, value)`` pairs without any further validation or transformation. Each pair returned by a scanner method or function represents an individual header field. The first element (the header field name) is the substring up to but not including the first whitespace-padded colon (or other delimiter specified by ``separator_regex``) in the first source line of the header field. The second element (the header field value) is a single string, the concatenation of one or more lines, starting with the substring after the first colon in the first source line, with leading whitespace on lines after the first preserved; the ending of each line is converted to ``"\n"`` (added if there is no line ending in the actual input), and the last line of the field value has its trailing line ending (if any) removed. .. note:: "Line ending" here means a CR, LF, or CR LF sequence. Unicode line separators are not treated as line endings and are not trimmed or converted to ``"\n"``. Scanner Class ------------- .. autoclass:: Scanner :exclude-members: separator_regex, skip_leading_newlines Functions --------- .. autofunction:: scan .. autofunction:: scan_stanzas Deprecated Functions -------------------- .. autofunction:: scan_string .. autofunction:: scan_stanzas_string .. autofunction:: scan_next_stanza .. autofunction:: scan_next_stanza_string headerparser-0.5.1/docs/util.rst000066400000000000000000000002331450730324400166210ustar00rootroot00000000000000.. currentmodule:: headerparser Utilities ========= .. autoclass:: NormalizedDict .. autofunction:: BOOL .. autofunction:: lower .. autofunction:: unfold headerparser-0.5.1/pyproject.toml000066400000000000000000000001331450730324400170750ustar00rootroot00000000000000[build-system] requires = ["setuptools >= 46.4.0"] build-backend = "setuptools.build_meta" headerparser-0.5.1/setup.cfg000066400000000000000000000036521450730324400160130ustar00rootroot00000000000000[metadata] name = headerparser version = attr:headerparser.__version__ description = argparse for mail-style headers long_description = file:README.rst long_description_content_type = text/x-rst author = John Thorvald Wodder II author_email = headerparser@varonathe.org license = MIT license_files = LICENSE url = https://github.com/jwodder/headerparser keywords = e-mail email mail rfc822 headers rfc2822 rfc5322 parser classifiers = Programming Language :: Python :: 3 :: Only Programming Language :: Python :: 3 Programming Language :: Python :: 3.7 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Programming Language :: Python :: 3.11 Programming Language :: Python :: 3.12 Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: Implementation :: PyPy License :: OSI Approved :: MIT License Intended Audience :: Developers Topic :: Communications :: Email Topic :: Communications :: Usenet News Topic :: Internet :: WWW/HTTP Topic :: Text Processing Typing :: Typed project_urls = Source Code = https://github.com/jwodder/headerparser Bug Tracker = https://github.com/jwodder/headerparser/issues Documentation = https://headerparser.readthedocs.io [options] packages = find: package_dir = =src include_package_data = True python_requires = >=3.7 install_requires = attrs >= 20.1.0 Deprecated ~= 1.2 [options.packages.find] where = src [mypy] allow_incomplete_defs = False allow_untyped_defs = False ignore_missing_imports = True # : no_implicit_optional = True implicit_reexport = False local_partial_types = True pretty = True show_error_codes = True show_traceback = True strict_equality = True warn_redundant_casts = True warn_return_any = True warn_unreachable = True headerparser-0.5.1/src/000077500000000000000000000000001450730324400147535ustar00rootroot00000000000000headerparser-0.5.1/src/headerparser/000077500000000000000000000000001450730324400174205ustar00rootroot00000000000000headerparser-0.5.1/src/headerparser/__init__.py000066400000000000000000000036731450730324400215420ustar00rootroot00000000000000""" argparse for mail-style headers ``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail) headers and converts them into case-insensitive dictionaries with the trailing message body (if any) attached. Fields can be converted to other types, marked required, or given default values using an API based on the standard library's ``argparse`` module. (Everyone loves ``argparse``, right?) Low-level functions for just scanning header fields (breaking them into sequences of key-value pairs without any further processing) are also included. Visit or for more information. """ from .errors import ( BodyNotAllowedError, DuplicateFieldError, Error, FieldTypeError, InvalidChoiceError, MalformedHeaderError, MissingBodyError, MissingFieldError, ParserError, ScannerEOFError, ScannerError, UnexpectedFoldingError, UnknownFieldError, ) from .normdict import NormalizedDict from .parser import HeaderParser from .scanner import ( Scanner, scan, scan_next_stanza, scan_next_stanza_string, scan_stanzas, scan_stanzas_string, scan_string, ) from .types import BOOL, lower, unfold __version__ = "0.5.1" __author__ = "John Thorvald Wodder II" __author_email__ = "headerparser@varonathe.org" __license__ = "MIT" __url__ = "https://github.com/jwodder/headerparser" __all__ = [ "BOOL", "BodyNotAllowedError", "DuplicateFieldError", "Error", "HeaderParser", "FieldTypeError", "InvalidChoiceError", "MalformedHeaderError", "MissingBodyError", "MissingFieldError", "NormalizedDict", "ParserError", "Scanner", "ScannerEOFError", "ScannerError", "UnexpectedFoldingError", "UnknownFieldError", "lower", "scan", "scan_next_stanza", "scan_next_stanza_string", "scan_stanzas", "scan_stanzas_string", "scan_string", "unfold", ] headerparser-0.5.1/src/headerparser/errors.py000066400000000000000000000077421450730324400213200ustar00rootroot00000000000000from typing import Any class Error(Exception): """Superclass for all custom exceptions raised by the package""" pass class ParserError(Error, ValueError): """Superclass for all custom exceptions related to errors in parsing""" pass class MissingFieldError(ParserError): """ Raised when a header field marked as required is not present in the input """ def __init__(self, name: str) -> None: #: The name of the missing header field self.name: str = name def __str__(self) -> str: return f"Required header field {self.name!r} is not present" class UnknownFieldError(ParserError): """ Raised when an unknown header field is encountered and additional header fields are not enabled """ def __init__(self, name: str) -> None: #: The name of the unknown header field self.name: str = name def __str__(self) -> str: return f"Unknown header field {self.name!r}" class DuplicateFieldError(ParserError): """ Raised when a header field not marked as multiple occurs two or more times in the input """ def __init__(self, name: str) -> None: #: The name of the duplicated header field self.name: str = name def __str__(self) -> str: return f"Header field {self.name!r} occurs more than once" class FieldTypeError(ParserError): """Raised when a ``type`` callable raises an exception""" def __init__(self, name: str, value: str, exc_value: BaseException) -> None: #: The name of the header field for which the ``type`` callable was #: called self.name: str = name #: The value on which the ``type`` callable was called self.value: str = value #: The exception raised by the ``type`` callable self.exc_value: BaseException = exc_value def __str__(self) -> str: return ( f"Error while parsing {self.name!r}: {self.value!r}:" f" {self.exc_value.__class__.__name__}: {self.exc_value}" ) class InvalidChoiceError(ParserError): """ Raised when a header field is given a value that is not one of its allowed choices """ def __init__(self, name: str, value: Any) -> None: #: The name of the header field self.name: str = name #: The invalid value self.value: Any = value def __str__(self) -> str: return f"{self.value!r} is not a valid choice for {self.name!r}" class MissingBodyError(ParserError): """Raised when ``body=True`` but there is no message body in the input""" def __str__(self) -> str: return "Message body is required but missing" class BodyNotAllowedError(ParserError): """Raised when ``body=False`` and the parser encounters a message body""" def __str__(self) -> str: return "Message body is present but not allowed" class ScannerError(Error, ValueError): """Superclass for all custom exceptions related to errors in scanning""" pass class MalformedHeaderError(ScannerError): """ Raised when the scanner encounters an invalid header line, i.e., a line without either a colon or leading whitespace """ def __init__(self, line: str) -> None: #: The invalid header line self.line: str = line def __str__(self) -> str: return f"Invalid header line encountered: {self.line!r}" class UnexpectedFoldingError(ScannerError): """ Raised when the scanner encounters a folded (indented) line that is not preceded by a valid header line """ def __init__(self, line: str) -> None: #: The line containing the unexpected folding (indentation) self.line: str = line def __str__(self) -> str: return f"Indented line without preceding header line encountered: {self.line!r}" class ScannerEOFError(Error): """ Raised when a `Scanner` method is called after all input has been exhausted """ def __str__(self) -> str: return "Scanner has reached end of input" headerparser-0.5.1/src/headerparser/normdict.py000066400000000000000000000114411450730324400216120ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Callable, Iterable, Iterator, Mapping, MutableMapping from typing import Any, Optional from .types import lower class NormalizedDict(MutableMapping): """ A generalization of a case-insensitive dictionary. `NormalizedDict` takes a callable (the "normalizer") that is applied to any key passed to its `~object.__getitem__`, `~object.__setitem__`, or `~object.__delitem__` method, and the result of the call is then used for the actual lookup. When iterating over a `NormalizedDict`, each key is returned as the "pre-normalized" form passed to `~object.__setitem__` the last time the key was set (but see `normalized()` below). Aside from this, `NormalizedDict` behaves like a normal `~collections.abc.MutableMapping` class. If a normalizer is not specified upon instantiation, a default will be used that converts strings to lowercase and leaves everything else unchanged, so `NormalizedDict` defaults to yet another case-insensitive dictionary. Two `NormalizedDict` instances compare equal iff their normalizers, bodies, and `normalized_dict()` return values are equal. When comparing a `NormalizedDict` to any other type of mapping, the other mapping is first converted to a `NormalizedDict` using the same normalizer. :param mapping data: a mapping or iterable of ``(key, value)`` pairs with which to initialize the instance :param callable normalizer: A callable to apply to keys before looking them up; defaults to `lower`. The callable MUST be idempotent (i.e., ``normalizer(x)`` must equal ``normalizer(normalizer(x))`` for all inputs) or else bad things will happen to your dictionary. :param body: initial value for the `body` attribute :type body: string or `None` """ def __init__( self, data: None | Mapping | Iterable[tuple[Any, Any]] = None, normalizer: Optional[Callable[[Any], Any]] = None, body: Optional[str] = None, ) -> None: self._data: dict[Any, tuple[Any, Any]] = {} self.normalizer: Callable[[Any], Any] = ( normalizer if normalizer is not None else lower ) #: This is where `HeaderParser` stores the message body (if any) #: accompanying the header section represented by the mapping self.body: Optional[str] = body if data is not None: # Don't call `update` until after `normalizer` is set. self.update(data) def __getitem__(self, key: Any) -> Any: return self._data[self.normalizer(key)][1] def __setitem__(self, key: Any, value: Any) -> None: self._data[self.normalizer(key)] = (key, value) def __delitem__(self, key: Any) -> None: del self._data[self.normalizer(key)] def __iter__(self) -> Iterator: return (key for key, _ in self._data.values()) def __len__(self) -> int: return len(self._data) def __eq__(self, other: Any) -> bool: if isinstance(other, NormalizedDict): if self.normalizer != other.normalizer or self.body != other.body: return False elif isinstance(other, Mapping): if self.body is not None: return False other = NormalizedDict(other, normalizer=self.normalizer) else: return NotImplemented return self.normalized_dict() == other.normalized_dict() def __repr__(self) -> str: return ( "{0.__module__}.{0.__name__}" "({2!r}, normalizer={1.normalizer!r}, body={1.body!r})".format( type(self), self, dict(self) ) ) def normalized(self) -> NormalizedDict: """ Return a copy of the instance such that iterating over it will return normalized keys instead of the keys passed to `~object.__setitem__` >>> normdict = NormalizedDict() >>> normdict['Foo'] = 23 >>> normdict['bar'] = 42 >>> sorted(normdict) ['Foo', 'bar'] >>> sorted(normdict.normalized()) ['bar', 'foo'] :rtype: NormalizedDict """ return NormalizedDict( self.normalized_dict(), normalizer=self.normalizer, body=self.body, ) def normalized_dict(self) -> dict: """ Convert to a `dict` with all keys normalized. (A `dict` with non-normalized keys can be obtained with ``dict(normdict)``.) :rtype: dict """ return {key: value for key, (_, value) in self._data.items()} def copy(self) -> NormalizedDict: """Create a shallow copy of the mapping""" dup = type(self)() dup._data = self._data.copy() dup.normalizer = self.normalizer dup.body = self.body return dup headerparser-0.5.1/src/headerparser/parser.py000066400000000000000000000543771450730324400213060ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Callable, Iterable, Iterator from typing import Any, Optional from deprecated import deprecated from . import errors, scanner from .normdict import NormalizedDict from .scanner import Scanner, scan_stanzas from .types import lower, unfold class HeaderParser: """ A parser for RFC 822-style header sections. Define the fields the parser should recognize with the `add_field()` method, configure handling of unrecognized fields with `add_additional()`, and then parse input with `parse()` or another `!parse_*()` method. :param callable normalizer: By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting ``normalizer`` to a custom callable that takes a field name and returns a "normalized" name for use in equality testing. The normalizer will also be used when looking up keys in the `NormalizedDict` instances returned by the parser's `!parse_*()` methods. :param bool body: whether the parser should allow or forbid a body after the header section; `True` means a body is required, `False` means a body is prohibited, and `None` (the default) means a body is optional :param kwargs: Passed to the `Scanner` constructor """ def __init__( self, normalizer: Optional[Callable[[str], Any]] = None, body: Optional[bool] = None, **kwargs: Any, ) -> None: #: The ``normalizer`` argument passed to the constructor, or `lower` if #: no normalizer was supplied self._normalizer = normalizer if normalizer is not None else lower #: The ``body`` argument passed to the constructor self._body = body #: Scanner options self._scan_opts = kwargs #: A mapping from normalized field names to `NamedField` instances self._fielddefs: dict[Any, NamedField] = {} #: The set of all normalized ``dest`` values for all named fields #: defined so far self._dests: set = set() #: If additional fields are enabled, this is the `FieldDef` instance #: used to process them; otherwise, it is `None`. self._additional: Optional[FieldDef] = None #: Whether any fields with custom ``dest`` values have been defined, #: thereby precluding `add_additional()` self._custom_dests: bool = False def __eq__(self, other: Any) -> bool: if isinstance(other, HeaderParser): return vars(self) == vars(other) else: return NotImplemented def add_field(self, name: str, *altnames: str, **kwargs: Any) -> None: """ Define a header field for the parser to parse. During parsing, if a field is encountered whose name (*modulo* normalization) equals either ``name`` or one of the ``altnames``, the field's value will be processed according to the options in ``**kwargs``. (If no options are specified, the value will just be stored in the result dictionary.) .. versionchanged:: 0.2.0 ``action`` argument added :param string name: the primary name for the field, used in error messages and as the default value of ``dest`` :param strings altnames: field name synonyms :param dest: The key in the result dictionary in which the field's value(s) will be stored; defaults to ``name``. When additional headers are enabled (see `add_additional`), ``dest`` must equal (after normalization) one of the field's names. :param bool required: If `True` (default `False`), the ``parse_*`` methods will raise a `~headerparser.errors.MissingFieldError` if the field is not present in the input :param default: The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input. ``default`` cannot be set when the field is required. ``type``, ``unfold``, and ``action`` will not be applied to the default value, and the default value need not belong to ``choices``. :param bool multiple: If `True`, the header field will be allowed to occur more than once in the input, and all of the field's values will be stored in a list. If `False` (the default), a `~headerparser.errors.DuplicateFieldError` will be raised if the field occurs more than once in the input. :param bool unfold: If `True` (default `False`), the field value will be "unfolded" (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying ``type`` :param callable type: a callable to apply to the field value before storing it in the result dictionary :param iterable choices: A sequence of values which the field is allowed to have. If ``choices`` is defined, all occurrences of the field in the input must have one of the given values (after applying ``type``) or else an `~headerparser.errors.InvalidChoiceError` is raised. :param callable action: A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field's ``name``, and the field's value (after processing with ``type`` and ``unfold`` and checking against ``choices``). The callable replaces the default behavior of storing the field's values in the result dictionary, and so the callable must explicitly store the values if desired. When ``action`` is defined for a field, ``dest`` cannot be. :return: `None` :raises ValueError: - if another field with the same name or ``dest`` was already defined - if ``dest`` is not one of the field's names and `add_additional` is enabled - if ``default`` is defined and ``required`` is true - if ``choices`` is an empty sequence - if both ``dest`` and ``action`` are defined :raises TypeError: if ``name`` or one of the ``altnames`` is not a string """ if "action" in kwargs and "dest" in kwargs: raise ValueError("`action` and `dest` are mutually exclusive") kwargs.setdefault("dest", name) if "type" in kwargs: kwargs["type_"] = kwargs.pop("type") hd = NamedField(name=name, **kwargs) normed: set = set(map(self._normalizer, (name,) + altnames)) # Error before modifying anything: redefs = [n for n in self._fielddefs if n in normed] if redefs: raise ValueError(f"field defined more than once: {redefs[0]!r}") if self._normalizer(hd.dest) in self._dests: raise ValueError(f"destination defined more than once: {hd.dest!r}") if self._normalizer(hd.dest) not in normed: if self._additional is not None: raise ValueError("add_additional and `dest` are mutually exclusive") self._custom_dests = True for n in normed: self._fielddefs[n] = hd self._dests.add(self._normalizer(hd.dest)) def add_additional(self, enable: bool = True, **kwargs: Any) -> None: """ Specify how the parser should handle fields in the input that were not previously registered with `add_field`. By default, unknown fields will cause the ``parse_*`` methods to raise an `~headerparser.errors.UnknownFieldError`, but calling this method with ``enable=True`` (the default) will change the parser's behavior so that all unregistered fields are processed according to the options in ``**kwargs``. (If no options are specified, the additional values will just be stored in the result dictionary.) If this method is called more than once, only the settings from the last call will be used. Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of ``multiple``) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done through `add_field`. .. versionchanged:: 0.2.0 ``action`` argument added :param bool enable: whether the parser should accept input fields that were not registered with `add_field`; setting this to `False` disables additional fields and restores the parser's default behavior :param bool multiple: If `True`, each additional header field will be allowed to occur more than once in the input, and each field's values will be stored in a list. If `False` (the default), a `~headerparser.errors.DuplicateFieldError` will be raised if an additional field occurs more than once in the input. :param bool unfold: If `True` (default `False`), additional field values will be "unfolded" (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying ``type`` :param callable type: a callable to apply to additional field values before storing them in the result dictionary :param iterable choices: A sequence of values which additional fields are allowed to have. If ``choices`` is defined, all additional field values in the input must have one of the given values (after applying ``type``) or else an `~headerparser.errors.InvalidChoiceError` is raised. :param callable action: A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field's name, and the field's value (after processing with ``type`` and ``unfold`` and checking against ``choices``). The callable replaces the default behavior of storing the field's values in the result dictionary, and so the callable must explicitly store the values if desired. :return: `None` :raises ValueError: - if ``enable`` is true and a previous call to `add_field` used a custom ``dest`` - if ``choices`` is an empty sequence """ if enable: if self._custom_dests: raise ValueError("add_additional and `dest` are mutually exclusive") if "type" in kwargs: kwargs["type_"] = kwargs.pop("type") self._additional = FieldDef(**kwargs) else: self._additional = None def parse_stream( self, fields: Iterable[tuple[Optional[str], str]] ) -> NormalizedDict: """ Process a sequence of ``(name, value)`` pairs as returned by `scan()` and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call. :param fields: a sequence of ``(name, value)`` pairs representing the input fields :type fields: iterable of pairs of strings :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ValueError: if the input contains more than one body pair """ data: NormalizedDict = NormalizedDict(normalizer=self._normalizer) fields_seen: set[str] = set() body_seen = False for k, v in fields: if k is None: if body_seen: raise ValueError("Body appears twice in input") if self._body is not None and not self._body: raise errors.BodyNotAllowedError() data.body = v body_seen = True else: hd: FieldDef try: hd = self._fielddefs[self._normalizer(k)] except KeyError: if self._additional is not None: hd = self._additional else: raise errors.UnknownFieldError(k) else: fields_seen.add(hd.name) hd.process(data, k, v) for hd in self._fielddefs.values(): if hd.name not in fields_seen: if hd.required: raise errors.MissingFieldError(hd.name) elif hasattr(hd, "default"): data[hd.dest] = hd.default if self._body and not body_seen: raise errors.MissingBodyError() return data def parse(self, data: str | Iterable[str]) -> NormalizedDict: """ .. versionadded:: 0.4.0 Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given string, filehandle, or sequence of lines and return a dictionary of the header fields (possibly with body attached). If ``data`` is an iterable of `str`, newlines will be appended to lines in multiline header fields where not already present but will not be inserted where missing inside the body. .. versionchanged:: 0.5.0 ``data`` can now be a string. :param iterable: a string, text-file-like object, or iterable of lines to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ return self.parse_stream(scanner.scan(data, **self._scan_opts)) @deprecated(version="0.5.0", reason="use parse() instead") def parse_string(self, s: str) -> NormalizedDict: """ Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached) .. deprecated:: 0.5.0 Use `parse()` instead. :param string s: the text to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if the header section is malformed """ return self.parse_stream(scanner.scan(s, **self._scan_opts)) # pragma: no cover def parse_stanzas(self, data: str | Iterable[str]) -> Iterator[NormalizedDict]: """ .. versionadded:: 0.4.0 Parse zero or more stanzas of RFC 822-style header fields from the given string, filehandle, or sequence of lines and return a generator of dictionaries of header fields. All of the input is treated as header sections, not message bodies; as a result, calling this method when ``body`` is true will produce a `MissingBodyError`. .. versionchanged:: 0.5.0 ``data`` can now be a string. :param data: a string, text-file-like object, or iterable of lines to parse :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ return self.parse_stanzas_stream(scan_stanzas(data, **self._scan_opts)) @deprecated(version="0.5.0", reason="use parse_stanzas() instead") def parse_stanzas_string(self, s: str) -> Iterator[NormalizedDict]: """ .. versionadded:: 0.4.0 Parse zero or more stanzas of RFC 822-style header fields from the given string and return a generator of dictionaries of header fields. All of the input is treated as header sections, not message bodies; as a result, calling this method when ``body`` is true will produce a `MissingBodyError`. .. deprecated:: 0.5.0 Use `parse_stanzas()` instead. :param string s: the text to parse :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ return self.parse_stanzas_stream( # pragma: no cover scan_stanzas(s, **self._scan_opts) ) def parse_stanzas_stream( self, fields: Iterable[Iterable[tuple[str, str]]] ) -> Iterator[NormalizedDict]: """ .. versionadded:: 0.4.0 Parse an iterable of iterables of ``(name, value)`` pairs as returned by `scan_stanzas()` and return a generator of dictionaries of header fields. This is a low-level method that you will usually not need to call. :param fields: an iterable of iterables of pairs of strings :rtype: generator of `NormalizedDict` :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ for stanza in fields: yield self.parse_stream(stanza) @deprecated(version="0.5.0") def parse_next_stanza(self, iterator: Iterator[str]) -> NormalizedDict: """ .. versionadded:: 0.4.0 Parse a RFC 822-style header field section from the contents of the given filehandle or iterator of lines and return a dictionary of the header fields. Input processing stops at the end of the header section, leaving the rest of the iterator unconsumed. As a message body is not consumed, calling this method when ``body`` is true will produce a `MissingBodyError`. .. deprecated:: 0.5.0 Instead combine `Scanner.scan_next_stanza()` with `parse_stream()` :param iterator: a text-file-like object or iterator of lines to parse :rtype: NormalizedDict :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ sc = Scanner(iterator, **self._scan_opts) return self.parse_stream(sc.scan_next_stanza()) @deprecated(version="0.5.0") def parse_next_stanza_string(self, s: str) -> tuple[NormalizedDict, str]: """ .. versionadded:: 0.4.0 Parse a RFC 822-style header field section from the given string and return a pair of a dictionary of the header fields and the rest of the string. As a message body is not consumed, calling this method when ``body`` is true will produce a `MissingBodyError`. .. deprecated:: 0.5.0 Instead combine `Scanner.scan_next_stanza()` with `parse_stream()` :param string s: the text to parse :rtype: pair of `NormalizedDict` and a string :raises ParserError: if the input fields do not conform to the field definitions declared with `add_field` and `add_additional` :raises ScannerError: if a header section is malformed """ sc = Scanner(s, **self._scan_opts) fields = list(sc.scan_next_stanza()) try: extra = sc.get_unscanned() except errors.ScannerEOFError: extra = "" return (self.parse_stream(fields), extra) class FieldDef: def __init__( self, type_: Optional[Callable[[str], Any]] = None, multiple: bool = False, unfold: bool = False, choices: Optional[Iterable] = None, action: Optional[Callable[[NormalizedDict, str, Any], Any]] = None, ): self.type_ = type_ self.multiple = multiple self.unfold = unfold if choices is not None: choices = list(choices) if not choices: raise ValueError("empty list supplied for choices") self.choices: Optional[list] = choices self.action = action def __eq__(self, other: Any) -> bool: if isinstance(other, FieldDef): return vars(self) == vars(other) else: # pragma: no cover return NotImplemented def _process(self, data: NormalizedDict, name: str, dest: Any, value: str) -> None: if self.unfold: value = unfold(value) if self.type_ is not None: try: value = self.type_(value) except errors.FieldTypeError: raise except Exception as e: raise errors.FieldTypeError(name, value, e) if self.choices is not None and value not in self.choices: raise errors.InvalidChoiceError(name, value) if self.action is not None: self.action(data, name, value) elif self.multiple: data.setdefault(dest, []).append(value) elif dest in data: raise errors.DuplicateFieldError(name) else: data[dest] = value def process(self, data: NormalizedDict, name: str, value: str) -> None: self._process(data, name, name, value) class NamedField(FieldDef): def __init__( self, name: str, dest: Any, required: bool = False, **kwargs: Any ) -> None: if not isinstance(name, str): raise TypeError("field names must be strings") self.name = name self.dest = dest self.required = required if "default" in kwargs: if self.required: raise ValueError("required and default are mutually exclusive") self.default = kwargs.pop("default") super(NamedField, self).__init__(**kwargs) def process(self, data: NormalizedDict, _: str, value: str) -> None: self._process(data, self.name, self.dest, value) headerparser-0.5.1/src/headerparser/py.typed000066400000000000000000000000001450730324400211050ustar00rootroot00000000000000headerparser-0.5.1/src/headerparser/scanner.py000066400000000000000000000350311450730324400214250ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Iterable, Iterator import re from typing import Optional, Tuple, Union import attr from deprecated import deprecated from .errors import MalformedHeaderError, ScannerEOFError, UnexpectedFoldingError from .util import ascii_splitlines RgxType = Union[str, "re.Pattern[str]"] FieldType = Tuple[Optional[str], str] DEFAULT_SEPARATOR_REGEX = re.compile(r"[ \t]*:[ \t]*") def data2iter(data: str | Iterable[str]) -> Iterator[str]: if isinstance(data, str): data = ascii_splitlines(data) return iter(data) def convert_sep(v: Optional[RgxType]) -> re.Pattern[str]: if v is None: return DEFAULT_SEPARATOR_REGEX else: return re.compile(v) def none2false(v: Optional[bool]) -> bool: return False if v is None else v @attr.define class Scanner: """ .. versionadded:: 0.5.0 A class for scanning text for RFC 822-style header fields. Each method processes some portion of the input yet unscanned; the `scan()`, `scan_stanzas()`, and `get_unscanned()` methods process the entirety of the remaining input, while the `scan_next_stanza()` method only processes up through the first blank line. :param data: The text to scan. This may be a string, a text-file-like object, or an iterable of lines. If it is a string, it will be broken into lines on CR, LF, and CR LF boundaries. :param separator_regex: A regex (as a `str` or compiled regex object) defining the name-value separator; defaults to :regexp:`[ \\\\t]*:[ \\\\t]*`. When the regex is found in a line, everything before the matched substring becomes the field name, and everything after becomes the first line of the field value. Note that the regex must match any surrounding whitespace in order for it to be trimmed from the key & value. :param bool skip_leading_newlines: If `True`, blank lines at the beginning of the input will be discarded. If `False`, a blank line at the beginning of the input marks the end of an empty header section. """ _data: Iterator[str] = attr.field(converter=data2iter) separator_regex: re.Pattern[str] = attr.field( default=DEFAULT_SEPARATOR_REGEX, converter=convert_sep, kw_only=True, ) skip_leading_newlines: bool = attr.field( default=False, kw_only=True, converter=none2false ) _eof: bool = attr.field(default=False, init=False) def scan(self) -> Iterator[FieldType]: """ Scan the remaining input for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field encountered, plus a ``(None, body)`` pair representing the body (if any) after the header section. All lines after the first blank line are concatenated & yielded as-is in a ``(None, body)`` pair. (Note that body lines which do not end with a line terminator will not have one appended.) If there is no empty line in the input, then no body pair is yielded. If the empty line is the last line in the input, the body will be the empty string. If the empty line is the *first* line in the input and the ``skip_leading_newlines`` option is false (the default), then all other lines will be treated as part of the body and will not be scanned for header fields. :raises ScannerError: if the header section is malformed :raises ScannerEOFError: if all of the input has already been consumed """ yield from self.scan_next_stanza() try: body = self.get_unscanned() except ScannerEOFError: pass else: yield (None, body) def scan_next_stanza(self) -> Iterator[tuple[str, str]]: """ Scan the remaining input for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input. Input processing stops as soon as a blank line is encountered. (If ``skip_leading_newlines`` is true, the function only stops on a blank line after a non-blank line.) :raises ScannerError: if the header section is malformed :raises ScannerEOFError: if all of the input has already been consumed """ if self._eof: raise ScannerEOFError() name: Optional[str] = None value = "" begun = False more_left = False for line in self._data: line = line.rstrip("\r\n") if line.startswith((" ", "\t")): begun = True if name is not None: value += "\n" + line else: raise UnexpectedFoldingError(line) else: m = self.separator_regex.search(line) if m: begun = True if name is not None: yield (name, value) name = line[: m.start()] value = line[m.end() :] elif line == "": if self.skip_leading_newlines and not begun: continue else: more_left = True break else: raise MalformedHeaderError(line) if name is not None: yield (name, value) if not more_left: self._eof = True def scan_stanzas(self) -> Iterator[list[tuple[str, str]]]: """ Scan the remaining input for zero or more stanzas of RFC 822-style header fields and return a generator of lists of ``(name, value)`` pairs, where each list represents a stanza of header fields in the input. The stanzas are terminated by blank lines. Consecutive blank lines between stanzas are treated as a single blank line. Blank lines at the end of the input are discarded without creating a new stanza. :raises ScannerError: if the header section is malformed :raises ScannerEOFError: if all of the input has already been consumed """ if self._eof: raise ScannerEOFError() while True: try: fields = list(self.scan_next_stanza()) except ScannerEOFError: break if fields or not self._eof: yield fields else: break # type: ignore[unreachable] self.skip_leading_newlines = True def get_unscanned(self) -> str: """ Return all of the input that has not yet been processed. After calling this method, calling any method again on the same `Scanner` instance will raise `ScannerEOFError`. :raises ScannerEOFError: if all of the input has already been consumed """ if self._eof: raise ScannerEOFError() else: return "".join(self._data) @deprecated(version="0.5.0", reason="use scan() instead") def scan_string( s: str, *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> Iterator[FieldType]: """ Scan a string for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. See `scan()` for more information on the exact behavior of the scanner. .. deprecated:: 0.5.0 Use `scan()` instead. :param s: a string which will be broken into lines on CR, LF, and CR LF boundaries and passed to `scan()` :param kwargs: Passed to the `Scanner` constructor :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ return scan( # pragma: no cover s, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ) def scan( data: str | Iterable[str], *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> Iterator[FieldType]: """ .. versionadded:: 0.4.0 Scan a string, text-file-like object, or iterable of lines for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input, plus a ``(None, body)`` pair representing the body (if any) after the header section. If ``data`` is a string, it will be broken into lines on CR, LF, and CR LF boundaries. All lines after the first blank line are concatenated & yielded as-is in a ``(None, body)`` pair. (Note that body lines which do not end with a line terminator will not have one appended.) If there is no empty line in ``data``, then no body pair is yielded. If the empty line is the last line in ``data``, the body will be the empty string. If the empty line is the *first* line in ``data`` and the ``skip_leading_newlines`` option is false (the default), then all other lines will be treated as part of the body and will not be scanned for header fields. .. versionchanged:: 0.5.0 ``data`` can now be a string. :param data: a string, text-file-like object, or iterable of strings representing lines of input :param kwargs: Passed to the `Scanner` constructor :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ return Scanner( data, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ).scan() @deprecated(version="0.5.0", reason="use Scanner.scan_next_stanza() instead") def scan_next_stanza( iterator: Iterator[str], *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> Iterator[tuple[str, str]]: """ .. versionadded:: 0.4.0 Scan a text-file-like object or iterator of lines for RFC 822-style header fields and return a generator of ``(name, value)`` pairs for each header field in the input. Input processing stops as soon as a blank line is encountered, leaving the rest of the iterator unconsumed (If ``skip_leading_newlines`` is true, the function only stops on a blank line after a non-blank line). .. deprecated:: 0.5.0 Use `Scanner.scan_next_stanza()` instead :param iterator: a text-file-like object or iterator of strings representing lines of input :param kwargs: Passed to the `Scanner` constructor :rtype: generator of pairs of strings :raises ScannerError: if the header section is malformed """ return Scanner( iterator, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ).scan_next_stanza() @deprecated(version="0.5.0", reason="use Scanner.scan_next_stanza() instead") def scan_next_stanza_string( s: str, *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> tuple[list[tuple[str, str]], str]: """ .. versionadded:: 0.4.0 Scan a string for RFC 822-style header fields and return a pair ``(fields, extra)`` where ``fields`` is a list of ``(name, value)`` pairs for each header field in the input up to the first blank line and ``extra`` is everything after the first blank line (If ``skip_leading_newlines`` is true, the dividing point is instead the first blank line after a non-blank line); if there is no appropriate blank line in the input, ``extra`` is the empty string. .. deprecated:: 0.5.0 Use `Scanner.scan_next_stanza()` instead :param s: a string to scan :param kwargs: Passed to the `Scanner` constructor :rtype: pair of a list of pairs of strings and a string :raises ScannerError: if the header section is malformed """ sc = Scanner( s, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ) fields = list(sc.scan_next_stanza()) try: extra = sc.get_unscanned() except ScannerEOFError: extra = "" return (fields, extra) def scan_stanzas( data: str | Iterable[str], *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> Iterator[list[tuple[str, str]]]: """ .. versionadded:: 0.4.0 Scan a string, text-file-like object, or iterable of lines for zero or more stanzas of RFC 822-style header fields and return a generator of lists of ``(name, value)`` pairs, where each list represents a stanza of header fields in the input. If ``data`` is a string, it will be broken into lines on CR, LF, and CR LF boundaries. The stanzas are terminated by blank lines. Consecutive blank lines between stanzas are treated as a single blank line. Blank lines at the end of the input are discarded without creating a new stanza. .. versionchanged:: 0.5.0 ``data`` can now be a string. :param data: a string, text-file-like object, or iterable of strings representing lines of input :param kwargs: Passed to the `Scanner` constructor :rtype: generator of lists of pairs of strings :raises ScannerError: if the header section is malformed """ return Scanner( data, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ).scan_stanzas() @deprecated(version="0.5.0", reason="use scan_stanzas() instead") def scan_stanzas_string( s: str, *, separator_regex: Optional[RgxType] = None, skip_leading_newlines: bool = False, ) -> Iterator[list[tuple[str, str]]]: """ .. versionadded:: 0.4.0 Scan a string for zero or more stanzas of RFC 822-style header fields and return a generator of lists of ``(name, value)`` pairs, where each list represents a stanza of header fields in the input. The stanzas are terminated by blank lines. Consecutive blank lines between stanzas are treated as a single blank line. Blank lines at the end of the input are discarded without creating a new stanza. .. deprecated:: 0.5.0 Use `scan_stanzas()` instead :param s: a string which will be broken into lines on CR, LF, and CR LF boundaries and passed to `scan_stanzas()` :param kwargs: Passed to the `Scanner` constructor :rtype: generator of lists of pairs of strings :raises ScannerError: if the header section is malformed """ return scan_stanzas( # pragma: no cover s, separator_regex=separator_regex, skip_leading_newlines=skip_leading_newlines, ) headerparser-0.5.1/src/headerparser/types.py000066400000000000000000000031701450730324400211370ustar00rootroot00000000000000import re from typing import Any TRUTHY = {"yes", "y", "on", "true", "1"} FALSEY = {"no", "n", "off", "false", "0"} def BOOL(s: str) -> bool: """ Convert boolean-like strings to `bool` values. The strings ``'yes'``, ``'y'``, ``'on'``, ``'true'``, and ``'1'`` are converted to `True`, and the strings ``'no'``, ``'n'``, ``'off'``, ``'false'``, and ``'0'`` are converted to `False`. The conversion is case-insensitive and ignores leading & trailing whitespace. Any value that cannot be converted to a `bool` results in a `ValueError`. :param string s: a boolean-like string to convert to a `bool` :rtype: bool :raises ValueError: if ``s`` is not one of the values listed above """ b = s.strip().lower() if b in TRUTHY: return True elif b in FALSEY: return False else: raise ValueError(f"invalid boolean: {s!r}") def lower(s: Any) -> Any: """ .. versionadded:: 0.2.0 Convert ``s`` to lowercase by calling its :meth:`~str.lower()` method if it has one; otherwise, return ``s`` unchanged """ try: return s.lower() except (TypeError, AttributeError): return s def unfold(s: str) -> str: r""" .. versionadded:: 0.2.0 Remove folding whitespace from a string by converting line breaks (and any whitespace adjacent to line breaks) to a single space and removing leading & trailing whitespace. >>> unfold('This is a \n folded string.\n') 'This is a folded string.' :param string s: a string to unfold :rtype: string """ return re.sub(r"[ \t]*[\r\n][ \t\r\n]*", " ", s).strip(" ") headerparser-0.5.1/src/headerparser/util.py000066400000000000000000000004641450730324400207530ustar00rootroot00000000000000from __future__ import annotations import re def ascii_splitlines(s: str) -> list[str]: lines = [] lastend = 0 for m in re.finditer(r"\r\n?|\n", s): lines.append(s[lastend : m.end()]) lastend = m.end() if lastend < len(s): lines.append(s[lastend:]) return lines headerparser-0.5.1/test/000077500000000000000000000000001450730324400151435ustar00rootroot00000000000000headerparser-0.5.1/test/test_normdict.py000066400000000000000000000136721450730324400204040ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Callable import re from typing import Optional import pytest from headerparser import NormalizedDict, lower def test_empty() -> None: nd = NormalizedDict() assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is lower def test_one() -> None: nd = NormalizedDict({"Foo": "bar"}) assert dict(nd) == {"Foo": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is lower def test_get_cases() -> None: nd = NormalizedDict({"Foo": "bar"}) assert nd["Foo"] == "bar" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] def test_set() -> None: nd = NormalizedDict() assert dict(nd) == {} nd["Foo"] = "bar" assert dict(nd) == {"Foo": "bar"} assert nd["Foo"] == "bar" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] nd["fOO"] = "quux" assert dict(nd) == {"fOO": "quux"} assert nd["Foo"] == "quux" assert nd["Foo"] == nd["foo"] == nd["FOO"] == nd["fOO"] def test_del() -> None: nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}) del nd["Foo"] assert dict(nd) == {"Bar": "FOO"} del nd["BAR"] assert dict(nd) == {} def test_del_nexists() -> None: nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}) with pytest.raises(KeyError): del nd["Baz"] def test_eq_empty() -> None: nd = NormalizedDict() nd2 = NormalizedDict() assert nd == nd2 def test_eq_nonempty() -> None: nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"Foo": "bar"}) assert nd == nd2 def test_eq_cases() -> None: nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"fOO": "bar"}) assert nd == nd2 def test_neq() -> None: assert NormalizedDict({"Foo": "bar"}) != NormalizedDict({"Foo": "BAR"}) def test_normalized() -> None: nd = NormalizedDict({"Foo": "BAR"}) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"foo": "BAR"} assert nd2.body is None assert nd == nd2 def test_normalized_with_body() -> None: nd = NormalizedDict({"Foo": "BAR"}, body="Glarch.") nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"foo": "BAR"} assert nd2.body == "Glarch." assert nd == nd2 def test_normalized_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"foo": "BAR"} def test_eq_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}) assert nd == {"Foo": "BAR"} assert {"Foo": "BAR"} == nd assert nd == {"FOO": "BAR"} assert {"FOO": "BAR"} == nd assert nd == {"foo": "BAR"} assert {"foo": "BAR"} == nd assert nd != {"Foo": "bar"} assert {"Foo": "bar"} != nd def test_body_neq_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}, body="") assert nd != {"Foo": "BAR"} assert {"Foo": "BAR"} != nd def test_eq_body() -> None: nd = NormalizedDict({"Foo": "bar"}, body="") nd2 = NormalizedDict({"fOO": "bar"}, body="") assert nd == nd2 def test_neq_body() -> None: nd = NormalizedDict({"Foo": "bar"}, body="yes") nd2 = NormalizedDict({"fOO": "bar"}, body="no") assert nd != nd2 def test_neq_none() -> None: assert NormalizedDict() != None # noqa: E711 assert None != NormalizedDict() # noqa: E711 def test_neq_bool() -> None: assert NormalizedDict() != False # noqa: E712 assert False != NormalizedDict() # noqa: E712 def test_neq_int() -> None: assert NormalizedDict() != 42 assert 42 != NormalizedDict() def test_init_list() -> None: nd = NormalizedDict([("Foo", "bar"), ("Bar", "baz"), ("FOO", "quux")]) assert dict(nd) == {"FOO": "quux", "Bar": "baz"} def test_copy() -> None: nd = NormalizedDict({"Foo": "bar"}) nd2 = nd.copy() assert nd is not nd2 assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body is None assert nd == nd2 nd2["Foo"] = "gnusto" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto"} assert nd != nd2 nd2["fOO"] = "quux" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"fOO": "quux"} assert nd != nd2 nd2["Glarch"] = "baz" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"fOO": "quux", "Glarch": "baz"} assert nd != nd2 def test_copy_with_body() -> None: nd = NormalizedDict({"Foo": "bar"}, body="Glarch.") nd2 = nd.copy() assert nd is not nd2 assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body == "Glarch." assert nd == nd2 nd2.body = "quux" assert nd.body == "Glarch." assert nd2.body == "quux" assert nd != nd2 def test_neq_normalizers_empty() -> None: nd = NormalizedDict() nd2 = NormalizedDict(normalizer=lambda x: x) assert dict(nd) == dict(nd2) == {} assert nd != nd2 def test_neq_normalizers_nonempty() -> None: nd = NormalizedDict({"Foo": "bar"}) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=lambda x: x) assert dict(nd) == dict(nd2) == {"Foo": "bar"} assert nd != nd2 def normdash(s: str) -> str: return re.sub(r"[-_\s]+", "-", s.lower()) def identity(s: str) -> str: return s @pytest.mark.parametrize( "data", [ {}, {"Foo": "Bar"}, {"foo": "Bar"}, {"FOO_BAR": "BAZ"}, ], ) @pytest.mark.parametrize("normalizer", [None, lower, normdash, identity]) @pytest.mark.parametrize("body", [None, "Glarch."]) def test_repr( data: dict[str, str], normalizer: Optional[Callable[[str], str]], body: Optional[str], ) -> None: nd = NormalizedDict(data, body=body, normalizer=normalizer) if normalizer is None: normalizer = lower assert repr(nd) == ( f"headerparser.normdict.NormalizedDict({data!r}," f" normalizer={normalizer!r}, body={body!r})" ) headerparser-0.5.1/test/test_normdict_custom.py000066400000000000000000000124751450730324400217760ustar00rootroot00000000000000import re import pytest from headerparser import NormalizedDict def normdash(s: str) -> str: return re.sub(r"[-_\s]+", "-", s.lower()) def test_empty() -> None: nd = NormalizedDict(normalizer=normdash) assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is normdash def test_one() -> None: nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) assert dict(nd) == {"A Key": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is normdash def test_get_cases() -> None: nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) assert nd["A Key"] == "bar" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] def test_set() -> None: nd = NormalizedDict(normalizer=normdash) assert dict(nd) == {} nd["A Key"] = "bar" assert dict(nd) == {"A Key": "bar"} assert nd["A Key"] == "bar" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] nd["A-Key"] = "quux" assert dict(nd) == {"A-Key": "quux"} assert nd["A Key"] == "quux" assert nd["A Key"] == nd["a_key"] == nd["A-KEY"] == nd["A - key"] def test_del() -> None: nd = NormalizedDict( {"A Key": "bar", "Another-Key": "FOO"}, normalizer=normdash, ) del nd["A Key"] assert dict(nd) == {"Another-Key": "FOO"} del nd["ANOTHER_KEY"] assert dict(nd) == {} def test_del_nexists() -> None: nd = NormalizedDict( {"A Key": "bar", "Another-Key": "FOO"}, normalizer=normdash, ) with pytest.raises(KeyError): del nd["AKey"] def test_eq_empty() -> None: nd = NormalizedDict(normalizer=normdash) nd2 = NormalizedDict(normalizer=normdash) assert nd == nd2 def test_eq_nonempty() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=normdash) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=normdash) assert nd == nd2 def test_eq_cases() -> None: nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) nd2 = NormalizedDict({"a_key": "bar"}, normalizer=normdash) assert nd == nd2 def test_neq() -> None: assert NormalizedDict({"A Key": "A Value"}, normalizer=normdash) != NormalizedDict( {"A Key": "a_value"}, normalizer=normdash ) def test_normalized() -> None: nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"a-key": "BAR"} assert nd2.body is None assert nd2.normalizer is normdash assert nd == nd2 def test_normalized_with_body() -> None: nd = NormalizedDict({"A Key": "BAR"}, body="Foo Baz", normalizer=normdash) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"a-key": "BAR"} assert nd2.body == "Foo Baz" assert nd2.normalizer is normdash assert nd == nd2 def test_normalized_dict() -> None: nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"a-key": "BAR"} def test_eq_dict() -> None: nd = NormalizedDict({"A Key": "BAR"}, normalizer=normdash) assert nd == {"A Key": "BAR"} assert {"A Key": "BAR"} == nd assert nd == {"A_KEY": "BAR"} assert {"A_KEY": "BAR"} == nd assert nd == {"a-key": "BAR"} assert {"a-key": "BAR"} == nd assert nd != {"A Key": "bar"} assert {"A Key": "bar"} != nd def test_body_neq_dict() -> None: nd = NormalizedDict({"A Key": "BAR"}, body="", normalizer=normdash) assert nd != {"A Key": "BAR"} assert {"A Key": "BAR"} != nd def test_eq_body() -> None: nd = NormalizedDict({"A Key": "bar"}, body="", normalizer=normdash) nd2 = NormalizedDict({"a_KEY": "bar"}, body="", normalizer=normdash) assert nd == nd2 def test_neq_body() -> None: nd = NormalizedDict({"A Key": "bar"}, body="yes", normalizer=normdash) nd2 = NormalizedDict({"a_KEY": "bar"}, body="no", normalizer=normdash) assert nd != nd2 def test_init_list() -> None: nd = NormalizedDict( [("A Key", "bar"), ("Another-Key", "baz"), ("A_KEY", "quux")], normalizer=normdash, ) assert dict(nd) == {"A_KEY": "quux", "Another-Key": "baz"} def test_copy() -> None: nd = NormalizedDict({"A Key": "bar"}, normalizer=normdash) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"A Key": "bar"} assert nd2.body is None assert nd2.normalizer is normdash assert nd == nd2 nd2["A Key"] = "gnusto" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"A Key": "gnusto"} assert nd != nd2 nd2["a-key"] = "quux" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"a-key": "quux"} assert nd != nd2 nd2["Another_Key"] = "baz" assert dict(nd) == {"A Key": "bar"} assert dict(nd2) == {"a-key": "quux", "Another_Key": "baz"} assert nd != nd2 def test_copy_with_body() -> None: nd = NormalizedDict({"A Key": "bar"}, body="Glarch.", normalizer=normdash) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"A Key": "bar"} assert nd2.body == "Glarch." assert nd2.normalizer is normdash assert nd == nd2 nd2.body = "quux" assert nd.body == "Glarch." assert nd2.body == "quux" assert nd != nd2 headerparser-0.5.1/test/test_normdict_identity.py000066400000000000000000000116311450730324400223060ustar00rootroot00000000000000import pytest from headerparser import NormalizedDict def identity(s: str) -> str: return s def test_empty() -> None: nd = NormalizedDict(normalizer=identity) assert dict(nd) == {} assert nd.body is None assert len(nd) == 0 assert not bool(nd) assert nd.normalizer is identity def test_one() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert dict(nd) == {"Foo": "bar"} assert nd.body is None assert len(nd) == 1 assert bool(nd) assert nd.normalizer is identity def test_get_cases() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert nd["Foo"] == "bar" assert "foo" not in nd assert "FOO" not in nd assert "fOO" not in nd def test_set() -> None: nd = NormalizedDict(normalizer=identity) assert dict(nd) == {} nd["Foo"] = "bar" assert dict(nd) == {"Foo": "bar"} assert len(nd) == 1 assert nd["Foo"] == "bar" nd["fOO"] = "quux" assert dict(nd) == {"Foo": "bar", "fOO": "quux"} assert len(nd) == 2 assert nd["Foo"] == "bar" assert nd["fOO"] == "quux" def test_del() -> None: nd = NormalizedDict({"Foo": "bar", "fOO": "BAR"}, normalizer=identity) del nd["Foo"] assert dict(nd) == {"fOO": "BAR"} del nd["fOO"] assert dict(nd) == {} def test_del_nexists() -> None: nd = NormalizedDict({"Foo": "bar", "Bar": "FOO"}, normalizer=identity) with pytest.raises(KeyError): del nd["fOO"] def test_eq_empty() -> None: nd = NormalizedDict(normalizer=identity) nd2 = NormalizedDict(normalizer=identity) assert nd == nd2 def test_eq_nonempty() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity) assert nd == nd2 def test_neq_cases() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = NormalizedDict({"fOO": "bar"}, normalizer=identity) assert nd != nd2 def test_neq() -> None: assert NormalizedDict({"Foo": "bar"}, normalizer=identity) != NormalizedDict( {"Foo": "BAR"}, normalizer=identity ) def test_normalized() -> None: nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "BAR"} assert nd2.body is None assert nd2.normalizer is identity assert nd == nd2 def test_normalized_with_body() -> None: nd = NormalizedDict({"Foo": "BAR"}, body="Glarch.", normalizer=identity) nd2 = nd.normalized() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "BAR"} assert nd2.body == "Glarch." assert nd2.normalizer is identity assert nd == nd2 def test_normalized_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) nd2 = nd.normalized_dict() assert isinstance(nd2, dict) assert nd2 == {"Foo": "BAR"} def test_eq_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity) assert nd == {"Foo": "BAR"} assert {"Foo": "BAR"} == nd assert nd != {"FOO": "BAR"} assert {"FOO": "BAR"} != nd assert nd != {"foo": "BAR"} assert {"foo": "BAR"} != nd assert nd != {"Foo": "bar"} assert {"Foo": "bar"} != nd def test_body_neq_dict() -> None: nd = NormalizedDict({"Foo": "BAR"}, normalizer=identity, body="") assert nd != {"Foo": "BAR"} assert {"Foo": "BAR"} != nd def test_eq_body() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity, body="") nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity, body="") assert nd == nd2 def test_neq_body() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity, body="yes") nd2 = NormalizedDict({"Foo": "bar"}, normalizer=identity, body="no") assert nd != nd2 def test_init_list() -> None: nd = NormalizedDict( [("Foo", "bar"), ("Bar", "baz"), ("FOO", "quux")], normalizer=identity ) assert dict(nd) == {"Foo": "bar", "FOO": "quux", "Bar": "baz"} def test_copy() -> None: nd = NormalizedDict({"Foo": "bar"}, normalizer=identity) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body is None assert nd2.normalizer is identity assert nd == nd2 nd2["Foo"] = "gnusto" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto"} assert nd != nd2 nd2["fOO"] = "quux" assert dict(nd) == {"Foo": "bar"} assert dict(nd2) == {"Foo": "gnusto", "fOO": "quux"} assert nd != nd2 def test_copy_with_body() -> None: nd = NormalizedDict({"Foo": "bar"}, body="Glarch.", normalizer=identity) nd2 = nd.copy() assert isinstance(nd2, NormalizedDict) assert dict(nd2) == {"Foo": "bar"} assert nd2.body == "Glarch." assert nd2.normalizer is identity assert nd == nd2 nd2.body = "quux" assert nd.body == "Glarch." assert nd2.body == "quux" assert nd != nd2 headerparser-0.5.1/test/test_parser/000077500000000000000000000000001450730324400174765ustar00rootroot00000000000000headerparser-0.5.1/test/test_parser/test_parser.py000066400000000000000000000265441450730324400224160ustar00rootroot00000000000000from typing import Any import pytest from pytest_mock import MockerFixture import headerparser from headerparser import HeaderParser def test_simple() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_out_of_order() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBaz: blue\nBar: green\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_different_cases() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBAR: green\nbaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_empty_body() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "" def test_blank_body() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n\n\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "\n" def test_body() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n\nThis is a test.") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "This is a test." def test_headerlike_body() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Foo: quux\n" "Bar: glarch\n" "Baz: cleesh\n" ) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "Foo: quux\nBar: glarch\nBaz: cleesh\n" def test_missing() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": "red", "Bar": "green"} assert msg.body is None def test_required() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", required=True) msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_required_default() -> None: parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field("Foo", required=True, default="Why?") assert "required and default are mutually exclusive" in str(excinfo.value) def test_required_none() -> None: parser = HeaderParser() parser.add_field("None", required=True, type=lambda _: None) msg = parser.parse("None: whatever") assert dict(msg) == {"None": None} assert msg.body is None def test_missing_required() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse("Foo: red\nBar: green\n") assert str(excinfo.value) == "Required header field 'Baz' is not present" assert excinfo.value.name == "Baz" def test_present_default() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", default=42) msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_missing_default() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", default=42) msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": 42} assert msg.body is None def test_missing_None_default() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", default=None) msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": None} assert msg.body is None def test_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True) parser.add_field("Bar") msg = parser.parse("Foo: red\nFOO: magenta\nBar: green\nfoo : crimson\n") assert dict(msg) == {"Foo": ["red", "magenta", "crimson"], "Bar": "green"} assert msg.body is None def test_one_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": ["red"], "Bar": "green"} assert msg.body is None def test_no_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True) parser.add_field("Bar") msg = parser.parse("Bar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body is None def test_bad_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True) parser.add_field("Bar") with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nFOO: magenta\nBar: green\nBar: lime\n") assert str(excinfo.value) == "Header field 'Bar' occurs more than once" assert excinfo.value.name == "Bar" def test_default_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, default=42) parser.add_field("Bar") msg = parser.parse("Bar: green\n") assert dict(msg) == {"Foo": 42, "Bar": "green"} assert msg.body is None def test_present_default_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, default=42) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": ["red"], "Bar": "green"} assert msg.body is None def test_present_default_many_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, default=42) parser.add_field("Bar") msg = parser.parse("Foo: red\nFOO: magenta\nBar: green\n") assert dict(msg) == {"Foo": ["red", "magenta"], "Bar": "green"} assert msg.body is None def test_required_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, required=True) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": ["red"], "Bar": "green"} assert msg.body is None def test_required_many_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, required=True) parser.add_field("Bar") msg = parser.parse("Foo: red\nFOO: magenta\nBar: green\n") assert dict(msg) == {"Foo": ["red", "magenta"], "Bar": "green"} assert msg.body is None def test_missing_required_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True, required=True) parser.add_field("Bar") with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse("Bar: green\n") assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == "Foo" def test_unknown() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nQuux: blue\n") assert str(excinfo.value) == "Unknown header field 'Quux'" assert excinfo.value.name == "Quux" def test_empty_input() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("") assert dict(msg) == {} assert msg.body is None def test_trailing_whitespace() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red \nBar: green\n (ish) \nBaz: blue\n ") assert dict(msg) == { "Foo": "red ", "Bar": "green\n (ish) ", "Baz": "blue\n ", } assert msg.body is None def test_redefinition() -> None: parser = HeaderParser() parser.add_field("Foo") with pytest.raises(ValueError) as excinfo: parser.add_field("FOO") assert "field defined more than once" in str(excinfo.value) def test_many_missing_required() -> None: parser = HeaderParser() parser.add_field("Foo", required=True) parser.add_field("Bar", required=True) parser.add_field("Baz", required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse("") assert excinfo.value.name in ("Foo", "Bar", "Baz") def test_unfold() -> None: parser = HeaderParser() parser.add_field("Folded") parser.add_field("Unfolded", unfold=True) msg = parser.parse( "Folded: This is\n" " test\n" "\ttext.\n" "UnFolded: This is\n" " test\n" "\ttext.\n" ) assert dict(msg) == { "Folded": "This is\n test\n\ttext.", "Unfolded": "This is test text.", } assert msg.body is None def test_space_in_name() -> None: parser = HeaderParser() parser.add_field("Key Name") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("key name: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Key Name": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_scan_opts_passed(mocker: MockerFixture) -> None: m = mocker.patch("headerparser.scanner.scan", wraps=headerparser.scanner.scan) parser = HeaderParser( separator_regex=r"\s*:\s*", skip_leading_newlines=True, ) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") parser.parse("Foo: red\nBar: green\nBaz: blue\n") m.assert_called_with( "Foo: red\nBar: green\nBaz: blue\n", separator_regex=r"\s*:\s*", skip_leading_newlines=True, ) def test_body_twice() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(ValueError) as excinfo: parser.parse_stream( [ ("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "Body #1"), (None, "Body #2"), ] ) assert str(excinfo.value) == "Body appears twice in input" @pytest.mark.parametrize("name", [42, None, 3.14, True, ["B", "a", "r"]]) def test_nonstr_field_name(name: Any) -> None: parser = HeaderParser() parser.add_field("Foo") with pytest.raises(TypeError) as excinfo: parser.add_field(name) assert str(excinfo.value) == "field names must be strings" headerparser-0.5.1/test/test_parser/test_parser_action.py000066400000000000000000000224161450730324400237450ustar00rootroot00000000000000from unittest.mock import ANY, Mock import pytest from pytest_mock import MockerFixture import headerparser from headerparser import BOOL, HeaderParser, NormalizedDict @pytest.fixture def use_as_body() -> Mock: def _use(nd: NormalizedDict, _name: str, value: str) -> None: nd.body = value return Mock(side_effect=_use) def test_action(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_missing(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Bar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None assert not stub.called def test_action_type(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, type=BOOL) parser.add_field("Bar") msg = parser.parse("Foo: yes\nBar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", True) def test_action_type_error(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, type=BOOL) parser.add_field("Bar") with pytest.raises(headerparser.FieldTypeError): parser.parse("Foo: maybe\nBar: green\n") assert not stub.called def test_action_required(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, required=True) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_required_missing(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, required=True) parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.MissingFieldError): parser.parse("Bar: green\nBaz: blue\n") assert not stub.called def test_action_choices(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, choices=["red", "green", "blue"]) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_bad_choice(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, choices=["red", "green", "blue"]) parser.add_field("Bar") with pytest.raises(headerparser.InvalidChoiceError): parser.parse("Foo: taupe\nBar: green\n") assert not stub.called def test_action_unfold(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, unfold=True) parser.add_field("Bar") msg = parser.parse("Foo: folded\n text \nBar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "folded text") def test_action_no_unfold(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub) parser.add_field("Bar") msg = parser.parse("Foo: folded\n text \nBar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "folded\n text ") def test_action_default(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, default="orange") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_default_missing(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, default="orange") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Bar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "orange", "Bar": "green", "Baz": "blue"} assert msg.body is None assert not stub.called def test_action_different_case(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("FOO: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_multiname(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", "Quux", action=stub) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("quux: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None stub.assert_called_once_with(msg, "Foo", "red") def test_action_multiple(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo", action=stub, multiple=True) parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nFOO: purple\nBaz: blue\nfoo: orange\n") assert dict(msg) == {"Bar": "green", "Baz": "blue"} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, "Foo", "red"), mocker.call(msg, "Foo", "purple"), mocker.call(msg, "Foo", "orange"), ] def test_action_dest(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field("Foo", action=stub, dest="bar") assert "`action` and `dest` are mutually exclusive" in str(excinfo.value) assert not stub.called def test_action_normalized_dest(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field("Foo", action=stub, dest="foo") assert "`action` and `dest` are mutually exclusive" in str(excinfo.value) assert not stub.called def test_action_additional(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo") parser.add_additional(action=stub) msg = parser.parse("Bar: green\nFoo: red\nBaz: blue\n") assert dict(msg) == {"Foo": "red"} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, "Bar", "green"), mocker.call(msg, "Baz", "blue"), ] def test_action_multiple_additional(mocker: MockerFixture) -> None: stub = mocker.stub() parser = HeaderParser() parser.add_field("Foo") parser.add_additional(action=stub, multiple=True) msg = parser.parse("Bar: green\nFoo: red\nBaz: blue\nbaz: mauve\nBAR: taupe\n") assert dict(msg) == {"Foo": "red"} assert msg.body is None assert stub.call_args_list == [ mocker.call(msg, "Bar", "green"), mocker.call(msg, "Baz", "blue"), mocker.call(msg, "baz", "mauve"), mocker.call(msg, "BAR", "taupe"), ] @pytest.mark.parametrize("body", [True, None]) def test_action_set_body_overwritten(body: bool, use_as_body: Mock) -> None: parser = HeaderParser(body=body) parser.add_field("Foo", action=use_as_body) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n\nThis is the body.\n") assert dict(msg) == {"Bar": "green"} assert msg.body == "This is the body.\n" use_as_body.assert_called_once_with(msg, "Foo", "red") def test_action_set_body_forbidden(use_as_body: Mock) -> None: parser = HeaderParser(body=False) parser.add_field("Foo", action=use_as_body) parser.add_field("Bar") with pytest.raises(headerparser.BodyNotAllowedError): parser.parse("Foo: red\nBar: green\n\nThis is the body.\n") use_as_body.assert_called_once_with(ANY, "Foo", "red") @pytest.mark.parametrize("body", [False, None]) def test_action_set_body(body: bool, use_as_body: Mock) -> None: parser = HeaderParser(body=body) parser.add_field("Foo", action=use_as_body) parser.add_field("Bar") msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Bar": "green"} assert msg.body == "red" use_as_body.assert_called_once_with(msg, "Foo", "red") def test_action_set_body_missing(use_as_body: Mock) -> None: parser = HeaderParser(body=True) parser.add_field("Foo", action=use_as_body) parser.add_field("Bar") with pytest.raises(headerparser.MissingBodyError): parser.parse("Foo: red\nBar: green\n") use_as_body.assert_called_once_with(ANY, "Foo", "red") headerparser-0.5.1/test/test_parser/test_parser_additional.py000066400000000000000000000216431450730324400246010ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional() msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_many_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional() msg = parser.parse( "Foo: red\nBar: green\nBaz: blue\nQUUX: purple\nglarch: orange\n" ) assert dict(msg) == { "Foo": "red", "Bar": "green", "Baz": "blue", "QUUX": "purple", "glarch": "orange", } assert msg.body is None def test_intermixed_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional() msg = parser.parse( "QUUX: purple\nBar: green\nglarch: orange\nFoo: red\nBaz: blue\n" ) assert dict(msg) == { "Foo": "red", "Bar": "green", "Baz": "blue", "QUUX": "purple", "glarch": "orange", } assert msg.body is None def test_additional_only() -> None: parser = HeaderParser() parser.add_additional() msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_dest_additional() -> None: parser = HeaderParser() parser.add_field("Foo", dest="dest") parser.add_field("Bar") with pytest.raises(ValueError) as excinfo: parser.add_additional() assert "add_additional and `dest` are mutually exclusive" in str(excinfo.value) def test_additional_dest() -> None: parser = HeaderParser() parser.add_additional() parser.add_field("Foo") with pytest.raises(ValueError) as excinfo: parser.add_field("Bar", dest="dest") assert "add_additional and `dest` are mutually exclusive" in str(excinfo.value) def test_additional_bad_named_multiple() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nFOO: magenta\nBar: green\n") assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == "Foo" def test_additional_named_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", multiple=True) parser.add_additional() msg = parser.parse("Foo: red\nFOO: magenta\nBar: green\n") assert dict(msg) == {"Foo": ["red", "magenta"], "Bar": "green"} assert msg.body is None def test_additional_bad_multiple() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBar: lime\n") assert str(excinfo.value) == "Header field 'Bar' occurs more than once" assert excinfo.value.name == "Bar" def test_additional_bad_multiple_cases() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional() with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBAR: lime\n") assert str(excinfo.value) == "Header field 'BAR' occurs more than once" assert excinfo.value.name == "BAR" def test_multiple_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional(multiple=True) msg = parser.parse("Foo: red\nBar: green\nBAR: lime\n") assert dict(msg) == {"Foo": "red", "Bar": ["green", "lime"]} assert msg.body is None def test_one_multiple_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional(multiple=True) msg = parser.parse("Foo: red\nBAR: lime\n") assert dict(msg) == {"Foo": "red", "BAR": ["lime"]} assert msg.body is None def test_multiple_additional_bad_named_multiple() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(multiple=True) with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\nFOO: magenta\n") assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == "Foo" def test_additional_missing_named() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional() msg = parser.parse("Baz: blue\nQUUX: purple\nglarch: orange\n") assert dict(msg) == {"Baz": "blue", "QUUX": "purple", "glarch": "orange"} assert msg.body is None def test_additional_missing_required_named() -> None: parser = HeaderParser() parser.add_field("Foo", required=True) parser.add_field("Bar") parser.add_additional() with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse("Baz: blue\nQUUX: purple\nglarch: orange\n") assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == "Foo" def test_missing_additional() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional() msg = parser.parse("Foo: red\nBar: green\n") assert dict(msg) == {"Foo": "red", "Bar": "green"} assert msg.body is None def test_additional_type() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(type=int) msg = parser.parse("Foo: 1\nBar: 2\nBaz: 3\n") assert dict(msg) == {"Foo": "1", "Bar": "2", "Baz": 3} assert msg.body is None def test_additional_bad_type() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(type=int) with pytest.raises(headerparser.FieldTypeError) as excinfo: parser.parse("Foo: 1\nBar: 2\nBaz: three\n") assert str(excinfo.value) == ( "Error while parsing 'Baz': 'three': ValueError: " + str(excinfo.value.exc_value) ) assert excinfo.value.name == "Baz" assert excinfo.value.value == "three" assert isinstance(excinfo.value.exc_value, ValueError) def test_additional_choices() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(choices=["red", "green", "blue"]) msg = parser.parse("Foo: mauve\nBar: red\nBaz: green\nQuux: blue\n") assert dict(msg) == { "Foo": "mauve", "Bar": "red", "Baz": "green", "Quux": "blue", } assert msg.body is None def test_additional_bad_choices() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(choices=["red", "green", "blue"]) with pytest.raises(headerparser.InvalidChoiceError) as excinfo: parser.parse("Foo: mauve\nBar: red\nBaz: green\nQuux: taupe\n") assert str(excinfo.value) == "'taupe' is not a valid choice for 'Quux'" assert excinfo.value.name == "Quux" assert excinfo.value.value == "taupe" def test_additional_unfold() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_additional(unfold=True) msg = parser.parse( "Foo: This is\n" " test\n" " text.\n" "Bar: This is\n" " test\n" " text.\n" ) assert dict(msg) == { "Foo": "This is\n test\n text.", "Bar": "This is test text.", } assert msg.body is None def test_bad_additional_dest() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") with pytest.raises(TypeError): parser.add_additional(dest="somewhere") def test_bad_additional_required() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") with pytest.raises(TypeError): parser.add_additional(required=True) def test_bad_additional_default() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") with pytest.raises(TypeError): parser.add_additional(default="") def test_additional_multiname() -> None: parser = HeaderParser() parser.add_field("Foo", "Oof") parser.add_field("Bar", "Baz") parser.add_additional() msg = parser.parse("Oof: red\nBar: green\nQuux: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Quux": "blue"} assert msg.body is None def test_additional_off() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_additional(False) with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert str(excinfo.value) == "Unknown header field 'Baz'" assert excinfo.value.name == "Baz" headerparser-0.5.1/test/test_parser/test_parser_body.py000066400000000000000000000076751450730324400234370ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_require_body() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "This space intentionally left nonblank.\n" ) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "This space intentionally left nonblank.\n" def test_empty_required_body() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body == "" def test_missing_required_body() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.MissingBodyError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert str(excinfo.value) == "Message body is required but missing" def test_forbid_body() -> None: parser = HeaderParser(body=False) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None def test_empty_forbidden_body() -> None: parser = HeaderParser(body=False) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\n\n") assert str(excinfo.value) == "Message body is present but not allowed" def test_present_forbidden_body() -> None: parser = HeaderParser(body=False) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "This space intentionally left nonblank.\n" ) assert str(excinfo.value) == "Message body is present but not allowed" def test_headers_as_required_body() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg = parser.parse("\nFoo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {} assert msg.body == "Foo: red\nBar: green\nBaz: blue\n" def test_headers_as_forbidden_body() -> None: parser = HeaderParser(body=False) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse("\nFoo: red\nBar: green\nBaz: blue\n") assert str(excinfo.value) == "Message body is present but not allowed" def test_required_body_only() -> None: parser = HeaderParser(body=True) msg = parser.parse("\nFoo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {} assert msg.body == "Foo: red\nBar: green\nBaz: blue\n" def test_body_as_unknown_headers() -> None: parser = HeaderParser(body=True) with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert str(excinfo.value) == "Unknown header field 'Foo'" assert excinfo.value.name == "Foo" def test_require_body_all_empty() -> None: parser = HeaderParser(body=True) msg = parser.parse("\n") assert dict(msg) == {} assert msg.body == "" def test_forbid_body_all_empty() -> None: parser = HeaderParser(body=False) with pytest.raises(headerparser.BodyNotAllowedError) as excinfo: parser.parse("\n\n") assert str(excinfo.value) == "Message body is present but not allowed" headerparser-0.5.1/test/test_parser/test_parser_choices.py000066400000000000000000000066011450730324400241030ustar00rootroot00000000000000import pytest from headerparser import BOOL, HeaderParser, InvalidChoiceError def test_choices() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"]) msg = parser.parse("Color: green") assert dict(msg) == {"Color": "green"} assert msg.body is None def test_invalid_choice() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"]) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse("Color: taupe") assert str(excinfo.value) == "'taupe' is not a valid choice for 'Color'" assert excinfo.value.name == "Color" assert excinfo.value.value == "taupe" def test_no_choice() -> None: parser = HeaderParser() with pytest.raises(ValueError) as excinfo: parser.add_field("Unicorn", choices=[]) assert str(excinfo.value) == "empty list supplied for choices" def test_default_choice() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"], default="beige") msg = parser.parse("Color: blue") assert dict(msg) == {"Color": "blue"} assert msg.body is None def test_missing_default_choice() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"], default="beige") msg = parser.parse("") assert dict(msg) == {"Color": "beige"} assert msg.body is None def test_unfold_multiple_choices() -> None: parser = HeaderParser() parser.add_field( "Corner", choices=["upper left", "upper right", "lower left", "lower right"], unfold=True, multiple=True, ) msg = parser.parse("Corner: lower right\nCorner: upper\n left\n") assert dict(msg) == {"Corner": ["lower right", "upper left"]} assert msg.body is None def test_unfold_indented_choices() -> None: parser = HeaderParser() parser.add_field( "Corner", choices=["upper left", "upper right", "lower left", "lower right"], unfold=True, ) msg = parser.parse("Corner: upper\n right") assert dict(msg) == {"Corner": "upper right"} assert msg.body is None def test_lower_choices() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"], type=str.lower) msg = parser.parse("Color: RED") assert dict(msg) == {"Color": "red"} assert msg.body is None def test_lower_invalid_choice() -> None: parser = HeaderParser() parser.add_field("Color", choices=["red", "green", "blue"], type=str.lower) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse("Color: MAUVE") assert str(excinfo.value) == "'mauve' is not a valid choice for 'Color'" assert excinfo.value.name == "Color" assert excinfo.value.value == "mauve" def test_bool_choices() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, choices=(False, "foo")) msg = parser.parse("Boolean: N\n") assert dict(msg) == {"Boolean": False} assert msg.body is None def test_bool_choices_invalid_choice() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, choices=(False, "foo")) with pytest.raises(InvalidChoiceError) as excinfo: parser.parse("BOOLEAN: Y\n") assert str(excinfo.value) == "True is not a valid choice for 'Boolean'" assert excinfo.value.name == "Boolean" assert excinfo.value.value is True headerparser-0.5.1/test/test_parser/test_parser_dest.py000066400000000000000000000075551450730324400234360ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="notfoo") parser.add_field("Bar", dest="notbar") parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"notfoo": "red", "notbar": "green", "Baz": "blue"} assert msg.body is None def test_dest_conflict() -> None: parser = HeaderParser() parser.add_field("Foo", dest="quux") with pytest.raises(ValueError) as excinfo: parser.add_field("Bar", dest="QUUX") assert "destination defined more than once" in str(excinfo.value) def test_header_vs_eq_dest() -> None: parser = HeaderParser() parser.add_field("Foo") with pytest.raises(ValueError) as excinfo: parser.add_field("Bar", dest="Foo") assert "destination defined more than once" in str(excinfo.value) def test_header_vs_like_dest() -> None: parser = HeaderParser() parser.add_field("Foo") with pytest.raises(ValueError) as excinfo: parser.add_field("Bar", dest="foo") assert "destination defined more than once" in str(excinfo.value) def test_dest_vs_eq_header() -> None: parser = HeaderParser() parser.add_field("Bar", dest="Foo") with pytest.raises(ValueError) as excinfo: parser.add_field("Foo") assert "destination defined more than once" in str(excinfo.value) def test_header_eq_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Foo") msg = parser.parse("foo: red") assert dict(msg) == {"Foo": "red"} assert msg.body is None def test_header_like_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="FOO") msg = parser.parse("foo: red") assert dict(msg) == {"FOO": "red"} assert msg.body is None def test_header_missing_default_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="FOO", default=42) msg = parser.parse("") assert dict(msg) == {"FOO": 42} assert msg.body is None def test_switched_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Bar") parser.add_field("Bar", dest="Foo") msg = parser.parse("Foo: foo\nBar: bar\n") assert dict(msg) == {"Bar": "foo", "Foo": "bar"} assert msg.body is None def test_one_missing_required_switched_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Bar", required=True) parser.add_field("Bar", dest="Foo", required=True) with pytest.raises(headerparser.MissingFieldError) as excinfo: parser.parse("Foo: foo\n") assert str(excinfo.value) == "Required header field 'Bar' is not present" assert excinfo.value.name == "Bar" def test_missing_default_switched_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Bar", default=42) parser.add_field("Bar", dest="Foo", default="17") msg = parser.parse("") assert dict(msg) == {"Bar": 42, "Foo": "17"} assert msg.body is None def test_one_missing_default_switched_dest() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Bar", default=42) parser.add_field("Bar", dest="Foo", default="17") msg = parser.parse("Foo: 42") assert dict(msg) == {"Bar": "42", "Foo": "17"} assert msg.body is None def test_dest_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", dest="list", multiple=True) msg = parser.parse("Foo: red\nFoo: green\nFoo: blue") assert dict(msg) == {"list": ["red", "green", "blue"]} assert msg.body is None def test_dest_as_unknown_header() -> None: parser = HeaderParser() parser.add_field("Foo", dest="Bar") with pytest.raises(headerparser.UnknownFieldError) as excinfo: parser.parse("Bar: not a header") assert str(excinfo.value) == "Unknown header field 'Bar'" assert excinfo.value.name == "Bar" headerparser-0.5.1/test/test_parser/test_parser_eq.py000066400000000000000000000022541450730324400230730ustar00rootroot00000000000000from typing import Any import pytest from headerparser import HeaderParser def test_eq_empty() -> None: p1 = HeaderParser() p2 = HeaderParser() assert p1 == p2 @pytest.mark.parametrize("other", [None, False, True, 42, "", [], {}]) def test_neq_empty_other(other: Any) -> None: p = HeaderParser() assert p != other assert other != p def test_eq_one_field() -> None: p1 = HeaderParser() p1.add_field("Foo") p2 = HeaderParser() p2.add_field("Foo") assert p1 == p2 def test_neq_empty_one_field() -> None: p1 = HeaderParser() p2 = HeaderParser() p2.add_field("Foo") assert p1 != p2 def test_eq_two_fields() -> None: p1 = HeaderParser() p1.add_field("Foo") p1.add_field("Bar") p2 = HeaderParser() p2.add_field("Foo") p2.add_field("Bar") assert p1 == p2 def test_eq_out_of_order() -> None: p1 = HeaderParser() p1.add_field("Foo") p1.add_field("Bar") p2 = HeaderParser() p2.add_field("Bar") p2.add_field("Foo") assert p1 == p2 # multiple, type, action, default, required, custom dest, additional, # normalizer, body, altnames, altnames with different cases, unfold, choices headerparser-0.5.1/test/test_parser/test_parser_multiname.py000066400000000000000000000031601450730324400244560ustar00rootroot00000000000000import pytest import headerparser from headerparser import HeaderParser def test_multiname_use_first() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar") msg = parser.parse("Foo: red") assert dict(msg) == {"Foo": "red"} assert msg.body is None def test_multiname_use_second() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar") msg = parser.parse("Bar: red") assert dict(msg) == {"Foo": "red"} assert msg.body is None def test_multiname_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar", multiple=True) parser.add_field("Baz") msg = parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert dict(msg) == {"Foo": ["red", "green"], "Baz": "blue"} assert msg.body is None def test_multiname_bad_multiple() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar") parser.add_field("Baz") with pytest.raises(headerparser.DuplicateFieldError) as excinfo: parser.parse("Foo: red\nBar: green\nBaz: blue\n") assert str(excinfo.value) == "Header field 'Foo' occurs more than once" assert excinfo.value.name == "Foo" def test_multiname_conflict() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar", multiple=True) with pytest.raises(ValueError) as excinfo: parser.add_field("Baz", "BAR") assert "field defined more than once" in str(excinfo.value) def test_multiname_dest() -> None: parser = HeaderParser() parser.add_field("Foo", "Bar", dest="Baz") msg = parser.parse("Bar: red") assert dict(msg) == {"Baz": "red"} assert msg.body is None headerparser-0.5.1/test/test_parser/test_parser_next_stanza.py000066400000000000000000000033021450730324400250170ustar00rootroot00000000000000from io import StringIO import pytest from headerparser import HeaderParser, MissingBodyError pytestmark = pytest.mark.filterwarnings("ignore:.*_next_stanza:DeprecationWarning") def test_simple() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") fp = StringIO("Foo: red\nBar: green\nBaz: blue\n\nThis body is not consumed.\n") msg = parser.parse_next_stanza(fp) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None assert fp.read() == "This body is not consumed.\n" def test_simple_string() -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msg, rest = parser.parse_next_stanza_string( "Foo: red\nBar: green\nBaz: blue\n\nThis body is not consumed.\n" ) assert dict(msg) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert msg.body is None assert rest == "This body is not consumed.\n" def test_body_true() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") fp = StringIO("Foo: red\nBar: green\nBaz: blue\n\nThis body is not consumed.\n") with pytest.raises(MissingBodyError): parser.parse_next_stanza(fp) def test_body_true_string() -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") with pytest.raises(MissingBodyError): parser.parse_next_stanza_string( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "This body is not consumed.\n" ) headerparser-0.5.1/test/test_parser/test_parser_stanzas.py000066400000000000000000000153511450730324400241530ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Iterator as IteratorABC from io import StringIO from typing import Callable, Iterator, cast import pytest import headerparser from headerparser import HeaderParser, NormalizedDict, scan_stanzas PMethod = Callable[[HeaderParser, str], Iterator[NormalizedDict]] def parse_stanzas_string(p: HeaderParser, s: str) -> IteratorABC[NormalizedDict]: return p.parse_stanzas(s) def parse_stanzas_string_as_file( p: HeaderParser, s: str ) -> IteratorABC[NormalizedDict]: return p.parse_stanzas(StringIO(s)) def parse_stanzas_string_as_stream( p: HeaderParser, s: str ) -> IteratorABC[NormalizedDict]: return p.parse_stanzas_stream(scan_stanzas(s)) @pytest.fixture( params=[ parse_stanzas_string, parse_stanzas_string_as_file, parse_stanzas_string_as_stream, ] ) def pmethod(request: pytest.FixtureRequest) -> PMethod: return cast(PMethod, request.param) # type: ignore[attr-defined] def test_simple(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") m1, m2, m3 = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\n\n" "Baz: sapphire\nBar: emerald\nFoo: ruby\n\n" "Bar: earth\nBaz: water\nFoo: fire\n\n", ) assert dict(m1) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert m1.body is None assert dict(m2) == {"Foo": "ruby", "Bar": "emerald", "Baz": "sapphire"} assert m2.body is None assert dict(m3) == {"Foo": "fire", "Bar": "earth", "Baz": "water"} assert m3.body is None def test_invalid_stanza(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msgs = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\n\n" "Baz: sapphire\nBar: emerald\nFoo: ruby\n\n" "Bar: earth\nBaz: water\nFoo: fire\nQuux: aether\nCleesh: air\n\n" "Baz: ice\nFoo: lightning\nBar: mud\n\n", ) m1 = next(msgs) assert dict(m1) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert m1.body is None m2 = next(msgs) assert dict(m2) == {"Foo": "ruby", "Bar": "emerald", "Baz": "sapphire"} assert m2.body is None with pytest.raises(headerparser.UnknownFieldError) as excinfo: next(msgs) assert str(excinfo.value) == "Unknown header field 'Quux'" assert excinfo.value.name == "Quux" def test_some_required(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo", required=True) parser.add_field("Bar") parser.add_field("Baz") msgs = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\n\n" "Baz: sapphire\nBar: emerald\nFoo: ruby\n\n" "Bar: earth\nBaz: water\n\n" "Baz: ice\nFoo: lightning\nBar: mud\n\n", ) m1 = next(msgs) assert dict(m1) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert m1.body is None m2 = next(msgs) assert dict(m2) == {"Foo": "ruby", "Bar": "emerald", "Baz": "sapphire"} assert m2.body is None with pytest.raises(headerparser.MissingFieldError) as excinfo: next(msgs) assert str(excinfo.value) == "Required header field 'Foo' is not present" assert excinfo.value.name == "Foo" def test_disjoint_keys(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") m1, m2, m3 = pmethod(parser, "Foo: red\n\nBar: green\n\nBaz: blue\n\n") assert dict(m1) == {"Foo": "red"} assert m1.body is None assert dict(m2) == {"Bar": "green"} assert m2.body is None assert dict(m3) == {"Baz": "blue"} assert m3.body is None def test_overlapping_keys(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") m1, m2, m3 = pmethod( parser, "Foo: red\n\nBar: green\nFoo: yellow\n\nFoo: white\nBaz: blue\n\n" ) assert dict(m1) == {"Foo": "red"} assert m1.body is None assert dict(m2) == {"Foo": "yellow", "Bar": "green"} assert m2.body is None assert dict(m3) == {"Foo": "white", "Baz": "blue"} assert m3.body is None def test_multiple(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar", multiple=True) parser.add_field("Baz") m1, m2, m3 = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\nBar: lime\n\n" "Baz: sapphire\nBar: emerald\nBar: jade\nBar: green\nFoo: ruby\n\n" "Bar: earth\nBaz: water\nFoo: fire\nBar: mud\nBar: land\nBar: solid\n\n", ) assert dict(m1) == {"Foo": "red", "Bar": ["green", "lime"], "Baz": "blue"} assert m1.body is None assert dict(m2) == { "Foo": "ruby", "Bar": ["emerald", "jade", "green"], "Baz": "sapphire", } assert m2.body is None assert dict(m3) == { "Foo": "fire", "Bar": ["earth", "mud", "land", "solid"], "Baz": "water", } assert m3.body is None def test_default(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", default="DEF") m1, m2, m3 = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\n\n" "Bar: emerald\nFoo: ruby\n\n" "Bar: earth\nBaz: water\nFoo: fire\n\n", ) assert dict(m1) == {"Foo": "red", "Bar": "green", "Baz": "blue"} assert m1.body is None assert dict(m2) == {"Foo": "ruby", "Bar": "emerald", "Baz": "DEF"} assert m2.body is None assert dict(m3) == {"Foo": "fire", "Bar": "earth", "Baz": "water"} assert m3.body is None def test_default_inverted(pmethod: PMethod) -> None: parser = HeaderParser() parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz", default="DEF") m1, m2, m3 = pmethod( parser, "Foo: red\nBar: green\n\n" "Baz: sapphire\nBar: emerald\nFoo: ruby\n\n" "Bar: earth\nFoo: fire\n\n", ) assert dict(m1) == {"Foo": "red", "Bar": "green", "Baz": "DEF"} assert m1.body is None assert dict(m2) == {"Foo": "ruby", "Bar": "emerald", "Baz": "sapphire"} assert m2.body is None assert dict(m3) == {"Foo": "fire", "Bar": "earth", "Baz": "DEF"} assert m3.body is None def test_body_true(pmethod: PMethod) -> None: parser = HeaderParser(body=True) parser.add_field("Foo") parser.add_field("Bar") parser.add_field("Baz") msgs = pmethod( parser, "Foo: red\nBar: green\nBaz: blue\n\n" "Baz: sapphire\nBar: emerald\nFoo: ruby\n\n" "Bar: earth\nBaz: water\nFoo: fire\n\n", ) with pytest.raises(headerparser.MissingBodyError): next(msgs) headerparser-0.5.1/test/test_parser/test_parser_types.py000066400000000000000000000076371450730324400236440ustar00rootroot00000000000000from typing import Any import pytest from headerparser import BOOL, FieldTypeError, HeaderParser def test_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL) msg = parser.parse("Boolean: yes\n") assert dict(msg) == {"Boolean": True} assert msg.body is None def test_multiple_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, multiple=True) msg = parser.parse( "Boolean: yes\n" "Boolean: y\n" "Boolean: on\n" "Boolean: true\n" "Boolean: 1\n" "Boolean: YES\n" "Boolean: TRUE\n" "Boolean: no\n" "Boolean: n\n" "Boolean: off\n" "Boolean: false\n" "Boolean: 0\n" "Boolean: NO\n" "Boolean: FALSE\n" ) assert dict(msg) == {"Boolean": [True] * 7 + [False] * 7} assert msg.body is None def test_default_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, default="foo") msg = parser.parse("Boolean: Off") assert dict(msg) == {"Boolean": False} assert msg.body is None def test_missing_default_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, default="foo") msg = parser.parse("") assert dict(msg) == {"Boolean": "foo"} assert msg.body is None def test_invalid_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL) with pytest.raises(FieldTypeError) as excinfo: parser.parse("Boolean: One\n") assert str(excinfo.value) == ( "Error while parsing 'Boolean': 'One': ValueError: invalid boolean: 'One'" ) assert excinfo.value.name == "Boolean" assert excinfo.value.value == "One" assert isinstance(excinfo.value.exc_value, ValueError) def test_bool_and_not_bool() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL) parser.add_field("String") msg = parser.parse("Boolean: yes\nString: no\n") assert dict(msg) == {"Boolean": True, "String": "no"} assert msg.body is None def test_bool_choices_bad_type() -> None: parser = HeaderParser() parser.add_field("Boolean", type=BOOL, choices=(False, "foo")) with pytest.raises(FieldTypeError) as excinfo: parser.parse("BOOLEAN: foo\n") assert str(excinfo.value) == ( "Error while parsing 'Boolean': 'foo': ValueError: invalid boolean: 'foo'" ) assert excinfo.value.name == "Boolean" assert excinfo.value.value == "foo" assert isinstance(excinfo.value.exc_value, ValueError) assert "invalid boolean" in str(excinfo.value.exc_value) def test_native_type() -> None: parser = HeaderParser() parser.add_field("Number", "No.", type=int, dest="#") msg = parser.parse("Number: 42") assert dict(msg) == {"#": 42} assert msg.body is None def test_bad_native_type() -> None: parser = HeaderParser() parser.add_field("Number", "No.", type=int, dest="#") with pytest.raises(FieldTypeError) as excinfo: parser.parse("No.: forty-two") assert str(excinfo.value) == ( "Error while parsing 'Number': 'forty-two': ValueError: " + str(excinfo.value.exc_value) ) assert excinfo.value.name == "Number" assert excinfo.value.value == "forty-two" assert isinstance(excinfo.value.exc_value, ValueError) def fieldtypeerror_raiser(_: Any) -> None: raise FieldTypeError("name", "value", ValueError("foobar")) def test_fieldtypeerror_raiser() -> None: parser = HeaderParser() parser.add_field("Foo", type=fieldtypeerror_raiser) with pytest.raises(FieldTypeError) as excinfo: parser.parse("Foo: Bar\n") assert str(excinfo.value) == ( "Error while parsing 'name': 'value': ValueError: foobar" ) assert excinfo.value.name == "name" assert excinfo.value.value == "value" assert isinstance(excinfo.value.exc_value, ValueError) assert str(excinfo.value.exc_value) == "foobar" headerparser-0.5.1/test/test_scanner/000077500000000000000000000000001450730324400176335ustar00rootroot00000000000000headerparser-0.5.1/test/test_scanner/test_scan_next_stanza.py000066400000000000000000000147701450730324400246170ustar00rootroot00000000000000from __future__ import annotations from typing import Optional import pytest from headerparser import ( Scanner, ScannerEOFError, scan_next_stanza, scan_next_stanza_string, ) @pytest.mark.parametrize( "lines,fields,trailer,skip_leading_newlines", [ ([], [], None, True), ([], [], None, False), (["\n", "\n"], [], None, True), (["\n", "\n"], [], "\n", False), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], "This is a body.\n", True, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], "This is a body.\n", False, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], "\nThis is a body.\n", True, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], "\nThis is a body.\n", False, ), ( ["Foo: red\n", "Bar: green\n", "Baz: blue\n"], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], None, True, ), ( ["Foo: red\n", "Bar: green\n", "Baz: blue\n"], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], None, False, ), ( [ "\n", "\n", "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], "This is a body.\n", True, ), ( [ "\n", "\n", "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [], "\nFoo: red\nBar: green\nBaz: blue\n\nThis is a body.\n", False, ), ], ) def test_scanner_next_stanza( lines: list[str], fields: list[tuple[str, str]], trailer: Optional[str], skip_leading_newlines: bool, ) -> None: for data in (lines, "".join(lines)): sc = Scanner(data, skip_leading_newlines=skip_leading_newlines) assert list(sc.scan_next_stanza()) == fields try: remainder = sc.get_unscanned() except ScannerEOFError: assert trailer is None else: assert remainder == trailer @pytest.mark.filterwarnings("ignore:.*scan_next_stanza:DeprecationWarning") @pytest.mark.parametrize( "lines,fields,trailer,skip_leading_newlines", [ ([], [], [], True), ([], [], [], False), (["\n", "\n"], [], [], True), (["\n", "\n"], [], ["\n"], False), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ["This is a body.\n"], True, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ["This is a body.\n"], False, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ["\n", "This is a body.\n"], True, ), ( [ "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ["\n", "This is a body.\n"], False, ), ( ["Foo: red\n", "Bar: green\n", "Baz: blue\n"], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [], True, ), ( ["Foo: red\n", "Bar: green\n", "Baz: blue\n"], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [], False, ), ( [ "\n", "\n", "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ["This is a body.\n"], True, ), ( [ "\n", "\n", "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], [], [ "\n", "Foo: red\n", "Bar: green\n", "Baz: blue\n", "\n", "This is a body.\n", ], False, ), ], ) def test_scan_next_stanza( lines: list[str], fields: list[tuple[str, str]], trailer: list[str], skip_leading_newlines: bool, ) -> None: liter = iter(lines) assert ( list(scan_next_stanza(liter, skip_leading_newlines=skip_leading_newlines)) == fields ) assert list(liter) == trailer assert scan_next_stanza_string( "".join(lines), skip_leading_newlines=skip_leading_newlines ) == (fields, "".join(trailer)) headerparser-0.5.1/test/test_scanner/test_scan_stanzas.py000066400000000000000000000165101450730324400237360ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Iterator as IteratorABC from io import StringIO from typing import Callable, Iterator, List, Tuple, cast import pytest from headerparser import MalformedHeaderError, Scanner, ScannerEOFError, scan_stanzas ScannerType = Callable[..., Iterator[List[Tuple[str, str]]]] def scan_stanzas_string_as_file( s: str, skip_leading_newlines: bool = False ) -> IteratorABC[list[tuple[str, str]]]: return scan_stanzas(StringIO(s), skip_leading_newlines=skip_leading_newlines) def scan_stanzas_string_as_list( s: str, skip_leading_newlines: bool = False ) -> IteratorABC[list[tuple[str, str]]]: return scan_stanzas(s.splitlines(True), skip_leading_newlines=skip_leading_newlines) def scan_stanzas_string( s: str, skip_leading_newlines: bool = False ) -> IteratorABC[list[tuple[str, str]]]: return scan_stanzas(s, skip_leading_newlines=skip_leading_newlines) @pytest.fixture( params=[ scan_stanzas_string_as_file, scan_stanzas_string_as_list, scan_stanzas_string, ] ) def scanner(request: pytest.FixtureRequest) -> ScannerType: return cast(ScannerType, request.param) # type: ignore[attr-defined] @pytest.mark.parametrize( "lines,fields,skip_leading_newlines", [ ("", [], True), ("", [], False), ("\n\n", [], True), ("\n\n", [[]], False), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], True, ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], False, ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], True, ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], False, ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n" "\n" "\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], True, ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n" "\n" "\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], False, ), ( "\n" "\n" "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], True, ), ( "\n" "\n" "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Red: bar\n" "Green: baz\n", [ [], [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], [("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet")], [("Blue", "foo"), ("Red", "bar"), ("Green", "baz")], ], False, ), ], ) def test_scan_stanzas( lines: str, fields: list[list[tuple[str, str]]], skip_leading_newlines: bool, scanner: ScannerType, ) -> None: assert list(scanner(lines, skip_leading_newlines=skip_leading_newlines)) == fields def test_invalid_stanza(scanner: ScannerType) -> None: stanzas = scanner( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Quux: ruby\n" "Glarch: sapphire\n" "Cleesh: garnet\n" "\n" "Blue: foo\n" "Wait, this isn't a header.\n" "Green: baz\n", skip_leading_newlines=True, ) assert next(stanzas) == [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")] assert next(stanzas) == [ ("Quux", "ruby"), ("Glarch", "sapphire"), ("Cleesh", "garnet"), ] with pytest.raises(MalformedHeaderError) as excinfo: next(stanzas) assert str(excinfo.value) == ( 'Invalid header line encountered: "Wait, this isn\'t a header."' ) def test_scan_stanzas_empty() -> None: sc = Scanner("") assert list(sc.scan_stanzas()) == [] with pytest.raises(ScannerEOFError) as excinfo: next(sc.scan_stanzas()) assert str(excinfo.value) == "Scanner has reached end of input" headerparser-0.5.1/test/test_scanner/test_scanner.py000066400000000000000000000216651450730324400227070ustar00rootroot00000000000000from __future__ import annotations from collections.abc import Iterator as IteratorABC from io import StringIO import re from typing import Any, Callable, Iterator, cast import pytest import headerparser from headerparser import scan from headerparser.scanner import FieldType ScannerType = Callable[..., Iterator[FieldType]] def scan_string_as_file(s: str, **kwargs: Any) -> IteratorABC[FieldType]: return scan(StringIO(s), **kwargs) def scan_string_as_list(s: str, **kwargs: Any) -> IteratorABC[FieldType]: return scan(s.splitlines(True), **kwargs) def scan_string(s: str, **kwargs: Any) -> IteratorABC[FieldType]: return scan(s, **kwargs) @pytest.fixture(params=[scan_string_as_file, scan_string_as_list, scan_string]) def scanner(request: pytest.FixtureRequest) -> ScannerType: return cast(ScannerType, request.param) # type: ignore[attr-defined] @pytest.mark.parametrize( "lines,fields", [ ("", []), ( "Foo: red\nBar: green\nBaz: blue\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ), ( "Foo: red\nBar: green\nBaz: blue\n\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "")], ), ( "Foo: red\nBar: green\nBaz: blue\n\n\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "\n")], ), ( "Foo: red\nBar: green\nBaz: blue\n\nThis is a test.", [ ("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "This is a test."), ], ), ( "Foo: red\nBar: green\nBaz: blue\n\n\nThis is a test.", [ ("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "\nThis is a test."), ], ), ( "Foo: red\n" "Bar: green\n" "Baz: blue\n" "\n" "Foo: quux\n" "Bar: glarch\n" "Baz: cleesh\n", [ ("Foo", "red"), ("Bar", "green"), ("Baz", "blue"), (None, "Foo: quux\nBar: glarch\nBaz: cleesh\n"), ], ), ( "Key1: Value1\nKey2 :Value2\nKey3 : Value3\nKey4:Value4\n", [ ("Key1", "Value1"), ("Key2", "Value2"), ("Key3", "Value3"), ("Key4", "Value4"), ], ), ( "Key1: Value1\n" " Folded\n" " More folds\n" "Key2: Value2\n" " Folded\n" " Fewer folds\n" "Key3: Value3\n" " Key4: Not a real header\n" "Key4: \n" "\tTab after empty line\n" " \n" ' After an "empty" folded line\n' "Key5:\n" " After a line without even a space!\n", [ ("Key1", "Value1\n Folded\n More folds"), ("Key2", "Value2\n Folded\n Fewer folds"), ("Key3", "Value3\n Key4: Not a real header"), ("Key4", '\n\tTab after empty line\n \n After an "empty" folded line'), ("Key5", "\n After a line without even a space!"), ], ), ( "Foo: red\nBar: green\nBaz: blue", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ), ( "Foo: red\r\nBar: green\r\nBaz: blue\r\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ), ( "Foo: value1\nFoo: value2\nFOO: VALUE3\nfOO: valueFour\n", [ ("Foo", "value1"), ("Foo", "value2"), ("FOO", "VALUE3"), ("fOO", "valueFour"), ], ), ( "Leading: value\n" "Trailing: value \n" "Leading-Tab:\tvalue\n" "Trailing-Tab:value\t\n", [ ("Leading", "value"), ("Trailing", "value "), ("Leading-Tab", "value"), ("Trailing-Tab", "value\t"), ], ), ("Key Name: value", [("Key Name", "value")]), ("Foo: red : crimson: scarlet\n", [("Foo", "red : crimson: scarlet")]), ], ) @pytest.mark.parametrize("skip_leading_newlines", [True, False]) def test_scan( lines: str, fields: list[FieldType], skip_leading_newlines: bool, scanner: ScannerType, ) -> None: assert list(scanner(lines, skip_leading_newlines=skip_leading_newlines)) == fields @pytest.mark.parametrize( "lines,fields,skip_leading_newlines", [ ( "\nFoo: red\nBar: green\nBaz: blue\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], True, ), ( "\nFoo: red\nBar: green\nBaz: blue\n", [(None, "Foo: red\nBar: green\nBaz: blue\n")], False, ), ("\n", [(None, "")], False), ("\n", [], True), ("\n\n", [(None, "\n")], False), ("\n\n", [], True), ], ) def test_scan_skip( lines: str, fields: list[FieldType], skip_leading_newlines: bool, scanner: ScannerType, ) -> None: assert list(scanner(lines, skip_leading_newlines=skip_leading_newlines)) == fields @pytest.mark.parametrize( "lines,fields,separator_regex", [ ( "Key1: Value1\nKey2 :Value2\nKey3 : Value3\nKey4:Value4\n", [ ("Key1", " Value1"), ("Key2 ", "Value2"), ("Key3 ", " Value3"), ("Key4", "Value4"), ], ":", ), ( "Foo = red\nBar =green\nBaz= blue\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], r"\s*=\s*", ), ( "Foo = red\nBar =green\nBaz= blue\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], re.compile(r"\s*=\s*"), ), ( "Foo = red = crimson=scarlet\n", [("Foo", "red = crimson=scarlet")], r"\s*=\s*", ), ( "Key: Value = foo\nKey = Value: foo\n", [("Key: Value", "foo"), ("Key", "Value: foo")], r"\s*=\s*", ), ], ) def test_scan_separator_regex( lines: str, fields: list[FieldType], separator_regex: bool, scanner: ScannerType ) -> None: assert list(scanner(lines, separator_regex=separator_regex)) == fields @pytest.mark.parametrize( "lines,fields", [ ( "Foo: red\rBar: green\rBaz: blue\r", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ), ( "Foo: red\nBar: green\rBaz: blue\r\n", [("Foo", "red"), ("Bar", "green"), ("Baz", "blue")], ), ( "Foo: line\n" " feed\n" "Bar: carriage\r" " return\r" "Baz: CR\r\n" " LF\r\n", [ ("Foo", "line\n feed"), ("Bar", "carriage\n return"), ("Baz", "CR\n LF"), ], ), ], ) @pytest.mark.parametrize("skip_leading_newlines", [True, False]) def test_scan_string( lines: str, fields: list[FieldType], skip_leading_newlines: bool ) -> None: assert list(scan(lines, skip_leading_newlines=skip_leading_newlines)) == fields def test_lines_no_ends() -> None: assert list( scan( [ "Key: value", "Folded: hold on", " let me check", " ", " yes", "", "Newlines will not be added to this body.", "So it'll look bad.", ] ) ) == [ ("Key", "value"), ("Folded", "hold on\n let me check\n \n yes"), (None, "Newlines will not be added to this body.So it'll look bad."), ] def test_malformed_header(scanner: ScannerType) -> None: with pytest.raises(headerparser.MalformedHeaderError) as excinfo: list(scanner("Foo: red\nBar green\nBaz: blue\n")) assert str(excinfo.value) == "Invalid header line encountered: 'Bar green'" assert excinfo.value.line == "Bar green" def test_unexpected_folding(scanner: ScannerType) -> None: with pytest.raises(headerparser.UnexpectedFoldingError) as excinfo: list(scanner(" Foo: red\nBar green\nBaz: blue\n")) assert str(excinfo.value) == ( "Indented line without preceding header line encountered: ' Foo: red'" ) assert excinfo.value.line == " Foo: red" def test_separator_regex_default_separator(scanner: ScannerType) -> None: with pytest.raises(headerparser.MalformedHeaderError) as excinfo: list(scanner("Foo = red\nBar: green\n", separator_regex=r"\s*=\s*")) assert str(excinfo.value) == "Invalid header line encountered: 'Bar: green'" assert excinfo.value.line == "Bar: green" headerparser-0.5.1/test/test_unfold.py000066400000000000000000000024631450730324400200500ustar00rootroot00000000000000import pytest from headerparser import unfold @pytest.mark.parametrize( "sin,sout", [ ("some value", "some value"), ("some\nvalue", "some value"), ("some\n value", "some value"), (" some value", "some value"), ("\nsome value", "some value"), (" \nsome value", "some value"), ("\n some value", "some value"), ("some value ", "some value"), ("some value\n", "some value"), ("some value\n ", "some value"), ("some value \n", "some value"), ( "A period ends a sentence. It is followed by two spaces.", "A period ends a sentence. It is followed by two spaces.", ), ("x\ty\n0\t1\n", "x\ty 0\t1"), ( "Value1\n Folded\n More folds\n Fewer folds\n", "Value1 Folded More folds Fewer folds", ), ("some\n\tvalue", "some value"), ("some\n\t value", "some value"), ("some\n \f value", "some \f value"), ("some \n \n value", "some value"), ("some\n\nvalue", "some value"), ("some\r value", "some value"), ("some\r\n value", "some value"), ("some\nsort\rof\r\nvalue", "some sort of value"), ], ) def test_unfold_single_line(sin: str, sout: str) -> None: assert unfold(sin) == sout headerparser-0.5.1/tox.ini000066400000000000000000000027731450730324400155100ustar00rootroot00000000000000[tox] envlist = lint,typing,py37,py38,py39,py310,py311,py312,pypy3 skip_missing_interpreters = True isolated_build = True minversion = 3.3.0 [testenv] deps = coverage pytest pytest-mock commands = coverage erase coverage run -m pytest {posargs} --doctest-modules --pyargs headerparser coverage run -m pytest {posargs} test README.rst docs/index.rst coverage combine coverage report [testenv:lint] skip_install = True deps = flake8 flake8-bugbear flake8-builtins flake8-unused-arguments commands = flake8 src test [testenv:typing] deps = mypy types-Deprecated {[testenv]deps} commands = mypy src test [pytest] doctest_optionflags = IGNORE_EXCEPTION_DETAIL filterwarnings = error [coverage:run] branch = True parallel = True source = headerparser [coverage:paths] source = src .tox/**/site-packages [coverage:report] precision = 2 show_missing = True [flake8] doctests = True exclude = .*/,build/,dist/,test/data,venv/ hang-closing = False max-doc-length = 100 max-line-length = 80 unused-arguments-ignore-stub-functions = True select = A,B,B902,B950,C,E,E242,F,U100,W ignore = B005,E203,E262,E266,E501,W503 [isort] atomic = True force_sort_within_sections = True honor_noqa = True lines_between_sections = 0 profile = black reverse_relative = True sort_relative_in_force_sorted_sections = True src_paths = src [testenv:docs] basepython = python3 deps = -rdocs/requirements.txt changedir = docs commands = sphinx-build -E -W -b html . _build/html