././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2927737 pyaml-23.12.0/0000755000175000017500000000000014542337551012656 5ustar00fraggodfraggod././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1478060463.0 pyaml-23.12.0/COPYING0000644000175000017500000000075313006264657013716 0ustar00fraggodfraggod DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE Version 2, December 2004 Copyright (C) 2012 Mike Kazantsev Everyone is permitted to copy and distribute verbatim or modified copies of this license document, and changing it is allowed as long as the name is changed. DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. You just DO WHAT THE FUCK YOU WANT TO. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1432061116.0 pyaml-23.12.0/MANIFEST.in0000644000175000017500000000003312526702274014406 0ustar00fraggodfraggodinclude COPYING README.rst ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2927737 pyaml-23.12.0/PKG-INFO0000644000175000017500000002624714542337551013766 0ustar00fraggodfraggodMetadata-Version: 2.1 Name: pyaml Version: 23.12.0 Summary: PyYAML-based module to produce a bit more pretty and readable YAML-serialized data Home-page: https://github.com/mk-fg/pretty-yaml Author: Mike Kazantsev Author-email: Mike Kazantsev License: WTFPL Project-URL: Homepage, https://github.com/mk-fg/pretty-yaml Keywords: yaml,serialization,pretty-print,formatter,human,readability Classifier: Development Status :: 4 - Beta Classifier: Intended Audience :: Developers Classifier: License :: Public Domain Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3.8 Classifier: Topic :: Software Development Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Utilities Requires-Python: >=3.8 Description-Content-Type: text/x-rst License-File: COPYING Requires-Dist: PyYAML Provides-Extra: anchors Requires-Dist: unidecode; extra == "anchors" pretty-yaml (or pyaml) ====================== PyYAML_-based python module to produce a bit more pretty and human-readable YAML-serialized data. This module is for serialization only, see `ruamel.yaml`_ module for literate YAML parsing (keeping track of comments, spacing, line/column numbers of values, etc). (side-note: to dump stuff parsed by ruamel.yaml with this module, use only ``YAML(typ='safe')`` there) It's a small module, and for projects that only need part of its functionality, I'd recommend copy-pasting that in, instead of adding janky dependency. .. _PyYAML: http://pyyaml.org/ .. _ruamel.yaml: https://bitbucket.org/ruamel/yaml/ .. contents:: :backlinks: none Repository URLs: - https://github.com/mk-fg/pretty-yaml - https://codeberg.org/mk-fg/pretty-yaml - https://fraggod.net/code/git/pretty-yaml Warning ------- Prime goal of this module is to produce human-readable output that can be easily diff'ed, manipulated and re-used, but maybe with occasional issues. So please do not rely on the thing to produce output that can always be deserialized exactly to what was exported, at least - use PyYAML directly for that (but maybe with options from the next section). What this module does and why ----------------------------- YAML is generally nice and easy format to read *if* it was written by humans. PyYAML can a do fairly decent job of making stuff readable, and the best combination of parameters for such output that I've seen so far is probably this one:: >>> m = [123, 45.67, {1: None, 2: False}, 'some text'] >>> data = dict(a='asldnsa\nasldpáknsa\n', b='whatever text', ma=m, mb=m) >>> yaml.safe_dump( data, sys.stdout, width=100, allow_unicode=True, default_flow_style=False ) a: 'asldnsa asldpáknsa ' b: whatever text ma: &id001 - 123 - 45.67 - 1: null 2: false - some text mb: *id001 pyaml (this module) tries to improve on that a bit, with the following tweaks: * Most human-friendly representation options in PyYAML (that I know of) are used as defaults - unicode, flow-style, width=100 (old default is 80). * Dump "null" values as empty values, if possible, which have the same meaning but reduce visual clutter and are easier to edit. * Dicts, sets, OrderedDicts, defaultdicts, namedtuples, enums, dataclasses, etc are represented as their safe YAML-compatible base (like int, list or mapping), with mappings key-sorted by default for more diff-friendly output. * Use shorter and simplier yes/no for booleans. * List items get indented, as they should be. * Attempt is made to pick more readable string representation styles, depending on the value, e.g.:: >>> yaml.safe_dump(cert, sys.stdout) cert: '-----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... >>> pyaml.p(cert): cert: | -----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... * "force_embed" option (default=yes) to avoid having &id stuff scattered all over the output. Might be more useful to disable it in some specific cases though. * "&idXYZ" anchors, when needed, get labels from the keys they get attached to, not just meaningless enumerators, e.g. "&users_-_admin" instead. * "string_val_style" option to only apply to strings that are values, not keys, i.e:: >>> pyaml.p(data, string_val_style='"') key: "value\nasldpáknsa\n" >>> yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_style='"') "key": "value\nasldpáknsa\n" * Add vertical spacing (empty lines) between keys on different depths, to separate long YAML sections in the output visually, make it more seekable. * Discard end-of-document "..." indicators for simple values. Result for the (rather meaningless) example above:: >>> pyaml.p(data, force_embed=False, vspacing=dict(split_lines=10)) a: | asldnsa asldpáknsa b: whatever text ma: &ma - 123 - 45.67 - 1: 2: no - some text mb: *ma (force_embed=False enabled deduplication with ``&ma`` anchor, vspacing is adjusted to split even this tiny output) ---------- Extended example:: >>> pyaml.dump(data, vspacing=dict(split_lines=10)) destination: encoding: xz: enabled: yes min_size: 5120 options: path_filter: - \.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$ - \.(rpm|deb|iso)$ - \.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$ - \.(sqlite3?|fossil|fsl)$ - \.git/objects/[0-9a-f]+/[0-9a-f]+$ result: append_to_file: append_to_lafs_dir: print_to_stdout: yes url: http://localhost:3456/uri filter: - /(CVS|RCS|SCCS|_darcs|\{arch\})/$ - /\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$ - /=(RELEASE-ID|meta-update|update)$ http: ca_certs_files: /etc/ssl/certs/ca-certificates.crt debug_requests: no request_pool_options: cachedConnectionTimeout: 600 maxPersistentPerHost: 10 retryAutomatically: yes logging: formatters: basic: datefmt: '%Y-%m-%d %H:%M:%S' format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s' handlers: console: class: logging.StreamHandler formatter: basic level: custom stream: ext://sys.stderr loggers: twisted: handlers: - console level: 0 root: handlers: - console level: custom Note that unless there are many moderately wide and deep trees of data, which are expected to be read and edited by people, it might be preferrable to directly use PyYAML regardless, as it won't introduce another (rather pointless in that case) dependency and a point of failure. Features and Tricks ------------------- * Pretty-print any yaml or json (yaml subset) file from the shell:: % python -m pyaml /path/to/some/file.yaml % pyaml < myfile.yml % curl -s https://www.githubstatus.com/api/v2/summary.json | pyaml ``pipx install pyaml`` can be a good way to only install "pyaml" command-line script. * Process and replace json/yaml file in-place:: % python -m pyaml -r mydata.yml * Easier "debug printf" for more complex data (all funcs below are aliases to same thing):: pyaml.p(stuff) pyaml.pprint(my_data) pyaml.pprint('----- HOW DOES THAT BREAKS!?!?', input_data, some_var, more_stuff) pyaml.print(data, file=sys.stderr) # needs "from __future__ import print_function" * Force all string values to a certain style (see info on these in `PyYAML docs`_):: pyaml.dump(many_weird_strings, string_val_style='|') pyaml.dump(multiline_words, string_val_style='>') pyaml.dump(no_want_quotes, string_val_style='plain') Using ``pyaml.add_representer()`` (note \*p\*yaml) as suggested in `this SO thread`_ (or `github-issue-7`_) should also work. See also this `amazing reply to StackOverflow#3790454`_ for everything about the many different string styles in YAML. * Control indent and width of the results:: pyaml.dump(wide_and_deep, indent=4, width=120) These are actually keywords for PyYAML Emitter (passed to it from Dumper), see more info on these in `PyYAML docs`_. * Dump multiple yaml documents into a file: ``pyaml.dump_all([data1, data2, data3], dst_file)`` explicit_start=True is implied, unless overidden by explicit_start=False. * Control thresholds for vertical spacing of values (0 = always space stuff out), and clump all oneliner ones at the top:: >>> pyaml.dump( data, sort_dicts=pyaml.PYAMLSort.oneline_group, vspacing=dict(split_lines=0, split_count=0) ) chart: axisCenteredZero: no axisColorMode: text axisLabel: '' axisPlacement: auto barAlignment: 0 drawStyle: line ... hideFrom: legend: no tooltip: no viz: no scaleDistribution: type: linear stacking: group: A mode: none Or same thing with cli tool ``-v/--vspacing`` option: ``pyaml -v 0/0g mydata.yaml`` .. _PyYAML docs: http://pyyaml.org/wiki/PyYAMLDocumentation#Scalars .. _this SO thread: http://stackoverflow.com/a/7445560 .. _github-issue-7: https://github.com/mk-fg/pretty-yaml/issues/7 .. _amazing reply to StackOverflow#3790454: https://stackoverflow.com/questions/3790454/how-do-i-break-a-string-in-yaml-over-multiple-lines/21699210#21699210 Installation ------------ It's a regular Python 3.8+ module/package, published on PyPI (as pyaml_). Module uses PyYAML_ for processing of the actual YAML files and should pull it in as a dependency. Dependency on unidecode_ module is optional and should only be useful with force_embed=False keyword (defaults to True), and same-id objects or recursion is used within serialized data - i.e. only when generating &some_key_id anchors is needed. If module is unavailable at runtime, anchor ids will be less like their keys and maybe not as nice. Using pip_ is how you generally install it, usually coupled with venv_ usage (which will also provide "pip" tool itself):: % pip install pyaml Current-git version can be installed like this:: % pip install git+https://github.com/mk-fg/pretty-yaml pip will default to installing into currently-active venv, then user's home directory (under ``~/.local/lib/python...``), and maybe system-wide when running as root (only useful in specialized environments like docker containers). There are many other python packaging tools - pipenv_, poetry_, pdm_, etc - use whatever is most suitable for specific project/environment. pipx_ can be used to install command-line script without a module. More general info on python packaging can be found at `packaging.python.org`_. When changing code, unit tests can be run with ``python -m unittest`` from the local repository checkout. .. _pyaml: https://pypi.org/project/pyaml/ .. _unidecode: https://pypi.python.org/pypi/Unidecode .. _pip: https://pip.pypa.io/en/stable/ .. _venv: https://docs.python.org/3/library/venv.html .. _poetry: https://python-poetry.org/ .. _pipenv: https://pipenv.pypa.io/ .. _pdm: https://pdm.fming.dev/ .. _pipx: https://pypa.github.io/pipx/ .. _packaging.python.org: https://packaging.python.org/installing/ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703525951.0 pyaml-23.12.0/README.rst0000644000175000017500000002436314542337077014360 0ustar00fraggodfraggodpretty-yaml (or pyaml) ====================== PyYAML_-based python module to produce a bit more pretty and human-readable YAML-serialized data. This module is for serialization only, see `ruamel.yaml`_ module for literate YAML parsing (keeping track of comments, spacing, line/column numbers of values, etc). (side-note: to dump stuff parsed by ruamel.yaml with this module, use only ``YAML(typ='safe')`` there) It's a small module, and for projects that only need part of its functionality, I'd recommend copy-pasting that in, instead of adding janky dependency. .. _PyYAML: http://pyyaml.org/ .. _ruamel.yaml: https://bitbucket.org/ruamel/yaml/ .. contents:: :backlinks: none Repository URLs: - https://github.com/mk-fg/pretty-yaml - https://codeberg.org/mk-fg/pretty-yaml - https://fraggod.net/code/git/pretty-yaml Warning ------- Prime goal of this module is to produce human-readable output that can be easily diff'ed, manipulated and re-used, but maybe with occasional issues. So please do not rely on the thing to produce output that can always be deserialized exactly to what was exported, at least - use PyYAML directly for that (but maybe with options from the next section). What this module does and why ----------------------------- YAML is generally nice and easy format to read *if* it was written by humans. PyYAML can a do fairly decent job of making stuff readable, and the best combination of parameters for such output that I've seen so far is probably this one:: >>> m = [123, 45.67, {1: None, 2: False}, 'some text'] >>> data = dict(a='asldnsa\nasldpáknsa\n', b='whatever text', ma=m, mb=m) >>> yaml.safe_dump( data, sys.stdout, width=100, allow_unicode=True, default_flow_style=False ) a: 'asldnsa asldpáknsa ' b: whatever text ma: &id001 - 123 - 45.67 - 1: null 2: false - some text mb: *id001 pyaml (this module) tries to improve on that a bit, with the following tweaks: * Most human-friendly representation options in PyYAML (that I know of) are used as defaults - unicode, flow-style, width=100 (old default is 80). * Dump "null" values as empty values, if possible, which have the same meaning but reduce visual clutter and are easier to edit. * Dicts, sets, OrderedDicts, defaultdicts, namedtuples, enums, dataclasses, etc are represented as their safe YAML-compatible base (like int, list or mapping), with mappings key-sorted by default for more diff-friendly output. * Use shorter and simplier yes/no for booleans. * List items get indented, as they should be. * Attempt is made to pick more readable string representation styles, depending on the value, e.g.:: >>> yaml.safe_dump(cert, sys.stdout) cert: '-----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... >>> pyaml.p(cert): cert: | -----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... * "force_embed" option (default=yes) to avoid having &id stuff scattered all over the output. Might be more useful to disable it in some specific cases though. * "&idXYZ" anchors, when needed, get labels from the keys they get attached to, not just meaningless enumerators, e.g. "&users_-_admin" instead. * "string_val_style" option to only apply to strings that are values, not keys, i.e:: >>> pyaml.p(data, string_val_style='"') key: "value\nasldpáknsa\n" >>> yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_style='"') "key": "value\nasldpáknsa\n" * Add vertical spacing (empty lines) between keys on different depths, to separate long YAML sections in the output visually, make it more seekable. * Discard end-of-document "..." indicators for simple values. Result for the (rather meaningless) example above:: >>> pyaml.p(data, force_embed=False, vspacing=dict(split_lines=10)) a: | asldnsa asldpáknsa b: whatever text ma: &ma - 123 - 45.67 - 1: 2: no - some text mb: *ma (force_embed=False enabled deduplication with ``&ma`` anchor, vspacing is adjusted to split even this tiny output) ---------- Extended example:: >>> pyaml.dump(data, vspacing=dict(split_lines=10)) destination: encoding: xz: enabled: yes min_size: 5120 options: path_filter: - \.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$ - \.(rpm|deb|iso)$ - \.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$ - \.(sqlite3?|fossil|fsl)$ - \.git/objects/[0-9a-f]+/[0-9a-f]+$ result: append_to_file: append_to_lafs_dir: print_to_stdout: yes url: http://localhost:3456/uri filter: - /(CVS|RCS|SCCS|_darcs|\{arch\})/$ - /\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$ - /=(RELEASE-ID|meta-update|update)$ http: ca_certs_files: /etc/ssl/certs/ca-certificates.crt debug_requests: no request_pool_options: cachedConnectionTimeout: 600 maxPersistentPerHost: 10 retryAutomatically: yes logging: formatters: basic: datefmt: '%Y-%m-%d %H:%M:%S' format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s' handlers: console: class: logging.StreamHandler formatter: basic level: custom stream: ext://sys.stderr loggers: twisted: handlers: - console level: 0 root: handlers: - console level: custom Note that unless there are many moderately wide and deep trees of data, which are expected to be read and edited by people, it might be preferrable to directly use PyYAML regardless, as it won't introduce another (rather pointless in that case) dependency and a point of failure. Features and Tricks ------------------- * Pretty-print any yaml or json (yaml subset) file from the shell:: % python -m pyaml /path/to/some/file.yaml % pyaml < myfile.yml % curl -s https://www.githubstatus.com/api/v2/summary.json | pyaml ``pipx install pyaml`` can be a good way to only install "pyaml" command-line script. * Process and replace json/yaml file in-place:: % python -m pyaml -r mydata.yml * Easier "debug printf" for more complex data (all funcs below are aliases to same thing):: pyaml.p(stuff) pyaml.pprint(my_data) pyaml.pprint('----- HOW DOES THAT BREAKS!?!?', input_data, some_var, more_stuff) pyaml.print(data, file=sys.stderr) # needs "from __future__ import print_function" * Force all string values to a certain style (see info on these in `PyYAML docs`_):: pyaml.dump(many_weird_strings, string_val_style='|') pyaml.dump(multiline_words, string_val_style='>') pyaml.dump(no_want_quotes, string_val_style='plain') Using ``pyaml.add_representer()`` (note \*p\*yaml) as suggested in `this SO thread`_ (or `github-issue-7`_) should also work. See also this `amazing reply to StackOverflow#3790454`_ for everything about the many different string styles in YAML. * Control indent and width of the results:: pyaml.dump(wide_and_deep, indent=4, width=120) These are actually keywords for PyYAML Emitter (passed to it from Dumper), see more info on these in `PyYAML docs`_. * Dump multiple yaml documents into a file: ``pyaml.dump_all([data1, data2, data3], dst_file)`` explicit_start=True is implied, unless overidden by explicit_start=False. * Control thresholds for vertical spacing of values (0 = always space stuff out), and clump all oneliner ones at the top:: >>> pyaml.dump( data, sort_dicts=pyaml.PYAMLSort.oneline_group, vspacing=dict(split_lines=0, split_count=0) ) chart: axisCenteredZero: no axisColorMode: text axisLabel: '' axisPlacement: auto barAlignment: 0 drawStyle: line ... hideFrom: legend: no tooltip: no viz: no scaleDistribution: type: linear stacking: group: A mode: none Or same thing with cli tool ``-v/--vspacing`` option: ``pyaml -v 0/0g mydata.yaml`` .. _PyYAML docs: http://pyyaml.org/wiki/PyYAMLDocumentation#Scalars .. _this SO thread: http://stackoverflow.com/a/7445560 .. _github-issue-7: https://github.com/mk-fg/pretty-yaml/issues/7 .. _amazing reply to StackOverflow#3790454: https://stackoverflow.com/questions/3790454/how-do-i-break-a-string-in-yaml-over-multiple-lines/21699210#21699210 Installation ------------ It's a regular Python 3.8+ module/package, published on PyPI (as pyaml_). Module uses PyYAML_ for processing of the actual YAML files and should pull it in as a dependency. Dependency on unidecode_ module is optional and should only be useful with force_embed=False keyword (defaults to True), and same-id objects or recursion is used within serialized data - i.e. only when generating &some_key_id anchors is needed. If module is unavailable at runtime, anchor ids will be less like their keys and maybe not as nice. Using pip_ is how you generally install it, usually coupled with venv_ usage (which will also provide "pip" tool itself):: % pip install pyaml Current-git version can be installed like this:: % pip install git+https://github.com/mk-fg/pretty-yaml pip will default to installing into currently-active venv, then user's home directory (under ``~/.local/lib/python...``), and maybe system-wide when running as root (only useful in specialized environments like docker containers). There are many other python packaging tools - pipenv_, poetry_, pdm_, etc - use whatever is most suitable for specific project/environment. pipx_ can be used to install command-line script without a module. More general info on python packaging can be found at `packaging.python.org`_. When changing code, unit tests can be run with ``python -m unittest`` from the local repository checkout. .. _pyaml: https://pypi.org/project/pyaml/ .. _unidecode: https://pypi.python.org/pypi/Unidecode .. _pip: https://pip.pypa.io/en/stable/ .. _venv: https://docs.python.org/3/library/venv.html .. _poetry: https://python-poetry.org/ .. _pipenv: https://pipenv.pypa.io/ .. _pdm: https://pdm.fming.dev/ .. _pipx: https://pypa.github.io/pipx/ .. _packaging.python.org: https://packaging.python.org/installing/ ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2917736 pyaml-23.12.0/pyaml/0000755000175000017500000000000014542337551014000 5ustar00fraggodfraggod././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526021.0 pyaml-23.12.0/pyaml/__init__.py0000644000175000017500000002567414542337205016123 0ustar00fraggodfraggodimport os, sys, io, re, string, warnings, enum, pathlib, collections as cs import yaml PYAMLSort = enum.Enum('PYAMLSort', 'none keys oneline_group') class PYAMLDumper(yaml.dumper.SafeDumper): class str_ext(str): __slots__ = 'ext', pyaml_anchor_decode = None # imported from unidecode module when needed pyaml_sort_dicts = None def __init__( self, *args, sort_dicts=None, force_embed=True, string_val_style=None, anchor_len_max=40, **kws ): self.pyaml_force_embed = force_embed self.pyaml_string_val_style = string_val_style self.pyaml_anchor_len_max = anchor_len_max if isinstance(sort_dicts, PYAMLSort): if sort_dicts is sort_dicts.none: kws['sort_keys'] = False elif sort_dicts is sort_dicts.keys: kws['sort_keys'] = True else: self.pyaml_sort_dicts, kws['sort_keys'] = sort_dicts, False elif sort_dicts is not None: kws['sort_keys'] = sort_dicts # for compatibility return super().__init__(*args, **kws) @staticmethod def pyaml_transliterate(s): if unidecode_missing := not all(ord(c) < 128 for c in s): if (unidecode := PYAMLDumper.pyaml_anchor_decode) is None: try: from unidecode import unidecode except ImportError: unidecode = False PYAMLDumper.pyaml_anchor_decode = unidecode if unidecode: unidecode_missing, s = None, unidecode(s) return re.sub(r'[^-_a-z0-9]+', '_', s.lower()), unidecode_missing def anchor_node(self, node, hints=list()): if node in self.anchors: if self.anchors[node] is None and not self.pyaml_force_embed: if hints: nid, uc = self.pyaml_transliterate('_-_'.join(h.value for h in hints)) if len(nid) > (n := self.pyaml_anchor_len_max - 9) + 9: nid = f'{nid[:n//2]}-_-{nid[-n//2:]}_{self.generate_anchor(node)}' elif uc is True: nid = f'{nid}_{self.generate_anchor(node)}' else: nid = self.generate_anchor(node) self.anchors[node] = nid else: self.anchors[node] = None if isinstance(node, yaml.nodes.SequenceNode): for item in node.value: self.anchor_node(item) elif isinstance(node, yaml.nodes.MappingNode): for key, value in node.value: self.anchor_node(key) self.anchor_node(value, hints=hints+[key]) def serialize_node(self, node, parent, index): if self.pyaml_force_embed: self.anchors[node] = self.serialized_nodes.clear() return super().serialize_node(node, parent, index) def expect_block_sequence(self): self.increase_indent(flow=False, indentless=False) self.state = self.expect_first_block_sequence_item def expect_block_sequence_item(self, first=False): if not first and isinstance(self.event, yaml.events.SequenceEndEvent): self.indent = self.indents.pop() self.state = self.states.pop() else: self.write_indent() self.write_indicator('-', True, indention=True) self.states.append(self.expect_block_sequence_item) self.expect_node(sequence=True) def check_simple_key(self): res = super().check_simple_key() if self.analysis: self.analysis.allow_flow_plain = False return res def choose_scalar_style(self, _re1=re.compile(r':(\s|$)')): if self.states[-1] == self.expect_block_mapping_simple_value: # Mapping keys - disable overriding string style, strip comments if self.pyaml_string_val_style: self.event.style = 'plain' if isinstance(self.analysis.scalar, self.str_ext): self.analysis.scalar = str(self.event.value) # Do default thing for complicated stuff if self.event.style != 'plain': return super().choose_scalar_style() # Make sure style isn't overidden for strings like list/mapping items if (s := self.event.value).startswith('- ') or _re1.search(s): return "'" # Returned style=None picks write_plain in Emitter.process_scalar def write_indicator(self, indicator, *args, **kws): if indicator == '...': return # presumably it's useful somewhere, but don't care super().write_indicator(indicator, *args, **kws) def represent_str(self, data): if not (style := self.pyaml_string_val_style): if '\n' in data[:-1]: style = 'literal' for line in data.splitlines(): if len(line) > self.best_width: break else: style = '|' return yaml.representer.ScalarNode('tag:yaml.org,2002:str', data, style=style) def represent_mapping_sort_oneline(self, kv): key, value = kv if not value or isinstance(value, (int, float)): v = 1 elif isinstance(value, str) and '\n' not in value: v = 1 else: v = 2 if isinstance(key, (int, float)): k = 1 elif isinstance(key, str): k = 2 elif key is None: k = 4 else: k, key = 3, f'{type(key)}\0{key}' # best-effort sort for all other types return v, k, key def represent_mapping(self, tag, mapping, *args, **kws): if self.pyaml_sort_dicts is PYAMLSort.oneline_group: try: mapping = dict(sorted( mapping.items(), key=self.represent_mapping_sort_oneline )) except TypeError: pass # for subtype comparison fails return super().represent_mapping(tag, mapping, *args, **kws) def represent_undefined(self, data): if isinstance(data, tuple) and hasattr(data, '_make') and hasattr(data, '_asdict'): return self.represent_dict(data._asdict()) # assuming namedtuple if isinstance(data, cs.abc.Mapping): return self.represent_dict(data) # dict-like if type(data).__class__.__module__ == 'enum': node = self.represent_data(data.value) node.value = self.str_ext(node.value) node.value.ext = f'# {data.__class__.__name__}.{data.name}' return node if hasattr(type(data), '__dataclass_fields__'): try: import dataclasses as dcs except ImportError: pass # can still be something else else: return self.represent_dict(dcs.asdict(data)) try: # this is for numpy arrays, and the likes if not callable(getattr(data, 'tolist', None)): raise AttributeError except: pass # can raise other errors with custom types else: return self.represent_data(data.tolist()) return super().represent_undefined(data) # will raise RepresenterError def write_ext(self, func, text, *args, **kws): # Emitter write-funcs extension to append comments to values getattr(super(), f'write_{func}')(text, *args, **kws) if ext := getattr(text, 'ext', None): super().write_plain(ext) write_folded = lambda s,v,*a,**kw: s.write_ext('folded', v, *a, **kw) write_literal = lambda s,v,*a,**kw: s.write_ext('literal', v, *a, **kw) write_plain = lambda s,v,*a,**kw: s.write_ext('plain', v, *a, **kw) # Unsafe was a separate class in <23.x versions, left here for compatibility UnsafePYAMLDumper = PYAMLDumper add_representer = PYAMLDumper.add_representer add_representer( bool, lambda s,o: s.represent_scalar('tag:yaml.org,2002:bool', ['no', 'yes'][o]) ) add_representer( type(None), lambda s,o: s.represent_scalar('tag:yaml.org,2002:null', '') ) add_representer(str, PYAMLDumper.represent_str) add_representer(cs.defaultdict, PYAMLDumper.represent_dict) add_representer(cs.OrderedDict, PYAMLDumper.represent_dict) add_representer(set, PYAMLDumper.represent_list) add_representer(type(pathlib.Path('')), lambda cls,o: cls.represent_data(str(o))) add_representer(None, PYAMLDumper.represent_undefined) def dump_add_vspacing(yaml_str, split_lines=40, split_count=2, oneline_group=False): '''Add some newlines to separate overly long YAML lists/mappings. "long" means both >split_lines in length and has >split_count items.''' def _add_vspacing(lines): a = a_seq = ind_re = ind_re_sub = None blocks, item_lines = list(), list() for n, line in enumerate(lines): if ind_re is None and (m := re.match(r'( *)([^# ].?)', line)): ind_re = re.compile(m[1] + r'\S') lines.append(f'{m[1]}.') # for last add_vspacing if ind_re_sub: if ind_re_sub.match(line): continue if n - a > split_lines and (block := lines[a:n]): if a_seq: block.insert(0, lines[a-1].replace('- ', ' ', 1)) blocks.append((a, n, _add_vspacing(block)[a_seq:])) ind_re_sub = None if ind_re.match(line): item_lines.append(n) if m := re.match(r'( *)(- )?\S.*:\s*$', line): a, a_seq, ind_re_sub = n+1, bool(m[2]), re.compile(m[1] + ' ') if split_items := len(lines) > split_lines and len(item_lines) > split_count: for n in item_lines: try: if oneline_group and ind_re and ( ind_re.match(lines[n-1]) and ind_re.match(lines[n+1]) ): continue except IndexError: continue lines[n] = f'\n{lines[n]}' for a, b, block in reversed(blocks): lines[a:b] = block if ind_re: lines.pop() if split_items: lines.append('') return lines yaml_str = '\n'.join(_add_vspacing(yaml_str.splitlines())) return re.sub(r'\n\n+', '\n\n', yaml_str.strip() + '\n') def dump( data, dst=None, safe=None, force_embed=True, vspacing=True, string_val_style=None, sort_dicts=None, multiple_docs=False, width=100, **pyyaml_kws ): '''Serialize data as pretty-YAML to specified dst file-like object, or return as str with dst=str (default) or encoded to bytes with dst=bytes.''' if safe is not None: cat = DeprecationWarning if not safe else UserWarning warnings.warn( 'pyaml module "safe" arg/keyword is ignored as implicit' ' safe=maybe-true?, as of pyaml >= 23.x', category=cat, stacklevel=2 ) if sort_dicts is not None and not isinstance(sort_dicts, PYAMLSort): warnings.warn( 'Using pyaml module sort_dicts as boolean is deprecated as of' ' pyaml >= 23.x - translated to sort_keys PyYAML keyword, use that instead', DeprecationWarning, stacklevel=2 ) if stream := pyyaml_kws.pop('stream', None): if dst is not None and stream is not dst: raise TypeError( 'Using different pyaml dst=' ' and pyyaml stream= options at the same time is not supported' ) dst = stream elif dst is None: dst = str # old default buff = io.StringIO() Dumper = lambda *a,**kw: PYAMLDumper( *a, **kw, force_embed=force_embed, string_val_style=string_val_style, sort_dicts=sort_dicts ) if not multiple_docs: data = [data] else: pyyaml_kws.setdefault('explicit_start', True) yaml.dump_all( data, buff, Dumper=Dumper, width=width, default_flow_style=False, allow_unicode=True, **pyyaml_kws ) buff = buff.getvalue() if vspacing not in [None, False]: if vspacing is True: vspacing = dict() elif not isinstance(vspacing, dict): warnings.warn( 'Unsupported pyaml "vspacing" parameter type:' f' [{vspacing.__class__.__name__}] {vspacing}\n' 'As of pyaml >= 23.x it should be either True or keywords-dict' ' for pyaml_add_vspacing, and any other values are ignored,' ' enabling default vspacing behavior.', DeprecationWarning, stacklevel=2 ) vspacing = dict() if sort_dicts is PYAMLSort.oneline_group: vspacing.setdefault('oneline_group', True) buff = dump_add_vspacing(buff, **vspacing) if dst is bytes: return buff.encode() elif dst is str: return buff else: try: dst.write(b'') # tests if dst is str- or bytestream except: dst.write(buff) else: dst.write(buff.encode()) # Simplier pyaml.dump() aliases def dump_all(data, *dump_args, **dump_kws): return dump(data, *dump_args, multiple_docs=True, **dump_kws) def dumps(data, **dump_kws): return dump(data, **dump_kws) def pprint(*data, **dump_kws): dst = dump_kws.pop('file', dump_kws.pop('dst', sys.stdout)) if len(data) == 1: data, = data dump(data, dst=dst, **dump_kws) p, _p = pprint, print print = pprint ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1694280036.0 pyaml-23.12.0/pyaml/__main__.py0000644000175000017500000000011514477124544016072 0ustar00fraggodfraggodimport sys, pyaml.cli if __name__ == '__main__': sys.exit(pyaml.cli.main()) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1694579200.0 pyaml-23.12.0/pyaml/cli.py0000644000175000017500000001015214500235000015074 0ustar00fraggodfraggodimport os, sys, re, stat, json, tempfile, contextlib import yaml, pyaml @contextlib.contextmanager def safe_replacement(path, *open_args, mode=None, xattrs=None, **open_kws): 'Context to atomically create/replace file-path in-place unless errors are raised' path, xattrs = str(path), None if mode is None: try: mode = stat.S_IMODE(os.lstat(path).st_mode) except FileNotFoundError: pass if xattrs is None and getattr(os, 'getxattr', None): # MacOS try: xattrs = dict((k, os.getxattr(path, k)) for k in os.listxattr(path)) except FileNotFoundError: pass open_kws.update( delete=False, dir=os.path.dirname(path), prefix=os.path.basename(path)+'.' ) if not open_args: open_kws.setdefault('mode', 'w') with tempfile.NamedTemporaryFile(*open_args, **open_kws) as tmp: try: if mode is not None: os.fchmod(tmp.fileno(), mode) if xattrs: for k, v in xattrs.items(): os.setxattr(path, k, v) yield tmp if not tmp.closed: tmp.flush() try: os.fdatasync(tmp) except AttributeError: pass # MacOS os.rename(tmp.name, path) finally: try: os.unlink(tmp.name) except FileNotFoundError: pass def main(argv=None, stdin=sys.stdin, stdout=sys.stdout, stderr=sys.stderr): import argparse, textwrap dd = lambda text: re.sub( r' \t+', ' ', textwrap.dedent(text).strip('\n') + '\n' ).replace('\t', ' ') parser = argparse.ArgumentParser( formatter_class=argparse.RawTextHelpFormatter, description='Process and dump prettified YAML to stdout.') parser.add_argument('path', nargs='?', metavar='path', help='Path to YAML to read (default: use stdin).') parser.add_argument('-r', '--replace', action='store_true', help='Replace specified path with prettified version in-place.') parser.add_argument('-w', '--width', type=int, metavar='chars', help=dd(''' Max line width hint to pass to pyyaml for the dump. Only used to format scalars and collections (e.g. lists).''')) parser.add_argument('-v', '--vspacing', metavar='N[/M][g]', help=dd(''' Custom thresholds for when to add vertical spacing (empty lines), to visually separate items in overly long YAML lists/mappings. "long" means both >split-lines in line-length and has >split-count items. Value has N[/M][g] format, with default being something like 40/2. N = min number of same-indent lines in a section to split. M = min count of values in a list/mapping to split. "g" can be added to clump single-line values at the top of such lists/maps. Values examples: 20g, 5/1g, 60/4, g, 10.''')) parser.add_argument('-q', '--quiet', action='store_true', help='Disable sanity-check on the output and suppress stderr warnings.') opts = parser.parse_args(sys.argv[1:] if argv is None else argv) if opts.replace and not opts.path: parser.error('-r/--replace option can only be used with a file path, not stdin') src = open(opts.path) if opts.path else stdin try: data = yaml.safe_load(src) finally: src.close() pyaml_kwargs = dict() if opts.width: pyaml_kwargs['width'] = opts.width if vspacing := opts.vspacing: if vspacing.endswith('g'): pyaml_kwargs['sort_dicts'] = pyaml.PYAMLSort.oneline_group vspacing = vspacing.strip('g') if vspacing: vspacing, (lines, _, count) = dict(), vspacing.strip().strip('/').partition('/') if lines: vspacing['split_lines'] = int(lines.strip()) if count: vspacing['split_count'] = int(count.strip()) pyaml_kwargs['vspacing'] = vspacing ys = pyaml.dump(data, **pyaml_kwargs) if not opts.quiet: try: data_chk = yaml.safe_load(ys) try: data_hash = json.dumps(data, sort_keys=True) except: pass # too complex for checking with json else: if json.dumps(data_chk, sort_keys=True) != data_hash: raise AssertionError('Data from before/after pyaml does not match') except Exception as err: p_err = lambda *a,**kw: print(*a, **kw, file=stderr, flush=True) p_err( 'WARNING: Failed to parse produced YAML' ' output back to data, it is likely too complicated for pyaml' ) err = f'[{err.__class__.__name__}] {err}' p_err(' raised error: ' + ' // '.join(map(str.strip, err.split('\n')))) if opts.replace: with safe_replacement(opts.path) as tmp: tmp.write(ys) else: stdout.write(ys) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2927737 pyaml-23.12.0/pyaml/tests/0000755000175000017500000000000014542337551015142 5ustar00fraggodfraggod././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1554927782.0 pyaml-23.12.0/pyaml/tests/__init__.py0000644000175000017500000000000013453450246017236 0ustar00fraggodfraggod././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1694580351.0 pyaml-23.12.0/pyaml/tests/test_cli.py0000644000175000017500000001217314500237177017323 0ustar00fraggodfraggodimport collections as cs, functools as ft import os, sys, io, enum, unittest, json, tempfile import yaml try: import pyaml.cli except ImportError: sys.path.insert(1, os.path.join(__file__, *['..']*3)) import pyaml.cli class test_const(enum.IntEnum): dispatch = 2455 heartbeat = 123 data = dict( key='value', path='/some/path', query_dump=dict( key1='тест1', key2='тест2', key3='тест3', последний=None ), ids=dict(), a=[1,None,'asd', 'не-ascii'], b=3.5, c=None ) data['query_dump_clone'] = data['query_dump'] data['ids']['id в уникоде'] = [4, 5, 6] data['ids']['id2 в уникоде'] = data['ids']['id в уникоде'] data["'asd'\n!\0\1"] = dict(b=1, a=2) class CliToolTests(unittest.TestCase): def data_hash(self, data): return json.dumps(data, sort_keys=True) def pyaml_dump_corrupted(self, dump, *args, append=None, **kws): out = dump(*args, **kws) if append: out += append return out def test_success(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() ys = yaml.safe_dump(d) pyaml.cli.main( argv=list(), stdin=io.StringIO(ys), stdout=out, stderr=err ) yaml.safe_load(out.getvalue()) self.assertGreater(len(out.getvalue()), 150) self.assertEqual(err.getvalue(), '') d.update( d=test_const.heartbeat, asd=cs.OrderedDict(b=1, a=2) ) ys = pyaml.dump(d) pyaml.cli.main( argv=list(), stdin=io.StringIO(ys), stdout=out, stderr=err ) yaml.safe_load(out.getvalue()) self.assertGreater(len(out.getvalue()), 150) self.assertEqual(err.getvalue(), '') def test_load_fail(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() ys = yaml.safe_dump(d) + '\0asd : fgh : ghj\0' with self.assertRaises(yaml.YAMLError): pyaml.cli.main( argv=list(), stdin=io.StringIO(ys), stdout=out, stderr=err ) def test_out_broken(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() pyaml_dump, pyaml.dump = pyaml.dump, ft.partial( self.pyaml_dump_corrupted, pyaml.dump, append='\0asd : fgh : ghj\0' ) try: ys = yaml.safe_dump(d) pyaml.cli.main( argv=list(), stdin=io.StringIO(ys), stdout=out, stderr=err ) with self.assertRaises(yaml.YAMLError): yaml.safe_load(out.getvalue()) self.assertGreater(len(out.getvalue()), 150) self.assertRegex(err.getvalue(), r'^WARNING:') finally: pyaml.dump = pyaml_dump def test_out_mismatch(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() pyaml_dump, pyaml.dump = pyaml.dump, ft.partial( self.pyaml_dump_corrupted, pyaml.dump, append='\nextra-key: value' ) try: ys = yaml.safe_dump(d) pyaml.cli.main( argv=list(), stdin=io.StringIO(ys), stdout=out, stderr=err ) yaml.safe_load(out.getvalue()) self.assertGreater(len(out.getvalue()), 150) self.assertRegex(err.getvalue(), r'^WARNING:') finally: pyaml.dump = pyaml_dump def test_out_err_nocheck(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() pyaml_dump, pyaml.dump = pyaml.dump, ft.partial( self.pyaml_dump_corrupted, pyaml.dump, append='\0asd : fgh : ghj\0' ) try: ys = yaml.safe_dump(d) pyaml.cli.main( argv=['-q'], stdin=io.StringIO(ys), stdout=out, stderr=err ) with self.assertRaises(yaml.YAMLError): yaml.safe_load(out.getvalue()) self.assertGreater(len(out.getvalue()), 150) self.assertEqual(err.getvalue(), '') finally: pyaml.dump = pyaml_dump def test_replace(self): d, out, err = data.copy(), io.StringIO(), io.StringIO() sys_out, sys_err = sys.stdout, sys.stderr with self.assertRaises(SystemExit): sys.stdout, sys.stderr = out, err try: pyaml.cli.main( argv=['-r'], stdin=io.StringIO(), stdout=out, stderr=err ) finally: sys.stdout, sys.stderr = sys_out, sys_err self.assertEqual(out.getvalue(), '') self.assertGreater(len(err.getvalue()), 50) err.seek(0); err.truncate() with tempfile.NamedTemporaryFile(prefix='.pyaml.test.') as tmp: d_json, d_yaml = json.dumps(d).encode(), pyaml.dump(d, bytes) tmp.write(d_json); tmp.flush() os.fchmod(tmp.fileno(), 0o1510) stat_tmp = os.fstat(tmp.fileno()) pyaml.cli.main( argv=['-r', tmp.name], stdin=io.StringIO(), stdout=out, stderr=err ) with open(tmp.name, 'rb') as tmp_new: d_new, stat_new = tmp_new.read(), os.fstat(tmp_new.fileno()) self.assertEqual(out.getvalue(), '') self.assertEqual(err.getvalue(), '') tmp.seek(0); d_tmp = tmp.read() self.assertEqual(d_tmp, d_json) self.assertNotEqual(d_tmp, d_new) self.assertNotIn(d_json, d_new) self.assertEqual(yaml.safe_load(d_new), d) self.assertEqual( (stat_tmp.st_mode, stat_tmp.st_uid, stat_tmp.st_gid), (stat_new.st_mode, stat_new.st_uid, stat_new.st_gid) ) os.chmod(tmp.name, 0o600) with open(tmp.name, 'r+') as tmp_new: tmp_new.write('\0asd : fgh : ghj\0') tmp_new.seek(0); d_new = tmp_new.read() with self.assertRaises(yaml.YAMLError): pyaml.cli.main( argv=['-r', tmp.name], stdin=io.StringIO(), stdout=out, stderr=err ) self.assertEqual(out.getvalue(), '') self.assertEqual(err.getvalue(), '') with open(tmp.name, 'r') as tmp_new: tmp_new.seek(0); d_new2 = tmp_new.read() self.assertEqual(d_new, d_new2) if __name__ == '__main__': unittest.main() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703525748.0 pyaml-23.12.0/pyaml/tests/test_dump.py0000644000175000017500000005175714542336564017542 0ustar00fraggodfraggodimport os, sys, io, re, unittest, json, enum, textwrap, collections as cs import yaml try: import pyaml except ImportError: sys.path.insert(1, os.path.join(__file__, *['..']*3)) import pyaml try: import unidecode except ImportError: unidecode = None large_yaml = br''' ### Default (baseline) configuration parameters. ### DO NOT ever change this config, use -c commandline option instead! # Note that this file is YAML, so YAML types can be used here, see http://yaml.org/type/ # For instance, large number can be specified as "10_000_000" or "!!float 10e6". source: # Path or glob pattern (to match path) to backup, required path: # example: /srv/backups/weekly.* queue: # Path to intermediate backup queue-file (list of paths to upload), required path: # example: /srv/backups/queue.txt # Don't rebuild queue-file if it's newer than source.path check_mtime: true entry_cache: # Path to persistent db (sqlite) of remote directory nodes, required path: # example: /srv/backups/dentries.sqlite # How to pick a path among those matched by "path" glob pick_policy: alphasort_last # only one supported destination: # URL of Tahoe-LAFS node webapi url: http://localhost:3456/uri result: # what to do with a cap (URI) of a resulting tree (with full backup) print_to_stdout: true # Append the entry to the specified file (creating it, if doesn't exists) # Example entry: "2012-10-10T23:12:43.904543 /srv/backups/weekly.2012-10-10 URI:DIR2-CHK:..." append_to_file: # example: /srv/backups/lafs_caps # Append the entry to specified tahoe-lafs directory (i.e. put it into that dir) append_to_lafs_dir: # example: URI:DIR2:... encoding: xz: enabled: true options: # see lzma.LZMAOptions, empty = module defaults min_size: 5120 # don't compress files smaller than 5 KiB (unless overidden in "path_filter") path_filter: # List of include/exclude regexp path-rules, similar to "filter" section below. # Same as with "filter", rules can be tuples with '+' or '-' (implied for strings) as first element. # '+' will indicate that file is compressible, if it's size >= "min_size" option. # Unlike "filter", first element of rule-tuple can also be a number, # overriding "min_size" parameter for matched (by that rule) paths. # If none of the patterns match path, file is handled as if it was matched by '+' rule. - '\.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$' - '\.(rpm|deb|iso)$' - '\.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$' - '\.(sqlite3?|fossil|fsl)$' - '\.git/objects/[0-9a-f]+/[0-9a-f]+$' # - [500, '\.(txt|csv|log|md|rst|cat|(ba|z|k|c|fi)?sh|env)$'] # - [500, '\.(cgi|py|p[lm]|php|c|h|[ce]l|lisp|hs|patch|diff|xml|xsl|css|x?html[45]?|js)$'] # - [500, '\.(co?nf|cfg?|li?st|ini|ya?ml|jso?n|vg|tab)(\.(sample|default|\w+-new))?$'] # - [500, '\.(unit|service|taget|mount|desktop|rules|rc|menu)$'] # - [2000, '^/etc/'] http: request_pool_options: maxPersistentPerHost: 10 cachedConnectionTimeout: 600 retryAutomatically: true ca_certs_files: /etc/ssl/certs/ca-certificates.crt # can be a list debug_requests: false # insecure! logs will contain tahoe caps filter: # Either tuples like "[action ('+' or '-'), regexp]" or just exclude-patterns (python # regexps) to match relative (to source.path, starting with "/") paths to backup. # Patterns are matched against each path in order they're listed here. # Leaf directories are matched with the trailing slash # (as with rsync) to be distinguishable from files with the same name. # If path doesn't match any regexp on the list, it will be included. # # Examples: # - ['+', '/\.git/config$'] # backup git repository config files # - '/\.git/' # *don't* backup any repository objects # - ['-', '/\.git/'] # exactly same thing as above (redundant) # - '/(?i)\.?svn(/.*|ignore)$' # exclude (case-insensitive) svn (or .svn) paths and ignore-lists - '/(CVS|RCS|SCCS|_darcs|\{arch\})/$' - '/\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$' - '/=(RELEASE-ID|meta-update|update)$' operation: queue_only: false # only generate upload queue file, don't upload anything reuse_queue: false # don't generate upload queue file, use existing one as-is disable_deduplication: false # make no effort to de-duplicate data (should still work on tahoe-level for files) # Rate limiting might be useful to avoid excessive cpu/net usage on nodes, # and especially when uploading to rate-limited api's (like free cloud storages). # Only used when uploading objects to the grid, not when building queue file. # Format of each value is "interval[:burst]", where "interval" can be specified as rate (e.g. "1/3e5"). # Simple token bucket algorithm is used. Empty values mean "no limit". # Examples: # "objects: 1/10:50" - 10 objects per second, up to 50 at once (if rate was lower before). # "objects: 0.1:50" - same as above. # "objects: 10:20" - 1 object in 10 seconds, up to 20 at once. # "objects: 5" - make interval between object uploads equal 5 seconds. # "bytes: 1/3e6:50e6" - 3 MB/s max, up to 50 MB/s if connection was underutilized before. rate_limit: bytes: # limit on rate of *file* bytes upload, example: 1/3e5:20e6 objects: # limit on rate of uploaded objects, example: 10:50 logging: # see http://docs.python.org/library/logging.config.html # "custom" level means WARNING/DEBUG/NOISE, depending on CLI options warnings: true # capture python warnings sql_queries: false # log executed sqlite queries (very noisy, caps will be there) version: 1 formatters: basic: format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s' datefmt: '%Y-%m-%d %H:%M:%S' handlers: console: class: logging.StreamHandler stream: ext://sys.stderr formatter: basic level: custom debug_logfile: class: logging.handlers.RotatingFileHandler filename: /srv/backups/debug.log formatter: basic encoding: utf-8 maxBytes: 5242880 # 5 MiB backupCount: 2 level: NOISE loggers: twisted: handlers: [console] level: 0 root: level: custom handlers: [console] ''' class test_const(enum.IntEnum): dispatch = 2455 heartbeat = 123 data = dict( path='/some/path', query_dump=cs.OrderedDict([ ('key1', 'тест1'), ('key2', 'тест2'), ('key3', 'тест3'), ('последний', None) ]), ids=cs.OrderedDict(), a=[1,None,'asd', 'не-ascii'], b=3.5, c=None, d=test_const.dispatch, asd=cs.OrderedDict([('b', 1), ('a', 2)]) ) data['query_dump_clone'] = data['query_dump'] data['ids']['id в уникоде'] = [4, 5, 6] data['ids']['id2 в уникоде'] = data['ids']['id в уникоде'] data["'asd'\n!\0\1"] = cs.OrderedDict([('b', 1), ('a', 2)]) data_str_multiline = dict(cert=( '-----BEGIN CERTIFICATE-----\n' 'MIIDUjCCAjoCCQD0/aLLkLY/QDANBgkqhkiG9w0BAQUFADBqMRAwDgYDVQQKFAdm\n' 'Z19jb3JlMRYwFAYDVQQHEw1ZZWthdGVyaW5idXJnMR0wGwYDVQQIExRTdmVyZGxv\n' 'dnNrYXlhIG9ibGFzdDELMAkGA1UEBhMCUlUxEjAQBgNVBAMTCWxvY2FsaG9zdDAg\n' 'Fw0xMzA0MjQwODUxMTRaGA8yMDUzMDQxNDA4NTExNFowajEQMA4GA1UEChQHZmdf\n' 'Y29yZTEWMBQGA1UEBxMNWWVrYXRlcmluYnVyZzEdMBsGA1UECBMUU3ZlcmRsb3Zz\n' 'a2F5YSBvYmxhc3QxCzAJBgNVBAYTAlJVMRIwEAYDVQQDEwlsb2NhbGhvc3QwggEi\n' 'MA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCnZr3jbhfb5bUhORhmXOXOml8N\n' 'fAli/ak6Yv+LRBtmOjke2gFybPZFuXYr0lYGQ4KgarN904vEg7WUbSlwwJuszJxQ\n' 'Lz3xSDqQDqF74m1XeBYywZQIywKIbA/rfop3qiMeDWo3WavYp2kaxW28Xd/ZcsTd\n' 'bN/eRo+Ft1bor1VPiQbkQKaOOi6K8M9a/2TK1ei2MceNbw6YrlCZe09l61RajCiz\n' 'y5eZc96/1j436wynmqJn46hzc1gC3APjrkuYrvUNKORp8y//ye+6TX1mVbYW+M5n\n' 'CZsIjjm9URUXf4wsacNlCHln1nwBxUe6D4e2Hxh2Oc0cocrAipxuNAa8Afn5AgMB\n' 'AAEwDQYJKoZIhvcNAQEFBQADggEBADUHf1UXsiKCOYam9u3c0GRjg4V0TKkIeZWc\n' 'uN59JWnpa/6RBJbykiZh8AMwdTonu02g95+13g44kjlUnK3WG5vGeUTrGv+6cnAf\n' '4B4XwnWTHADQxbdRLja/YXqTkZrXkd7W3Ipxdi0bDCOSi/BXSmiblyWdbNU4cHF/\n' 'Ex4dTWeGFiTWY2upX8sa+1PuZjk/Ry+RPMLzuamvzP20mVXmKtEIfQTzz4b8+Pom\n' 'T1gqPkNEbe2j1DciRNUOH1iuY+cL/b7JqZvvdQK34w3t9Cz7GtMWKo+g+ZRdh3+q\n' '2sn5m3EkrUb1hSKQbMWTbnaG4C/F3i4KVkH+8AZmR9OvOmZ+7Lo=\n' '-----END CERTIFICATE-----' )) data_str_long = dict(cert=( 'MIIDUjCCAjoCCQD0/aLLkLY/QDANBgkqhkiG9w0BAQUFADBqMRAwDgYDVQQKFAdm' 'Z19jb3JlMRYwFAYDVQQHEw1ZZWthdGVyaW5idXJnMR0wGwYDVQQIExRTdmVyZGxv' 'dnNrYXlhIG9ibGFzdDELMAkGA1UEBhMCUlUxEjAQBgNVBAMTCWxvY2FsaG9zdDAg' 'Fw0xMzA0MjQwODUxMTRaGA8yMDUzMDQxNDA4NTExNFowajEQMA4GA1UEChQHZmdf' 'Y29yZTEWMBQGA1UEBxMNWWVrYXRlcmluYnVyZzEdMBsGA1UECBMUU3ZlcmRsb3Zz' 'a2F5YSBvYmxhc3QxCzAJBgNVBAYTAlJVMRIwEAYDVQQDEwlsb2NhbGhvc3QwggEi' 'MA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCnZr3jbhfb5bUhORhmXOXOml8N' 'fAli/ak6Yv+LRBtmOjke2gFybPZFuXYr0lYGQ4KgarN904vEg7WUbSlwwJuszJxQ' 'Lz3xSDqQDqF74m1XeBYywZQIywKIbA/rfop3qiMeDWo3WavYp2kaxW28Xd/ZcsTd' 'bN/eRo+Ft1bor1VPiQbkQKaOOi6K8M9a/2TK1ei2MceNbw6YrlCZe09l61RajCiz' 'y5eZc96/1j436wynmqJn46hzc1gC3APjrkuYrvUNKORp8y//ye+6TX1mVbYW+M5n' 'CZsIjjm9URUXf4wsacNlCHln1nwBxUe6D4e2Hxh2Oc0cocrAipxuNAa8Afn5AgMB' 'AAEwDQYJKoZIhvcNAQEFBQADggEBADUHf1UXsiKCOYam9u3c0GRjg4V0TKkIeZWc' 'uN59JWnpa/6RBJbykiZh8AMwdTonu02g95+13g44kjlUnK3WG5vGeUTrGv+6cnAf' '4B4XwnWTHADQxbdRLja/YXqTkZrXkd7W3Ipxdi0bDCOSi/BXSmiblyWdbNU4cHF/' 'Ex4dTWeGFiTWY2upX8sa+1PuZjk/Ry+RPMLzuamvzP20mVXmKtEIfQTzz4b8+Pom' 'T1gqPkNEbe2j1DciRNUOH1iuY+cL/b7JqZvvdQK34w3t9Cz7GtMWKo+g+ZRdh3+q' '2sn5m3EkrUb1hSKQbMWTbnaG4C/F3i4KVkH+8AZmR9OvOmZ+7Lo=' )) class DumpTests(unittest.TestCase): def yaml_var(self, ys, raw=False): ys = textwrap.dedent(ys).replace('\t', ' ') return ys if raw else yaml.safe_load(ys) def flatten(self, data, path=tuple()): dst = list() if isinstance(data, (tuple, list)): for v in data: dst.extend(self.flatten(v, path + ('!!list',))) elif isinstance(data, dict): for k,v in data.items(): dst.extend(self.flatten(v, path + (k,))) else: dst.append((path, data)) return tuple(sorted(dst, key=lambda v: json.dumps(v, sort_keys=True))) def pos_list(self, ys, sep='\n'): pos, pos_list = 0, list() while True: pos = ys.find(sep, pos+1) if pos < 0: break pos_list.append(pos) return pos_list def empty_line_list(self, ys): return list(n for n, line in enumerate(ys.splitlines()) if not line) def test_dst(self): buff = io.BytesIO() self.assertIs(pyaml.dump(data, buff), None) self.assertIsInstance(pyaml.dump(data, str), str) self.assertIsInstance(pyaml.dump(data, bytes), bytes) def test_simple(self): a = self.flatten(data) b = pyaml.dump(data) self.assertEqual(a, self.flatten(yaml.safe_load(b))) def test_vspacing(self): data = yaml.safe_load(large_yaml) a = self.flatten(data) b = pyaml.dump(data, vspacing=dict(split_lines=10, split_count=2)) self.assertEqual(a, self.flatten(yaml.safe_load(b))) self.assertEqual( self.pos_list(b, '\n'), [12, 13, 25, 33, 52, 73, 88, 107, 157, 184, 264, 299, 344, 345, 355, 375, 399, 424, 425, 458, 459, 467, 505, 561, 600, 601, 607, 660, 681, 705, 738, 767, 795, 796, 805, 806, 820, 831, 866, 936, 937, 949, 950, 963, 998, 1021, 1041, 1072, 1073, 1092, 1113, 1163, 1185, 1224, 1247, 1266, 1290, 1291, 1302, 1315, 1331, 1349, 1364, 1365, 1373, 1387, 1403, 1421, 1422, 1440, 1441, 1454, 1455, 1471, 1472, 1483, 1511, 1528, 1542, 1553, 1566, 1584, 1585, 1593, 1608, 1618, 1626, 1656, 1665, 1686, 1696] ) b = pyaml.dump(data, vspacing=False) self.assertNotIn('\n\n', b) def test_ids(self): b = pyaml.dump(data, force_embed=False) self.assertNotIn('&id00', b) self.assertIn('query_dump_clone: *query_dump_clone', b) self.assertIn('id в уникоде: &ids_-_id2_', b) if not unidecode: self.assertIn('id в уникоде: &ids_-_id2__id00', b) def test_ids_unidecode(self): if not unidecode: self.skipTest('No unidecode module to test ids from non-ascii keys') b = pyaml.dump(data, force_embed=False) self.assertNotIn('&id00', b) self.assertNotIn('_id00', b) self.assertIn('id в уникоде: &ids_-_id2_v_unikode', b) def test_force_embed(self): for check, fe in (self.assertNotIn, True), (self.assertIn, False): dump = pyaml.dump(data, force_embed=fe) for c in '*&': check(c, dump) def test_encoding(self): b = pyaml.dump(data, force_embed=True) b_lines = list(map(str.strip, b.splitlines())) chk = ['query_dump:', 'key1: тест1', 'key2: тест2', 'key3: тест3', 'последний:'] pos = b_lines.index('query_dump:') self.assertEqual(b_lines[pos:pos + len(chk)], chk) def test_str_long(self): b = pyaml.dump(data_str_long) self.assertNotIn('"', b) self.assertNotIn("'", b) self.assertEqual(len(b.splitlines()), 1) def test_str_multiline(self): b = pyaml.dump(data_str_multiline) b_lines = b.splitlines() self.assertGreater(len(b_lines), len(data_str_multiline['cert'].splitlines())) for line in b_lines: self.assertLess(len(line), 100) def test_dumps(self): b = pyaml.dumps(data_str_multiline) self.assertIsInstance(b, str) def test_print(self): self.assertIs(pyaml.print, pyaml.pprint) self.assertIs(pyaml.print, pyaml.p) buff = io.BytesIO() b = pyaml.dump(data_str_multiline, dst=bytes) pyaml.print(data_str_multiline, file=buff) self.assertEqual(b, buff.getvalue()) def test_print_args(self): buff = io.BytesIO() args = 1, 2, 3 b = pyaml.dump(args, dst=bytes) pyaml.print(*args, file=buff) self.assertEqual(b, buff.getvalue()) def test_str_styles(self): a = pyaml.dump(data_str_multiline) b = pyaml.dump(data_str_multiline, string_val_style='|') self.assertEqual(a, b) b = pyaml.dump(data_str_multiline, string_val_style='plain') self.assertNotEqual(a, b) c = pyaml.dump(data_str_multiline, string_val_style='literal') self.assertNotEqual(c, a) self.assertNotEqual(c, b) self.assertTrue(pyaml.dump('waka waka', string_val_style='|').startswith('|-\n')) a = pyaml.dump(data_int := dict(a=123)) self.assertEqual(a, 'a: 123\n') self.assertEqual(pyaml.dump(data_int, string_val_style='|'), a) self.assertEqual(pyaml.dump(data_int, string_val_style='literal'), a) a = pyaml.dump(data_str := dict(a='123')) b = pyaml.dump(data_str, string_val_style='|') self.assertEqual(a, "a: '123'\n") self.assertEqual(self.flatten(data_str), self.flatten(yaml.safe_load(a))) self.assertNotEqual(a, b) self.assertEqual(self.flatten(data_str), self.flatten(yaml.safe_load(b))) def test_colons_in_strings(self): val1 = {'foo': ['bar:', 'baz', 'bar:bazzo', 'a: b'], 'foo:': 'yak:'} val1_str = pyaml.dump(val1) val2 = yaml.safe_load(val1_str) val2_str = pyaml.dump(val2) val3 = yaml.safe_load(val2_str) self.assertEqual(val1, val2) self.assertEqual(val1_str, val2_str) self.assertEqual(val2, val3) def test_unqouted_spaces(self): val1 = {'key': 'word1 word2 word3', 'key key': 'asd', 'k3': 'word: stuff'} val1_str = pyaml.dump(val1) val2 = yaml.safe_load(val1_str) self.assertEqual(val1, val2) self.assertIn('key: word1 word2 word3', val1_str) def test_empty_strings(self): val1 = {'key': ['', 'stuff', '', 'more'], '': 'value', 'k3': ''} val1_str = pyaml.dump(val1) val2 = yaml.safe_load(val1_str) val2_str = pyaml.dump(val2) val3 = yaml.safe_load(val2_str) self.assertEqual(val1, val2) self.assertEqual(val1_str, val2_str) self.assertEqual(val2, val3) def test_single_dash_strings(self): strip_seq_dash = lambda line: line.lstrip().lstrip('-').lstrip() val1 = {'key': ['-', '-stuff', '- -', '- more-', 'more-', '--']} val1_str = pyaml.dump(val1) val2 = yaml.safe_load(val1_str) val2_str = pyaml.dump(val2) val3 = yaml.safe_load(val2_str) self.assertEqual(val1, val2) self.assertEqual(val1_str, val2_str) self.assertEqual(val2, val3) val1_str_lines = val1_str.splitlines() self.assertEqual(strip_seq_dash(val1_str_lines[2]), '-stuff') self.assertEqual(strip_seq_dash(val1_str_lines[5]), 'more-') self.assertEqual(strip_seq_dash(val1_str_lines[6]), '--') val1 = {'key': '-'} val1_str = pyaml.dump(val1) val2 = yaml.safe_load(val1_str) val2_str = pyaml.dump(val2) val3 = yaml.safe_load(val2_str) def test_namedtuple(self): TestTuple = cs.namedtuple('TestTuple', 'y x z') val = TestTuple(1, 2, 3) val_str = pyaml.dump(val, sort_keys=False) self.assertEqual(val_str, 'y: 1\nx: 2\nz: 3\n') # namedtuple order was preserved def test_ordereddict(self): d = cs.OrderedDict((i, '') for i in reversed(range(10))) lines = pyaml.dump(d, sort_keys=False).splitlines() self.assertEqual(lines, list(reversed(sorted(lines)))) def test_enum(self): c = test_const.heartbeat d1 = {'a': c, 'b': c.value, c: 'testx'} self.assertEqual(d1['a'], d1['b']) s = pyaml.dump(d1) d2 = yaml.safe_load(s) self.assertEqual(d1['a'], d2['a']) self.assertEqual(d1['a'], c) self.assertEqual(d1[c], 'testx') self.assertIn('a: 123 # test_const.heartbeat', s) def test_pyyaml_params(self): d = {'foo': 'lorem ipsum ' * 30} # 300+ chars for w in 40, 80, 200: lines = pyaml.dump(d, width=w, indent=10).splitlines() for n, line in enumerate(lines, 1): self.assertLess(len(line), w*1.2) if n != len(lines): self.assertGreater(len(line), w*0.8) def test_multiple_docs(self): docs = [yaml.safe_load(large_yaml), dict(a=1, b=2, c=3)] docs_str = pyaml.dump_all(docs, vspacing=True) self.assertTrue(docs_str.startswith('---')) self.assertIn('---\n\na: 1\n\nb: 2\n\nc: 3\n', docs_str) docs_str2 = pyaml.dump(docs, vspacing=True, multiple_docs=True) self.assertEqual(docs_str, docs_str2) docs_str2 = pyaml.dump(docs, vspacing=True) self.assertNotEqual(docs_str, docs_str2) docs_str2 = pyaml.dump_all(docs, explicit_start=False) self.assertFalse(docs_str2.startswith('---')) self.assertNotEqual(docs_str, docs_str2) docs_str = pyaml.dump(docs, multiple_docs=True, explicit_start=False) self.assertEqual(docs_str, docs_str2) def test_ruamel_yaml(self): try: from ruamel.yaml import YAML except ImportError: self.skipTest('No ruamel.yaml module to test it') data = YAML(typ='safe').load(large_yaml) yaml_str = pyaml.dump(data) def test_dump_stream_kws(self): data = [1, 2, 3] buff1, buff2 = io.StringIO(), io.StringIO() pyaml.dump(data, dst=buff1) pyaml.dump(data, stream=buff2) self.assertEqual(buff1.getvalue(), buff2.getvalue()) buff1.seek(0); buff1.truncate() pyaml.dump(data, dst=buff1, stream=buff1) self.assertEqual(buff1.getvalue(), buff2.getvalue()) ys = pyaml.dump(data, dst=str, stream=str) self.assertEqual(ys, buff2.getvalue()) buff1.seek(0); buff1.truncate(); buff2.seek(0); buff2.truncate() with self.assertRaises(TypeError): pyaml.dump(data, dst=buff1, stream=buff2) with self.assertRaises(TypeError): pyaml.dump(data, dst=str, stream=buff2) self.assertEqual(buff1.getvalue(), '') self.assertEqual(buff2.getvalue(), '') def test_list_vspacing(self): itm = self.yaml_var(''' builtIn: 1 datasource: type: grafana uid: -- Grafana -- enable: yes hide: yes iconColor: rgba(0, 211, 255, 1) name: Annotations & Alerts type: dashboard''') ys = pyaml.dump(dict(mylist=[itm]*10)) self.assertEqual( self.empty_line_list(ys), [1, 11, 21, 31, 41, 51, 61, 71, 81, 91] ) ys = self.yaml_var(''' panels: - datasource: type: datasource uid: grafana fieldConfig:''', raw=True) for n in range(60): ys += '\n' + ' '*3 + f'field{n}: value-{n}' ys = pyaml.dump(yaml.safe_load(ys)) self.assertEqual(self.empty_line_list(ys), list(range(4, 126, 2))) def test_anchor_cutoff(self): data = self.yaml_var(''' similique-natus-inventore-deserunt-amet-explicabo-cum-accusamus-temporibus: quam-nulla-dolorem-dolore-velit-quis-deserunt-est-ullam-exercitationem: culpa-quia-incidunt-accusantium-ad-dicta-nobis-rerum-veritatis: &test test: 1 similique-commodi-aperiam-libero-error-eos-quidem-eius: ipsam-labore-enim,-vero-voluptatem-eaque-dolores-blanditiis-recusandae: quas-atque-maxime-itaque-ullam-sequi-suscipit-quis-vitae-veritatis: *test''') ys = pyaml.dump(data, force_embed=False) for c in '&', r'\*': self.assertTrue(m := re.search(fr'(?<= ){c}\S+', ys)) self.assertLess(len(m[0]), 50) self.assertIn('similique', m[0]) self.assertIn('veritatis', m[0]) data = dict(test1=dict(test2=(v := dict(a=1, b=2, c=3))), test3=dict(test4=v)) ys = pyaml.dump(data, force_embed=False) for c in '&', r'\*': self.assertTrue(m := re.search(fr'(?<= ){c}\S+', ys)) self.assertLess(len(m[0]), 30) self.assertEqual(len(re.findall(r'test\d', m[0])), 2) def test_group_online_values(self): data = self.yaml_var(''' similique-natus: 1 similique-commodi: aperiam-libero: 2 "111": digit-string deserunt-est-2: asdasd deserunt-est-1: | line1 line2 culpa-quia: 1234 deserunt-est-3: asdasd 10: test1 200: test 30: test2''') ys1 = pyaml.dump(data, sort_dicts=pyaml.PYAMLSort.oneline_group, vspacing=dict(split_lines=0, split_count=0) ) self.assertEqual(self.empty_line_list(ys1), [8, 12]) ys2 = pyaml.dump(data, vspacing=dict(split_lines=0, split_count=0)) self.assertNotEqual(ys1, ys2) self.assertEqual(self.empty_line_list(ys2), [1, 3, 5, 7, 9, 13, 15, 17, 19, 21]) if __name__ == '__main__': unittest.main() # print('-'*80) # pyaml.dump(yaml.safe_load(large_yaml), sys.stdout) # print('-'*80) # pyaml.dump(data, sys.stdout) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2927737 pyaml-23.12.0/pyaml.egg-info/0000755000175000017500000000000014542337551015472 5ustar00fraggodfraggod././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/PKG-INFO0000644000175000017500000002624714542337551016602 0ustar00fraggodfraggodMetadata-Version: 2.1 Name: pyaml Version: 23.12.0 Summary: PyYAML-based module to produce a bit more pretty and readable YAML-serialized data Home-page: https://github.com/mk-fg/pretty-yaml Author: Mike Kazantsev Author-email: Mike Kazantsev License: WTFPL Project-URL: Homepage, https://github.com/mk-fg/pretty-yaml Keywords: yaml,serialization,pretty-print,formatter,human,readability Classifier: Development Status :: 4 - Beta Classifier: Intended Audience :: Developers Classifier: License :: Public Domain Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3.8 Classifier: Topic :: Software Development Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Utilities Requires-Python: >=3.8 Description-Content-Type: text/x-rst License-File: COPYING Requires-Dist: PyYAML Provides-Extra: anchors Requires-Dist: unidecode; extra == "anchors" pretty-yaml (or pyaml) ====================== PyYAML_-based python module to produce a bit more pretty and human-readable YAML-serialized data. This module is for serialization only, see `ruamel.yaml`_ module for literate YAML parsing (keeping track of comments, spacing, line/column numbers of values, etc). (side-note: to dump stuff parsed by ruamel.yaml with this module, use only ``YAML(typ='safe')`` there) It's a small module, and for projects that only need part of its functionality, I'd recommend copy-pasting that in, instead of adding janky dependency. .. _PyYAML: http://pyyaml.org/ .. _ruamel.yaml: https://bitbucket.org/ruamel/yaml/ .. contents:: :backlinks: none Repository URLs: - https://github.com/mk-fg/pretty-yaml - https://codeberg.org/mk-fg/pretty-yaml - https://fraggod.net/code/git/pretty-yaml Warning ------- Prime goal of this module is to produce human-readable output that can be easily diff'ed, manipulated and re-used, but maybe with occasional issues. So please do not rely on the thing to produce output that can always be deserialized exactly to what was exported, at least - use PyYAML directly for that (but maybe with options from the next section). What this module does and why ----------------------------- YAML is generally nice and easy format to read *if* it was written by humans. PyYAML can a do fairly decent job of making stuff readable, and the best combination of parameters for such output that I've seen so far is probably this one:: >>> m = [123, 45.67, {1: None, 2: False}, 'some text'] >>> data = dict(a='asldnsa\nasldpáknsa\n', b='whatever text', ma=m, mb=m) >>> yaml.safe_dump( data, sys.stdout, width=100, allow_unicode=True, default_flow_style=False ) a: 'asldnsa asldpáknsa ' b: whatever text ma: &id001 - 123 - 45.67 - 1: null 2: false - some text mb: *id001 pyaml (this module) tries to improve on that a bit, with the following tweaks: * Most human-friendly representation options in PyYAML (that I know of) are used as defaults - unicode, flow-style, width=100 (old default is 80). * Dump "null" values as empty values, if possible, which have the same meaning but reduce visual clutter and are easier to edit. * Dicts, sets, OrderedDicts, defaultdicts, namedtuples, enums, dataclasses, etc are represented as their safe YAML-compatible base (like int, list or mapping), with mappings key-sorted by default for more diff-friendly output. * Use shorter and simplier yes/no for booleans. * List items get indented, as they should be. * Attempt is made to pick more readable string representation styles, depending on the value, e.g.:: >>> yaml.safe_dump(cert, sys.stdout) cert: '-----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... >>> pyaml.p(cert): cert: | -----BEGIN CERTIFICATE----- MIIH3jCCBcagAwIBAgIJAJi7AjQ4Z87OMA0GCSqGSIb3DQEBCwUAMIHBMRcwFQYD VQQKFA52YWxlcm9uLm5vX2lzcDEeMBwGA1UECxMVQ2VydGlmaWNhdGUgQXV0aG9y ... * "force_embed" option (default=yes) to avoid having &id stuff scattered all over the output. Might be more useful to disable it in some specific cases though. * "&idXYZ" anchors, when needed, get labels from the keys they get attached to, not just meaningless enumerators, e.g. "&users_-_admin" instead. * "string_val_style" option to only apply to strings that are values, not keys, i.e:: >>> pyaml.p(data, string_val_style='"') key: "value\nasldpáknsa\n" >>> yaml.safe_dump(data, sys.stdout, allow_unicode=True, default_style='"') "key": "value\nasldpáknsa\n" * Add vertical spacing (empty lines) between keys on different depths, to separate long YAML sections in the output visually, make it more seekable. * Discard end-of-document "..." indicators for simple values. Result for the (rather meaningless) example above:: >>> pyaml.p(data, force_embed=False, vspacing=dict(split_lines=10)) a: | asldnsa asldpáknsa b: whatever text ma: &ma - 123 - 45.67 - 1: 2: no - some text mb: *ma (force_embed=False enabled deduplication with ``&ma`` anchor, vspacing is adjusted to split even this tiny output) ---------- Extended example:: >>> pyaml.dump(data, vspacing=dict(split_lines=10)) destination: encoding: xz: enabled: yes min_size: 5120 options: path_filter: - \.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$ - \.(rpm|deb|iso)$ - \.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$ - \.(sqlite3?|fossil|fsl)$ - \.git/objects/[0-9a-f]+/[0-9a-f]+$ result: append_to_file: append_to_lafs_dir: print_to_stdout: yes url: http://localhost:3456/uri filter: - /(CVS|RCS|SCCS|_darcs|\{arch\})/$ - /\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$ - /=(RELEASE-ID|meta-update|update)$ http: ca_certs_files: /etc/ssl/certs/ca-certificates.crt debug_requests: no request_pool_options: cachedConnectionTimeout: 600 maxPersistentPerHost: 10 retryAutomatically: yes logging: formatters: basic: datefmt: '%Y-%m-%d %H:%M:%S' format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s' handlers: console: class: logging.StreamHandler formatter: basic level: custom stream: ext://sys.stderr loggers: twisted: handlers: - console level: 0 root: handlers: - console level: custom Note that unless there are many moderately wide and deep trees of data, which are expected to be read and edited by people, it might be preferrable to directly use PyYAML regardless, as it won't introduce another (rather pointless in that case) dependency and a point of failure. Features and Tricks ------------------- * Pretty-print any yaml or json (yaml subset) file from the shell:: % python -m pyaml /path/to/some/file.yaml % pyaml < myfile.yml % curl -s https://www.githubstatus.com/api/v2/summary.json | pyaml ``pipx install pyaml`` can be a good way to only install "pyaml" command-line script. * Process and replace json/yaml file in-place:: % python -m pyaml -r mydata.yml * Easier "debug printf" for more complex data (all funcs below are aliases to same thing):: pyaml.p(stuff) pyaml.pprint(my_data) pyaml.pprint('----- HOW DOES THAT BREAKS!?!?', input_data, some_var, more_stuff) pyaml.print(data, file=sys.stderr) # needs "from __future__ import print_function" * Force all string values to a certain style (see info on these in `PyYAML docs`_):: pyaml.dump(many_weird_strings, string_val_style='|') pyaml.dump(multiline_words, string_val_style='>') pyaml.dump(no_want_quotes, string_val_style='plain') Using ``pyaml.add_representer()`` (note \*p\*yaml) as suggested in `this SO thread`_ (or `github-issue-7`_) should also work. See also this `amazing reply to StackOverflow#3790454`_ for everything about the many different string styles in YAML. * Control indent and width of the results:: pyaml.dump(wide_and_deep, indent=4, width=120) These are actually keywords for PyYAML Emitter (passed to it from Dumper), see more info on these in `PyYAML docs`_. * Dump multiple yaml documents into a file: ``pyaml.dump_all([data1, data2, data3], dst_file)`` explicit_start=True is implied, unless overidden by explicit_start=False. * Control thresholds for vertical spacing of values (0 = always space stuff out), and clump all oneliner ones at the top:: >>> pyaml.dump( data, sort_dicts=pyaml.PYAMLSort.oneline_group, vspacing=dict(split_lines=0, split_count=0) ) chart: axisCenteredZero: no axisColorMode: text axisLabel: '' axisPlacement: auto barAlignment: 0 drawStyle: line ... hideFrom: legend: no tooltip: no viz: no scaleDistribution: type: linear stacking: group: A mode: none Or same thing with cli tool ``-v/--vspacing`` option: ``pyaml -v 0/0g mydata.yaml`` .. _PyYAML docs: http://pyyaml.org/wiki/PyYAMLDocumentation#Scalars .. _this SO thread: http://stackoverflow.com/a/7445560 .. _github-issue-7: https://github.com/mk-fg/pretty-yaml/issues/7 .. _amazing reply to StackOverflow#3790454: https://stackoverflow.com/questions/3790454/how-do-i-break-a-string-in-yaml-over-multiple-lines/21699210#21699210 Installation ------------ It's a regular Python 3.8+ module/package, published on PyPI (as pyaml_). Module uses PyYAML_ for processing of the actual YAML files and should pull it in as a dependency. Dependency on unidecode_ module is optional and should only be useful with force_embed=False keyword (defaults to True), and same-id objects or recursion is used within serialized data - i.e. only when generating &some_key_id anchors is needed. If module is unavailable at runtime, anchor ids will be less like their keys and maybe not as nice. Using pip_ is how you generally install it, usually coupled with venv_ usage (which will also provide "pip" tool itself):: % pip install pyaml Current-git version can be installed like this:: % pip install git+https://github.com/mk-fg/pretty-yaml pip will default to installing into currently-active venv, then user's home directory (under ``~/.local/lib/python...``), and maybe system-wide when running as root (only useful in specialized environments like docker containers). There are many other python packaging tools - pipenv_, poetry_, pdm_, etc - use whatever is most suitable for specific project/environment. pipx_ can be used to install command-line script without a module. More general info on python packaging can be found at `packaging.python.org`_. When changing code, unit tests can be run with ``python -m unittest`` from the local repository checkout. .. _pyaml: https://pypi.org/project/pyaml/ .. _unidecode: https://pypi.python.org/pypi/Unidecode .. _pip: https://pip.pypa.io/en/stable/ .. _venv: https://docs.python.org/3/library/venv.html .. _poetry: https://python-poetry.org/ .. _pipenv: https://pipenv.pypa.io/ .. _pdm: https://pdm.fming.dev/ .. _pipx: https://pypa.github.io/pipx/ .. _packaging.python.org: https://packaging.python.org/installing/ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/SOURCES.txt0000644000175000017500000000054014542337551017355 0ustar00fraggodfraggodCOPYING MANIFEST.in README.rst pyproject.toml setup.py pyaml/__init__.py pyaml/__main__.py pyaml/cli.py pyaml.egg-info/PKG-INFO pyaml.egg-info/SOURCES.txt pyaml.egg-info/dependency_links.txt pyaml.egg-info/entry_points.txt pyaml.egg-info/requires.txt pyaml.egg-info/top_level.txt pyaml/tests/__init__.py pyaml/tests/test_cli.py pyaml/tests/test_dump.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/dependency_links.txt0000644000175000017500000000000114542337551021540 0ustar00fraggodfraggod ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/entry_points.txt0000644000175000017500000000005614542337551020771 0ustar00fraggodfraggod[console_scripts] pyaml = pyaml.__main__:main ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/requires.txt0000644000175000017500000000003414542337551020067 0ustar00fraggodfraggodPyYAML [anchors] unidecode ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526249.0 pyaml-23.12.0/pyaml.egg-info/top_level.txt0000644000175000017500000000000614542337551020220 0ustar00fraggodfraggodpyaml ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1703526188.0 pyaml-23.12.0/pyproject.toml0000644000175000017500000000166114542337454015600 0ustar00fraggodfraggod[project] name = "pyaml" version = "23.12.0" description = "PyYAML-based module to produce a bit more pretty and readable YAML-serialized data" authors = [{name="Mike Kazantsev", email="mk.fraggod@gmail.com"}] license = {text="WTFPL"} classifiers = [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: Public Domain", "Programming Language :: Python", "Programming Language :: Python :: 3.8", "Topic :: Software Development", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Utilities" ] keywords = ["yaml", "serialization", "pretty-print", "formatter", "human", "readability"] requires-python = ">=3.8" dependencies = ["PyYAML"] dynamic = ["readme"] [project.urls] Homepage = "https://github.com/mk-fg/pretty-yaml" [tool.setuptools.dynamic] readme = {file="README.rst"} [project.optional-dependencies] anchors = ["unidecode"] [project.scripts] pyaml = "pyaml.__main__:main" ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1703526249.2927737 pyaml-23.12.0/setup.cfg0000644000175000017500000000004614542337551014477 0ustar00fraggodfraggod[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1688624488.0 pyaml-23.12.0/setup.py0000644000175000017500000000225114451456550014370 0ustar00fraggodfraggod# For compatibility - pyproject.toml should work instead # ...but it does not atm, so ends up being parsed to setup() values here anyway import setuptools, re, pathlib as pl toml_lines = (pl.Path(__file__).parent / 'pyproject.toml').read_text().split('\n') def toml_lines_rollup(lines, n=0): if not lines[n:]: return if re.match('\s+', lines[n]): return [lines[n].strip(), *(toml_lines_rollup(lines, n+1) or list())] if tail := toml_lines_rollup(lines, n+1): lines[n] += ' '.join(tail) toml_lines_rollup(toml_lines) def toml_str(key): for line, line_next in zip(toml_lines, toml_lines[1:]): if (m := re.fullmatch(r'(\w[\w-]+)\s*=\s*(.*?)\s*', line)) and m[1] == key: return re.findall(r'"([^"]+?)"', m[2]) else: raise KeyError(key) setup_kws = dict( ((k, k) for k in 'name version license description classifiers'.split()), url='Homepage', install_requires='dependencies' ) setup_kws = dict((k1, toml_str(k2)[0]) for k1, k2 in setup_kws.items()) setup_kws['keywords'] = toml_str('keywords') setup_kws['author'], setup_kws['author_email'] = toml_str('authors') if fp := getattr(setuptools, 'find_packages', None): setup_kws['packages'] = fp() setuptools.setup(**setup_kws)