././@PaxHeader0000000000000000000000000000003200000000000010210 xustar0026 mtime=1715810318.92847 genson-1.3.0/0000755000076500000240000000000014621230017011525 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/.gitignore0000644000076500000240000000034114407651467013535 0ustar00jrwstaff# See: github.com/github/gitignore/blob/master/Python.gitignore # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] # Distribution / packaging build/ dist/ *.egg-info/ *.egg *.eggs # Testing / CI .coverage .tox ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715654774.0 genson-1.3.0/.travis.yml0000644000076500000240000000047214620550166013652 0ustar00jrwstafflanguage: python # run tests for all envs python: - 3.7 - 3.8 - 3.9 - 3.10 - 3.11 - 3.12 env: ISOLATED=false install: pip install tox tox-travis script: tox # lint only needs run once jobs: include: - python: 3.9 env: ISOLATED=false install: pip install flake8 script: flake8 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715807078.0 genson-1.3.0/AUTHORS.rst0000644000076500000240000000072414621221546013416 0ustar00jrwstaffCredits ======= **GenSON** is written and maintained by `Jon Wolverton `_. Contributors ------------ - `David Kay `_ - `KOLANICH `_ - `YehudaCorsia `_ - `Brad Sokol `_ - `John Vandenberg `_ - `shtutzim `_ - `Mike Ralphson `_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715805140.0 genson-1.3.0/HISTORY.rst0000644000076500000240000001017214621215724013431 0ustar00jrwstaffHistory ======= 1.3.0 ----- Modernization * add support for Python versions up through 3.12 * remove support for old Python versions older than 3.7 since test dependencies no longer support them * remove Python 2.7 support * remove tests & test commands only relevant to Python 2.7 * remove backwards-compatibility from code * enable running as a module (``python -m genson``) * modernize package configuration (issue #68) * Use a valid ``schema_uri`` in tests (issue #69) 1.2.2 ----- * add ``__version__`` attr to module and ``--version`` option to CLI tool * add ``--encoding`` option to CLI tool that overrides default file encoding (fixes #47) 1.2.1 ----- * expose ``SchemaStrategy.__eq__()`` for extension * add support for Python 3.8 * update Trove classifiers * **Bugfix**: ``SchemaBuilder.__eq__()`` wasn't matching the ``$schema`` keyword correctly * **Bugfix**: only activate empty ``required`` option when ``required`` is actualy empty 1.2.0 ----- * ``SchemaStrategies`` are now extendable, enabling custom ``SchemaBuilder`` classes. * optimize ``__eq__`` logic 1.1.0 ----- * add support for Python 3.7 * drop support for Python 3.3 * drop support for JSON-Schema Draft 4 (because it doesn't allow empty ``required`` arrays) * **Bugfix**: preserve empty ``required`` arrays (fixes #25) * **Bugfix**: handle nested ``anyOf`` keywords (fixes #35) 1.0.2 ----- * add support for ``long`` integers in Python 2.7 * update test-skipping decorator to use standard version requirement strings 1.0.1 ----- * **Bugfix**: seeding an object schema with a ``"required"`` keyword caused an error * **Docs**: fix mislabeled method 1.0.0 ----- This version was a total overhaul. The main change was to split Schema into three separate classes, making it simpler to add more complicated functionality by having different generator types to handle the different schema types. 1. ``SchemaNode`` to manage the tree structure 2. ``SchemaGenerator`` for the schema generation logic 3. ``SchemaBuilder`` to manage the public API Interface Changes +++++++++++++++++ * ``SchemaBuilder`` is the new ``Schema`` * ``to_dict()`` is now called ``to_schema()`` To make the transition easier, there is still a ``Schema`` class that wraps ``SchemaBuilder`` with a backwards-compatibility layer, but you will trigger a ``PendingDeprecationWarning`` if you use it. Seed Schemas ++++++++++++ The ``merge_arrays`` option has been removed in favor of seed schemas. You can now seed specific nodes as list or tuple instead of setting a global option for every node in the schema tree. You can also now seed object nodes with ``patternProperties``, which was a highly requested feature. Other Changes +++++++++++++ * include ``"$schema"`` keyword * accept schemas without ``"type"`` keyword * use ``"anyOf"`` keyword to help combine schemas * add ``SchemaGenerationError`` for better error handling * empty ``"properties"`` and ``"items"`` are not included in generated schemas * ``genson`` executable * new ``--schema-uri`` option * auto-detect object boundaries by default 0.2.3 ----- * **Docs**: add installation instructions 0.2.2 ----- * **Docs**: Python 3.6 is now explicitly tested and listed as compatible. 0.2.1 ----- * **Bugfix**: ``add_schema`` failed when adding list-style array schemas * **Bugfix**: typo in readme 0.2.0 ----- * **Bugfix**: Options were not propagated down to subschemas. * **Bugfix**: Empty arrays resulted in invalid schemas because it still included an ``items`` property. * **Bugfix**: ``items`` was being set to a list even when ``merge_arrays`` was set to ``True``. This resulted in overly permissive schemas because ``items`` are matched optionally by default. * **Improvement**: Positional Array Matching - In order to be more consistent with the way JSON Schema works, the alternate to ``merge_arrays`` is no longer never to merge list items, but instead to merge them based on their position in the list. * **Improvement**: Schema Incompatibility Warning - A schema incompatibility used to cause a fatal error with a nondescript warning. The message has been improved and it has been reduced to a warning. 0.1.0 (2014-11-29) ------------------ * Initial release ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/LICENSE0000644000076500000240000000211614037050167012541 0ustar00jrwstaffThe MIT License (MIT) Copyright (c) 2014 Jon Wolverton github.com/wolverdude Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/MANIFEST.in0000644000076500000240000000007414407651467013306 0ustar00jrwstaffinclude *.rst include LICENSE recursive-include test *.json ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9283743 genson-1.3.0/PKG-INFO0000644000076500000240000006757614621230017012647 0ustar00jrwstaffMetadata-Version: 2.1 Name: genson Version: 1.3.0 Summary: GenSON is a powerful, user-friendly JSON Schema generator. Home-page: https://github.com/wolverdude/genson/ Download-URL: https://github.com/wolverdude/GenSON/tarball/v0.2s.0 Author: Jon Wolverton Author-email: wolverton.jr@gmail.com License: MIT Keywords: json, schema, json-schema, jsonschema, object, generate,generator, builder, merge, draft 7, validate, validation Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: Natural Language :: English Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Topic :: Software Development :: Code Generators Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Utilities Description-Content-Type: text/x-rst License-File: LICENSE License-File: AUTHORS.rst GenSON ====== **GenSON** is a powerful, user-friendly `JSON Schema`_ generator built in Python. .. note:: This is *not* the Python equivalent of the `Java Genson library`_. If you are coming from Java and need to create JSON objects in Python, you want `Python's builtin json library`_.) GenSON's core function is to take JSON objects and generate schemas that describe them, but it is unique in its ability to *merge* schemas. It was originally built to describe the common structure of a large number of JSON objects, and it uses its merging ability to generate a single schema from any number of JSON objects and/or schemas. GenSON's schema builder follows these three rules: 1. *Every* object it is given must validate under the generated schema. 2. *Any* object that is valid under *any* schema it is given must also validate under the generated schema. (there is one glaring exception to this, detailed `below`_) 3. The generated schema should be as strict as possible given the first 2 rules. JSON Schema Implementation -------------------------- **GenSON** is compatible with JSON Schema Draft 6 and above. It is important to note that GenSON uses only a subset of JSON Schema's capabilities. This is mainly because it doesn't know the specifics of your data model, and it tries to avoid guessing them. Its purpose is to generate the basic structure so that you can skip the boilerplate and focus on the details of the schema. Currently, GenSON only deals with these keywords: * ``"$schema"`` * ``"type"`` * ``"items"`` * ``"properties"`` * ``"patternProperties"`` * ``"required"`` * ``"anyOf"`` You should be aware that this limited vocabulary could cause GenSON to violate rules 1 and 2. If you feed it schemas with advanced keywords, it will just blindly pass them on to the final schema. Note that ``"$ref"`` and ``id`` are also not supported, so GenSON will not dereference linked nodes when building a schema. Installation ------------ .. code-block:: bash $ pip install genson CLI Tool -------- The package includes a ``genson`` executable that allows you to access this functionality from the command line. For usage info, run with ``--help``: .. code-block:: bash $ genson --help .. code-block:: usage: genson [-h] [--version] [-d DELIM] [-e ENCODING] [-i SPACES] [-s SCHEMA] [-$ SCHEMA_URI] ... Generate one, unified JSON Schema from one or more JSON objects and/or JSON Schemas. Compatible with JSON-Schema Draft 4 and above. positional arguments: object Files containing JSON objects (defaults to stdin if no arguments are passed). optional arguments: -h, --help Show this help message and exit. --version Show version number and exit. -d DELIM, --delimiter DELIM Set a delimiter. Use this option if the input files contain multiple JSON objects/schemas. You can pass any string. A few cases ('newline', 'tab', 'space') will get converted to a whitespace character. If this option is omitted, the parser will try to auto-detect boundaries. -e ENCODING, --encoding ENCODING Use ENCODING instead of the default system encoding when reading files. ENCODING must be a valid codec name or alias. -i SPACES, --indent SPACES Pretty-print the output, indenting SPACES spaces. -s SCHEMA, --schema SCHEMA File containing a JSON Schema (can be specified multiple times to merge schemas). -$ SCHEMA_URI, --schema-uri SCHEMA_URI The value of the '$schema' keyword (defaults to 'http://json-schema.org/schema#' or can be specified in a schema with the -s option). If 'NULL' is passed, the "$schema" keyword will not be included in the result. GenSON Python API ----------------- ``SchemaBuilder`` is the basic schema generator class. ``SchemaBuilder`` instances can be loaded up with existing schemas and objects before being serialized. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({"type": "object", "properties": {}}) >>> builder.add_object({"hi": "there"}) >>> builder.add_object({"hi": 5}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': { 'hi': {'type': ['integer', 'string']}}, 'required': ['hi']} >>> print(builder.to_json(indent=2)) { "$schema": "http://json-schema.org/schema#", "type": "object", "properties": { "hi": { "type": [ "integer", "string" ] } }, "required": [ "hi" ] } ``SchemaBuilder`` API +++++++++++++++++++++ ``__init__(schema_uri=None)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ :param schema_uri: value of the ``$schema`` keyword. If not given, it will use the value of the first available ``$schema`` keyword on an added schema or else the default: ``'http://json-schema.org/schema#'``. A value of ``False`` or ``None`` will direct GenSON to leave out the ``"$schema"`` keyword. ``add_schema(schema)`` ^^^^^^^^^^^^^^^^^^^^^^ Merge in a JSON schema. This can be a ``dict`` or another ``SchemaBuilder`` object. :param schema: a JSON Schema .. note:: There is no schema validation. If you pass in a bad schema, you might get back a bad schema. ``add_object(obj)`` ^^^^^^^^^^^^^^^^^^^ Modify the schema to accommodate an object. :param obj: any object or scalar that can be serialized in JSON ``to_schema()`` ^^^^^^^^^^^^^^^ Generate a schema based on previous inputs. :rtype: ``dict`` ``to_json()`` ^^^^^^^^^^^^^ Generate a schema and convert it directly to serialized JSON. :rtype: ``str`` ``__eq__(other)`` ^^^^^^^^^^^^^^^^^ Check for equality with another ``SchemaBuilder`` object. :param other: another ``SchemaBuilder`` object. Other types are accepted, but will always return ``False`` SchemaBuilder object interaction ++++++++++++++++++++++++++++++++ ``SchemaBuilder`` objects can also interact with each other: * You can pass one schema directly to another to merge them. * You can compare schema equality directly. .. code-block:: python >>> from genson import SchemaBuilder >>> b1 = SchemaBuilder() >>> b1.add_schema({"type": "object", "properties": { ... "hi": {"type": "string"}}}) >>> b2 = SchemaBuilder() >>> b2.add_schema({"type": "object", "properties": { ... "hi": {"type": "integer"}}}) >>> b1 == b2 False >>> b1.add_schema(b2) >>> b2.add_schema(b1) >>> b1 == b2 True >>> b1.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': {'hi': {'type': ['integer', 'string']}}} Seed Schemas ------------ There are several cases where multiple valid schemas could be generated from the same object. GenSON makes a default choice in all these ambiguous cases, but if you want it to choose differently, you can tell it what to do using a *seed schema*. Seeding Arrays ++++++++++++++ For example, suppose you have a simple array with two items: .. code-block:: python ['one', 1] There are always two ways for GenSON to interpret any array: List and Tuple. Lists have one schema for every item, whereas Tuples have a different schema for every array position. This is analogous to the (now deprecated) ``merge_arrays`` option from version 0. You can read more about JSON Schema `array validation here`_. List Validation ^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": {"type": ["integer", "string"]} } Tuple Validation ^^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": [{"type": "integer"}, {"type": "string"}] } By default, GenSON always interprets arrays using list validation, but you can tell it to use tuple validation by seeding it with a schema. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': {'type': ['integer', 'string']}} >>> builder = SchemaBuilder() >>> seed_schema = {'type': 'array', 'items': []} >>> builder.add_schema(seed_schema) >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': [{'type': 'string'}, {'type': 'integer'}]} Note that in this case, the seed schema is actually invalid. You can't have an empty array as the value for an ``items`` keyword. But GenSON is a generator, not a validator, so you can fudge a little. GenSON will modify the generated schema so that it is valid, provided that there aren't invalid keywords beyond the ones it knows about. Seeding patternProperties +++++++++++++++++++++++++ Support for patternProperties_ is new in version 1; however, since GenSON's default behavior is to only use ``properties``, this powerful keyword can only be utilized with seed schemas. You will need to supply an ``object`` schema with a ``patternProperties`` object whose keys are RegEx strings. Again, you can fudge here and set the values to null instead of creating valid subschemas. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({'type': 'object', 'patternProperties': {r'^\d+$': None}}) >>> builder.add_object({'1': 1, '2': 2, '3': 3}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'patternProperties': {'^\\d+$': {'type': 'integer'}}} There are a few gotchas you should be aware of here: * GenSON is written in Python, so it uses the `Python flavor of RegEx`_. * GenSON still prefers ``properties`` to ``patternProperties`` if a property already exists that matches one of your patterns, the normal property will be updated, *not* the pattern property. * If a key matches multiple patterns, there is *no guarantee* of which one will be updated. * The patternProperties_ docs themselves have some more useful pointers that can save you time. Typeless Schemas ++++++++++++++++ In version 0, GenSON did not accept a schema without a type, but in order to be flexible in the support of seed schemas, support was added for version 1. However, GenSON violates rule #2 in its handling of typeless schemas. Any object will validate under an empty schema, but GenSON incorporates typeless schemas into the first-available typed schema, and since typed schemas are stricter than typless ones, objects that would validate under an added schema will not validate under the result. Customizing ``SchemaBuilder`` ----------------------------- You can extend the ``SchemaBuilder`` class to add in your own logic (e.g. recording ``minimum`` and ``maximum`` for a number). In order to do this, you need to: 1. Create a custom ``SchemaStrategy`` class. 2. Create a ``SchemaBuilder`` subclass that includes your custom ``SchemaStrategy`` class(es). 3. Use your custom ``SchemaBuilder`` just like you would the stock ``SchemaBuilder``. ``SchemaStrategy`` Classes ++++++++++++++++++++++++++ GenSON uses the Strategy Pattern to parse, update, and serialize different kinds of schemas that behave in different ways. There are several ``SchemaStrategy`` classes that roughly correspond to different schema types. GenSON maps each node in an object or schema to an instance of one of these classes. Each instance stores the current schema state and updates or returns it when required. You can modify the specific ways these classes work by extending them. You can inherit from any existing ``SchemaStrategy`` class, though ``SchemaStrategy`` and ``TypedSchemaStrategy`` are the most useful base classes. You should call ``super`` and pass along all arguments when overriding any instance methods. The documentation below explains the public API and what you need to extend and override at a high level. Feel free to explore `the code`_ to see more, but know that the public API is documented here, and anything else you depend on could be subject to change. All ``SchemaStrategy`` subclasses maintain the public API though, so you can extend any of them in this way. ``SchemaStrategy`` API ++++++++++++++++++++++ [class constant] ``KEYWORDS`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This should be a tuple listing all of the JSON-schema keywords that this strategy knows how to handle. Any keywords encountered in added schemas will be be naively passed on to the generated schema unless they are in this list (or you override that behavior in ``to_schema``). When adding keywords to a new ``SchemaStrategy``, it's best to splat the parent class's ``KEYWORDS`` into the new tuple. [class method] ``match_schema(cls, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in schema. :param schema: a JSON Schema in ``dict`` form :rtype: ``bool`` [class method] ``match_object(cls, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in object. :param obj: any object or scalar that can be serialized in JSON :rtype: ``bool`` ``__init__(self, node_class)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this method if you need to initialize an instance variable. :param node_class: This param is not part of the public API. Pass it along to ``super``. ``add_schema(self, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to modify how a schema is parsed and stored. :param schema: a JSON Schema in ``dict`` form ``add_object(self, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to change the way a schemas are inferred from objects. :param obj: any object or scalar that can be serialized in JSON ``to_schema(self)`` ^^^^^^^^^^^^^^^^^^^ Override this method to customize how a schema object is constructed from the inputs. It is suggested that you invoke ``super`` as the basis for the return value, but it is not required. :rtype: ``dict`` .. note:: There is no schema validation. If you return a bad schema from this method, ``SchemaBuilder`` will output a bad schema. ``__eq__(self, other)`` ^^^^^^^^^^^^^^^^^^^^^^^ When checking for ``SchemaBuilder`` equality, strategies are matched using ``__eq__``. The default implementation uses a simple ``__dict__`` equality check. Override this method if you need to override that behavior. This may be useful if you add instance variables that aren't relevant to whether two SchemaStrategies are considered equal. :rtype: ``bool`` ``TypedSchemaStrategy`` API +++++++++++++++++++++++++++ This is an abstract schema strategy for making simple schemas that only deal with the ``type`` keyword, but you can extend it to add more functionality. Subclasses must define the following two class constants, but you get the entire ``SchemaStrategy`` interface for free. [class constant] ``JS_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This will be the value of the ``type`` keyword in the generated schema. It is also used to match any added schemas. [class constant] ``PYTHON_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is a Python type or tuple of types that will be matched against an added object using ``isinstance``. Extending ``SchemaBuilder`` +++++++++++++++++++++++++++ Once you have extended ``SchemaStrategy`` types, you'll need to create a ``SchemaBuilder`` class that uses them, since the default ``SchemaBuilder`` only incorporates the default strategies. To do this, extend the ``SchemaBuilder`` class and define one of these two constants inside it: [class constant] ``EXTRA_STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the standard (and suggested) way to add strategies. Set it to a tuple of all your new strategies, and they will be added to the existing list of strategies to check. This preserves all the existing functionality. Note that order matters. GenSON checks the list in order, so the first strategy has priority over the second and so on. All ``EXTRA_STRATEGIES`` have priority over the default strategies. [class constant] ``STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This clobbers the existing list of strategies and completely replaces it. Set it to a tuple just like for ``EXTRA_STRATEGIES``, but note that if any object or schema gets added that your exhaustive list of strategies doesn't know how to handle, you'll get an error. You should avoid doing this unless you're extending most or all existing strategies in some way. Example: ``MinNumber`` ++++++++++++++++++++++ Here's some example code creating a number strategy that tracks the `minimum number`_ seen and includes it in the output schema. .. code-block:: python from genson import SchemaBuilder from genson.schema.strategies import Number class MinNumber(Number): # add 'minimum' to list of keywords KEYWORDS = (*Number.KEYWORDS, 'minimum') # create a new instance variable def __init__(self, node_class): super().__init__(node_class) self.min = None # capture 'minimum's from schemas def add_schema(self, schema): super().add_schema(schema) if self.min is None: self.min = schema.get('minimum') elif 'minimum' in schema: self.min = min(self.min, schema['minimum']) # adjust minimum based on the data def add_object(self, obj): super().add_object(obj) self.min = obj if self.min is None else min(self.min, obj) # include 'minimum' in the output def to_schema(self): schema = super().to_schema() schema['minimum'] = self.min return schema # new SchemaBuilder class that uses the MinNumber strategy in addition # to the existing strategies. Both MinNumber and Number are active, but # MinNumber has priority, so it effectively replaces Number. class MinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum """ EXTRA_STRATEGIES = (MinNumber,) # this class *ONLY* has the MinNumber strategy. Any object that is not # a number will cause an error. class ExclusiveMinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum, and only handles number """ STRATEGIES = (MinNumber,) Now that we have the MinNumberSchemaBuilder class, let's see how it works. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> builder.add_object(5) >>> builder.add_object(7) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(-2) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -2} >>> builder.add_schema({'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7} Note that the exclusive builder is much more particular. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> picky_builder = ExclusiveMinNumberSchemaBuilder() >>> picky_builder.add_object(5) >>> picky_builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(None) # this is fine >>> picky_builder.add_object(None) # this fails genson.schema.node.SchemaGenerationError: Could not find matching schema type for object: None Contributing ------------ When contributing, please follow these steps: 1. Clone the repo and make your changes. 2. Make sure your code has test cases written against it. 3. Lint your code with `Flake8`_. 4. Run `tox`_ to make sure the test suite passes. 5. Ensure the docs are accurate. 6. Add your name to the list of contributers. 7. Submit a Pull Request. Tests +++++ Tests are written in ``unittest`` and are run using `tox`_ and `nose`_. Tox will run all tests with coverage against each supported Python version that is installed on your machine. .. code-block:: bash $ tox Integration +++++++++++ When you submit a PR, `Travis CI`_ performs the following steps: 1. Lints the code with Flake8 2. Runs the entire test suite against each supported Python version. 3. Ensures that test coverage is at least 90% If any of these steps fail, your PR cannot be merged until it is fixed. Potential Future Features +++++++++++++++++++++++++ The following are extra features under consideration. * recognize every validation keyword and ignore any that don't apply * option to set error level * custom serializer plugins * logical support for more keywords: * ``enum`` * ``minimum``/``maximum`` * ``minLength``/``maxLength`` * ``minItems``/``maxItems`` * ``minProperties``/``maxProperties`` * ``additionalItems`` * ``additionalProperties`` * ``format`` & ``pattern`` * ``$ref`` & ``id`` .. _JSON Schema: http://json-schema.org/ .. _Java Genson library: https://owlike.github.io/genson/ .. _`Python's builtin json library`: https://docs.python.org/library/json.html .. _below: #typeless-schemas .. _array validation here: https://spacetelescope.github.io/understanding-json-schema/reference/array.html#items .. _patternProperties: https://spacetelescope.github.io/understanding-json-schema/reference/object.html#pattern-properties .. _Python flavor of RegEx: https://docs.python.org/3.6/library/re.html .. _the code: https://github.com/wolverdude/GenSON/tree/master/genson/schema/strategies .. _minimum number: https://json-schema.org/understanding-json-schema/reference/numeric.html#range .. _Flake8: https://pypi.python.org/pypi/flake8 .. _tox: https://pypi.python.org/pypi/tox .. _nose: https://pypi.python.org/pypi/nose .. _Travis CI: https://travis-ci.com/github/wolverdude/GenSON History ======= 1.3.0 ----- Modernization * add support for Python versions up through 3.12 * remove support for old Python versions older than 3.7 since test dependencies no longer support them * remove Python 2.7 support * remove tests & test commands only relevant to Python 2.7 * remove backwards-compatibility from code * enable running as a module (``python -m genson``) * modernize package configuration (issue #68) * Use a valid ``schema_uri`` in tests (issue #69) 1.2.2 ----- * add ``__version__`` attr to module and ``--version`` option to CLI tool * add ``--encoding`` option to CLI tool that overrides default file encoding (fixes #47) 1.2.1 ----- * expose ``SchemaStrategy.__eq__()`` for extension * add support for Python 3.8 * update Trove classifiers * **Bugfix**: ``SchemaBuilder.__eq__()`` wasn't matching the ``$schema`` keyword correctly * **Bugfix**: only activate empty ``required`` option when ``required`` is actualy empty 1.2.0 ----- * ``SchemaStrategies`` are now extendable, enabling custom ``SchemaBuilder`` classes. * optimize ``__eq__`` logic 1.1.0 ----- * add support for Python 3.7 * drop support for Python 3.3 * drop support for JSON-Schema Draft 4 (because it doesn't allow empty ``required`` arrays) * **Bugfix**: preserve empty ``required`` arrays (fixes #25) * **Bugfix**: handle nested ``anyOf`` keywords (fixes #35) 1.0.2 ----- * add support for ``long`` integers in Python 2.7 * update test-skipping decorator to use standard version requirement strings 1.0.1 ----- * **Bugfix**: seeding an object schema with a ``"required"`` keyword caused an error * **Docs**: fix mislabeled method 1.0.0 ----- This version was a total overhaul. The main change was to split Schema into three separate classes, making it simpler to add more complicated functionality by having different generator types to handle the different schema types. 1. ``SchemaNode`` to manage the tree structure 2. ``SchemaGenerator`` for the schema generation logic 3. ``SchemaBuilder`` to manage the public API Interface Changes +++++++++++++++++ * ``SchemaBuilder`` is the new ``Schema`` * ``to_dict()`` is now called ``to_schema()`` To make the transition easier, there is still a ``Schema`` class that wraps ``SchemaBuilder`` with a backwards-compatibility layer, but you will trigger a ``PendingDeprecationWarning`` if you use it. Seed Schemas ++++++++++++ The ``merge_arrays`` option has been removed in favor of seed schemas. You can now seed specific nodes as list or tuple instead of setting a global option for every node in the schema tree. You can also now seed object nodes with ``patternProperties``, which was a highly requested feature. Other Changes +++++++++++++ * include ``"$schema"`` keyword * accept schemas without ``"type"`` keyword * use ``"anyOf"`` keyword to help combine schemas * add ``SchemaGenerationError`` for better error handling * empty ``"properties"`` and ``"items"`` are not included in generated schemas * ``genson`` executable * new ``--schema-uri`` option * auto-detect object boundaries by default 0.2.3 ----- * **Docs**: add installation instructions 0.2.2 ----- * **Docs**: Python 3.6 is now explicitly tested and listed as compatible. 0.2.1 ----- * **Bugfix**: ``add_schema`` failed when adding list-style array schemas * **Bugfix**: typo in readme 0.2.0 ----- * **Bugfix**: Options were not propagated down to subschemas. * **Bugfix**: Empty arrays resulted in invalid schemas because it still included an ``items`` property. * **Bugfix**: ``items`` was being set to a list even when ``merge_arrays`` was set to ``True``. This resulted in overly permissive schemas because ``items`` are matched optionally by default. * **Improvement**: Positional Array Matching - In order to be more consistent with the way JSON Schema works, the alternate to ``merge_arrays`` is no longer never to merge list items, but instead to merge them based on their position in the list. * **Improvement**: Schema Incompatibility Warning - A schema incompatibility used to cause a fatal error with a nondescript warning. The message has been improved and it has been reduced to a warning. 0.1.0 (2014-11-29) ------------------ * Initial release Credits ======= **GenSON** is written and maintained by `Jon Wolverton `_. Contributors ------------ - `David Kay `_ - `KOLANICH `_ - `YehudaCorsia `_ - `Brad Sokol `_ - `John Vandenberg `_ - `shtutzim `_ - `Mike Ralphson `_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715809925.0 genson-1.3.0/README.rst0000644000076500000240000005375314621227205013236 0ustar00jrwstaffGenSON ====== **GenSON** is a powerful, user-friendly `JSON Schema`_ generator built in Python. .. note:: This is *not* the Python equivalent of the `Java Genson library`_. If you are coming from Java and need to create JSON objects in Python, you want `Python's builtin json library`_.) GenSON's core function is to take JSON objects and generate schemas that describe them, but it is unique in its ability to *merge* schemas. It was originally built to describe the common structure of a large number of JSON objects, and it uses its merging ability to generate a single schema from any number of JSON objects and/or schemas. GenSON's schema builder follows these three rules: 1. *Every* object it is given must validate under the generated schema. 2. *Any* object that is valid under *any* schema it is given must also validate under the generated schema. (there is one glaring exception to this, detailed `below`_) 3. The generated schema should be as strict as possible given the first 2 rules. JSON Schema Implementation -------------------------- **GenSON** is compatible with JSON Schema Draft 6 and above. It is important to note that GenSON uses only a subset of JSON Schema's capabilities. This is mainly because it doesn't know the specifics of your data model, and it tries to avoid guessing them. Its purpose is to generate the basic structure so that you can skip the boilerplate and focus on the details of the schema. Currently, GenSON only deals with these keywords: * ``"$schema"`` * ``"type"`` * ``"items"`` * ``"properties"`` * ``"patternProperties"`` * ``"required"`` * ``"anyOf"`` You should be aware that this limited vocabulary could cause GenSON to violate rules 1 and 2. If you feed it schemas with advanced keywords, it will just blindly pass them on to the final schema. Note that ``"$ref"`` and ``id`` are also not supported, so GenSON will not dereference linked nodes when building a schema. Installation ------------ .. code-block:: bash $ pip install genson CLI Tool -------- The package includes a ``genson`` executable that allows you to access this functionality from the command line. For usage info, run with ``--help``: .. code-block:: bash $ genson --help .. code-block:: usage: genson [-h] [--version] [-d DELIM] [-e ENCODING] [-i SPACES] [-s SCHEMA] [-$ SCHEMA_URI] ... Generate one, unified JSON Schema from one or more JSON objects and/or JSON Schemas. Compatible with JSON-Schema Draft 4 and above. positional arguments: object Files containing JSON objects (defaults to stdin if no arguments are passed). optional arguments: -h, --help Show this help message and exit. --version Show version number and exit. -d DELIM, --delimiter DELIM Set a delimiter. Use this option if the input files contain multiple JSON objects/schemas. You can pass any string. A few cases ('newline', 'tab', 'space') will get converted to a whitespace character. If this option is omitted, the parser will try to auto-detect boundaries. -e ENCODING, --encoding ENCODING Use ENCODING instead of the default system encoding when reading files. ENCODING must be a valid codec name or alias. -i SPACES, --indent SPACES Pretty-print the output, indenting SPACES spaces. -s SCHEMA, --schema SCHEMA File containing a JSON Schema (can be specified multiple times to merge schemas). -$ SCHEMA_URI, --schema-uri SCHEMA_URI The value of the '$schema' keyword (defaults to 'http://json-schema.org/schema#' or can be specified in a schema with the -s option). If 'NULL' is passed, the "$schema" keyword will not be included in the result. GenSON Python API ----------------- ``SchemaBuilder`` is the basic schema generator class. ``SchemaBuilder`` instances can be loaded up with existing schemas and objects before being serialized. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({"type": "object", "properties": {}}) >>> builder.add_object({"hi": "there"}) >>> builder.add_object({"hi": 5}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': { 'hi': {'type': ['integer', 'string']}}, 'required': ['hi']} >>> print(builder.to_json(indent=2)) { "$schema": "http://json-schema.org/schema#", "type": "object", "properties": { "hi": { "type": [ "integer", "string" ] } }, "required": [ "hi" ] } ``SchemaBuilder`` API +++++++++++++++++++++ ``__init__(schema_uri=None)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ :param schema_uri: value of the ``$schema`` keyword. If not given, it will use the value of the first available ``$schema`` keyword on an added schema or else the default: ``'http://json-schema.org/schema#'``. A value of ``False`` or ``None`` will direct GenSON to leave out the ``"$schema"`` keyword. ``add_schema(schema)`` ^^^^^^^^^^^^^^^^^^^^^^ Merge in a JSON schema. This can be a ``dict`` or another ``SchemaBuilder`` object. :param schema: a JSON Schema .. note:: There is no schema validation. If you pass in a bad schema, you might get back a bad schema. ``add_object(obj)`` ^^^^^^^^^^^^^^^^^^^ Modify the schema to accommodate an object. :param obj: any object or scalar that can be serialized in JSON ``to_schema()`` ^^^^^^^^^^^^^^^ Generate a schema based on previous inputs. :rtype: ``dict`` ``to_json()`` ^^^^^^^^^^^^^ Generate a schema and convert it directly to serialized JSON. :rtype: ``str`` ``__eq__(other)`` ^^^^^^^^^^^^^^^^^ Check for equality with another ``SchemaBuilder`` object. :param other: another ``SchemaBuilder`` object. Other types are accepted, but will always return ``False`` SchemaBuilder object interaction ++++++++++++++++++++++++++++++++ ``SchemaBuilder`` objects can also interact with each other: * You can pass one schema directly to another to merge them. * You can compare schema equality directly. .. code-block:: python >>> from genson import SchemaBuilder >>> b1 = SchemaBuilder() >>> b1.add_schema({"type": "object", "properties": { ... "hi": {"type": "string"}}}) >>> b2 = SchemaBuilder() >>> b2.add_schema({"type": "object", "properties": { ... "hi": {"type": "integer"}}}) >>> b1 == b2 False >>> b1.add_schema(b2) >>> b2.add_schema(b1) >>> b1 == b2 True >>> b1.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': {'hi': {'type': ['integer', 'string']}}} Seed Schemas ------------ There are several cases where multiple valid schemas could be generated from the same object. GenSON makes a default choice in all these ambiguous cases, but if you want it to choose differently, you can tell it what to do using a *seed schema*. Seeding Arrays ++++++++++++++ For example, suppose you have a simple array with two items: .. code-block:: python ['one', 1] There are always two ways for GenSON to interpret any array: List and Tuple. Lists have one schema for every item, whereas Tuples have a different schema for every array position. This is analogous to the (now deprecated) ``merge_arrays`` option from version 0. You can read more about JSON Schema `array validation here`_. List Validation ^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": {"type": ["integer", "string"]} } Tuple Validation ^^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": [{"type": "integer"}, {"type": "string"}] } By default, GenSON always interprets arrays using list validation, but you can tell it to use tuple validation by seeding it with a schema. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': {'type': ['integer', 'string']}} >>> builder = SchemaBuilder() >>> seed_schema = {'type': 'array', 'items': []} >>> builder.add_schema(seed_schema) >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': [{'type': 'string'}, {'type': 'integer'}]} Note that in this case, the seed schema is actually invalid. You can't have an empty array as the value for an ``items`` keyword. But GenSON is a generator, not a validator, so you can fudge a little. GenSON will modify the generated schema so that it is valid, provided that there aren't invalid keywords beyond the ones it knows about. Seeding patternProperties +++++++++++++++++++++++++ Support for patternProperties_ is new in version 1; however, since GenSON's default behavior is to only use ``properties``, this powerful keyword can only be utilized with seed schemas. You will need to supply an ``object`` schema with a ``patternProperties`` object whose keys are RegEx strings. Again, you can fudge here and set the values to null instead of creating valid subschemas. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({'type': 'object', 'patternProperties': {r'^\d+$': None}}) >>> builder.add_object({'1': 1, '2': 2, '3': 3}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'patternProperties': {'^\\d+$': {'type': 'integer'}}} There are a few gotchas you should be aware of here: * GenSON is written in Python, so it uses the `Python flavor of RegEx`_. * GenSON still prefers ``properties`` to ``patternProperties`` if a property already exists that matches one of your patterns, the normal property will be updated, *not* the pattern property. * If a key matches multiple patterns, there is *no guarantee* of which one will be updated. * The patternProperties_ docs themselves have some more useful pointers that can save you time. Typeless Schemas ++++++++++++++++ In version 0, GenSON did not accept a schema without a type, but in order to be flexible in the support of seed schemas, support was added for version 1. However, GenSON violates rule #2 in its handling of typeless schemas. Any object will validate under an empty schema, but GenSON incorporates typeless schemas into the first-available typed schema, and since typed schemas are stricter than typless ones, objects that would validate under an added schema will not validate under the result. Customizing ``SchemaBuilder`` ----------------------------- You can extend the ``SchemaBuilder`` class to add in your own logic (e.g. recording ``minimum`` and ``maximum`` for a number). In order to do this, you need to: 1. Create a custom ``SchemaStrategy`` class. 2. Create a ``SchemaBuilder`` subclass that includes your custom ``SchemaStrategy`` class(es). 3. Use your custom ``SchemaBuilder`` just like you would the stock ``SchemaBuilder``. ``SchemaStrategy`` Classes ++++++++++++++++++++++++++ GenSON uses the Strategy Pattern to parse, update, and serialize different kinds of schemas that behave in different ways. There are several ``SchemaStrategy`` classes that roughly correspond to different schema types. GenSON maps each node in an object or schema to an instance of one of these classes. Each instance stores the current schema state and updates or returns it when required. You can modify the specific ways these classes work by extending them. You can inherit from any existing ``SchemaStrategy`` class, though ``SchemaStrategy`` and ``TypedSchemaStrategy`` are the most useful base classes. You should call ``super`` and pass along all arguments when overriding any instance methods. The documentation below explains the public API and what you need to extend and override at a high level. Feel free to explore `the code`_ to see more, but know that the public API is documented here, and anything else you depend on could be subject to change. All ``SchemaStrategy`` subclasses maintain the public API though, so you can extend any of them in this way. ``SchemaStrategy`` API ++++++++++++++++++++++ [class constant] ``KEYWORDS`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This should be a tuple listing all of the JSON-schema keywords that this strategy knows how to handle. Any keywords encountered in added schemas will be be naively passed on to the generated schema unless they are in this list (or you override that behavior in ``to_schema``). When adding keywords to a new ``SchemaStrategy``, it's best to splat the parent class's ``KEYWORDS`` into the new tuple. [class method] ``match_schema(cls, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in schema. :param schema: a JSON Schema in ``dict`` form :rtype: ``bool`` [class method] ``match_object(cls, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in object. :param obj: any object or scalar that can be serialized in JSON :rtype: ``bool`` ``__init__(self, node_class)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this method if you need to initialize an instance variable. :param node_class: This param is not part of the public API. Pass it along to ``super``. ``add_schema(self, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to modify how a schema is parsed and stored. :param schema: a JSON Schema in ``dict`` form ``add_object(self, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to change the way a schemas are inferred from objects. :param obj: any object or scalar that can be serialized in JSON ``to_schema(self)`` ^^^^^^^^^^^^^^^^^^^ Override this method to customize how a schema object is constructed from the inputs. It is suggested that you invoke ``super`` as the basis for the return value, but it is not required. :rtype: ``dict`` .. note:: There is no schema validation. If you return a bad schema from this method, ``SchemaBuilder`` will output a bad schema. ``__eq__(self, other)`` ^^^^^^^^^^^^^^^^^^^^^^^ When checking for ``SchemaBuilder`` equality, strategies are matched using ``__eq__``. The default implementation uses a simple ``__dict__`` equality check. Override this method if you need to override that behavior. This may be useful if you add instance variables that aren't relevant to whether two SchemaStrategies are considered equal. :rtype: ``bool`` ``TypedSchemaStrategy`` API +++++++++++++++++++++++++++ This is an abstract schema strategy for making simple schemas that only deal with the ``type`` keyword, but you can extend it to add more functionality. Subclasses must define the following two class constants, but you get the entire ``SchemaStrategy`` interface for free. [class constant] ``JS_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This will be the value of the ``type`` keyword in the generated schema. It is also used to match any added schemas. [class constant] ``PYTHON_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is a Python type or tuple of types that will be matched against an added object using ``isinstance``. Extending ``SchemaBuilder`` +++++++++++++++++++++++++++ Once you have extended ``SchemaStrategy`` types, you'll need to create a ``SchemaBuilder`` class that uses them, since the default ``SchemaBuilder`` only incorporates the default strategies. To do this, extend the ``SchemaBuilder`` class and define one of these two constants inside it: [class constant] ``EXTRA_STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the standard (and suggested) way to add strategies. Set it to a tuple of all your new strategies, and they will be added to the existing list of strategies to check. This preserves all the existing functionality. Note that order matters. GenSON checks the list in order, so the first strategy has priority over the second and so on. All ``EXTRA_STRATEGIES`` have priority over the default strategies. [class constant] ``STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This clobbers the existing list of strategies and completely replaces it. Set it to a tuple just like for ``EXTRA_STRATEGIES``, but note that if any object or schema gets added that your exhaustive list of strategies doesn't know how to handle, you'll get an error. You should avoid doing this unless you're extending most or all existing strategies in some way. Example: ``MinNumber`` ++++++++++++++++++++++ Here's some example code creating a number strategy that tracks the `minimum number`_ seen and includes it in the output schema. .. code-block:: python from genson import SchemaBuilder from genson.schema.strategies import Number class MinNumber(Number): # add 'minimum' to list of keywords KEYWORDS = (*Number.KEYWORDS, 'minimum') # create a new instance variable def __init__(self, node_class): super().__init__(node_class) self.min = None # capture 'minimum's from schemas def add_schema(self, schema): super().add_schema(schema) if self.min is None: self.min = schema.get('minimum') elif 'minimum' in schema: self.min = min(self.min, schema['minimum']) # adjust minimum based on the data def add_object(self, obj): super().add_object(obj) self.min = obj if self.min is None else min(self.min, obj) # include 'minimum' in the output def to_schema(self): schema = super().to_schema() schema['minimum'] = self.min return schema # new SchemaBuilder class that uses the MinNumber strategy in addition # to the existing strategies. Both MinNumber and Number are active, but # MinNumber has priority, so it effectively replaces Number. class MinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum """ EXTRA_STRATEGIES = (MinNumber,) # this class *ONLY* has the MinNumber strategy. Any object that is not # a number will cause an error. class ExclusiveMinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum, and only handles number """ STRATEGIES = (MinNumber,) Now that we have the MinNumberSchemaBuilder class, let's see how it works. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> builder.add_object(5) >>> builder.add_object(7) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(-2) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -2} >>> builder.add_schema({'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7} Note that the exclusive builder is much more particular. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> picky_builder = ExclusiveMinNumberSchemaBuilder() >>> picky_builder.add_object(5) >>> picky_builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(None) # this is fine >>> picky_builder.add_object(None) # this fails genson.schema.node.SchemaGenerationError: Could not find matching schema type for object: None Contributing ------------ When contributing, please follow these steps: 1. Clone the repo and make your changes. 2. Make sure your code has test cases written against it. 3. Lint your code with `Flake8`_. 4. Run `tox`_ to make sure the test suite passes. 5. Ensure the docs are accurate. 6. Add your name to the list of contributers. 7. Submit a Pull Request. Tests +++++ Tests are written in ``unittest`` and are run using `tox`_ and `nose`_. Tox will run all tests with coverage against each supported Python version that is installed on your machine. .. code-block:: bash $ tox Integration +++++++++++ When you submit a PR, `Travis CI`_ performs the following steps: 1. Lints the code with Flake8 2. Runs the entire test suite against each supported Python version. 3. Ensures that test coverage is at least 90% If any of these steps fail, your PR cannot be merged until it is fixed. Potential Future Features +++++++++++++++++++++++++ The following are extra features under consideration. * recognize every validation keyword and ignore any that don't apply * option to set error level * custom serializer plugins * logical support for more keywords: * ``enum`` * ``minimum``/``maximum`` * ``minLength``/``maxLength`` * ``minItems``/``maxItems`` * ``minProperties``/``maxProperties`` * ``additionalItems`` * ``additionalProperties`` * ``format`` & ``pattern`` * ``$ref`` & ``id`` .. _JSON Schema: http://json-schema.org/ .. _Java Genson library: https://owlike.github.io/genson/ .. _`Python's builtin json library`: https://docs.python.org/library/json.html .. _below: #typeless-schemas .. _array validation here: https://spacetelescope.github.io/understanding-json-schema/reference/array.html#items .. _patternProperties: https://spacetelescope.github.io/understanding-json-schema/reference/object.html#pattern-properties .. _Python flavor of RegEx: https://docs.python.org/3.6/library/re.html .. _the code: https://github.com/wolverdude/GenSON/tree/master/genson/schema/strategies .. _minimum number: https://json-schema.org/understanding-json-schema/reference/numeric.html#range .. _Flake8: https://pypi.python.org/pypi/flake8 .. _tox: https://pypi.python.org/pypi/tox .. _nose: https://pypi.python.org/pypi/nose .. _Travis CI: https://travis-ci.com/github/wolverdude/GenSON ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9226403 genson-1.3.0/genson/0000755000076500000240000000000014621230017013016 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810297.0 genson-1.3.0/genson/__init__.py0000644000076500000240000000053314621227771015145 0ustar00jrwstafffrom .schema.builder import SchemaBuilder, Schema from .schema.node import SchemaNode, SchemaGenerationError from .schema.strategies.base import SchemaStrategy, TypedSchemaStrategy __version__ = '1.3.0' __all__ = [ 'SchemaBuilder', 'SchemaNode', 'SchemaGenerationError', 'Schema', 'SchemaStrategy', 'TypedSchemaStrategy'] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715734326.0 genson-1.3.0/genson/__main__.py0000644000076500000240000001336514621003466015126 0ustar00jrwstaffimport argparse import sys import re import json from . import SchemaBuilder, __version__ class CLI: def __init__(self, prog=None): self._make_parser(prog) self._prepare_args() self.builder = SchemaBuilder(schema_uri=self.args.schema_uri) def run(self): if not self.args.schema and not self.args.object: self.fail('noting to do - no schemas or objects given') self.add_schemas() self.add_objects() self.print_output() def add_schemas(self): for fp in self.args.schema: self._call_with_json_from_fp(self.builder.add_schema, fp) fp.close() def add_objects(self): for fp in self.args.object: self._call_with_json_from_fp(self.builder.add_object, fp) fp.close() def print_output(self): print(self.builder.to_json(indent=self.args.indent)) def fail(self, message): self.parser.error(message) def _make_parser(self, prog=None): file_type = argparse.FileType('r', encoding=self._get_encoding()) self.parser = argparse.ArgumentParser( add_help=False, prog=prog, description="""Generate one, unified JSON Schema from one or more JSON objects and/or JSON Schemas. Compatible with JSON-Schema Draft 4 and above.""") self.parser.add_argument( '-h', '--help', action='help', default=argparse.SUPPRESS, help='Show this help message and exit.') self.parser.add_argument( '--version', action='version', default=argparse.SUPPRESS, version='%(prog)s {}'.format(__version__), help='Show version number and exit.') self.parser.add_argument( '-d', '--delimiter', metavar='DELIM', help="""Set a delimiter. Use this option if the input files contain multiple JSON objects/schemas. You can pass any string. A few cases ('newline', 'tab', 'space') will get converted to a whitespace character. If this option is omitted, the parser will try to auto-detect boundaries.""") self.parser.add_argument( '-e', '--encoding', type=str, metavar='ENCODING', help="""Use ENCODING instead of the default system encoding when reading files. ENCODING must be a valid codec name or alias.""") self.parser.add_argument( '-i', '--indent', type=int, metavar='SPACES', help="""Pretty-print the output, indenting SPACES spaces.""") self.parser.add_argument( '-s', '--schema', action='append', default=[], type=file_type, help="""File containing a JSON Schema (can be specified multiple times to merge schemas).""") self.parser.add_argument( '-$', '--schema-uri', metavar='SCHEMA_URI', dest='schema_uri', default=SchemaBuilder.DEFAULT_URI, help="""The value of the '$schema' keyword (defaults to {default!r} or can be specified in a schema with the -s option). If {null!r} is passed, the "$schema" keyword will not be included in the result.""".format(default=SchemaBuilder.DEFAULT_URI, null=SchemaBuilder.NULL_URI)) self.parser.add_argument( 'object', nargs=argparse.REMAINDER, type=file_type, help="""Files containing JSON objects (defaults to stdin if no arguments are passed).""") def _get_encoding(self): """ use separate arg parser to grab encoding argument before defining FileType args """ parser = argparse.ArgumentParser(add_help=False) parser.add_argument('-e', '--encoding', type=str) args, _ = parser.parse_known_args() return args.encoding def _prepare_args(self): self.args = self.parser.parse_args() self._prepare_delimiter() # default to stdin if no objects or schemas if not self.args.object and not sys.stdin.isatty(): self.args.object.append(sys.stdin) def _prepare_delimiter(self): """ manage special conversions for difficult bash characters """ if self.args.delimiter == 'newline': self.args.delimiter = '\n' elif self.args.delimiter == 'tab': self.args.delimiter = '\t' elif self.args.delimiter == 'space': self.args.delimiter = ' ' def _call_with_json_from_fp(self, method, fp): for json_string in self._get_json_strings(fp.read().strip()): try: json_obj = json.loads(json_string) except json.JSONDecodeError as err: self.fail('invalid JSON in {}: {}'.format(fp.name, err)) method(json_obj) def _get_json_strings(self, raw_text): if self.args.delimiter is None or self.args.delimiter == '': json_strings = self._detect_json_strings(raw_text) else: json_strings = raw_text.split(self.args.delimiter) # sanitize data before returning return [string.strip() for string in json_strings if string.strip()] @staticmethod def _detect_json_strings(raw_text): """ Use regex with lookaround to spot the boundaries between JSON objects. Unfortunately, it has to match *something*, so at least one character must be removed and replaced. """ strings = re.split(r'}\s*(?={)', raw_text) # put back the stripped character json_strings = [string + '}' for string in strings[:-1]] # the last one doesn't need to be modified json_strings.append(strings[-1]) return json_strings def main(): CLI().run() if __name__ == "__main__": CLI('genson').run() ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9244697 genson-1.3.0/genson/schema/0000755000076500000240000000000014621230017014256 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759813.0 genson-1.3.0/genson/schema/__init__.py0000644000076500000240000000000014037050205016355 0ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/builder.py0000644000076500000240000001245714407651467016311 0ustar00jrwstaffimport json from warnings import warn from .node import SchemaNode from .strategies import BASIC_SCHEMA_STRATEGIES class _MetaSchemaBuilder(type): def __init__(cls, name, bases, attrs): super().__init__(name, bases, attrs) if 'EXTRA_STRATEGIES' in attrs: schema_strategies = list(attrs['EXTRA_STRATEGIES']) # add in all strategies inherited from base classes for base in bases: schema_strategies += list(getattr(base, 'STRATEGIES', [])) unique_schema_strategies = [] for schema_strategy in schema_strategies: if schema_strategy not in unique_schema_strategies: unique_schema_strategies.append(schema_strategy) cls.STRATEGIES = tuple(unique_schema_strategies) # create a version of SchemaNode loaded with the custom strategies cls.NODE_CLASS = type('%sSchemaNode' % name, (SchemaNode,), {'STRATEGIES': cls.STRATEGIES}) class SchemaBuilder(metaclass=_MetaSchemaBuilder): """ ``SchemaBuilder`` is the basic schema generator class. ``SchemaBuilder`` instances can be loaded up with existing schemas and objects before being serialized. """ DEFAULT_URI = 'http://json-schema.org/schema#' NULL_URI = 'NULL' NODE_CLASS = SchemaNode STRATEGIES = BASIC_SCHEMA_STRATEGIES def __init__(self, schema_uri='DEFAULT'): """ :param schema_uri: value of the ``$schema`` keyword. If not given, it will use the value of the first available ``$schema`` keyword on an added schema or else the default: ``'http://json-schema.org/schema#'``. A value of ``False`` or ``None`` will direct GenSON to leave out the ``"$schema"`` keyword. """ if schema_uri is None or schema_uri is False: self.schema_uri = self.NULL_URI elif schema_uri == 'DEFAULT': self.schema_uri = None else: self.schema_uri = schema_uri if not issubclass(self.NODE_CLASS, SchemaNode): raise TypeError("NODE_CLASS %r is not a subclass of SchemaNode" % self.NODE_CLASS) self._root_node = self.NODE_CLASS() def add_schema(self, schema): """ Merge in a JSON schema. This can be a ``dict`` or another ``SchemaBuilder`` :param schema: a JSON Schema .. note:: There is no schema validation. If you pass in a bad schema, you might get back a bad schema. """ if isinstance(schema, SchemaBuilder): schema_uri = schema.schema_uri schema = schema.to_schema() if schema_uri is None: del schema['$schema'] elif isinstance(schema, SchemaNode): schema = schema.to_schema() if '$schema' in schema: self.schema_uri = self.schema_uri or schema['$schema'] schema = dict(schema) del schema['$schema'] self._root_node.add_schema(schema) def add_object(self, obj): """ Modify the schema to accommodate an object. :param obj: any object or scalar that can be serialized in JSON """ self._root_node.add_object(obj) def to_schema(self): """ Generate a schema based on previous inputs. :rtype: ``dict`` """ schema = self._base_schema() schema.update(self._root_node.to_schema()) return schema def to_json(self, *args, **kwargs): """ Generate a schema and convert it directly to serialized JSON. :rtype: ``str`` """ return json.dumps(self.to_schema(), *args, **kwargs) def __len__(self): """ Number of ``SchemaStrategy``s at the top level. This is used mostly to check for emptiness. """ return len(self._root_node) def __eq__(self, other): """ Check for equality with another ``SchemaBuilder`` object. :param other: another ``SchemaBuilder`` object. Other types are accepted, but will always return ``False`` """ if other is self: return True if not isinstance(other, self.__class__): return False # use _base_schema to get proper comparison for $schema keyword return (self._base_schema() == other._base_schema() and self._root_node == other._root_node) def _base_schema(self): if self.schema_uri == self.NULL_URI: return {} else: return {'$schema': self.schema_uri or self.DEFAULT_URI} class Schema(SchemaBuilder): def __init__(self): warn('genson.Schema is deprecated in v1.0, and it may be ' 'removed in future versions. Use genson.SchemaBuilder' 'instead.', PendingDeprecationWarning) super().__init__(schema_uri=SchemaBuilder.NULL_URI) def to_dict(self, recurse='DEPRECATED'): warn('#to_dict is deprecated in v1.0, and it may be removed in ' 'future versions. Use #to_schema instead.', PendingDeprecationWarning) if recurse != 'DEPRECATED': warn('the `recurse` option for #to_dict does nothing in v1.0', DeprecationWarning) return self.to_schema() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/node.py0000644000076500000240000001120514407651467015576 0ustar00jrwstafffrom .strategies import BASIC_SCHEMA_STRATEGIES, Typeless class SchemaGenerationError(RuntimeError): pass class SchemaNode: """ Basic schema generator class. SchemaNode objects can be loaded up with existing schemas and objects before being serialized. """ STRATEGIES = BASIC_SCHEMA_STRATEGIES def __init__(self): self._active_strategies = [] def add_schema(self, schema): """ Merges in an existing schema. arguments: * `schema` (required - `dict` or `SchemaNode`): an existing JSON Schema to merge. """ # serialize instances of SchemaNode before parsing if isinstance(schema, SchemaNode): schema = schema.to_schema() for subschema in self._get_subschemas(schema): # delegate to SchemaType object active_strategy = self._get_strategy_for_schema(subschema) active_strategy.add_schema(subschema) # return self for easy method chaining return self def add_object(self, obj): """ Modify the schema to accommodate an object. arguments: * `obj` (required - `dict`): a JSON object to use in generating the schema. """ # delegate to SchemaType object active_strategy = self._get_strategy_for_object(obj) active_strategy.add_object(obj) # return self for easy method chaining return self def to_schema(self): """ Convert the current schema to a `dict`. """ types = set() generated_schemas = [] for active_strategy in self._active_strategies: generated_schema = active_strategy.to_schema() if len(generated_schema) == 1 and 'type' in generated_schema: types.add(generated_schema['type']) else: generated_schemas.append(generated_schema) if types: if len(types) == 1: (types,) = types else: types = sorted(types) generated_schemas = [{'type': types}] + generated_schemas if len(generated_schemas) == 1: (result_schema,) = generated_schemas elif generated_schemas: result_schema = {'anyOf': generated_schemas} else: result_schema = {} return result_schema def __len__(self): return len(self._active_strategies) def __eq__(self, other): """ Required for SchemaBuilder.__eq__ to work properly """ return (isinstance(other, self.__class__) and self.__dict__ == other.__dict__) # private methods def _get_subschemas(self, schema): if 'anyOf' in schema: return [subschema for anyof in schema['anyOf'] for subschema in self._get_subschemas(anyof)] elif isinstance(schema.get('type'), list): other_keys = dict(schema) del other_keys['type'] return [dict(type=tipe, **other_keys) for tipe in schema['type']] else: return [schema] def _get_strategy_for_schema(self, schema): return self._get_strategy_for_('schema', schema) def _get_strategy_for_object(self, obj): return self._get_strategy_for_('object', obj) def _get_strategy_for_(self, kind, schema_or_obj): # check existing types for active_strategy in self._active_strategies: if getattr(active_strategy, 'match_' + kind)(schema_or_obj): return active_strategy # check all potential types for strategy in self.STRATEGIES: if getattr(strategy, 'match_' + kind)(schema_or_obj): active_strategy = strategy(self.__class__) # incorporate typeless strategy if it exists if self._active_strategies and \ isinstance(self._active_strategies[-1], Typeless): typeless = self._active_strategies.pop() active_strategy.add_schema(typeless.to_schema()) self._active_strategies.append(active_strategy) return active_strategy # no match found, if typeless add to first strategy if kind == 'schema' and Typeless.match_schema(schema_or_obj): if not self._active_strategies: self._active_strategies.append(Typeless(self.__class__)) active_strategy = self._active_strategies[0] return active_strategy # no match found, raise an error raise SchemaGenerationError( 'Could not find matching schema type for {0}: {1!r}'.format( kind, schema_or_obj)) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9254053 genson-1.3.0/genson/schema/strategies/0000755000076500000240000000000014621230017016430 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759813.0 genson-1.3.0/genson/schema/strategies/__init__.py0000644000076500000240000000101214037050205020533 0ustar00jrwstafffrom .base import ( SchemaStrategy, TypedSchemaStrategy ) from .scalar import ( Typeless, Null, Boolean, Number, String ) from .array import List, Tuple from .object import Object BASIC_SCHEMA_STRATEGIES = ( Null, Boolean, Number, String, List, Tuple, Object ) __all__ = ( 'SchemaStrategy', 'TypedSchemaStrategy', 'Null', 'Boolean', 'Number', 'String', 'List', 'Tuple', 'Object', 'Typeless', 'BASIC_SCHEMA_STRATEGIES' ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/strategies/array.py0000644000076500000240000000407514407651467020150 0ustar00jrwstafffrom .base import SchemaStrategy class BaseArray(SchemaStrategy): """ abstract array schema strategy """ KEYWORDS = ('type', 'items') @staticmethod def match_object(obj): return isinstance(obj, list) def to_schema(self): schema = super().to_schema() schema['type'] = 'array' if self._items: schema['items'] = self.items_to_schema() return schema class List(BaseArray): """ strategy for list-style array schemas. This is the default strategy for arrays. """ @staticmethod def match_schema(schema): return schema.get('type') == 'array' \ and isinstance(schema.get('items', {}), dict) def __init__(self, node_class): super().__init__(node_class) self._items = node_class() def add_schema(self, schema): super().add_schema(schema) if 'items' in schema: self._items.add_schema(schema['items']) def add_object(self, obj): for item in obj: self._items.add_object(item) def items_to_schema(self): return self._items.to_schema() class Tuple(BaseArray): """ strategy for tuple-style array schemas. These will always have an items key to preserve the fact that it's a tuple. """ @staticmethod def match_schema(schema): return schema.get('type') == 'array' \ and isinstance(schema.get('items'), list) def __init__(self, node_class): super().__init__(node_class) self._items = [node_class()] def add_schema(self, schema): super().add_schema(schema) if 'items' in schema: self._add(schema['items'], 'add_schema') def add_object(self, obj): self._add(obj, 'add_object') def _add(self, items, func): while len(self._items) < len(items): self._items.append(self.node_class()) for subschema, item in zip(self._items, items): getattr(subschema, func)(item) def items_to_schema(self): return [item.to_schema() for item in self._items] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/strategies/base.py0000644000076500000240000000422414407651467017740 0ustar00jrwstafffrom copy import copy from warnings import warn class SchemaStrategy: """ base schema strategy. This contains the common interface for all subclasses: * match_schema * match_object * __init__ * add_schema * add_object * to_schema * __eq__ """ KEYWORDS = ('type',) @classmethod def match_schema(cls, schema): raise NotImplementedError("'match_schema' not implemented") @classmethod def match_object(cls, obj): raise NotImplementedError("'match_object' not implemented") def __init__(self, node_class): self.node_class = node_class self._extra_keywords = {} def add_schema(self, schema): self._add_extra_keywords(schema) def _add_extra_keywords(self, schema): for keyword, value in schema.items(): if keyword in self.KEYWORDS: continue elif keyword not in self._extra_keywords: self._extra_keywords[keyword] = value elif self._extra_keywords[keyword] != value: warn(('Schema incompatible. Keyword {0!r} has conflicting ' 'values ({1!r} vs. {2!r}). Using {1!r}').format( keyword, self._extra_keywords[keyword], value)) def add_object(self, obj): pass def to_schema(self): return copy(self._extra_keywords) def __eq__(self, other): """ Required for SchemaBuilder.__eq__ to work properly """ return (isinstance(other, self.__class__) and self.__dict__ == other.__dict__) class TypedSchemaStrategy(SchemaStrategy): """ base schema strategy class for scalar types. Subclasses define these two class constants: * `JS_TYPE`: a valid value of the `type` keyword * `PYTHON_TYPE`: Python type objects - can be a tuple of types """ @classmethod def match_schema(cls, schema): return schema.get('type') == cls.JS_TYPE @classmethod def match_object(cls, obj): return isinstance(obj, cls.PYTHON_TYPE) def to_schema(self): schema = super().to_schema() schema['type'] = self.JS_TYPE return schema ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/strategies/object.py0000644000076500000240000000631414407651467020276 0ustar00jrwstafffrom collections import defaultdict from re import search from .base import SchemaStrategy class Object(SchemaStrategy): """ object schema strategy """ KEYWORDS = ('type', 'properties', 'patternProperties', 'required') @staticmethod def match_schema(schema): return schema.get('type') == 'object' @staticmethod def match_object(obj): return isinstance(obj, dict) def __init__(self, node_class): super().__init__(node_class) self._properties = defaultdict(node_class) self._pattern_properties = defaultdict(node_class) self._required = None self._include_empty_required = False def add_schema(self, schema): super().add_schema(schema) if 'properties' in schema: for prop, subschema in schema['properties'].items(): subnode = self._properties[prop] if subschema is not None: subnode.add_schema(subschema) if 'patternProperties' in schema: for pattern, subschema in schema['patternProperties'].items(): subnode = self._pattern_properties[pattern] if subschema is not None: subnode.add_schema(subschema) if 'required' in schema: required = set(schema['required']) if not required: self._include_empty_required = True if self._required is None: self._required = required else: self._required &= required def add_object(self, obj): properties = set() for prop, subobj in obj.items(): pattern = None if prop not in self._properties: pattern = self._matching_pattern(prop) if pattern is not None: self._pattern_properties[pattern].add_object(subobj) else: properties.add(prop) self._properties[prop].add_object(subobj) if self._required is None: self._required = properties else: self._required &= properties def _matching_pattern(self, prop): for pattern in self._pattern_properties.keys(): if search(pattern, prop): return pattern def _add(self, items, func): while len(self._items) < len(items): self._items.append(self._schema_node_class()) for subschema, item in zip(self._items, items): getattr(subschema, func)(item) def to_schema(self): schema = super().to_schema() schema['type'] = 'object' if self._properties: schema['properties'] = self._properties_to_schema( self._properties) if self._pattern_properties: schema['patternProperties'] = self._properties_to_schema( self._pattern_properties) if self._required or self._include_empty_required: schema['required'] = sorted(self._required) return schema def _properties_to_schema(self, properties): schema_properties = {} for prop, schema_node in properties.items(): schema_properties[prop] = schema_node.to_schema() return schema_properties ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/genson/schema/strategies/scalar.py0000644000076500000240000000354314407651467020276 0ustar00jrwstafffrom .base import SchemaStrategy, TypedSchemaStrategy class Typeless(SchemaStrategy): """ schema strategy for schemas with no type. This is only used when there is no other active strategy, and it will be merged into the first typed strategy that gets added. """ @classmethod def match_schema(cls, schema): return 'type' not in schema @classmethod def match_object(cls, obj): return False class Null(TypedSchemaStrategy): """ strategy for null schemas """ JS_TYPE = 'null' PYTHON_TYPE = type(None) class Boolean(TypedSchemaStrategy): """ strategy for boolean schemas """ JS_TYPE = 'boolean' PYTHON_TYPE = bool class String(TypedSchemaStrategy): """ strategy for string schemas - works for ascii and unicode strings """ JS_TYPE = 'string' PYTHON_TYPE = str class Number(SchemaStrategy): """ strategy for integer and number schemas. It automatically converts from `integer` to `number` when a float object or a number schema is added """ JS_TYPES = ('integer', 'number') PYTHON_TYPES = (int, float) @classmethod def match_schema(cls, schema): return schema.get('type') in cls.JS_TYPES @classmethod def match_object(cls, obj): # cannot use isinstance() because boolean is a subtype of int return type(obj) in cls.PYTHON_TYPES def __init__(self, node_class): super().__init__(node_class) self._type = 'integer' def add_schema(self, schema): super().add_schema(schema) if schema.get('type') == 'number': self._type = 'number' def add_object(self, obj): if isinstance(obj, float): self._type = 'number' def to_schema(self): schema = super().to_schema() schema['type'] = self._type return schema ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9281256 genson-1.3.0/genson.egg-info/0000755000076500000240000000000014621230017014510 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810318.0 genson-1.3.0/genson.egg-info/PKG-INFO0000644000076500000240000006757614621230016015631 0ustar00jrwstaffMetadata-Version: 2.1 Name: genson Version: 1.3.0 Summary: GenSON is a powerful, user-friendly JSON Schema generator. Home-page: https://github.com/wolverdude/genson/ Download-URL: https://github.com/wolverdude/GenSON/tarball/v0.2s.0 Author: Jon Wolverton Author-email: wolverton.jr@gmail.com License: MIT Keywords: json, schema, json-schema, jsonschema, object, generate,generator, builder, merge, draft 7, validate, validation Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: Intended Audience :: Developers Classifier: Natural Language :: English Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Topic :: Software Development :: Code Generators Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Utilities Description-Content-Type: text/x-rst License-File: LICENSE License-File: AUTHORS.rst GenSON ====== **GenSON** is a powerful, user-friendly `JSON Schema`_ generator built in Python. .. note:: This is *not* the Python equivalent of the `Java Genson library`_. If you are coming from Java and need to create JSON objects in Python, you want `Python's builtin json library`_.) GenSON's core function is to take JSON objects and generate schemas that describe them, but it is unique in its ability to *merge* schemas. It was originally built to describe the common structure of a large number of JSON objects, and it uses its merging ability to generate a single schema from any number of JSON objects and/or schemas. GenSON's schema builder follows these three rules: 1. *Every* object it is given must validate under the generated schema. 2. *Any* object that is valid under *any* schema it is given must also validate under the generated schema. (there is one glaring exception to this, detailed `below`_) 3. The generated schema should be as strict as possible given the first 2 rules. JSON Schema Implementation -------------------------- **GenSON** is compatible with JSON Schema Draft 6 and above. It is important to note that GenSON uses only a subset of JSON Schema's capabilities. This is mainly because it doesn't know the specifics of your data model, and it tries to avoid guessing them. Its purpose is to generate the basic structure so that you can skip the boilerplate and focus on the details of the schema. Currently, GenSON only deals with these keywords: * ``"$schema"`` * ``"type"`` * ``"items"`` * ``"properties"`` * ``"patternProperties"`` * ``"required"`` * ``"anyOf"`` You should be aware that this limited vocabulary could cause GenSON to violate rules 1 and 2. If you feed it schemas with advanced keywords, it will just blindly pass them on to the final schema. Note that ``"$ref"`` and ``id`` are also not supported, so GenSON will not dereference linked nodes when building a schema. Installation ------------ .. code-block:: bash $ pip install genson CLI Tool -------- The package includes a ``genson`` executable that allows you to access this functionality from the command line. For usage info, run with ``--help``: .. code-block:: bash $ genson --help .. code-block:: usage: genson [-h] [--version] [-d DELIM] [-e ENCODING] [-i SPACES] [-s SCHEMA] [-$ SCHEMA_URI] ... Generate one, unified JSON Schema from one or more JSON objects and/or JSON Schemas. Compatible with JSON-Schema Draft 4 and above. positional arguments: object Files containing JSON objects (defaults to stdin if no arguments are passed). optional arguments: -h, --help Show this help message and exit. --version Show version number and exit. -d DELIM, --delimiter DELIM Set a delimiter. Use this option if the input files contain multiple JSON objects/schemas. You can pass any string. A few cases ('newline', 'tab', 'space') will get converted to a whitespace character. If this option is omitted, the parser will try to auto-detect boundaries. -e ENCODING, --encoding ENCODING Use ENCODING instead of the default system encoding when reading files. ENCODING must be a valid codec name or alias. -i SPACES, --indent SPACES Pretty-print the output, indenting SPACES spaces. -s SCHEMA, --schema SCHEMA File containing a JSON Schema (can be specified multiple times to merge schemas). -$ SCHEMA_URI, --schema-uri SCHEMA_URI The value of the '$schema' keyword (defaults to 'http://json-schema.org/schema#' or can be specified in a schema with the -s option). If 'NULL' is passed, the "$schema" keyword will not be included in the result. GenSON Python API ----------------- ``SchemaBuilder`` is the basic schema generator class. ``SchemaBuilder`` instances can be loaded up with existing schemas and objects before being serialized. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({"type": "object", "properties": {}}) >>> builder.add_object({"hi": "there"}) >>> builder.add_object({"hi": 5}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': { 'hi': {'type': ['integer', 'string']}}, 'required': ['hi']} >>> print(builder.to_json(indent=2)) { "$schema": "http://json-schema.org/schema#", "type": "object", "properties": { "hi": { "type": [ "integer", "string" ] } }, "required": [ "hi" ] } ``SchemaBuilder`` API +++++++++++++++++++++ ``__init__(schema_uri=None)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ :param schema_uri: value of the ``$schema`` keyword. If not given, it will use the value of the first available ``$schema`` keyword on an added schema or else the default: ``'http://json-schema.org/schema#'``. A value of ``False`` or ``None`` will direct GenSON to leave out the ``"$schema"`` keyword. ``add_schema(schema)`` ^^^^^^^^^^^^^^^^^^^^^^ Merge in a JSON schema. This can be a ``dict`` or another ``SchemaBuilder`` object. :param schema: a JSON Schema .. note:: There is no schema validation. If you pass in a bad schema, you might get back a bad schema. ``add_object(obj)`` ^^^^^^^^^^^^^^^^^^^ Modify the schema to accommodate an object. :param obj: any object or scalar that can be serialized in JSON ``to_schema()`` ^^^^^^^^^^^^^^^ Generate a schema based on previous inputs. :rtype: ``dict`` ``to_json()`` ^^^^^^^^^^^^^ Generate a schema and convert it directly to serialized JSON. :rtype: ``str`` ``__eq__(other)`` ^^^^^^^^^^^^^^^^^ Check for equality with another ``SchemaBuilder`` object. :param other: another ``SchemaBuilder`` object. Other types are accepted, but will always return ``False`` SchemaBuilder object interaction ++++++++++++++++++++++++++++++++ ``SchemaBuilder`` objects can also interact with each other: * You can pass one schema directly to another to merge them. * You can compare schema equality directly. .. code-block:: python >>> from genson import SchemaBuilder >>> b1 = SchemaBuilder() >>> b1.add_schema({"type": "object", "properties": { ... "hi": {"type": "string"}}}) >>> b2 = SchemaBuilder() >>> b2.add_schema({"type": "object", "properties": { ... "hi": {"type": "integer"}}}) >>> b1 == b2 False >>> b1.add_schema(b2) >>> b2.add_schema(b1) >>> b1 == b2 True >>> b1.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'properties': {'hi': {'type': ['integer', 'string']}}} Seed Schemas ------------ There are several cases where multiple valid schemas could be generated from the same object. GenSON makes a default choice in all these ambiguous cases, but if you want it to choose differently, you can tell it what to do using a *seed schema*. Seeding Arrays ++++++++++++++ For example, suppose you have a simple array with two items: .. code-block:: python ['one', 1] There are always two ways for GenSON to interpret any array: List and Tuple. Lists have one schema for every item, whereas Tuples have a different schema for every array position. This is analogous to the (now deprecated) ``merge_arrays`` option from version 0. You can read more about JSON Schema `array validation here`_. List Validation ^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": {"type": ["integer", "string"]} } Tuple Validation ^^^^^^^^^^^^^^^^ .. code-block:: json { "type": "array", "items": [{"type": "integer"}, {"type": "string"}] } By default, GenSON always interprets arrays using list validation, but you can tell it to use tuple validation by seeding it with a schema. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': {'type': ['integer', 'string']}} >>> builder = SchemaBuilder() >>> seed_schema = {'type': 'array', 'items': []} >>> builder.add_schema(seed_schema) >>> builder.add_object(['one', 1]) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'array', 'items': [{'type': 'string'}, {'type': 'integer'}]} Note that in this case, the seed schema is actually invalid. You can't have an empty array as the value for an ``items`` keyword. But GenSON is a generator, not a validator, so you can fudge a little. GenSON will modify the generated schema so that it is valid, provided that there aren't invalid keywords beyond the ones it knows about. Seeding patternProperties +++++++++++++++++++++++++ Support for patternProperties_ is new in version 1; however, since GenSON's default behavior is to only use ``properties``, this powerful keyword can only be utilized with seed schemas. You will need to supply an ``object`` schema with a ``patternProperties`` object whose keys are RegEx strings. Again, you can fudge here and set the values to null instead of creating valid subschemas. .. code-block:: python >>> from genson import SchemaBuilder >>> builder = SchemaBuilder() >>> builder.add_schema({'type': 'object', 'patternProperties': {r'^\d+$': None}}) >>> builder.add_object({'1': 1, '2': 2, '3': 3}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'object', 'patternProperties': {'^\\d+$': {'type': 'integer'}}} There are a few gotchas you should be aware of here: * GenSON is written in Python, so it uses the `Python flavor of RegEx`_. * GenSON still prefers ``properties`` to ``patternProperties`` if a property already exists that matches one of your patterns, the normal property will be updated, *not* the pattern property. * If a key matches multiple patterns, there is *no guarantee* of which one will be updated. * The patternProperties_ docs themselves have some more useful pointers that can save you time. Typeless Schemas ++++++++++++++++ In version 0, GenSON did not accept a schema without a type, but in order to be flexible in the support of seed schemas, support was added for version 1. However, GenSON violates rule #2 in its handling of typeless schemas. Any object will validate under an empty schema, but GenSON incorporates typeless schemas into the first-available typed schema, and since typed schemas are stricter than typless ones, objects that would validate under an added schema will not validate under the result. Customizing ``SchemaBuilder`` ----------------------------- You can extend the ``SchemaBuilder`` class to add in your own logic (e.g. recording ``minimum`` and ``maximum`` for a number). In order to do this, you need to: 1. Create a custom ``SchemaStrategy`` class. 2. Create a ``SchemaBuilder`` subclass that includes your custom ``SchemaStrategy`` class(es). 3. Use your custom ``SchemaBuilder`` just like you would the stock ``SchemaBuilder``. ``SchemaStrategy`` Classes ++++++++++++++++++++++++++ GenSON uses the Strategy Pattern to parse, update, and serialize different kinds of schemas that behave in different ways. There are several ``SchemaStrategy`` classes that roughly correspond to different schema types. GenSON maps each node in an object or schema to an instance of one of these classes. Each instance stores the current schema state and updates or returns it when required. You can modify the specific ways these classes work by extending them. You can inherit from any existing ``SchemaStrategy`` class, though ``SchemaStrategy`` and ``TypedSchemaStrategy`` are the most useful base classes. You should call ``super`` and pass along all arguments when overriding any instance methods. The documentation below explains the public API and what you need to extend and override at a high level. Feel free to explore `the code`_ to see more, but know that the public API is documented here, and anything else you depend on could be subject to change. All ``SchemaStrategy`` subclasses maintain the public API though, so you can extend any of them in this way. ``SchemaStrategy`` API ++++++++++++++++++++++ [class constant] ``KEYWORDS`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This should be a tuple listing all of the JSON-schema keywords that this strategy knows how to handle. Any keywords encountered in added schemas will be be naively passed on to the generated schema unless they are in this list (or you override that behavior in ``to_schema``). When adding keywords to a new ``SchemaStrategy``, it's best to splat the parent class's ``KEYWORDS`` into the new tuple. [class method] ``match_schema(cls, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in schema. :param schema: a JSON Schema in ``dict`` form :rtype: ``bool`` [class method] ``match_object(cls, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Return ``true`` if this strategy should be used to handle the passed-in object. :param obj: any object or scalar that can be serialized in JSON :rtype: ``bool`` ``__init__(self, node_class)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this method if you need to initialize an instance variable. :param node_class: This param is not part of the public API. Pass it along to ``super``. ``add_schema(self, schema)`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to modify how a schema is parsed and stored. :param schema: a JSON Schema in ``dict`` form ``add_object(self, obj)`` ^^^^^^^^^^^^^^^^^^^^^^^^^ Override this to change the way a schemas are inferred from objects. :param obj: any object or scalar that can be serialized in JSON ``to_schema(self)`` ^^^^^^^^^^^^^^^^^^^ Override this method to customize how a schema object is constructed from the inputs. It is suggested that you invoke ``super`` as the basis for the return value, but it is not required. :rtype: ``dict`` .. note:: There is no schema validation. If you return a bad schema from this method, ``SchemaBuilder`` will output a bad schema. ``__eq__(self, other)`` ^^^^^^^^^^^^^^^^^^^^^^^ When checking for ``SchemaBuilder`` equality, strategies are matched using ``__eq__``. The default implementation uses a simple ``__dict__`` equality check. Override this method if you need to override that behavior. This may be useful if you add instance variables that aren't relevant to whether two SchemaStrategies are considered equal. :rtype: ``bool`` ``TypedSchemaStrategy`` API +++++++++++++++++++++++++++ This is an abstract schema strategy for making simple schemas that only deal with the ``type`` keyword, but you can extend it to add more functionality. Subclasses must define the following two class constants, but you get the entire ``SchemaStrategy`` interface for free. [class constant] ``JS_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This will be the value of the ``type`` keyword in the generated schema. It is also used to match any added schemas. [class constant] ``PYTHON_TYPE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is a Python type or tuple of types that will be matched against an added object using ``isinstance``. Extending ``SchemaBuilder`` +++++++++++++++++++++++++++ Once you have extended ``SchemaStrategy`` types, you'll need to create a ``SchemaBuilder`` class that uses them, since the default ``SchemaBuilder`` only incorporates the default strategies. To do this, extend the ``SchemaBuilder`` class and define one of these two constants inside it: [class constant] ``EXTRA_STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the standard (and suggested) way to add strategies. Set it to a tuple of all your new strategies, and they will be added to the existing list of strategies to check. This preserves all the existing functionality. Note that order matters. GenSON checks the list in order, so the first strategy has priority over the second and so on. All ``EXTRA_STRATEGIES`` have priority over the default strategies. [class constant] ``STRATEGIES`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This clobbers the existing list of strategies and completely replaces it. Set it to a tuple just like for ``EXTRA_STRATEGIES``, but note that if any object or schema gets added that your exhaustive list of strategies doesn't know how to handle, you'll get an error. You should avoid doing this unless you're extending most or all existing strategies in some way. Example: ``MinNumber`` ++++++++++++++++++++++ Here's some example code creating a number strategy that tracks the `minimum number`_ seen and includes it in the output schema. .. code-block:: python from genson import SchemaBuilder from genson.schema.strategies import Number class MinNumber(Number): # add 'minimum' to list of keywords KEYWORDS = (*Number.KEYWORDS, 'minimum') # create a new instance variable def __init__(self, node_class): super().__init__(node_class) self.min = None # capture 'minimum's from schemas def add_schema(self, schema): super().add_schema(schema) if self.min is None: self.min = schema.get('minimum') elif 'minimum' in schema: self.min = min(self.min, schema['minimum']) # adjust minimum based on the data def add_object(self, obj): super().add_object(obj) self.min = obj if self.min is None else min(self.min, obj) # include 'minimum' in the output def to_schema(self): schema = super().to_schema() schema['minimum'] = self.min return schema # new SchemaBuilder class that uses the MinNumber strategy in addition # to the existing strategies. Both MinNumber and Number are active, but # MinNumber has priority, so it effectively replaces Number. class MinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum """ EXTRA_STRATEGIES = (MinNumber,) # this class *ONLY* has the MinNumber strategy. Any object that is not # a number will cause an error. class ExclusiveMinNumberSchemaBuilder(SchemaBuilder): """ all number nodes include minimum, and only handles number """ STRATEGIES = (MinNumber,) Now that we have the MinNumberSchemaBuilder class, let's see how it works. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> builder.add_object(5) >>> builder.add_object(7) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(-2) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -2} >>> builder.add_schema({'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7}) >>> builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': -7} Note that the exclusive builder is much more particular. .. code-block:: python >>> builder = MinNumberSchemaBuilder() >>> picky_builder = ExclusiveMinNumberSchemaBuilder() >>> picky_builder.add_object(5) >>> picky_builder.to_schema() {'$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'minimum': 5} >>> builder.add_object(None) # this is fine >>> picky_builder.add_object(None) # this fails genson.schema.node.SchemaGenerationError: Could not find matching schema type for object: None Contributing ------------ When contributing, please follow these steps: 1. Clone the repo and make your changes. 2. Make sure your code has test cases written against it. 3. Lint your code with `Flake8`_. 4. Run `tox`_ to make sure the test suite passes. 5. Ensure the docs are accurate. 6. Add your name to the list of contributers. 7. Submit a Pull Request. Tests +++++ Tests are written in ``unittest`` and are run using `tox`_ and `nose`_. Tox will run all tests with coverage against each supported Python version that is installed on your machine. .. code-block:: bash $ tox Integration +++++++++++ When you submit a PR, `Travis CI`_ performs the following steps: 1. Lints the code with Flake8 2. Runs the entire test suite against each supported Python version. 3. Ensures that test coverage is at least 90% If any of these steps fail, your PR cannot be merged until it is fixed. Potential Future Features +++++++++++++++++++++++++ The following are extra features under consideration. * recognize every validation keyword and ignore any that don't apply * option to set error level * custom serializer plugins * logical support for more keywords: * ``enum`` * ``minimum``/``maximum`` * ``minLength``/``maxLength`` * ``minItems``/``maxItems`` * ``minProperties``/``maxProperties`` * ``additionalItems`` * ``additionalProperties`` * ``format`` & ``pattern`` * ``$ref`` & ``id`` .. _JSON Schema: http://json-schema.org/ .. _Java Genson library: https://owlike.github.io/genson/ .. _`Python's builtin json library`: https://docs.python.org/library/json.html .. _below: #typeless-schemas .. _array validation here: https://spacetelescope.github.io/understanding-json-schema/reference/array.html#items .. _patternProperties: https://spacetelescope.github.io/understanding-json-schema/reference/object.html#pattern-properties .. _Python flavor of RegEx: https://docs.python.org/3.6/library/re.html .. _the code: https://github.com/wolverdude/GenSON/tree/master/genson/schema/strategies .. _minimum number: https://json-schema.org/understanding-json-schema/reference/numeric.html#range .. _Flake8: https://pypi.python.org/pypi/flake8 .. _tox: https://pypi.python.org/pypi/tox .. _nose: https://pypi.python.org/pypi/nose .. _Travis CI: https://travis-ci.com/github/wolverdude/GenSON History ======= 1.3.0 ----- Modernization * add support for Python versions up through 3.12 * remove support for old Python versions older than 3.7 since test dependencies no longer support them * remove Python 2.7 support * remove tests & test commands only relevant to Python 2.7 * remove backwards-compatibility from code * enable running as a module (``python -m genson``) * modernize package configuration (issue #68) * Use a valid ``schema_uri`` in tests (issue #69) 1.2.2 ----- * add ``__version__`` attr to module and ``--version`` option to CLI tool * add ``--encoding`` option to CLI tool that overrides default file encoding (fixes #47) 1.2.1 ----- * expose ``SchemaStrategy.__eq__()`` for extension * add support for Python 3.8 * update Trove classifiers * **Bugfix**: ``SchemaBuilder.__eq__()`` wasn't matching the ``$schema`` keyword correctly * **Bugfix**: only activate empty ``required`` option when ``required`` is actualy empty 1.2.0 ----- * ``SchemaStrategies`` are now extendable, enabling custom ``SchemaBuilder`` classes. * optimize ``__eq__`` logic 1.1.0 ----- * add support for Python 3.7 * drop support for Python 3.3 * drop support for JSON-Schema Draft 4 (because it doesn't allow empty ``required`` arrays) * **Bugfix**: preserve empty ``required`` arrays (fixes #25) * **Bugfix**: handle nested ``anyOf`` keywords (fixes #35) 1.0.2 ----- * add support for ``long`` integers in Python 2.7 * update test-skipping decorator to use standard version requirement strings 1.0.1 ----- * **Bugfix**: seeding an object schema with a ``"required"`` keyword caused an error * **Docs**: fix mislabeled method 1.0.0 ----- This version was a total overhaul. The main change was to split Schema into three separate classes, making it simpler to add more complicated functionality by having different generator types to handle the different schema types. 1. ``SchemaNode`` to manage the tree structure 2. ``SchemaGenerator`` for the schema generation logic 3. ``SchemaBuilder`` to manage the public API Interface Changes +++++++++++++++++ * ``SchemaBuilder`` is the new ``Schema`` * ``to_dict()`` is now called ``to_schema()`` To make the transition easier, there is still a ``Schema`` class that wraps ``SchemaBuilder`` with a backwards-compatibility layer, but you will trigger a ``PendingDeprecationWarning`` if you use it. Seed Schemas ++++++++++++ The ``merge_arrays`` option has been removed in favor of seed schemas. You can now seed specific nodes as list or tuple instead of setting a global option for every node in the schema tree. You can also now seed object nodes with ``patternProperties``, which was a highly requested feature. Other Changes +++++++++++++ * include ``"$schema"`` keyword * accept schemas without ``"type"`` keyword * use ``"anyOf"`` keyword to help combine schemas * add ``SchemaGenerationError`` for better error handling * empty ``"properties"`` and ``"items"`` are not included in generated schemas * ``genson`` executable * new ``--schema-uri`` option * auto-detect object boundaries by default 0.2.3 ----- * **Docs**: add installation instructions 0.2.2 ----- * **Docs**: Python 3.6 is now explicitly tested and listed as compatible. 0.2.1 ----- * **Bugfix**: ``add_schema`` failed when adding list-style array schemas * **Bugfix**: typo in readme 0.2.0 ----- * **Bugfix**: Options were not propagated down to subschemas. * **Bugfix**: Empty arrays resulted in invalid schemas because it still included an ``items`` property. * **Bugfix**: ``items`` was being set to a list even when ``merge_arrays`` was set to ``True``. This resulted in overly permissive schemas because ``items`` are matched optionally by default. * **Improvement**: Positional Array Matching - In order to be more consistent with the way JSON Schema works, the alternate to ``merge_arrays`` is no longer never to merge list items, but instead to merge them based on their position in the list. * **Improvement**: Schema Incompatibility Warning - A schema incompatibility used to cause a fatal error with a nondescript warning. The message has been improved and it has been reduced to a warning. 0.1.0 (2014-11-29) ------------------ * Initial release Credits ======= **GenSON** is written and maintained by `Jon Wolverton `_. Contributors ------------ - `David Kay `_ - `KOLANICH `_ - `YehudaCorsia `_ - `Brad Sokol `_ - `John Vandenberg `_ - `shtutzim `_ - `Mike Ralphson `_ ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810318.0 genson-1.3.0/genson.egg-info/SOURCES.txt0000644000076500000240000000165714621230016016404 0ustar00jrwstaff.gitignore .travis.yml AUTHORS.rst HISTORY.rst LICENSE MANIFEST.in README.rst pyproject.toml setup.cfg setup.py tox.ini genson/__init__.py genson/__main__.py genson.egg-info/PKG-INFO genson.egg-info/SOURCES.txt genson.egg-info/dependency_links.txt genson.egg-info/entry_points.txt genson.egg-info/top_level.txt genson.egg-info/zip-safe genson/schema/__init__.py genson/schema/builder.py genson/schema/node.py genson/schema/strategies/__init__.py genson/schema/strategies/array.py genson/schema/strategies/base.py genson/schema/strategies/object.py genson/schema/strategies/scalar.py test/__init__.py test/base.py test/test_add_multi.py test/test_add_single.py test/test_bin.py test/test_builder.py test/test_custom.py test/test_gen_multi.py test/test_gen_single.py test/test_misuse.py test/test_seed_schema.py test/fixtures/base_schema.json test/fixtures/cp1252.json test/fixtures/empty.json test/fixtures/not_json.txt test/fixtures/utf-8.json././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810318.0 genson-1.3.0/genson.egg-info/dependency_links.txt0000644000076500000240000000000114621230016020555 0ustar00jrwstaff ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810318.0 genson-1.3.0/genson.egg-info/entry_points.txt0000644000076500000240000000006014621230016020001 0ustar00jrwstaff[console_scripts] genson = genson.__main__:main ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715810318.0 genson-1.3.0/genson.egg-info/top_level.txt0000644000076500000240000000000714621230016017236 0ustar00jrwstaffgenson ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759817.0 genson-1.3.0/genson.egg-info/zip-safe0000644000076500000240000000000114037050211016135 0ustar00jrwstaff ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/pyproject.toml0000644000076500000240000000013614407651467014463 0ustar00jrwstaff[build-system] requires = ["setuptools>=44", "wheel"] build-backend = "setuptools.build_meta" ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1715810318.929251 genson-1.3.0/setup.cfg0000644000076500000240000000265314621230017013354 0ustar00jrwstaff[metadata] name = genson version = attr: genson.__version__ author = Jon Wolverton author_email = wolverton.jr@gmail.com license = MIT description = GenSON is a powerful, user-friendly JSON Schema generator. long_description = file: README.rst, HISTORY.rst, AUTHORS.rst long_description_content_type = text/x-rst keywords = json, schema, json-schema, jsonschema, object, generate generator, builder, merge, draft 7, validate, validation url = https://github.com/wolverdude/genson/ download_url = https://github.com/wolverdude/GenSON/tarball/v0.2s.0 classifiers = Development Status :: 5 - Production/Stable Environment :: Console Intended Audience :: Developers Natural Language :: English License :: OSI Approved :: MIT License Operating System :: OS Independent Programming Language :: Python Programming Language :: Python :: 3 Programming Language :: Python :: 3.7 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Programming Language :: Python :: 3.11 Programming Language :: Python :: 3.12 Topic :: Software Development :: Code Generators Topic :: Software Development :: Libraries :: Python Modules Topic :: Utilities [options] packages = genson zip_safe = True include_package_data = True [options.entry_points] console_scripts = genson = genson.__main__:main [bdist_wheel] universal = 0 [global] verbose = False [egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/setup.py0000644000076500000240000000010514407651467013255 0ustar00jrwstaffimport setuptools if __name__ == "__main__": setuptools.setup() ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1715810318.927324 genson-1.3.0/test/0000755000076500000240000000000014621230017012504 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/__init__.py0000644000076500000240000000000014037050167014612 0ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/base.py0000644000076500000240000000335114407651467014014 0ustar00jrwstaffimport sys import unittest import jsonschema from genson import SchemaNode, SchemaBuilder PYTHON_VERSION = sys.version[:sys.version.find(' ')] class BaseTestCase(unittest.TestCase): def setUp(self): self.builder = self.CLASS() self._objects = [] self._schemas = [] def set_schema_options(self, **options): self.builder = SchemaNode(**options) def add_object(self, obj): self.builder.add_object(obj) self._objects.append(obj) def add_schema(self, schema): self.builder.add_schema(schema) self._schemas.append(schema) def assertObjectValidates(self, obj): jsonschema.Draft7Validator(self.builder.to_schema()).validate(obj) def assertObjectDoesNotValidate(self, obj): with self.assertRaises(jsonschema.exceptions.ValidationError): jsonschema.Draft7Validator(self.builder.to_schema()).validate(obj) def assertResult(self, expected, enforceUserContract=True): self.assertEqual( expected, self.builder.to_schema(), 'Generated schema (below) does not match expected (above)') if enforceUserContract: self.assertUserContract() def assertUserContract(self): self._assertSchemaIsValid() self._assertComponentObjectsValidate() def _assertSchemaIsValid(self): jsonschema.Draft7Validator.check_schema(self.builder.to_schema()) def _assertComponentObjectsValidate(self): compiled_schema = self.builder.to_schema() for obj in self._objects: jsonschema.Draft7Validator(compiled_schema).validate(obj) class SchemaNodeTestCase(BaseTestCase): CLASS = SchemaNode class SchemaBuilderTestCase(BaseTestCase): CLASS = SchemaBuilder ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1715810318.9279315 genson-1.3.0/test/fixtures/0000755000076500000240000000000014621230017014355 5ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/fixtures/base_schema.json0000644000076500000240000000005514407651467017524 0ustar00jrwstaff{"$schema": "http://json-schema.org/schema#"}././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/fixtures/cp1252.json0000644000076500000240000000000514037050167016166 0ustar00jrwstaff"€…™"././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/fixtures/empty.json0000644000076500000240000000000014407651467016416 0ustar00jrwstaff././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/fixtures/not_json.txt0000644000076500000240000000002114407651467016762 0ustar00jrwstaffThis is not JSON.././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/fixtures/utf-8.json0000644000076500000240000000001314037050167016214 0ustar00jrwstaff"€…™"././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/test_add_multi.py0000644000076500000240000001055414037050167016073 0ustar00jrwstafffrom . import base class TestType(base.SchemaNodeTestCase): def test_single_type(self): schema = {'type': 'null'} self.add_schema(schema) self.add_schema(schema) self.assertResult(schema) def test_single_type_unicode(self): schema = {u'type': u'string'} self.add_schema(schema) self.assertResult(schema) def test_redundant_integer_type(self): self.add_schema({'type': 'integer'}) self.add_schema({'type': 'number'}) self.assertResult({'type': 'number'}) def test_typeless(self): schema1 = {"title": "ambiguous schema"} schema2 = {"grail": "We've already got one"} result = dict(schema1) result.update(schema2) self.add_schema(schema1) self.add_schema(schema2) self.assertResult(result) def test_typeless_incorporated(self): schema1 = {"title": "Gruyere"} schema2 = {"type": "boolean"} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"type": "boolean", "title": "Gruyere"}) def test_typeless_instantly_incorporated(self): schema1 = {"type": "boolean"} schema2 = {"title": "Gruyere"} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"type": "boolean", "title": "Gruyere"}) class TestRequired(base.SchemaNodeTestCase): def test_combines(self): schema1 = {"type": "object", "required": [ "series of statements", "definite proposition"]} schema2 = {"type": "object", "required": ["definite proposition"]} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"type": "object", "required": [ "definite proposition"]}) def test_ignores_missing(self): schema1 = {"type": "object"} schema2 = {"type": "object", "required": ["definite proposition"]} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"type": "object", "required": [ "definite proposition"]}) def test_omits_all_missing(self): schema1 = {"type": "object", "properties": {"spam": {}}} schema2 = {"type": "object", "properties": {"eggs": {}}} self.add_schema(schema1) self.add_schema(schema2) self.assertResult( {"type": "object", "properties": {"spam": {}, "eggs": {}}}) def test_maintains_empty(self): seed = {"required": []} schema1 = {"type": "object", "required": ["series of statements"]} schema2 = {"type": "object", "required": ["definite proposition"]} self.add_schema(seed) self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"type": "object", "required": []}) class TestAnyOf(base.SchemaNodeTestCase): def test_multi_type(self): self.add_schema({'type': 'boolean'}) self.add_schema({'type': 'null'}) self.add_schema({'type': 'string'}) self.assertResult({'type': ['boolean', 'null', 'string']}) def test_anyof_generated(self): schema1 = {"type": "null", "title": "African or European Swallow?"} schema2 = {"type": "boolean", "title": "Gruyere"} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"anyOf": [ schema1, schema2 ]}) def test_anyof_seeded(self): schema1 = {"type": "null", "title": "African or European Swallow?"} schema2 = {"type": "boolean", "title": "Gruyere"} self.add_schema({"anyOf": [ {"type": "null"}, schema2 ]}) self.add_schema(schema1) self.assertResult({"anyOf": [ schema1, schema2 ]}) def test_list_plus_tuple(self): schema1 = {"type": "array", "items": {"type": "null"}} schema2 = {"type": "array", "items": [{"type": "null"}]} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"anyOf": [ schema1, schema2 ]}) def test_multi_type_and_anyof(self): schema1 = {'type': ['boolean', 'null', 'string']} schema2 = {"type": "boolean", "title": "Gruyere"} self.add_schema(schema1) self.add_schema(schema2) self.assertResult({"anyOf": [ {'type': ['null', 'string']}, schema2 ]}) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/test_add_single.py0000644000076500000240000000642114037050167016220 0ustar00jrwstafffrom . import base class TestType(base.SchemaNodeTestCase): def test_single_type(self): schema = {'type': 'string'} self.add_schema(schema) self.assertResult(schema) def test_single_type_unicode(self): schema = {u'type': u'string'} self.add_schema(schema) self.assertResult(schema) def test_typeless(self): schema = {} self.add_schema(schema) self.assertResult(schema) def test_array_type_no_items(self): schema = {'type': 'array'} self.add_schema(schema) self.assertResult(schema) class TestAnyOf(base.SchemaNodeTestCase): def test_multi_type(self): schema = {'type': ['boolean', 'null', 'number', 'string']} self.add_schema(schema) self.assertResult(schema) def test_multi_type_with_extra_keywords(self): schema = {'type': ['boolean', 'null', 'number', 'string'], 'title': 'this will be duplicated'} self.add_schema(schema) self.assertResult({'anyOf': [ {'type': 'boolean', 'title': 'this will be duplicated'}, {'type': 'null', 'title': 'this will be duplicated'}, {'type': 'number', 'title': 'this will be duplicated'}, {'type': 'string', 'title': 'this will be duplicated'} ]}) def test_anyof(self): schema = {"anyOf": [ {"type": "null"}, {"type": "boolean", "title": "Gruyere"} ]} self.add_schema(schema) self.assertResult(schema) def test_recursive(self): schema = {"anyOf": [ {"type": ["integer", "string"]}, {"anyOf": [ {"type": "null"}, {"type": "boolean", "title": "Gruyere"} ]} ]} self.add_schema(schema) # recursive anyOf will be flattened self.assertResult({"anyOf": [ {"type": ["integer", "null", "string"]}, {"type": "boolean", "title": "Gruyere"} ]}) class TestRequired(base.SchemaNodeTestCase): def test_preserves_empty_required(self): schema = {'type': 'object', 'required': []} self.add_schema(schema) self.assertResult(schema) class TestPreserveExtraKeywords(base.SchemaNodeTestCase): def test_basic_type(self): schema = {'type': 'boolean', 'const': False, 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) def test_number(self): schema = {'type': 'number', 'const': 5, 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) def test_list(self): schema = {'type': 'array', 'items': {"type": "null"}, 'const': [], 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) def test_tuple(self): schema = {'type': 'array', 'items': [{"type": "null"}], 'const': [], 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) def test_object(self): schema = {'type': 'object', 'const': {}, 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) def test_typeless(self): schema = {'const': 5, 'myKeyword': True} self.add_schema(schema) self.assertResult(schema) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715734326.0 genson-1.3.0/test/test_bin.py0000644000076500000240000001132714621003466014677 0ustar00jrwstaffimport unittest import json import os from subprocess import Popen, PIPE from genson import SchemaBuilder BASE_SCHEMA = {"$schema": SchemaBuilder.DEFAULT_URI} FIXTURE_PATH = os.path.join(os.path.dirname(__file__), 'fixtures') SHORT_USAGE = """\ usage: genson [-h] [--version] [-d DELIM] [-e ENCODING] [-i SPACES] [-s SCHEMA] [-$ SCHEMA_URI] ...""" def fixture(filename): return os.path.join(FIXTURE_PATH, filename) def stderr_message(message): return '{}\ngenson: error: {}\n'.format(SHORT_USAGE, message) def run(args=tuple(), stdin_data=None): """ Run the ``genson`` executable as a subprocess and return (stdout, stderr). """ full_args = ['python', '-m', 'genson'] full_args.extend(args) env = os.environ.copy() env['COLUMNS'] = '80' # set width for deterministic text wrapping genson_process = Popen( full_args, env=env, stdout=PIPE, stderr=PIPE, stdin=PIPE if stdin_data is not None else None) if stdin_data is not None: stdin_data = stdin_data.encode('utf-8') (stdout, stderr) = genson_process.communicate(stdin_data) genson_process.wait() if isinstance(stdout, bytes): stdout = stdout.decode('utf-8') if isinstance(stderr, bytes): stderr = stderr.decode('utf-8') return (stdout, stderr) class TestBasic(unittest.TestCase): def test_empty_input(self): (stdout, stderr) = run(stdin_data='') self.assertEqual(stderr, '') self.assertEqual(json.loads(stdout), BASE_SCHEMA) def test_empty_object_stdin(self): (stdout, stderr) = run(stdin_data='{}') self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"type": "object"}, **BASE_SCHEMA)) def test_empty_object_file(self): (stdout, stderr) = run([fixture('empty.json')]) self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), BASE_SCHEMA) def test_basic_schema_file(self): (stdout, stderr) = run(['-s', fixture('base_schema.json')]) self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), BASE_SCHEMA) class TestError(unittest.TestCase): maxDiff = 1000 BAD_JSON_FILE = fixture('not_json.txt') BAD_JSON_MESSAGE = stderr_message( 'invalid JSON in %s: Expecting value: line 1 column 1 (char 0)' % BAD_JSON_FILE) def test_no_input(self): (stdout, stderr) = run() self.assertEqual(stderr, stderr_message( 'noting to do - no schemas or objects given')) self.assertEqual(stdout, '') def test_object_not_json(self): (stdout, stderr) = run([self.BAD_JSON_FILE]) self.assertEqual(stderr, self.BAD_JSON_MESSAGE) self.assertEqual(stdout, '') def test_schema_not_json(self): (stdout, stderr) = run(['-s', self.BAD_JSON_FILE]) self.assertEqual(stderr, self.BAD_JSON_MESSAGE) self.assertEqual(stdout, '') class TestDelimiter(unittest.TestCase): def test_delim_newline(self): (stdout, stderr) = run(['-d', 'newline'], stdin_data='{"hi":"there"}\n{"hi":5}') self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"required": ["hi"], "type": "object", "properties": { "hi": {"type": ["integer", "string"]}}}, **BASE_SCHEMA)) def test_delim_auto_empty(self): (stdout, stderr) = run(['-d', ''], stdin_data='{"hi":"there"}{"hi":5}') self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"required": ["hi"], "type": "object", "properties": { "hi": {"type": ["integer", "string"]}}}, **BASE_SCHEMA)) def test_delim_auto_whitespace(self): (stdout, stderr) = run(['-d', ''], stdin_data='{"hi":"there"} \n\t{"hi":5}') self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"required": ["hi"], "type": "object", "properties": { "hi": {"type": ["integer", "string"]}}}, **BASE_SCHEMA)) class TestEncoding(unittest.TestCase): def test_encoding_unicode(self): (stdout, stderr) = run( ['-e', 'utf-8', fixture('utf-8.json')]) self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"type": "string"}, **BASE_SCHEMA)) def test_encoding_cp1252(self): (stdout, stderr) = run( ['-e', 'cp1252', fixture('cp1252.json')]) self.assertEqual(stderr, '') self.assertEqual( json.loads(stdout), dict({"type": "string"}, **BASE_SCHEMA)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1683756839.0 genson-1.3.0/test/test_builder.py0000644000076500000240000000610214427013447015554 0ustar00jrwstafffrom . import base from genson import SchemaBuilder SCHEMA_URI = 'https://json-schema.org/draft/2020-12/schema' class TestParams(base.SchemaBuilderTestCase): def test_uri(self): self.builder = SchemaBuilder(schema_uri=SCHEMA_URI) self.assertResult({"$schema": SCHEMA_URI}) def test_null_uri(self): self.builder = SchemaBuilder(schema_uri=None) self.assertResult({}) class TestMethods(base.SchemaBuilderTestCase): def test_add_schema(self): self.add_schema({"type": "null"}) self.assertResult({ "$schema": SchemaBuilder.DEFAULT_URI, "type": "null"}) def test_add_object(self): self.add_object(None) self.assertResult({ "$schema": SchemaBuilder.DEFAULT_URI, "type": "null"}) def test_to_json(self): self.assertEqual( self.builder.to_json(), '{"$schema": "%s"}' % SchemaBuilder.DEFAULT_URI) def test_add_schema_with_uri_default(self): self.add_schema({"$schema": SCHEMA_URI, "type": "null"}) self.assertResult({"$schema": SCHEMA_URI, "type": "null"}) def test_add_schema_with_uri_not_defuult(self): self.builder = SchemaBuilder(schema_uri=SCHEMA_URI) self.add_schema({"$schema": 'BAD_URI', "type": "null"}) self.assertResult({"$schema": SCHEMA_URI, "type": "null"}) def test_empty_falsy(self): self.assertIs(bool(self.builder), False) def test_full_truty(self): self.add_object(None) self.assertIs(bool(self.builder), True) class TestInteraction(base.SchemaBuilderTestCase): def test_add_other(self): other = SchemaBuilder(schema_uri=SCHEMA_URI) other.add_object(1) self.add_object('one') self.add_schema(other) self.assertResult({ "$schema": SCHEMA_URI, "type": ["integer", "string"]}) def test_add_other_no_uri_overwrite(self): other = SchemaBuilder() other.add_object(1) self.add_object('one') self.add_schema(other) self.add_schema({'$schema': SCHEMA_URI}) self.assertResult({ "$schema": SCHEMA_URI, "type": ["integer", "string"]}) def test_eq(self): b1 = SchemaBuilder() b1.add_object(1) b2 = SchemaBuilder() b2.add_object(1) self.assertEqual(b1, b2) def test_ne(self): b1 = SchemaBuilder() b1.add_object(1) b2 = SchemaBuilder() b2.add_object('one') self.assertNotEqual(b1, b2) def test_eq_after_serialization(self): b1 = SchemaBuilder() b1.add_object({"bar": 10, "foo": 20}) b2 = SchemaBuilder() b2.add_schema(b1) self.assertEqual(b1, b2) def test_eq_empty_required(self): b1 = SchemaBuilder() b1.add_schema({ "type": "object", "properties": { "bar": {"type": "integer"}, "foo": {"type": "integer"}}, "required": []}) b2 = SchemaBuilder() b2.add_schema(b1) self.assertEqual(b1, b2) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/test_custom.py0000644000076500000240000000367714407651467015466 0ustar00jrwstafffrom genson import SchemaBuilder from genson.schema.strategies import SchemaStrategy, Number from . import base class MaxTenStrategy(Number): KEYWORDS = tuple(list(Number.KEYWORDS) + ['maximum']) def to_schema(self): schema = super().to_schema() schema['maximum'] = 10 return schema class FalseStrategy(SchemaStrategy): KEYWORDS = tuple(list(SchemaStrategy.KEYWORDS) + ['const']) @classmethod def match_schema(self, schema): return True @classmethod def match_object(self, obj): return True def to_schema(self): schema = super().to_schema() schema['type'] = 'boolean' schema['const'] = False return schema class MaxTenSchemaBuilder(SchemaBuilder): EXTRA_STRATEGIES = (MaxTenStrategy,) class FalseSchemaBuilder(SchemaBuilder): STRATEGIES = (FalseStrategy,) class TestExtraStrategies(base.SchemaNodeTestCase): CLASS = MaxTenSchemaBuilder def test_add_object(self): self.add_object(5) self.assertResult({ '$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'maximum': 10}) def test_add_schema(self): self.add_schema({'type': 'integer'}) self.assertResult({ '$schema': 'http://json-schema.org/schema#', 'type': 'integer', 'maximum': 10}) class TestClobberStrategies(base.SchemaNodeTestCase): CLASS = FalseSchemaBuilder def test_add_object(self): self.add_object("Any Norwegian Jarlsberger?") self.assertResult({ '$schema': 'http://json-schema.org/schema#', 'type': 'boolean', 'const': False}, enforceUserContract=False) def test_add_schema(self): self.add_schema({'type': 'string'}) self.assertResult({ '$schema': 'http://json-schema.org/schema#', 'type': 'boolean', 'const': False}, enforceUserContract=False) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1618759799.0 genson-1.3.0/test/test_gen_multi.py0000644000076500000240000000766214037050167016122 0ustar00jrwstafffrom . import base class TestBasicTypes(base.SchemaNodeTestCase): def test_single_type(self): self.add_object("bacon") self.add_object("egg") self.add_object("spam") self.assertResult({"type": "string"}) def test_redundant_integer_type(self): self.add_object(1) self.add_object(1.1) self.assertResult({"type": "number"}) class TestAnyOf(base.SchemaNodeTestCase): def test_simple(self): self.add_object("string") self.add_object(1.1) self.add_object(True) self.add_object(None) self.assertResult({"type": ["boolean", "null", "number", "string"]}) def test_complex(self): self.add_object({}) self.add_object([None]) self.assertResult({"anyOf": [ {"type": "object"}, {"type": "array", "items": {"type": "null"}} ]}) def test_simple_and_complex(self): self.add_object(None) self.add_object([None]) self.assertResult({"anyOf": [ {"type": "null"}, {"type": "array", "items": {"type": "null"}} ]}) class TestArrayList(base.SchemaNodeTestCase): def setUp(self): base.SchemaNodeTestCase.setUp(self) def test_empty(self): self.add_object([]) self.add_object([]) self.assertResult({"type": "array"}) def test_monotype(self): self.add_object(["spam", "spam", "spam", "eggs", "spam"]) self.add_object(["spam", "bacon", "eggs", "spam"]) self.assertResult({"type": "array", "items": {"type": "string"}}) def test_multitype(self): self.add_object([1, "2", "3", None, False]) self.add_object([1, 2, "3", False]) self.assertObjectValidates([1, "2", 3, None, False]) self.assertResult({ "type": "array", "items": { "type": ["boolean", "integer", "null", "string"]} }) def test_nested(self): self.add_object([ ["surprise"], ["fear", "surprise"] ]) self.add_object([ ["fear", "surprise", "ruthless efficiency"], ["fear", "surprise", "ruthless efficiency", "an almost fanatical devotion to the Pope"] ]) self.assertResult({ "type": "array", "items": { "type": "array", "items": {"type": "string"}} }) class TestArrayTuple(base.SchemaNodeTestCase): def test_empty(self): self.add_schema({"type": "array", "items": []}) self.add_object([]) self.add_object([]) self.assertResult({"type": "array", "items": [{}]}) def test_multitype(self): self.add_schema({"type": "array", "items": []}) self.add_object([1, "2", "3", None, False]) self.add_object([1, 2, "3", False]) self.assertObjectDoesNotValidate([1, "2", 3, None, False]) self.assertResult({ "type": "array", "items": [ {"type": "integer"}, {"type": ["integer", "string"]}, {"type": "string"}, {"type": ["boolean", "null"]}, {"type": "boolean"}] }) def test_nested(self): self.add_schema( {"type": "array", "items": {"type": "array", "items": []}}) self.add_object([ ["surprise"], ["fear", "surprise"] ]) self.add_object([ ["fear", "surprise", "ruthless efficiency"], ["fear", "surprise", "ruthless efficiency", "an almost fanatical devotion to the Pope"] ]) self.assertResult({ "type": "array", "items": { "type": "array", "items": [ {"type": "string"}, {"type": "string"}, {"type": "string"}, {"type": "string"} ] } }) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/test_gen_single.py0000644000076500000240000001375414407651467016263 0ustar00jrwstafffrom . import base class TestBasicTypes(base.SchemaNodeTestCase): def test_no_object(self): self.assertResult({}) def test_string(self): self.add_object("string") self.assertResult({"type": "string"}) def test_integer(self): self.add_object(1) self.assertResult({"type": "integer"}) def test_number(self): self.add_object(1.1) self.assertResult({"type": "number"}) def test_boolean(self): self.add_object(True) self.assertResult({"type": "boolean"}) def test_null(self): self.add_object(None) self.assertResult({"type": "null"}) class TestArrayList(base.SchemaNodeTestCase): def setUp(self): base.SchemaNodeTestCase.setUp(self) def test_empty(self): self.add_object([]) self.assertResult({"type": "array"}) def test_monotype(self): self.add_object(["spam", "spam", "spam", "eggs", "spam"]) self.assertResult({"type": "array", "items": {"type": "string"}}) def test_multitype(self): self.add_object([1, "2", None, False]) self.assertResult({ "type": "array", "items": { "type": ["boolean", "integer", "null", "string"]} }) self.assertObjectValidates([False, None, "2", 1]) def test_nested(self): self.add_object([ ["surprise"], ["fear", "surprise"], ["fear", "surprise", "ruthless efficiency"], ["fear", "surprise", "ruthless efficiency", "an almost fanatical devotion to the Pope"] ]) self.assertResult({ "type": "array", "items": { "type": "array", "items": {"type": "string"}} }) class TestArrayTuple(base.SchemaNodeTestCase): def setUp(self): base.SchemaNodeTestCase.setUp(self) def test_empty(self): self.add_schema({"type": "array", "items": []}) self.add_object([]) self.assertResult({"type": "array", "items": [{}]}) def test_empty_schema(self): self.add_schema({"type": "array", "items": [{}]}) self.add_object([]) self.assertResult({"type": "array", "items": [{}]}) def test_multitype(self): self.add_schema({"type": "array", "items": []}) self.add_object([1, "2", "3", None, False]) self.assertResult({ "type": "array", "items": [ {"type": "integer"}, {"type": "string"}, {"type": "string"}, {"type": "null"}, {"type": "boolean"}] }) self.assertObjectDoesNotValidate([1, 2, "3", None, False]) def test_nested(self): self.add_schema( {"type": "array", "items": {"type": "array", "items": []}}) self.add_object([ ["surprise"], ["fear", "surprise"], ["fear", "surprise", "ruthless efficiency"], ["fear", "surprise", "ruthless efficiency", "an almost fanatical devotion to the Pope"] ]) self.assertResult({ "type": "array", "items": { "type": "array", "items": [ {"type": "string"}, {"type": "string"}, {"type": "string"}, {"type": "string"} ] } }) class TestObject(base.SchemaNodeTestCase): def test_empty_object(self): self.add_object({}) self.assertResult({"type": "object"}) def test_basic_object(self): self.add_object({ "Red Windsor": "Normally, but today the van broke down.", "Stilton": "Sorry.", "Gruyere": False}) self.assertResult({ "required": ["Gruyere", "Red Windsor", "Stilton"], "type": "object", "properties": { "Red Windsor": {"type": "string"}, "Gruyere": {"type": "boolean"}, "Stilton": {"type": "string"} } }) class TestComplex(base.SchemaNodeTestCase): def test_array_in_object(self): self.add_object({"a": "b", "c": [1, 2, 3]}) self.assertResult({ "required": ["a", "c"], "type": "object", "properties": { "a": {"type": "string"}, "c": { "type": "array", "items": {"type": "integer"} } } }) def test_object_in_array(self): self.add_object([ {"name": "Sir Lancelot of Camelot", "quest": "to seek the Holy Grail", "favorite colour": "blue"}, {"name": "Sir Robin of Camelot", "quest": "to seek the Holy Grail", "capitol of Assyria": None}]) self.assertResult({ "type": "array", "items": { "type": "object", "required": ["name", "quest"], "properties": { "quest": {"type": "string"}, "name": {"type": "string"}, "favorite colour": {"type": "string"}, "capitol of Assyria": {"type": "null"} } } }) def test_three_deep(self): self.add_object({"matryoshka": {"design": {"principle": "FTW!"}}}) self.assertResult({ "type": "object", "required": ["matryoshka"], "properties": { "matryoshka": { "type": "object", "required": ["design"], "properties": { "design": { "type": "object", "required": ["principle"], "properties": { "principle": {"type": "string"} } } } } } }) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/test_misuse.py0000644000076500000240000000202414407651467015442 0ustar00jrwstafffrom . import base from genson import Schema, SchemaGenerationError class TestMisuse(base.SchemaBuilderTestCase): def test_schema_with_bad_type_error(self): with self.assertRaises(SchemaGenerationError): self.add_schema({'type': 'african swallow'}) def test_to_dict_pending_deprecation_warning(self): with self.assertWarns(PendingDeprecationWarning): builder = Schema() with self.assertWarns(PendingDeprecationWarning): builder.add_object('I fart in your general direction!') builder.to_dict() def test_recurse_deprecation_warning(self): with self.assertWarns(DeprecationWarning): builder = Schema() builder.add_object('Go away or I shall taunt you a second time!') builder.to_dict(recurse=True) def test_incompatible_schema_warning(self): with self.assertWarns(UserWarning): self.add_schema({'type': 'string', 'length': 5}) self.add_schema({'type': 'string', 'length': 7}) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1679774519.0 genson-1.3.0/test/test_seed_schema.py0000644000076500000240000000526514407651467016407 0ustar00jrwstafffrom . import base class TestSeedTuple(base.SchemaNodeTestCase): def test_tuple(self): self.add_schema({'type': 'array', 'items': []}) self.add_object([None]) self.assertResult({'type': 'array', 'items': [{'type': 'null'}]}) class TestPatternProperties(base.SchemaNodeTestCase): def test_single_pattern(self): self.add_schema({'type': 'object', 'patternProperties': { r'^\d$': None}}) self.add_object({'0': 0, '1': 1, '2': 2}) self.assertResult({'type': 'object', 'patternProperties': { r'^\d$': {'type': 'integer'}}}) def test_multi_pattern(self): self.add_schema({'type': 'object', 'patternProperties': { r'^\d$': None, r'^[a-z]$': None}}) self.add_object({'0': 0, '1': 1, 'a': True, 'b': False}) self.assertResult({'type': 'object', 'patternProperties': { r'^\d$': {'type': 'integer'}, r'^[a-z]$': {'type': 'boolean'}}}) def test_multi_pattern_multi_object(self): self.add_schema({'type': 'object', 'patternProperties': { r'^\d$': None, r'^[a-z]$': None}}) self.add_object({'0': 0}) self.add_object({'1': 1}) self.add_object({'a': True}) self.add_object({'b': False}) self.assertResult({'type': 'object', 'patternProperties': { r'^\d$': {'type': 'integer'}, r'^[a-z]$': {'type': 'boolean'}}}) def test_existing_schema(self): self.add_schema({'type': 'object', 'patternProperties': { r'^\d$': {'type': 'boolean'}}}) self.add_object({'0': 0, '1': 1, '2': 2}) self.assertResult({'type': 'object', 'patternProperties': { r'^\d$': {'type': ['boolean', 'integer']}}}) def test_prefers_existing_properties(self): self.add_schema({'type': 'object', 'properties': {'0': None}, 'patternProperties': {r'^\d$': None}}) self.add_object({'0': 0, '1': 1, '2': 2}) self.assertResult({'type': 'object', 'properties': {'0': {'type': 'integer'}}, 'patternProperties': {r'^\d$': {'type': 'integer'}}, 'required': ['0']}) def test_keeps_unrecognized_properties(self): self.add_schema({'type': 'object', 'patternProperties': {r'^\d$': None}}) self.add_object({'0': 0, '1': 1, '2': 2, 'a': True}) self.assertResult({'type': 'object', 'properties': {'a': {'type': 'boolean'}}, 'patternProperties': {r'^\d$': {'type': 'integer'}}, 'required': ['a']}) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1715804829.0 genson-1.3.0/tox.ini0000644000076500000240000000034514621215235013047 0ustar00jrwstaff[tox] envlist = py3{7,8,9,10,11,12} skip_missing_interpreters = true [testenv] deps = jsonschema>=4.0.0 coverage commands = coverage run --source=genson -m unittest coverage report --omit='*/__main__.py' --fail-under=90