././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/ 0000755 0001750 0001750 00000000000 00000000000 012632 5 ustar 00vt vt 0000000 0000000 ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222248.0
py-ubjson-0.16.1/CHANGELOG 0000644 0001750 0001750 00000004053 00000000000 014046 0 ustar 00vt vt 0000000 0000000 0.16.1
- Make recursion unit test work in PyPy also
0.16.0
- Don't use C extension under PyPy. (JIT-optimized pure-Python version is
considerably faster.)
0.15.0
- Fix decimal encoding bug/assert (reported by mgorny)
0.14.0
- Fix DecoderException not including offset when C-extension in use, expose
property for it.
- Fix decoding via C extension reading too much data from file-like objects
- Test suite exercises both pure & extension modules when extension enabled
- Expand test suite to increase C extension (line) coverage
- Remove need for pypandoc/rst (use original README.md)
- Fix spelling errors in cli util (hoenn)
0.13.0
- Make extension build more reproducible with sorted sources (bmwiedemann)
- Fix _ubjson_decode_value case statement fallthrough and use correct check
to not re-raise existing decoder DecoderException
0.12.0
- Relax floating point type test case (for 32-bit targets)
- Added Python 3.7 to classifiers list
0.11.0
- Add default parameter to encoder, allowing for unsupported types to be
encoded.
- Add object_hook parameter to decoder, to e.g. allow decoding of custom types
(which were encoded with the help of default parameter).
- Treat object_pairs_hook the same in pure Python mode as with extension
0.10.0
- Support No-Op type (decoder only)
- Allow for object keys to be interned, saving memory if repeated (PY3 only)
- Use PyUnicode_FromStringAndSize instead of PyUnicode_DecodeUTF8 (decoder)
- Open file for writing, not appending (to/from json utility)
- Used more compact json encoding (to/from json utility)
- Enable object key interning (to/from json utility)
0.9.0
- C extension re-implemented (without Cython) with major speedup (7-10x)
- object_pairs_hook now works like built-in json module
- Minor pure version improvements
- Windows build compatibility
0.8.5
- Added Python 3.5 to classifiers list
- Fix index in argv (command line utility)
0.8.4
- License information update
- Allow for lack of stdin/stdout/stdout buffer access
- Allow for extenion building to be skipped via env. var
0.8.3
- Initial public release
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/LICENSE 0000644 0001750 0001750 00000026141 00000000000 013643 0 ustar 00vt vt 0000000 0000000 Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2016 Iotic Labs Ltd
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/MANIFEST.in 0000644 0001750 0001750 00000000152 00000000000 014366 0 ustar 00vt vt 0000000 0000000 include README.md NOTICE LICENSE CHANGELOG UBJSON-Specification.md
include ez_setup.py
include src/*.[ch]
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/NOTICE 0000644 0001750 0001750 00000000525 00000000000 013540 0 ustar 00vt vt 0000000 0000000 py-ubjson
Copyright 2019 Iotic Labs Ltd
Universal Binary JSON Specification
Copyright 2015 ubjson.org
Floating point related functions (python_funcs.c) and integer encode/decode logic
Copyright (c) 2001-2016 Python Software Foundation
six.py
Copyright (c) 2010-2015 Benjamin Peterson
ez_setup.py
Copyright (c) Python Packaging Authority
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/PKG-INFO 0000644 0001750 0001750 00000011371 00000000000 013732 0 ustar 00vt vt 0000000 0000000 Metadata-Version: 2.1
Name: py-ubjson
Version: 0.16.1
Summary: Universal Binary JSON encoder/decoder
Home-page: https://github.com/Iotic-Labs/py-ubjson
Author: Iotic Labs Ltd
Author-email: info@iotic-labs.com
Maintainer: Iotic Labs Ltd
Maintainer-email: vilnis.termanis@iotic-labs.com
License: Apache License 2.0
Description: # Overview
This is a Python v3.2+ (and 2.7+) [Universal Binary JSON](http://ubjson.org) encoder/decoder based on the [draft-12](UBJSON-Specification.md) specification.
# Installing / packaging
```shell
# To get from PyPI
pip3 install py-ubjson
# To only build extension modules inline (e.g. in repository)
python3 setup.py build_ext -i
# To build & install globally
python3 setup.py install
# To skip building of extensions when installing (or building)
PYUBJSON_NO_EXTENSION=1 python3 setup.py install
```
**Notes**
- The extension module is not required but provide a significant speed boost.
- The above can also be run with v2.7+
- This module is also available via [Anaconda (conda-forge)](https://anaconda.org/conda-forge/py-ubjson)
- PyPI releases are signed with the [Iotic Labs Software release signing key](https://developer.iotic-labs.com/iotic-labs.com.asc)
- At run time, one can check whether compiled version is in use via the _ubjson.EXTENSION_ENABLED_ boolean
# Usage
It's meant to behave very much like Python's built-in [JSON module](https://docs.python.org/3/library/json.html), e.g.:
```python
import ubjson
encoded = ubjson.dumpb({u'a': 1})
decoded = ubjson.loadb(encoded)
```
**Note**: Only unicode strings in Python 2 will be encoded as strings, plain *str* will be encoded as a byte array.
# Documentation
```python
import ubsjon
help(ubjson.dump)
help(ubjson.load)
```
# Command-line utility
This converts between JSON and UBJSON formats:
```shell
python3 -mubjson
USAGE: ubjson (fromjson|tojson) (INFILE|-) [OUTFILE]
```
# Tests
## Static
This library has been checked using [flake8](https://pypi.python.org/pypi/flake8) and [pylint](http://www.pylint.org), using a modified configuration - see _pylint.rc_ and _flake8.cfg_.
## Unit
```shell
python3 -mvenv py
. py/bin/activate
pip install -U pip setuptools
pip install -e .[dev]
./coverage_test.sh
```
**Note**: See `coverage_test.sh` for additional requirements.
# Limitations
- The **No-Op** type is only supported by the decoder. (This should arguably be a protocol-level rather than serialisation-level option.) Specifically, it is **only** allowed to occur at the start or between elements of a container and **only** inside un-typed containers. (In a typed container it is impossible to tell the difference between an encoded element and a No-Op.)
- Strongly-typed containers are only supported by the decoder (apart from for **bytes**/**bytearray**) and not for No-Op.
- Encoder/decoder extensions are not supported at this time.
# Why?
The only existing implementation I was aware of at the time of writing ([simpleubjson](https://github.com/brainwater/simpleubjson)) had the following limitations:
- Does not support efficient binary encoding
- Only supports draft-9
- Only supports individual Python types rather than anything implementing an interface (e.g. _Mapping_)
- Does not decode nested arrays or objects in expected form
- Lacks C extension speed-up
Keywords: ubjson,ubj
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: C
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Description-Content-Type: text/markdown
Provides-Extra: dev
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/README.md 0000644 0001750 0001750 00000005633 00000000000 014120 0 ustar 00vt vt 0000000 0000000 # Overview
This is a Python v3.2+ (and 2.7+) [Universal Binary JSON](http://ubjson.org) encoder/decoder based on the [draft-12](UBJSON-Specification.md) specification.
# Installing / packaging
```shell
# To get from PyPI
pip3 install py-ubjson
# To only build extension modules inline (e.g. in repository)
python3 setup.py build_ext -i
# To build & install globally
python3 setup.py install
# To skip building of extensions when installing (or building)
PYUBJSON_NO_EXTENSION=1 python3 setup.py install
```
**Notes**
- The extension module is not required but provide a significant speed boost.
- The above can also be run with v2.7+
- This module is also available via [Anaconda (conda-forge)](https://anaconda.org/conda-forge/py-ubjson)
- PyPI releases are signed with the [Iotic Labs Software release signing key](https://developer.iotic-labs.com/iotic-labs.com.asc)
- At run time, one can check whether compiled version is in use via the _ubjson.EXTENSION_ENABLED_ boolean
# Usage
It's meant to behave very much like Python's built-in [JSON module](https://docs.python.org/3/library/json.html), e.g.:
```python
import ubjson
encoded = ubjson.dumpb({u'a': 1})
decoded = ubjson.loadb(encoded)
```
**Note**: Only unicode strings in Python 2 will be encoded as strings, plain *str* will be encoded as a byte array.
# Documentation
```python
import ubsjon
help(ubjson.dump)
help(ubjson.load)
```
# Command-line utility
This converts between JSON and UBJSON formats:
```shell
python3 -mubjson
USAGE: ubjson (fromjson|tojson) (INFILE|-) [OUTFILE]
```
# Tests
## Static
This library has been checked using [flake8](https://pypi.python.org/pypi/flake8) and [pylint](http://www.pylint.org), using a modified configuration - see _pylint.rc_ and _flake8.cfg_.
## Unit
```shell
python3 -mvenv py
. py/bin/activate
pip install -U pip setuptools
pip install -e .[dev]
./coverage_test.sh
```
**Note**: See `coverage_test.sh` for additional requirements.
# Limitations
- The **No-Op** type is only supported by the decoder. (This should arguably be a protocol-level rather than serialisation-level option.) Specifically, it is **only** allowed to occur at the start or between elements of a container and **only** inside un-typed containers. (In a typed container it is impossible to tell the difference between an encoded element and a No-Op.)
- Strongly-typed containers are only supported by the decoder (apart from for **bytes**/**bytearray**) and not for No-Op.
- Encoder/decoder extensions are not supported at this time.
# Why?
The only existing implementation I was aware of at the time of writing ([simpleubjson](https://github.com/brainwater/simpleubjson)) had the following limitations:
- Does not support efficient binary encoding
- Only supports draft-9
- Only supports individual Python types rather than anything implementing an interface (e.g. _Mapping_)
- Does not decode nested arrays or objects in expected form
- Lacks C extension speed-up
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/UBJSON-Specification.md 0000644 0001750 0001750 00000030557 00000000000 016744 0 ustar 00vt vt 0000000 0000000 # Index
- [Introduction](#introduction)
- [License](#license)
- [Draft 12 specification](#draft12)
- [Data Format](#data_format)
- [Type overview](#type_overview)
- [Value types](#value_types)
- [Container types](#container_types)
- [Optimized format](#container_optimized)
# Introduction
For the official, most up-to-date, more verbose specification (with discussion and examples) of Universal Binary JSON please visit [ubjson.org](http://ubjson.org). Since at the time of writing (6th August 2015) neither said website nor the [community workspace git repository](https://github.com/thebuzzmedia/universal-binary-json) indicated what version the specification applies to, I made the decision to produce this minimal document to act as a reference in case the specification on ubjson.org changes. I contacted Riyad Kalla of [The Buzz Media](http://www.thebuzzmedia.com) (and maintainer of ubjson.org) and he confirmed to me that (at said point in time) the current version was indeed Draft 12.
# License
The UBJSON Specification is licensed under the [Apache 2.0 License](http://www.apache.org/licenses/LICENSE-2.0.html).
# Draft 12 specification
## Data Format
A single construct with two optional segments (length and data) is used for all types:
```
[type, 1 byte char]([integer numeric length])([data])
```
Each element in the tuple is defined as:
- **type** - A 1 byte ASCII char (**_Marker_**) used to indicate the type of the data following it.
- **length** (_optional_) - A positive, integer numeric type (uint8, int16, int32, int64) specifying the length of the following data payload.
- **data** (_optional_) - A run of bytes representing the actual binary data for this type of value.
### Notes
- Some values are simple enough that just writing the 1 byte ASCII marker into the stream is enough to represent the value (e.g. null) while others have a type that is specific enough that no length is needed as the length is implied by the type (e.g. int32) while others still require both a type and a length to communicate their value (e.g. string). Additionally some values (e.g. array) have additional (_optional_) parameters to improve decoding efficiency and/or to reduce size of the encoded value even further.
- The UBJSON specification requires that all numeric values be written in Big-Endian order.
- To store binary data, use a [strongly-typed](#container_optimized) array of uint8 values.
- _application/ubjson_ should be used as the mime type
- _.ubj_ should be used a the file extension when storing UBJSON-encoded data is saved to a file
## Type overview
Type | Total size | ASCII Marker(s) | Length required | Data (payload)
---|---|---|---|---
[null](#value_null) | 1 byte | *Z* | No | No
[no-op](#value_noop) | 1 byte | *N* | No | No
[true](#value_bool) | 1 byte | *T* | No | No
[false](#value_bool) | 1 byte | *F* | No | No
[int8](#value_numeric) | 2 bytes | *i* | No | Yes
[uint8](#value_numeric) | 2 bytes | *U* | No | Yes
[int16](#value_numeric) | 3 bytes | *I* (upper case i) | No | Yes
[int32](#value_numeric) | 5 bytes | *l* (lower case L) | No | Yes
[int64](#value_numeric) | 9 bytes | *L* | No | Yes
[float32](#value_numeric) | 5 bytes | *d* | No | Yes
[float64](#value_numeric) | 9 bytes | *D* | No | Yes
[high-precision number](#value_numeric) | 1 byte + int num val + string byte len | *H* | Yes | Yes
[char](#value_char) | 2 bytes | *C* | No | Yes
[string](#value_string) | 1 byte + int num val + string byte len | *S* | Yes | Yes (if not empty)
[array](#container_array) | 2+ bytes | *\[* and *\]* | Optional | Yes (if not empty)
[object](#container_object) | 2+ bytes | *{* and *}* | Optional | Yes (if not empty)
## Value Types
### Null
The null value in is equivalent to the null value from the JSON specification.
#### Example
In JSON:
```json
{
"passcode": null
}
```
In UBJSON (using block-notation):
```
[{]
[i][8][passcode][Z]
[}]
```
---
### No-Op
There is no equivalent to no-op value in the original JSON specification. When decoding, No-Op values should be skipped. Also, they can only occur as elements of a container.
---
### Boolean
A boolean type is is equivalent to the boolean value from the JSON specification.
#### Example
In JSON:
```json
{
"authorized": true,
"verified": false
}
```
In UBJSON (using block-notation):
```
[{]
[i][10][authorized][T]
[i][8][verified][F]
[}]
```
---
### Numeric
Unlike in JSON whith has a single _Number_ type (used for both integers and floating point numbers), UBJSON defines multiple types for integers. The minimum/maximum of values (inclusive) for each integer type are as follows:
Type | Signed | Minimum | Maximum
---|---|---|---
int8 | Yes | -128 | 127
uint8 | No | 0 | 255
int16 | Yes | -32,768 | 32,767
int32 | Yes | -2,147,483,648 | 2,147,483,647
int64 | Yes | -9,223,372,036,854,775,808 | 9,223,372,036,854,775,807
float32 | Yes | See [IEEE 754 Spec](http://en.wikipedia.org/wiki/IEEE_754-1985) | See [IEEE 754 Spec](https://en.wikipedia.org/wiki/IEEE_754-1985)
float64 | Yes | See [IEEE 754 Spec](http://en.wikipedia.org/wiki/IEEE_754-1985) | See [IEEE 754 Spec](https://en.wikipedia.org/wiki/IEEE_754-1985)
high-precision number | Yes | Infinite | Infinite
**Notes**:
- Numeric values of infinity (and NaN) are to be encoded as a [null](#value_null) in all cases
- It is advisable to use the smallest applicable type when encoding a number.
#### Integer
All integer types are written in Big-Endian order.
#### Float
- float32 values are written in [IEEE 754 single precision floating point format](http://en.wikipedia.org/wiki/IEEE_754-1985), which has the following structure:
- Bit 31 (1 bit) - sign
- Bit 30-23 (8 bits) - exponent
- Bit 22-0 (23 bits) - fraction (significant)
- float64 values are written in [IEEE 754 double precision floating point format](http://en.wikipedia.org/wiki/IEEE_754-1985), which has the following structure:
- Bit 63 (1 bit) - sign
- Bit 62-52 (11 bits) - exponent
- Bit 51-0 (52 bits) - fraction (significant)
#### High-Precision
These are encoded as a string and thus are only limited by the maximum string size. Values **must** be written out in accordance with the original [JSON number type specification](http://json.org). Infinity (and NaN) are to be encoded as a [null](#value_null) value.
#### Examples
Numeric values in JSON:
```json
{
"int8": 16,
"uint8": 255,
"int16": 32767,
"int32": 2147483647,
"int64": 9223372036854775807,
"float32": 3.14,
"float64": 113243.7863123,
"huge1": "3.14159265358979323846",
"huge2": "-1.93+E190",
"huge3": "719..."
}
```
In UBJSON (using block-notation):
```
[{]
[i][4][int8][i][16]
[i][5][uint8][U][255]
[i][5][int16][I]32767]
[i][5][int32][l][2147483647]
[i][5][int64][L][9223372036854775807]
[i][7][float32][d][3.14]
[i][7][float64][D][113243.7863123]
[i][5][huge1][H][i][22][3.14159265358979323846]
[i][5][huge2][H][i][10][-1.93+E190]
[i][5][huge3][H][U][200][719...]
[}]
```
---
### Char
The char type in UBJSON is an unsigned byte meant to represent a single printable ASCII character (decimal values 0-127). It **must not** have a decimal value larger than 127. It is functionally identical to the uint8 type, but semantically is meant to represent a character and not a numeric value.
#### Example
Char values in JSON:
```json
{
"rolecode": "a",
"delim": ";",
}
```
UBJSON (using block-notation):
```
[{]
[i][8][rolecode][C][a]
[i][5][delim][C][;]
[}]
```
---
### String
The string type in UBJSON is equivalent to the string type from the JSON specification apart from that the UBJSON string value **requires** UTF-8 encoding.
#### Example
String values in JSON:
```json
{
"username": "rkalla",
"imagedata": "...huge string payload..."
}
```
UBJSON (using block-notation):
```
[{]
[i][8][username][S][i][5][rkalla]
[i][9][imagedata][S][l][2097152][...huge string payload...]
[}]
```
## Container types
See also [optimized format](#container_optimized) below.
### Array
The array type in UBJSON is equivalent to the array type from the JSON specification.
#### Example
Array in JSON:
```json
[
null,
true,
false,
4782345193,
153.132,
"ham"
]
```
UBJSON (using block-notation):
```
[[]
[Z]
[T]
[F]
[l][4782345193]
[d][153.132]
[S][i][3][ham]
[]]
```
---
### Object
The object type in UBJSON is equivalent to the object type from the JSON specification. Since value names can only be strings, the *S* (string) marker **must not** be included since it is redundant.
#### Example
Object in JSON:
```json
{
"post": {
"id": 1137,
"author": "rkalla",
"timestamp": 1364482090592,
"body": "I totally agree!"
}
}
```
UBJSON (using block-notation):
```
[{]
[i][4][post][{]
[i][2][id][I][1137]
[i][6][author][S][i][5][rkalla]
[i][9][timestamp][L][1364482090592]
[i][4][body][S][i][16][I totally agree!]
[}]
[}]
```
## Optimized Format
Both container types support optional parameters that can help optimize the container for better parsing performance and smaller size.
### Type - *$*
When a _type_ is specified, all value types stored in the container (either array or object) are considered to be of that singular _type_ and as a result, _type_ markers are omitted for each value within the container. This can be thought of providing the ability to create a strongly typed container in UBJSON.
- If a _type_ is specified, it **must** be done so before a _count_.
- If a _type_ is specified, a _count_ **must** be specified as well. (Otherwise it is impossible to tell when a container is ending, e.g. did you just parse *]* or the int8 value of 93?)
#### Example (string type):
```
[$][S]
```
---
### Count - *\#*
When a _count_ is specified, the parser is able to know ahead of time how many child elements will be parsed. This allows the parser to pre-size any internal construct used for parsing, verify that the promised number of child values were found and avoid scanning for any terminating bytes while parsing.
- A _count_ can be specified without a type.
#### Example (count of 64):
```
[#][i][64]
```
### Additional rules
- A _count_ **must** be >= 0.
- A _count_ can be specified by itself.
- If a _count_ is specified the container **must not** specify an end-marker.
- A container that specifies a _count_ **must** contain the specified number of child elements.
- If a _type_ is specified, it **must** be done so before count.
- If a _type_ is specified, a _count_ **must** also be specified. A _type_ cannot be specified by itself.
- A container that specifies a _type_ **must not** contain any additional _type_ markers for any contained value.
---
### Array Examples
Optimized with count
```
[[][#][i][5] // An array of 5 elements.
[d][29.97]
[d][31.13]
[d][67.0]
[d][2.113]
[d][23.8889]
// No end marker since a count was specified.
```
Optimized with type & count
```
[[][$][d][#][i][5] // An array of 5 float32 elements.
[29.97] // Value type is known, so type markers are omitted.
[31.13]
[67.0]
[2.113]
[23.8889]
// No end marker since a count was specified.
```
---
### Object Examples
Optimized with count
```
[{][#][i][3] // An object of 3 name:value pairs.
[i][3][lat][d][29.976]
[i][4][long][d][31.131]
[i][3][alt][d][67.0]
// No end marker since a count was specified.
```
Optimized with type & count
```
[{][$][d][#][i][3] // An object of 3 name:float32-value pairs.
[i][3][lat][29.976] // Value type is known, so type markers are omitted.
[i][4][long][31.131]
[i][3][alt][67.0]
// No end marker since a count was specified.
```
---
### Special case: Marker-only types (null, no-op & boolean)
If using both _count_ and _type_ optimisations, the marker itself represent the value thus saving repetition (since these types to not have a payload). Additional requirements are:
Strongly typed array of type true (boolean) and with a count of 512:
```
[[][$][T][#][I][512]
```
Strongly typed object of type null and with a count of 3:
```
[{][$][Z][#][i][3]
[i][4][name] // name only, no value specified.
[i][8][password]
[i][5][email]
```
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/ez_setup.py 0000644 0001750 0001750 00000027433 00000000000 015053 0 ustar 00vt vt 0000000 0000000 #!/usr/bin/env python
"""
Setuptools bootstrapping installer.
Run this script to install or upgrade setuptools.
"""
import os
import shutil
import sys
import tempfile
import zipfile
import optparse
import subprocess
import platform
import textwrap
import contextlib
import json
import codecs
from distutils import log
try:
from urllib.request import urlopen
except ImportError:
from urllib2 import urlopen
try:
from site import USER_SITE
except ImportError:
USER_SITE = None
LATEST = object()
DEFAULT_VERSION = LATEST
DEFAULT_URL = "https://pypi.python.org/packages/source/s/setuptools/"
DEFAULT_SAVE_DIR = os.curdir
def _python_cmd(*args):
"""
Execute a command.
Return True if the command succeeded.
"""
args = (sys.executable,) + args
return subprocess.call(args) == 0
def _install(archive_filename, install_args=()):
"""Install Setuptools."""
with archive_context(archive_filename):
# installing
log.warn('Installing Setuptools')
if not _python_cmd('setup.py', 'install', *install_args):
log.warn('Something went wrong during the installation.')
log.warn('See the error message above.')
# exitcode will be 2
return 2
def _build_egg(egg, archive_filename, to_dir):
"""Build Setuptools egg."""
with archive_context(archive_filename):
# building an egg
log.warn('Building a Setuptools egg in %s', to_dir)
_python_cmd('setup.py', '-q', 'bdist_egg', '--dist-dir', to_dir)
# returning the result
log.warn(egg)
if not os.path.exists(egg):
raise IOError('Could not build the egg.')
class ContextualZipFile(zipfile.ZipFile):
"""Supplement ZipFile class to support context manager for Python 2.6."""
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.close()
def __new__(cls, *args, **kwargs):
"""Construct a ZipFile or ContextualZipFile as appropriate."""
if hasattr(zipfile.ZipFile, '__exit__'):
return zipfile.ZipFile(*args, **kwargs)
return super(ContextualZipFile, cls).__new__(cls)
@contextlib.contextmanager
def archive_context(filename):
"""
Unzip filename to a temporary directory, set to the cwd.
The unzipped target is cleaned up after.
"""
tmpdir = tempfile.mkdtemp()
log.warn('Extracting in %s', tmpdir)
old_wd = os.getcwd()
try:
os.chdir(tmpdir)
with ContextualZipFile(filename) as archive:
archive.extractall()
# going in the directory
subdir = os.path.join(tmpdir, os.listdir(tmpdir)[0])
os.chdir(subdir)
log.warn('Now working in %s', subdir)
yield
finally:
os.chdir(old_wd)
shutil.rmtree(tmpdir)
def _do_download(version, download_base, to_dir, download_delay):
"""Download Setuptools."""
egg = os.path.join(to_dir, 'setuptools-%s-py%d.%d.egg'
% (version, sys.version_info[0], sys.version_info[1]))
if not os.path.exists(egg):
archive = download_setuptools(version, download_base,
to_dir, download_delay)
_build_egg(egg, archive, to_dir)
sys.path.insert(0, egg)
# Remove previously-imported pkg_resources if present (see
# https://bitbucket.org/pypa/setuptools/pull-request/7/ for details).
if 'pkg_resources' in sys.modules:
_unload_pkg_resources()
import setuptools
setuptools.bootstrap_install_from = egg
def use_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL,
to_dir=DEFAULT_SAVE_DIR, download_delay=15):
"""
Ensure that a setuptools version is installed.
Return None. Raise SystemExit if the requested version
or later cannot be installed.
"""
version = _resolve_version(version)
to_dir = os.path.abspath(to_dir)
# prior to importing, capture the module state for
# representative modules.
rep_modules = 'pkg_resources', 'setuptools'
imported = set(sys.modules).intersection(rep_modules)
try:
import pkg_resources
pkg_resources.require("setuptools>=" + version)
# a suitable version is already installed
return
except ImportError:
# pkg_resources not available; setuptools is not installed; download
pass
except pkg_resources.DistributionNotFound:
# no version of setuptools was found; allow download
pass
except pkg_resources.VersionConflict as VC_err:
if imported:
_conflict_bail(VC_err, version)
# otherwise, unload pkg_resources to allow the downloaded version to
# take precedence.
del pkg_resources
_unload_pkg_resources()
return _do_download(version, download_base, to_dir, download_delay)
def _conflict_bail(VC_err, version):
"""
Setuptools was imported prior to invocation, so it is
unsafe to unload it. Bail out.
"""
conflict_tmpl = textwrap.dedent("""
The required version of setuptools (>={version}) is not available,
and can't be installed while this script is running. Please
install a more recent version first, using
'easy_install -U setuptools'.
(Currently using {VC_err.args[0]!r})
""")
msg = conflict_tmpl.format(**locals())
sys.stderr.write(msg)
sys.exit(2)
def _unload_pkg_resources():
del_modules = [
name for name in sys.modules
if name.startswith('pkg_resources')
]
for mod_name in del_modules:
del sys.modules[mod_name]
def _clean_check(cmd, target):
"""
Run the command to download target.
If the command fails, clean up before re-raising the error.
"""
try:
subprocess.check_call(cmd)
except subprocess.CalledProcessError:
if os.access(target, os.F_OK):
os.unlink(target)
raise
def download_file_powershell(url, target):
"""
Download the file at url to target using Powershell.
Powershell will validate trust.
Raise an exception if the command cannot complete.
"""
target = os.path.abspath(target)
ps_cmd = (
"[System.Net.WebRequest]::DefaultWebProxy.Credentials = "
"[System.Net.CredentialCache]::DefaultCredentials; "
'(new-object System.Net.WebClient).DownloadFile("%(url)s", "%(target)s")'
% locals()
)
cmd = [
'powershell',
'-Command',
ps_cmd,
]
_clean_check(cmd, target)
def has_powershell():
"""Determine if Powershell is available."""
if platform.system() != 'Windows':
return False
cmd = ['powershell', '-Command', 'echo test']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_powershell.viable = has_powershell
def download_file_curl(url, target):
cmd = ['curl', url, '--silent', '--output', target]
_clean_check(cmd, target)
def has_curl():
cmd = ['curl', '--version']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_curl.viable = has_curl
def download_file_wget(url, target):
cmd = ['wget', url, '--quiet', '--output-document', target]
_clean_check(cmd, target)
def has_wget():
cmd = ['wget', '--version']
with open(os.path.devnull, 'wb') as devnull:
try:
subprocess.check_call(cmd, stdout=devnull, stderr=devnull)
except Exception:
return False
return True
download_file_wget.viable = has_wget
def download_file_insecure(url, target):
"""Use Python to download the file, without connection authentication."""
src = urlopen(url)
try:
# Read all the data in one block.
data = src.read()
finally:
src.close()
# Write all the data in one block to avoid creating a partial file.
with open(target, "wb") as dst:
dst.write(data)
download_file_insecure.viable = lambda: True
def get_best_downloader():
downloaders = (
download_file_powershell,
download_file_curl,
download_file_wget,
download_file_insecure,
)
viable_downloaders = (dl for dl in downloaders if dl.viable())
return next(viable_downloaders, None)
def download_setuptools(
version=DEFAULT_VERSION, download_base=DEFAULT_URL,
to_dir=DEFAULT_SAVE_DIR, delay=15,
downloader_factory=get_best_downloader):
"""
Download setuptools from a specified location and return its filename.
`version` should be a valid setuptools version number that is available
as an sdist for download under the `download_base` URL (which should end
with a '/'). `to_dir` is the directory where the egg will be downloaded.
`delay` is the number of seconds to pause before an actual download
attempt.
``downloader_factory`` should be a function taking no arguments and
returning a function for downloading a URL to a target.
"""
version = _resolve_version(version)
# making sure we use the absolute path
to_dir = os.path.abspath(to_dir)
zip_name = "setuptools-%s.zip" % version
url = download_base + zip_name
saveto = os.path.join(to_dir, zip_name)
if not os.path.exists(saveto): # Avoid repeated downloads
log.warn("Downloading %s", url)
downloader = downloader_factory()
downloader(url, saveto)
return os.path.realpath(saveto)
def _resolve_version(version):
"""
Resolve LATEST version
"""
if version is not LATEST:
return version
resp = urlopen('https://pypi.python.org/pypi/setuptools/json')
with contextlib.closing(resp):
try:
charset = resp.info().get_content_charset()
except Exception:
# Python 2 compat; assume UTF-8
charset = 'UTF-8'
reader = codecs.getreader(charset)
doc = json.load(reader(resp))
return str(doc['info']['version'])
def _build_install_args(options):
"""
Build the arguments to 'python setup.py install' on the setuptools package.
Returns list of command line arguments.
"""
return ['--user'] if options.user_install else []
def _parse_args():
"""Parse the command line for options."""
parser = optparse.OptionParser()
parser.add_option(
'--user', dest='user_install', action='store_true', default=False,
help='install in user site package (requires Python 2.6 or later)')
parser.add_option(
'--download-base', dest='download_base', metavar="URL",
default=DEFAULT_URL,
help='alternative URL from where to download the setuptools package')
parser.add_option(
'--insecure', dest='downloader_factory', action='store_const',
const=lambda: download_file_insecure, default=get_best_downloader,
help='Use internal, non-validating downloader'
)
parser.add_option(
'--version', help="Specify which version to download",
default=DEFAULT_VERSION,
)
parser.add_option(
'--to-dir',
help="Directory to save (and re-use) package",
default=DEFAULT_SAVE_DIR,
)
options, args = parser.parse_args()
# positional arguments are ignored
return options
def _download_args(options):
"""Return args for download_setuptools function from cmdline args."""
return dict(
version=options.version,
download_base=options.download_base,
downloader_factory=options.downloader_factory,
to_dir=options.to_dir,
)
def main():
"""Install or upgrade setuptools and EasyInstall."""
options = _parse_args()
archive = download_setuptools(**_download_args(options))
return _install(archive, _build_install_args(options))
if __name__ == '__main__':
sys.exit(main())
././@PaxHeader 0000000 0000000 0000000 00000000033 00000000000 011451 x ustar 00 0000000 0000000 27 mtime=1587222318.179162
py-ubjson-0.16.1/py_ubjson.egg-info/ 0000755 0001750 0001750 00000000000 00000000000 016334 5 ustar 00vt vt 0000000 0000000 ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222318.0
py-ubjson-0.16.1/py_ubjson.egg-info/PKG-INFO 0000644 0001750 0001750 00000011371 00000000000 017434 0 ustar 00vt vt 0000000 0000000 Metadata-Version: 2.1
Name: py-ubjson
Version: 0.16.1
Summary: Universal Binary JSON encoder/decoder
Home-page: https://github.com/Iotic-Labs/py-ubjson
Author: Iotic Labs Ltd
Author-email: info@iotic-labs.com
Maintainer: Iotic Labs Ltd
Maintainer-email: vilnis.termanis@iotic-labs.com
License: Apache License 2.0
Description: # Overview
This is a Python v3.2+ (and 2.7+) [Universal Binary JSON](http://ubjson.org) encoder/decoder based on the [draft-12](UBJSON-Specification.md) specification.
# Installing / packaging
```shell
# To get from PyPI
pip3 install py-ubjson
# To only build extension modules inline (e.g. in repository)
python3 setup.py build_ext -i
# To build & install globally
python3 setup.py install
# To skip building of extensions when installing (or building)
PYUBJSON_NO_EXTENSION=1 python3 setup.py install
```
**Notes**
- The extension module is not required but provide a significant speed boost.
- The above can also be run with v2.7+
- This module is also available via [Anaconda (conda-forge)](https://anaconda.org/conda-forge/py-ubjson)
- PyPI releases are signed with the [Iotic Labs Software release signing key](https://developer.iotic-labs.com/iotic-labs.com.asc)
- At run time, one can check whether compiled version is in use via the _ubjson.EXTENSION_ENABLED_ boolean
# Usage
It's meant to behave very much like Python's built-in [JSON module](https://docs.python.org/3/library/json.html), e.g.:
```python
import ubjson
encoded = ubjson.dumpb({u'a': 1})
decoded = ubjson.loadb(encoded)
```
**Note**: Only unicode strings in Python 2 will be encoded as strings, plain *str* will be encoded as a byte array.
# Documentation
```python
import ubsjon
help(ubjson.dump)
help(ubjson.load)
```
# Command-line utility
This converts between JSON and UBJSON formats:
```shell
python3 -mubjson
USAGE: ubjson (fromjson|tojson) (INFILE|-) [OUTFILE]
```
# Tests
## Static
This library has been checked using [flake8](https://pypi.python.org/pypi/flake8) and [pylint](http://www.pylint.org), using a modified configuration - see _pylint.rc_ and _flake8.cfg_.
## Unit
```shell
python3 -mvenv py
. py/bin/activate
pip install -U pip setuptools
pip install -e .[dev]
./coverage_test.sh
```
**Note**: See `coverage_test.sh` for additional requirements.
# Limitations
- The **No-Op** type is only supported by the decoder. (This should arguably be a protocol-level rather than serialisation-level option.) Specifically, it is **only** allowed to occur at the start or between elements of a container and **only** inside un-typed containers. (In a typed container it is impossible to tell the difference between an encoded element and a No-Op.)
- Strongly-typed containers are only supported by the decoder (apart from for **bytes**/**bytearray**) and not for No-Op.
- Encoder/decoder extensions are not supported at this time.
# Why?
The only existing implementation I was aware of at the time of writing ([simpleubjson](https://github.com/brainwater/simpleubjson)) had the following limitations:
- Does not support efficient binary encoding
- Only supports draft-9
- Only supports individual Python types rather than anything implementing an interface (e.g. _Mapping_)
- Does not decode nested arrays or objects in expected form
- Lacks C extension speed-up
Keywords: ubjson,ubj
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: C
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.2
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Description-Content-Type: text/markdown
Provides-Extra: dev
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222318.0
py-ubjson-0.16.1/py_ubjson.egg-info/SOURCES.txt 0000644 0001750 0001750 00000001040 00000000000 020213 0 ustar 00vt vt 0000000 0000000 CHANGELOG
LICENSE
MANIFEST.in
NOTICE
README.md
UBJSON-Specification.md
ez_setup.py
setup.py
py_ubjson.egg-info/PKG-INFO
py_ubjson.egg-info/SOURCES.txt
py_ubjson.egg-info/dependency_links.txt
py_ubjson.egg-info/not-zip-safe
py_ubjson.egg-info/requires.txt
py_ubjson.egg-info/top_level.txt
src/_ubjson.c
src/common.h
src/decoder.c
src/decoder.h
src/encoder.c
src/encoder.h
src/markers.h
src/python_funcs.c
src/python_funcs.h
test/test.py
ubjson/__init__.py
ubjson/__main__.py
ubjson/compat.py
ubjson/decoder.py
ubjson/encoder.py
ubjson/markers.py ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222318.0
py-ubjson-0.16.1/py_ubjson.egg-info/dependency_links.txt 0000644 0001750 0001750 00000000001 00000000000 022402 0 ustar 00vt vt 0000000 0000000
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1584439881.0
py-ubjson-0.16.1/py_ubjson.egg-info/not-zip-safe 0000644 0001750 0001750 00000000001 00000000000 020562 0 ustar 00vt vt 0000000 0000000
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222318.0
py-ubjson-0.16.1/py_ubjson.egg-info/requires.txt 0000644 0001750 0001750 00000000056 00000000000 020735 0 ustar 00vt vt 0000000 0000000
[dev]
Pympler<0.8,>=0.7
coverage<4.6,>=4.5.3
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222318.0
py-ubjson-0.16.1/py_ubjson.egg-info/top_level.txt 0000644 0001750 0001750 00000000017 00000000000 021064 0 ustar 00vt vt 0000000 0000000 _ubjson
ubjson
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/setup.cfg 0000644 0001750 0001750 00000000046 00000000000 014453 0 ustar 00vt vt 0000000 0000000 [egg_info]
tag_build =
tag_date = 0
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587221599.0
py-ubjson-0.16.1/setup.py 0000644 0001750 0001750 00000010361 00000000000 014345 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import print_function
import sys
import os
import warnings
from glob import glob
from platform import python_implementation
# Allow for environments without setuptools
try:
from setuptools import setup
except ImportError:
from ez_setup import use_setuptools
use_setuptools()
from setuptools import setup # pylint: disable=ungrouped-imports
from distutils.core import Extension
from distutils.command.build_ext import build_ext
from distutils.errors import CCompilerError
from distutils.errors import DistutilsPlatformError, DistutilsExecError
from ubjson import __version__ as version
def load_description(filename):
script_dir = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(script_dir, filename), 'r') as infile:
return infile.read()
# Loosely based on https://github.com/mongodb/mongo-python-driver/blob/master/setup.py
class BuildExtWarnOnFail(build_ext):
"""Allow for extension building to fail."""
def run(self):
try:
build_ext.run(self)
except DistutilsPlatformError:
ex = sys.exc_info()[1]
sys.stdout.write('%s\n' % str(ex))
warnings.warn("Extension modules: There was an issue with your platform configuration - see above.")
def build_extension(self, ext):
try:
build_ext.build_extension(self, ext)
except (CCompilerError, DistutilsExecError, DistutilsPlatformError, IOError):
ex = sys.exc_info()[1]
sys.stdout.write('%s\n' % str(ex))
warnings.warn("Extension module %s: The output above this warning shows how the compilation failed."
% ext.name)
BUILD_EXTENSIONS = 'PYUBJSON_NO_EXTENSION' not in os.environ and python_implementation() != 'PyPy'
COMPILE_ARGS = ['-std=c99']
# For testing/debug only - some of these are GCC-specific
# COMPILE_ARGS += ['-Wall', '-Wextra', '-Wundef', '-Wshadow', '-Wcast-align', '-Wcast-qual', '-Wstrict-prototypes',
# '-pedantic']
setup(
name='py-ubjson',
version=version,
description='Universal Binary JSON encoder/decoder',
long_description=load_description('README.md'),
long_description_content_type='text/markdown',
author='Iotic Labs Ltd',
author_email='info@iotic-labs.com',
maintainer='Iotic Labs Ltd',
maintainer_email='vilnis.termanis@iotic-labs.com',
url='https://github.com/Iotic-Labs/py-ubjson',
license='Apache License 2.0',
packages=['ubjson'],
extras_require={
'dev': [
'Pympler>=0.7 ,<0.8',
'coverage>=4.5.3,<4.6'
]
},
zip_safe=False,
ext_modules=([Extension(
'_ubjson',
sorted(glob('src/*.c')),
extra_compile_args=COMPILE_ARGS,
# undef_macros=['NDEBUG']
)] if BUILD_EXTENSIONS else []),
cmdclass={"build_ext": BuildExtWarnOnFail},
keywords=['ubjson', 'ubj'],
classifiers=[
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: Apache Software License',
'Intended Audience :: Developers',
'Programming Language :: C',
'Programming Language :: Python',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Topic :: Software Development :: Libraries',
'Topic :: Software Development :: Libraries :: Python Modules'
]
)
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/src/ 0000755 0001750 0001750 00000000000 00000000000 013421 5 ustar 00vt vt 0000000 0000000 ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/_ubjson.c 0000644 0001750 0001750 00000020551 00000000000 015227 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include
#include "common.h"
#include "encoder.h"
#include "decoder.h"
/******************************************************************************/
// container_count, sort_keys, no_float32
static _ubjson_encoder_prefs_t _ubjson_encoder_prefs_defaults = { NULL, 0, 0, 1 };
// no_bytes, object_pairs_hook
static _ubjson_decoder_prefs_t _ubjson_decoder_prefs_defaults = { NULL, NULL, 0, 0 };
/******************************************************************************/
PyDoc_STRVAR(_ubjson_dump__doc__, "See pure Python version (encoder.dump) for documentation.");
#define FUNC_DEF_DUMP {"dump", (PyCFunction)_ubjson_dump, METH_VARARGS | METH_KEYWORDS, _ubjson_dump__doc__}
static PyObject*
_ubjson_dump(PyObject *self, PyObject *args, PyObject *kwargs) {
static const char *format = "OO|iiiO:dump";
static char *keywords[] = {"obj", "fp", "container_count", "sort_keys", "no_float32", "default", NULL};
_ubjson_encoder_buffer_t *buffer = NULL;
_ubjson_encoder_prefs_t prefs = _ubjson_encoder_prefs_defaults;
PyObject *obj;
PyObject *fp;
PyObject *fp_write = NULL;
UNUSED(self);
if (!PyArg_ParseTupleAndKeywords(args, kwargs, format, keywords, &obj, &fp, &prefs.container_count,
&prefs.sort_keys, &prefs.no_float32, &prefs.default_func)) {
goto bail;
}
BAIL_ON_NULL(fp_write = PyObject_GetAttrString(fp, "write"));
BAIL_ON_NULL(buffer = _ubjson_encoder_buffer_create(&prefs, fp_write));
// buffer creation has added reference
Py_CLEAR(fp_write);
BAIL_ON_NONZERO(_ubjson_encode_value(obj, buffer));
BAIL_ON_NULL(obj = _ubjson_encoder_buffer_finalise(buffer));
_ubjson_encoder_buffer_free(&buffer);
return obj;
bail:
Py_XDECREF(fp_write);
_ubjson_encoder_buffer_free(&buffer);
return NULL;
}
PyDoc_STRVAR(_ubjson_dumpb__doc__, "See pure Python version (encoder.dumpb) for documentation.");
#define FUNC_DEF_DUMPB {"dumpb", (PyCFunction)_ubjson_dumpb, METH_VARARGS | METH_KEYWORDS, _ubjson_dumpb__doc__}
static PyObject*
_ubjson_dumpb(PyObject *self, PyObject *args, PyObject *kwargs) {
static const char *format = "O|iiiO:dumpb";
static char *keywords[] = {"obj", "container_count", "sort_keys", "no_float32", "default", NULL};
_ubjson_encoder_buffer_t *buffer = NULL;
_ubjson_encoder_prefs_t prefs = _ubjson_encoder_prefs_defaults;
PyObject *obj;
UNUSED(self);
if (!PyArg_ParseTupleAndKeywords(args, kwargs, format, keywords, &obj, &prefs.container_count, &prefs.sort_keys,
&prefs.no_float32, &prefs.default_func)) {
goto bail;
}
BAIL_ON_NULL(buffer = _ubjson_encoder_buffer_create(&prefs, NULL));
BAIL_ON_NONZERO(_ubjson_encode_value(obj, buffer));
BAIL_ON_NULL(obj = _ubjson_encoder_buffer_finalise(buffer));
_ubjson_encoder_buffer_free(&buffer);
return obj;
bail:
_ubjson_encoder_buffer_free(&buffer);
return NULL;
}
/******************************************************************************/
PyDoc_STRVAR(_ubjson_load__doc__, "See pure Python version (encoder.load) for documentation.");
#define FUNC_DEF_LOAD {"load", (PyCFunction)_ubjson_load, METH_VARARGS | METH_KEYWORDS, _ubjson_load__doc__}
static PyObject*
_ubjson_load(PyObject *self, PyObject *args, PyObject *kwargs) {
static const char *format = "O|iOOi:load";
static char *keywords[] = {"fp", "no_bytes", "object_hook", "object_pairs_hook", "intern_object_keys", NULL};
_ubjson_decoder_buffer_t *buffer = NULL;
_ubjson_decoder_prefs_t prefs = _ubjson_decoder_prefs_defaults;
PyObject *fp;
PyObject *fp_read = NULL;
PyObject *fp_seek = NULL;
PyObject *seekable = NULL;
PyObject *obj = NULL;
UNUSED(self);
if (!PyArg_ParseTupleAndKeywords(args, kwargs, format, keywords, &fp, &prefs.no_bytes, &prefs.object_hook,
&prefs.object_pairs_hook, &prefs.intern_object_keys)) {
goto bail;
}
BAIL_ON_NULL(fp_read = PyObject_GetAttrString(fp, "read"));
if (!PyCallable_Check(fp_read)) {
PyErr_SetString(PyExc_TypeError, "fp.read not callable");
goto bail;
}
// determine whether can seek input
if (NULL != (seekable = PyObject_CallMethod(fp, "seekable", NULL))) {
if (Py_True == seekable) {
// Could also PyCallable_Check but have already checked seekable() so will just fail later
fp_seek = PyObject_GetAttrString(fp, "seek");
}
Py_XDECREF(seekable);
}
// ignore seekable() / seek get errors
PyErr_Clear();
BAIL_ON_NULL(buffer = _ubjson_decoder_buffer_create(&prefs, fp_read, fp_seek));
// buffer creation has added references
Py_CLEAR(fp_read);
Py_CLEAR(fp_seek);
BAIL_ON_NULL(obj = _ubjson_decode_value(buffer, NULL));
BAIL_ON_NONZERO(_ubjson_decoder_buffer_free(&buffer));
return obj;
bail:
Py_XDECREF(fp_read);
Py_XDECREF(fp_seek);
Py_XDECREF(obj);
_ubjson_decoder_buffer_free(&buffer);
return NULL;
}
PyDoc_STRVAR(_ubjson_loadb__doc__, "See pure Python version (encoder.loadb) for documentation.");
#define FUNC_DEF_LOADB {"loadb", (PyCFunction)_ubjson_loadb, METH_VARARGS | METH_KEYWORDS, _ubjson_loadb__doc__}
static PyObject*
_ubjson_loadb(PyObject *self, PyObject *args, PyObject *kwargs) {
static const char *format = "O|iOOi:loadb";
static char *keywords[] = {"chars", "no_bytes", "object_hook", "object_pairs_hook", "intern_object_keys", NULL};
_ubjson_decoder_buffer_t *buffer = NULL;
_ubjson_decoder_prefs_t prefs = _ubjson_decoder_prefs_defaults;
PyObject *chars;
PyObject *obj = NULL;
UNUSED(self);
if (!PyArg_ParseTupleAndKeywords(args, kwargs, format, keywords, &chars, &prefs.no_bytes, &prefs.object_hook,
&prefs.object_pairs_hook, &prefs.intern_object_keys)) {
goto bail;
}
if (PyUnicode_Check(chars)) {
PyErr_SetString(PyExc_TypeError, "chars must be a bytes-like object, not str");
goto bail;
}
if (!PyObject_CheckBuffer(chars)) {
PyErr_SetString(PyExc_TypeError, "chars does not support buffer interface");
goto bail;
}
BAIL_ON_NULL(buffer = _ubjson_decoder_buffer_create(&prefs, chars, NULL));
BAIL_ON_NULL(obj = _ubjson_decode_value(buffer, NULL));
BAIL_ON_NONZERO(_ubjson_decoder_buffer_free(&buffer));
return obj;
bail:
Py_XDECREF(obj);
_ubjson_decoder_buffer_free(&buffer);
return NULL;
}
/******************************************************************************/
static PyMethodDef UbjsonMethods[] = {
FUNC_DEF_DUMP, FUNC_DEF_DUMPB,
FUNC_DEF_LOAD, FUNC_DEF_LOADB,
{NULL, NULL, 0, NULL}
};
#if PY_MAJOR_VERSION >= 3
static void module_free(PyObject *m)
{
UNUSED(m);
_ubjson_encoder_cleanup();
_ubjson_decoder_cleanup();
}
static struct PyModuleDef moduledef = {
PyModuleDef_HEAD_INIT, // m_base
"_ubjson", // m_name
NULL, // m_doc
-1, // m_size
UbjsonMethods, // m_methods
NULL, // m_slots
NULL, // m_traverse
NULL, // m_clear
(freefunc)module_free // m_free
};
#define INITERROR return NULL
PyObject*
PyInit__ubjson(void)
#else
#define INITERROR return
PyMODINIT_FUNC
init_ubjson(void)
#endif
{
#if PY_MAJOR_VERSION >= 3
PyObject *module = PyModule_Create(&moduledef);
#else
PyObject *module = Py_InitModule("_ubjson", UbjsonMethods);
#endif
BAIL_ON_NONZERO(_ubjson_encoder_init());
BAIL_ON_NONZERO(_ubjson_decoder_init());
#if PY_MAJOR_VERSION >= 3
return module;
#else
return;
#endif
bail:
_ubjson_encoder_cleanup();
_ubjson_decoder_cleanup();
Py_XDECREF(module);
INITERROR;
}
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/common.h 0000644 0001750 0001750 00000002173 00000000000 015065 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
/******************************************************************************/
#define MIN(x, y) (x) <= (y) ? (x) : (y)
#define MAX(x, y) (x) >= (y) ? (x) : (y)
#define UNUSED(x) (void)(x)
#define BAIL_ON_NULL(result)\
if (NULL == (result)) {\
goto bail;\
}
#define BAIL_ON_NONZERO(result)\
if (result) {\
goto bail;\
}
#define BAIL_ON_NEGATIVE(result)\
if ((result) < 0) {\
goto bail;\
}
#if defined (__cplusplus)
}
#endif
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/decoder.c 0000644 0001750 0001750 00000104611 00000000000 015175 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include
#include
#include "common.h"
#include "markers.h"
#include "decoder.h"
#include "python_funcs.h"
/******************************************************************************/
#define RECURSE_AND_RETURN_OR_BAIL(action, recurse_msg) {\
PyObject *ret;\
BAIL_ON_NONZERO(Py_EnterRecursiveCall(recurse_msg));\
ret = (action);\
Py_LeaveRecursiveCall();\
return ret;\
}
#define RAISE_DECODER_EXCEPTION(msg) {\
PyObject *num = NULL, *str = NULL, *tuple = NULL;\
if ((num = PyLong_FromSize_t(buffer->total_read)) &&\
(str = PyUnicode_FromString(msg)) &&\
(tuple = PyTuple_Pack(2, str, num))) {\
PyErr_SetObject(DecoderException, tuple);\
/* backup method in case object creation fails */\
} else {\
PyErr_Format(DecoderException, "%s (at byte [%zd])", msg, buffer->total_read);\
}\
Py_XDECREF(tuple);\
Py_XDECREF(num);\
Py_XDECREF(str);\
goto bail;\
}
// used only by READ macros below
#define ACTION_READ_ERROR(stmt, len, item_str) {\
if (NULL == (stmt)) {\
if (read > 0) {\
goto bail;\
} else if ((len > 0) || (read < len)) {\
RAISE_DECODER_EXCEPTION(("Insufficient input (" item_str ")"));\
}\
} else if (read < len) {\
RAISE_DECODER_EXCEPTION(("Insufficient (partial) input (" item_str ")"));\
}\
}
#define READ_VIA_FUNC(buffer, readptr, dst) \
buffer->read_func(buffer, readptr, dst)
#define READ_INTO_OR_BAIL(len, dst_buffer, item_str) {\
Py_ssize_t read = len;\
ACTION_READ_ERROR(READ_VIA_FUNC(buffer, &read, dst_buffer), len, item_str);\
}
#define READ_OR_BAIL(len, dst_buffer, item_str) {\
Py_ssize_t read = len;\
ACTION_READ_ERROR((dst_buffer = READ_VIA_FUNC(buffer, &read, NULL)), len, item_str);\
}
#define READ_OR_BAIL_CAST(len, dst_buffer, cast, item_str) {\
Py_ssize_t read = len;\
ACTION_READ_ERROR((dst_buffer = cast READ_VIA_FUNC(buffer, &read, NULL)), len, item_str);\
}
#define READ_CHAR_OR_BAIL(dst_char, item_str) {\
const char* tmp;\
READ_OR_BAIL(1, tmp, item_str);\
dst_char = tmp[0];\
}
#define DECODE_UNICODE_OR_BAIL(dst_obj, raw, length, item_str) {\
if (NULL == ((dst_obj) = PyUnicode_FromStringAndSize(raw, length))) {\
RAISE_DECODER_EXCEPTION(("Failed to decode utf8: " item_str));\
}\
}\
#define DECODE_LENGTH_OR_BAIL(length) BAIL_ON_NEGATIVE((length) = _decode_int_non_negative(buffer, NULL))
#define DECODE_LENGTH_OR_BAIL_MARKER(length, marker) \
BAIL_ON_NEGATIVE((length) = _decode_int_non_negative(buffer, &(marker)))
// decoder buffer size when using fp (i.e. minimum number of bytes to read in one go)
#define BUFFER_FP_SIZE 256
// io.SEEK_CUR constant (for seek() function)
#define IO_SEEK_CUR 1
static PyObject *DecoderException = NULL;
static PyTypeObject *PyDec_Type = NULL;
#define PyDec_Check(v) PyObject_TypeCheck(v, PyDec_Type)
/******************************************************************************/
typedef struct {
// next marker after container parameters
char marker;
// indicates whether countainer has count specified
int counting;
// number of elements in container (if counting or 1 if not)
long long count;
// type of container values, if typed, otherwise TYPE_NONE
char type;
// indicates the parameter specification for the container is invalid (an exception will have been set)
int invalid;
} _container_params_t;
static const char* _decoder_buffer_read_fixed(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer);
static const char* _decoder_buffer_read_callable(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer);
static const char* _decoder_buffer_read_buffered(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer);
//These functions return NULL on failure (an exception will have been set). Note that no type checking is performed!
static PyObject* _decode_int8(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_int16_32(_ubjson_decoder_buffer_t *buffer, Py_ssize_t size);
static PyObject* _decode_int64(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_float32(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_float64(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_high_prec(_ubjson_decoder_buffer_t *buffer);
static long long _decode_int_non_negative(_ubjson_decoder_buffer_t *buffer, char *given_marker);
static PyObject* _decode_char(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_string(_ubjson_decoder_buffer_t *buffer);
static _container_params_t _get_container_params(_ubjson_decoder_buffer_t *buffer, int in_mapping);
static int _is_no_data_type(char type);
static PyObject* _no_data_type(char type);
static PyObject* _decode_array(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_object_with_pairs_hook(_ubjson_decoder_buffer_t *buffer);
static PyObject* _decode_object(_ubjson_decoder_buffer_t *buffer);
/******************************************************************************/
/* Returns new decoder buffer or NULL on failure (an exception will be set). Input must either support buffer interface
* or be callable. Currently only increases reference count for input parameter.
*/
_ubjson_decoder_buffer_t* _ubjson_decoder_buffer_create(_ubjson_decoder_prefs_t* prefs, PyObject *input,
PyObject *seek) {
_ubjson_decoder_buffer_t *buffer;
if (NULL == (buffer = calloc(1, sizeof(_ubjson_decoder_buffer_t)))) {
PyErr_NoMemory();
return NULL;
}
buffer->prefs = *prefs;
buffer->input = input;
Py_XINCREF(input);
if (PyObject_CheckBuffer(input)) {
BAIL_ON_NONZERO(PyObject_GetBuffer(input, &buffer->view, PyBUF_SIMPLE));
buffer->read_func = _decoder_buffer_read_fixed;
buffer->view_set = 1;
} else if (PyCallable_Check(input)) {
if (NULL == seek) {
buffer->read_func = _decoder_buffer_read_callable;
} else {
buffer->read_func = _decoder_buffer_read_buffered;
buffer->seek = seek;
Py_INCREF(seek);
}
} else {
// Should have been checked a level above
PyErr_SetString(PyExc_TypeError, "Input neither support buffer interface nor is callable");
goto bail;
}
// treat Py_None as no argument being supplied
if (Py_None == buffer->prefs.object_hook) {
buffer->prefs.object_hook = NULL;
}
if (Py_None == buffer->prefs.object_pairs_hook) {
buffer->prefs.object_pairs_hook = NULL;
}
return buffer;
bail:
_ubjson_decoder_buffer_free(&buffer);
return NULL;
}
// Returns non-zero if buffer cleanup/finalisation failed and no other exception was set already
int _ubjson_decoder_buffer_free(_ubjson_decoder_buffer_t **buffer) {
int failed = 0;
if (NULL != buffer && NULL != *buffer) {
if ((*buffer)->view_set) {
// In buffered mode, rewind to position in stream up to which actually read (rather than buffered)
if (NULL != (*buffer)->seek && (*buffer)->view.len > (*buffer)->pos) {
PyObject *type, *value, *traceback, *seek_result;
// preserve the previous exception, if set
PyErr_Fetch(&type, &value, &traceback);
seek_result = PyObject_CallFunction((*buffer)->seek, "nn", ((*buffer)->pos - (*buffer)->view.len),
IO_SEEK_CUR);
Py_XDECREF(seek_result);
/* Blindly calling PyErr_Restore would clear any exception raised by seek call. If however already had
* an error before freeing buffer (this function), propagate that instead. (I.e. this behaves like a
* nested try-except block.
*/
if (NULL != type) {
PyErr_Restore(type, value, traceback);
} else if (NULL == seek_result) {
failed = 1;
}
}
PyBuffer_Release(&((*buffer)->view));
(*buffer)->view_set = 0;
}
if (NULL != (*buffer)->tmp_dst) {
free((*buffer)->tmp_dst);
(*buffer)->tmp_dst = NULL;
}
Py_CLEAR((*buffer)->input);
Py_CLEAR((*buffer)->seek);
free(*buffer);
*buffer = NULL;
}
return failed;
}
/* Tries to read len bytes from input, returning read chunk. Len is updated to how many bytes were actually read.
* If not NULL, dst_buffer can be an existing buffer to output len bytes into.
* Returns NULL if either no input is left (len is set to zero) or an error occurs (len is non-zero). The caller must
* NOT modify or free the returned chunk unless they specified out_buffer (in which case that is returned). When this
* function is called again, the previously returned output is no longer valid (unless was created by caller).
*
* This function reads from a fixed buffer (single byte array)
*/
static const char* _decoder_buffer_read_fixed(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer) {
Py_ssize_t old_pos;
if (0 == *len) {
return NULL;
}
if (buffer->total_read < buffer->view.len) {
*len = MIN(*len, (buffer->view.len - buffer->total_read));
old_pos = buffer->total_read;
buffer->total_read += *len;
// caller has provided own destination
if (NULL != dst_buffer) {
return memcpy(dst_buffer, &((char*)buffer->view.buf)[old_pos], *len);
} else {
return &((char*)buffer->view.buf)[old_pos];
}
// no input remaining
} else {
*len = 0;
return NULL;
}
}
// See _decoder_buffer_read_fixed for behaviour details. This function is used to read from a stream
static const char* _decoder_buffer_read_callable(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer) {
PyObject* read_result = NULL;
if (0 == *len) {
return NULL;
}
if (buffer->view_set) {
PyBuffer_Release(&buffer->view);
buffer->view_set = 0;
}
// read input and get buffer view
BAIL_ON_NULL(read_result = PyObject_CallFunction(buffer->input, "n", *len));
BAIL_ON_NONZERO(PyObject_GetBuffer(read_result, &buffer->view, PyBUF_SIMPLE));
buffer->view_set = 1;
// don't need reference since view reserves one already
Py_CLEAR(read_result);
// no input remaining
if (0 == buffer->view.len) {
*len = 0;
return NULL;
}
*len = buffer->view.len;
buffer->total_read += *len;
// caller has provided own destination
if (NULL != dst_buffer) {
return memcpy(dst_buffer, buffer->view.buf, *len);
} else {
return buffer->view.buf;
}
bail:
*len = 1;
Py_XDECREF(read_result);
return NULL;
}
// See _decoder_buffer_read_fixed for behaviour details. This function reads (buffered) from a seekable stream
static const char* _decoder_buffer_read_buffered(_ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer) {
Py_ssize_t old_pos;
char *tmp_dst;
Py_ssize_t remaining_old = 0; // how many bytes remaining to be read (from old view)
PyObject* read_result = NULL;
if (0 == *len) {
return NULL;
}
// previously used temporary output no longer needed
if (NULL != buffer->tmp_dst) {
free(buffer->tmp_dst);
buffer->tmp_dst = NULL;
}
// will require additional read if remaining input smaller than requested
if (!buffer->view_set || *len > (buffer->view.len - buffer->pos)) {
// create temporary buffer if not supplied (and have some remaining input in view)
if (NULL == dst_buffer) {
if (NULL == (tmp_dst = buffer->tmp_dst = malloc(sizeof(char) * (size_t)*len))) {
PyErr_NoMemory();
goto bail;
}
} else {
tmp_dst = dst_buffer;
}
// copy remainder into buffer and release old view
if (buffer->view_set) {
remaining_old = buffer->view.len - buffer->pos;
if (remaining_old > 0) {
memcpy(tmp_dst, &((char*)buffer->view.buf)[buffer->pos], remaining_old);
buffer->pos = buffer->view.len;
buffer->total_read += remaining_old;
}
PyBuffer_Release(&buffer->view);
buffer->view_set = 0;
buffer->pos = 0;
}
// read input and get buffer view
BAIL_ON_NULL(read_result = PyObject_CallFunction(buffer->input, "n",
MAX(BUFFER_FP_SIZE, (*len - remaining_old))));
BAIL_ON_NONZERO(PyObject_GetBuffer(read_result, &buffer->view, PyBUF_SIMPLE));
buffer->view_set = 1;
// don't need reference since view reserves one already
Py_CLEAR(read_result);
// no input remaining
if (0 == remaining_old && buffer->view.len == 0) {
*len = 0;
return NULL;
}
// read rest into buffer (adjusting total length if not all available)
*len = MIN(*len, (buffer->view.len - buffer->pos) + remaining_old);
buffer->pos = *len - remaining_old;
buffer->total_read += buffer->pos;
memcpy(&tmp_dst[remaining_old], (char*)buffer->view.buf, buffer->pos);
return tmp_dst;
// enough data in existing view
} else {
old_pos = buffer->pos;
buffer->pos += *len;
buffer->total_read += *len;
// caller has provided own destination
if (NULL != dst_buffer) {
return memcpy(dst_buffer, &((char*)buffer->view.buf)[old_pos], *len);
} else {
return &((char*)buffer->view.buf)[old_pos];
}
}
bail:
*len = 1;
Py_XDECREF(read_result);
return NULL;
}
/******************************************************************************/
// These methods are partially based on Python's _struct.c
static PyObject* _decode_int8(_ubjson_decoder_buffer_t *buffer) {
char value;
READ_CHAR_OR_BAIL(value, "int8");
#if PY_MAJOR_VERSION < 3
return PyInt_FromLong((long) (signed char)value);
#else
return PyLong_FromLong((long) (signed char)value);
#endif
bail:
return NULL;
}
static PyObject* _decode_uint8(_ubjson_decoder_buffer_t *buffer) {
char value;
READ_CHAR_OR_BAIL(value, "uint8");
#if PY_MAJOR_VERSION < 3
return PyInt_FromLong((long) (unsigned char)value);
#else
return PyLong_FromLong((long) (unsigned char)value);
#endif
bail:
return NULL;
}
// NOTE: size parameter can only be 2 or 4 (bytes)
static PyObject* _decode_int16_32(_ubjson_decoder_buffer_t *buffer, Py_ssize_t size) {
const unsigned char *raw;
long value = 0;
Py_ssize_t i;
READ_OR_BAIL_CAST(size, raw, (const unsigned char *), "int16/32");
for (i = size; i > 0; i--) {
value = (value << 8) | *raw++;
}
// extend signed bit
if (SIZEOF_LONG > size) {
value |= -(value & (1L << ((8 * size) - 1)));
}
#if PY_MAJOR_VERSION < 3
return PyInt_FromLong(value);
#else
return PyLong_FromLong(value);
#endif
bail:
return NULL;
}
static PyObject* _decode_int64(_ubjson_decoder_buffer_t *buffer) {
const unsigned char *raw;
long long value = 0;
const Py_ssize_t size = 8;
Py_ssize_t i;
READ_OR_BAIL_CAST(8, raw, (const unsigned char *), "int64");
for (i = size; i > 0; i--) {
value = (value << 8) | *raw++;
}
// extend signed bit
if (SIZEOF_LONG_LONG > 8) {
value |= -(value & ((long long)1 << ((8 * size) - 1)));
}
if (value >= LONG_MIN && value <= LONG_MAX) {
return PyLong_FromLong(Py_SAFE_DOWNCAST(value, long long, long));
} else {
return PyLong_FromLongLong(value);
}
bail:
return NULL;
}
// returns negative on error (exception set)
static long long _decode_int_non_negative(_ubjson_decoder_buffer_t *buffer, char *given_marker) {
char marker;
PyObject *int_obj = NULL;
long long value;
if (NULL == given_marker) {
READ_CHAR_OR_BAIL(marker, "Length marker");
} else {
marker = *given_marker;
}
switch (marker) {
case TYPE_INT8:
BAIL_ON_NULL(int_obj = _decode_int8(buffer));
break;
case TYPE_UINT8:
BAIL_ON_NULL(int_obj = _decode_uint8(buffer));
break;
case TYPE_INT16:
BAIL_ON_NULL(int_obj = _decode_int16_32(buffer, 2));
break;
case TYPE_INT32:
BAIL_ON_NULL(int_obj = _decode_int16_32(buffer, 4));
break;
case TYPE_INT64:
BAIL_ON_NULL(int_obj = _decode_int64(buffer));
break;
default:
RAISE_DECODER_EXCEPTION("Integer marker expected");
}
#if PY_MAJOR_VERSION < 3
if (PyInt_Check(int_obj)) {
value = PyInt_AsLong(int_obj);
} else
#endif
{
// not expecting this to occur unless LONG_MAX (sys.maxint in Python 2) < 2^63-1
value = PyLong_AsLongLong(int_obj);
}
if (PyErr_Occurred()) {
goto bail;
}
if (value < 0) {
RAISE_DECODER_EXCEPTION("Negative count/length unexpected");
}
Py_XDECREF(int_obj);
return value;
bail:
Py_XDECREF(int_obj);
return -1;
}
static PyObject* _decode_float32(_ubjson_decoder_buffer_t *buffer) {
const char *raw;
double value;
READ_OR_BAIL(4, raw, "float32");
value = _pyfuncs_ubj_PyFloat_Unpack4((const unsigned char *)raw, 0);
if ((-1.0 == value) && PyErr_Occurred()) {
goto bail;
}
return PyFloat_FromDouble(value);
bail:
return NULL;
}
static PyObject* _decode_float64(_ubjson_decoder_buffer_t *buffer) {
const char *raw;
double value;
READ_OR_BAIL(8, raw, "float64");
value = _pyfuncs_ubj_PyFloat_Unpack8((const unsigned char *)raw, 0);
if ((-1.0 == value) && PyErr_Occurred()) {
goto bail;
}
return PyFloat_FromDouble(value);
bail:
return NULL;
}
static PyObject* _decode_high_prec(_ubjson_decoder_buffer_t *buffer) {
const char *raw;
PyObject *num_str = NULL;
PyObject *decimal;
long long length;
DECODE_LENGTH_OR_BAIL(length);
READ_OR_BAIL((Py_ssize_t)length, raw, "highprec");
DECODE_UNICODE_OR_BAIL(num_str, raw, (Py_ssize_t)length, "highprec");
BAIL_ON_NULL(decimal = PyObject_CallFunctionObjArgs((PyObject*)PyDec_Type, num_str, NULL));
Py_XDECREF(num_str);
return decimal;
bail:
Py_XDECREF(num_str);
return NULL;
}
static PyObject* _decode_char(_ubjson_decoder_buffer_t *buffer) {
char value;
PyObject *obj = NULL;
READ_CHAR_OR_BAIL(value, "char");
DECODE_UNICODE_OR_BAIL(obj, &value, 1, "char");
return obj;
bail:
Py_XDECREF(obj);
return NULL;
}
static PyObject* _decode_string(_ubjson_decoder_buffer_t *buffer) {
long long length;
const char *raw;
PyObject *obj = NULL;
DECODE_LENGTH_OR_BAIL(length);
if (length > 0) {
READ_OR_BAIL((Py_ssize_t)length, raw, "string");
DECODE_UNICODE_OR_BAIL(obj, raw, (Py_ssize_t)length, "string");
} else {
BAIL_ON_NULL(obj = PyUnicode_FromStringAndSize(NULL, 0));
}
return obj;
bail:
Py_XDECREF(obj);
return NULL;
}
static _container_params_t _get_container_params(_ubjson_decoder_buffer_t *buffer, int in_mapping) {
_container_params_t params;
char marker;
// fixed type for all values
READ_CHAR_OR_BAIL(marker, "container type, count or 1st key/value type");
if (CONTAINER_TYPE == marker) {
READ_CHAR_OR_BAIL(marker, "container type");
switch (marker) {
case TYPE_NULL: case TYPE_BOOL_TRUE: case TYPE_BOOL_FALSE: case TYPE_CHAR: case TYPE_STRING: case TYPE_INT8:
case TYPE_UINT8: case TYPE_INT16: case TYPE_INT32: case TYPE_INT64: case TYPE_FLOAT32: case TYPE_FLOAT64:
case TYPE_HIGH_PREC: case ARRAY_START: case OBJECT_START:
params.type = marker;
break;
default:
RAISE_DECODER_EXCEPTION("Invalid container type");
}
READ_CHAR_OR_BAIL(marker, "container count or 1st key/value type");
} else {
// container type not fixed
params.type = TYPE_NONE;
}
// container value count
if (CONTAINER_COUNT == marker) {
params.counting = 1;
DECODE_LENGTH_OR_BAIL(params.count);
// reading ahead just to capture type, which will not exist if type is fixed
if ((params.count > 0) && (in_mapping || (TYPE_NONE == params.type))) {
READ_CHAR_OR_BAIL(marker, "1st key/value type");
} else {
marker = params.type;
}
} else if (TYPE_NONE == params.type) {
// count not provided but indicate that
params.count = 1;
params.counting = 0;
} else {
RAISE_DECODER_EXCEPTION("Container type without count");
}
params.marker = marker;
params.invalid = 0;
return params;
bail:
params.invalid = 1;
return params;
}
static int _is_no_data_type(char type) {
return ((TYPE_NULL == type) || (TYPE_BOOL_TRUE == type) || (TYPE_BOOL_FALSE == type));
}
// Note: Does NOT reserve a new reference
static PyObject* _no_data_type(char type) {
switch (type) {
case TYPE_NULL:
return Py_None;
case TYPE_BOOL_TRUE:
return Py_True;
case TYPE_BOOL_FALSE:
return Py_False;
default:
PyErr_SetString(PyExc_RuntimeError, "Internal error - _no_data_type");
return NULL;
}
}
static PyObject* _decode_array(_ubjson_decoder_buffer_t *buffer) {
_container_params_t params = _get_container_params(buffer, 0);
PyObject *list = NULL;
PyObject *value = NULL;
char marker;
if (params.invalid) {
goto bail;
}
marker = params.marker;
if (params.counting) {
// special case - byte array
if ((TYPE_UINT8 == params.type) && !buffer->prefs.no_bytes) {
BAIL_ON_NULL(list = PyBytes_FromStringAndSize(NULL, params.count));
READ_INTO_OR_BAIL(params.count, PyBytes_AS_STRING(list), "bytes array");
return list;
// special case - no data types
} else if (_is_no_data_type(params.type)) {
BAIL_ON_NULL(list = PyList_New(params.count));
BAIL_ON_NULL(value = _no_data_type(params.type));
while (params.count > 0) {
PyList_SET_ITEM(list, --params.count, value);
// reference stolen each time
Py_INCREF(value);
}
value = NULL;
// take advantage of faster creation/setting of list since count known
} else {
Py_ssize_t list_pos = 0; // position in list for far fast setting via PyList_SET_ITEM
BAIL_ON_NULL(list = PyList_New(params.count));
while (params.count > 0) {
if (TYPE_NOOP == marker) {
READ_CHAR_OR_BAIL(marker, "array value type marker (sized, after no-op)");
continue;
}
BAIL_ON_NULL(value = _ubjson_decode_value(buffer, &marker));
PyList_SET_ITEM(list, list_pos++, value);
// reference stolen by list so no longer want to decrement on failure
value = NULL;
params.count--;
if (params.count > 0 && TYPE_NONE == params.type) {
READ_CHAR_OR_BAIL(marker, "array value type marker (sized)");
}
}
}
} else {
BAIL_ON_NULL(list = PyList_New(0));
while (ARRAY_END != marker) {
if (TYPE_NOOP == marker) {
READ_CHAR_OR_BAIL(marker, "array value type marker (after no-op)");
continue;
}
BAIL_ON_NULL(value = _ubjson_decode_value(buffer, &marker));
BAIL_ON_NONZERO(PyList_Append(list, value));
Py_CLEAR(value);
if (TYPE_NONE == params.type) {
READ_CHAR_OR_BAIL(marker, "array value type marker");
}
}
}
return list;
bail:
Py_XDECREF(value);
Py_XDECREF(list);
return NULL;
}
// same as string, except there is no 'S' marker
static PyObject* _decode_object_key(_ubjson_decoder_buffer_t *buffer, char marker, int intern) {
long long length;
const char *raw;
PyObject *key;
DECODE_LENGTH_OR_BAIL_MARKER(length, marker);
READ_OR_BAIL((Py_ssize_t)length, raw, "string");
BAIL_ON_NULL(key = PyUnicode_FromStringAndSize(raw, (Py_ssize_t)length));
// unicode string interning not supported in v2
#if PY_MAJOR_VERSION < 3
UNUSED(intern);
#else
if (intern) {
PyUnicode_InternInPlace(&key);
}
#endif
return key;
bail:
return NULL;
}
// used by _decode_object* functions
#define DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION(context_str, intern) {\
key = _decode_object_key(buffer, marker, intern);\
if (NULL == key) {\
RAISE_DECODER_EXCEPTION("Failed to decode object key (" context_str ")");\
}\
}
static PyObject* _decode_object_with_pairs_hook(_ubjson_decoder_buffer_t *buffer) {
_container_params_t params = _get_container_params(buffer, 1);
PyObject *obj = NULL;
PyObject *list = NULL;
PyObject *key = NULL;
PyObject *value = NULL;
PyObject *item = NULL;
char *fixed_type;
char marker;
int intern = buffer->prefs.intern_object_keys;
if (params.invalid) {
goto bail;
}
marker = params.marker;
// take advantage of faster creation/setting of list since count known
if (params.counting) {
Py_ssize_t list_pos = 0; // position in list for far fast setting via PyList_SET_ITEM
BAIL_ON_NULL(list = PyList_New(params.count));
// special case: no data values (keys only)
if (_is_no_data_type(params.type)) {
value = _no_data_type(params.type);
Py_INCREF(value);
while (params.count > 0) {
DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION("sized, no data", intern);
BAIL_ON_NULL(item = PyTuple_Pack(2, key, value));
Py_CLEAR(key);
PyList_SET_ITEM(list, list_pos++, item);
// reference stolen
item = NULL;
params.count--;
if (params.count > 0) {
READ_CHAR_OR_BAIL(marker, "object key length");
}
}
} else {
fixed_type = (TYPE_NONE == params.type) ? NULL : ¶ms.type;
while (params.count > 0) {
if (TYPE_NOOP == marker) {
READ_CHAR_OR_BAIL(marker, "object key length (sized, after no-op)");
continue;
}
DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION("sized", intern);
BAIL_ON_NULL(value = _ubjson_decode_value(buffer, fixed_type));
BAIL_ON_NULL(item = PyTuple_Pack(2, key, value));
Py_CLEAR(key);
Py_CLEAR(value);
PyList_SET_ITEM(list, list_pos++, item);
// reference stolen
item = NULL;
params.count--;
if (params.count > 0) {
READ_CHAR_OR_BAIL(marker, "object key length (sized)");
}
}
}
} else {
BAIL_ON_NULL(list = PyList_New(0));
fixed_type = (TYPE_NONE == params.type) ? NULL : ¶ms.type;
while (OBJECT_END != marker) {
if (TYPE_NOOP == marker) {
READ_CHAR_OR_BAIL(marker, "object key length (after no-op)");
continue;
}
DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION("unsized", intern);
BAIL_ON_NULL(value = _ubjson_decode_value(buffer, fixed_type));
BAIL_ON_NULL(item = PyTuple_Pack(2, key, value));
Py_CLEAR(key);
Py_CLEAR(value);
BAIL_ON_NONZERO(PyList_Append(list, item));
Py_CLEAR(item);
READ_CHAR_OR_BAIL(marker, "object key length");
}
}
BAIL_ON_NULL(obj = PyObject_CallFunctionObjArgs(buffer->prefs.object_pairs_hook, list, NULL));
Py_XDECREF(list);
return obj;
bail:
Py_XDECREF(obj);
Py_XDECREF(list);
Py_XDECREF(key);
Py_XDECREF(value);
Py_XDECREF(item);
return NULL;
}
static PyObject* _decode_object(_ubjson_decoder_buffer_t *buffer) {
_container_params_t params = _get_container_params(buffer, 1);
PyObject *obj = NULL;
PyObject *newobj = NULL; // result of object_hook (if applicable)
PyObject *key = NULL;
PyObject *value = NULL;
char *fixed_type;
char marker;
int intern = buffer->prefs.intern_object_keys;
if (params.invalid) {
goto bail;
}
marker = params.marker;
BAIL_ON_NULL(obj = PyDict_New());
// special case: no data values (keys only)
if (params.counting && _is_no_data_type(params.type)) {
value = _no_data_type(params.type);
while (params.count > 0) {
DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION("sized, no data", intern);
BAIL_ON_NONZERO(PyDict_SetItem(obj, key, value));
// reference stolen in above call, but only for value!
Py_CLEAR(key);
Py_INCREF(value);
params.count--;
if (params.count > 0) {
READ_CHAR_OR_BAIL(marker, "object key length");
}
}
} else {
fixed_type = (TYPE_NONE == params.type) ? NULL : ¶ms.type;
while (params.count > 0 && (params.counting || (OBJECT_END != marker))) {
if (TYPE_NOOP == marker) {
READ_CHAR_OR_BAIL(marker, "object key length");
continue;
}
DECODE_OBJECT_KEY_OR_RAISE_ENCODER_EXCEPTION("sized/unsized", intern);
BAIL_ON_NULL(value = _ubjson_decode_value(buffer, fixed_type));
BAIL_ON_NONZERO(PyDict_SetItem(obj, key, value));
Py_CLEAR(key);
Py_CLEAR(value);
if (params.counting) {
params.count--;
}
if (params.count > 0) {
READ_CHAR_OR_BAIL(marker, "object key length");
}
}
}
if (NULL != buffer->prefs.object_hook) {
BAIL_ON_NULL(newobj = PyObject_CallFunctionObjArgs(buffer->prefs.object_hook, obj, NULL));
Py_CLEAR(obj);
return newobj;
}
return obj;
bail:
Py_XDECREF(key);
Py_XDECREF(value);
Py_XDECREF(obj);
Py_XDECREF(newobj);
return NULL;
}
/******************************************************************************/
// only used by _ubjson_decode_value
#define RETURN_OR_RAISE_DECODER_EXCEPTION(item, item_str) {\
obj = (item);\
if (NULL != obj) {\
return obj;\
} else if (PyErr_Occurred() && PyErr_ExceptionMatches((PyObject*)DecoderException)) {\
goto bail;\
} else {\
RAISE_DECODER_EXCEPTION("Failed to decode " item_str);\
}\
}
PyObject* _ubjson_decode_value(_ubjson_decoder_buffer_t *buffer, char *given_marker) {
char marker;
PyObject *obj;
if (NULL == given_marker) {
READ_CHAR_OR_BAIL(marker, "Type marker");
} else {
marker = *given_marker;
}
switch (marker) {
case TYPE_NULL:
Py_RETURN_NONE;
case TYPE_BOOL_TRUE:
Py_RETURN_TRUE;
case TYPE_BOOL_FALSE:
Py_RETURN_FALSE;
case TYPE_CHAR:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_char(buffer), "char");
case TYPE_STRING:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_string(buffer), "string");
case TYPE_INT8:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_int8(buffer), "int8");
case TYPE_UINT8:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_uint8(buffer), "uint8");
case TYPE_INT16:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_int16_32(buffer, 2), "int16");
case TYPE_INT32:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_int16_32(buffer, 4), "int32");
case TYPE_INT64:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_int64(buffer), "int64");
case TYPE_FLOAT32:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_float32(buffer), "float32");
case TYPE_FLOAT64:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_float64(buffer), "float64");
case TYPE_HIGH_PREC:
RETURN_OR_RAISE_DECODER_EXCEPTION(_decode_high_prec(buffer), "highprec");
case ARRAY_START:
RECURSE_AND_RETURN_OR_BAIL(_decode_array(buffer), "whilst decoding a UBJSON array");
case OBJECT_START:
if (NULL == buffer->prefs.object_pairs_hook) {
RECURSE_AND_RETURN_OR_BAIL(_decode_object(buffer), "whilst decoding a UBJSON object");
} else {
RECURSE_AND_RETURN_OR_BAIL(_decode_object_with_pairs_hook(buffer), "whilst decoding a UBJSON object");
}
default:
RAISE_DECODER_EXCEPTION("Invalid marker");
}
bail:
return NULL;
}
/******************************************************************************/
int _ubjson_decoder_init(void) {
PyObject *tmp_module = NULL;
PyObject *tmp_obj = NULL;
// try to determine floating point format / endianess
_pyfuncs_ubj_detect_formats();
// allow decoder to access DecoderException & Decimal class
BAIL_ON_NULL(tmp_module = PyImport_ImportModule("ubjson.decoder"));
BAIL_ON_NULL(DecoderException = PyObject_GetAttrString(tmp_module, "DecoderException"));
Py_CLEAR(tmp_module);
BAIL_ON_NULL(tmp_module = PyImport_ImportModule("decimal"));
BAIL_ON_NULL(tmp_obj = PyObject_GetAttrString(tmp_module, "Decimal"));
if (!PyType_Check(tmp_obj)) {
PyErr_SetString(PyExc_ImportError, "decimal.Decimal type import failure");
goto bail;
}
PyDec_Type = (PyTypeObject*) tmp_obj;
Py_CLEAR(tmp_module);
return 0;
bail:
Py_CLEAR(DecoderException);
Py_CLEAR(PyDec_Type);
Py_XDECREF(tmp_obj);
Py_XDECREF(tmp_module);
return 1;
}
void _ubjson_decoder_cleanup(void) {
Py_CLEAR(DecoderException);
Py_CLEAR(PyDec_Type);
}
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/decoder.h 0000644 0001750 0001750 00000005077 00000000000 015210 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
#include
/******************************************************************************/
typedef struct {
PyObject *object_hook;
PyObject *object_pairs_hook;
// don't convert UINT8 arrays to bytes instances (and keep as an array of individual integers)
int no_bytes;
int intern_object_keys;
} _ubjson_decoder_prefs_t;
typedef struct _ubjson_decoder_buffer_t {
// either supports buffer interface or is callable returning bytes
PyObject *input;
// NULL unless input supports seeking in which case expecting callable with signature of io.IOBase.seek()
PyObject *seek;
// function used to read data from this buffer with (depending on whether fixed, callable or seekable)
const char* (*read_func)(struct _ubjson_decoder_buffer_t *buffer, Py_ssize_t *len, char *dst_buffer);
// buffer protocol access to raw bytes of input
Py_buffer view;
// whether view will need to be released
int view_set;
// current position in view
Py_ssize_t pos;
// total bytes supplied to user (same as pos in case where callable not used)
Py_ssize_t total_read;
// temporary destination buffer if required read larger than currently available input
char *tmp_dst;
_ubjson_decoder_prefs_t prefs;
} _ubjson_decoder_buffer_t;
/******************************************************************************/
extern _ubjson_decoder_buffer_t* _ubjson_decoder_buffer_create(_ubjson_decoder_prefs_t* prefs,
PyObject *input, PyObject *seek);
extern int _ubjson_decoder_buffer_free(_ubjson_decoder_buffer_t **buffer);
extern int _ubjson_decoder_init(void);
// note: marker argument only used internally - supply NULL
extern PyObject* _ubjson_decode_value(_ubjson_decoder_buffer_t *buffer, char *given_marker);
extern void _ubjson_decoder_cleanup(void);
#if defined (__cplusplus)
}
#endif
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1584446054.0
py-ubjson-0.16.1/src/encoder.c 0000644 0001750 0001750 00000052451 00000000000 015213 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include
#include
#include "common.h"
#include "markers.h"
#include "encoder.h"
#include "python_funcs.h"
/******************************************************************************/
static char bytes_array_prefix[] = {ARRAY_START, CONTAINER_TYPE, TYPE_UINT8, CONTAINER_COUNT};
#define POWER_TWO(x) ((long long) 1 << (x))
#if defined(_MSC_VER) && !defined(fpclassify)
# define USE__FPCLASS
#endif
// initial encoder buffer size (when not supplied with fp)
#define BUFFER_INITIAL_SIZE 64
// encoder buffer size when using fp (i.e. minimum number of bytes to buffer before writing out)
#define BUFFER_FP_SIZE 256
static PyObject *EncoderException = NULL;
static PyTypeObject *PyDec_Type = NULL;
#define PyDec_Check(v) PyObject_TypeCheck(v, PyDec_Type)
/******************************************************************************/
static int _encoder_buffer_write(_ubjson_encoder_buffer_t *buffer, const char* const chunk, size_t chunk_len);
#define RECURSE_AND_BAIL_ON_NONZERO(action, recurse_msg) {\
int ret;\
BAIL_ON_NONZERO(Py_EnterRecursiveCall(recurse_msg));\
ret = (action);\
Py_LeaveRecursiveCall();\
BAIL_ON_NONZERO(ret);\
}
#define WRITE_OR_BAIL(str, len) BAIL_ON_NONZERO(_encoder_buffer_write(buffer, (str), len))
#define WRITE_CHAR_OR_BAIL(c) {\
char ctmp = (c);\
WRITE_OR_BAIL(&ctmp, 1);\
}
/* These functions return non-zero on failure (an exception will have been set). Note that no type checking is performed
* where a Python type is mentioned in the function name!
*/
static int _encode_PyBytes(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyObject_as_PyDecimal(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyDecimal(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyUnicode(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyFloat(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyLong(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_longlong(long long num, _ubjson_encoder_buffer_t *buffer);
#if PY_MAJOR_VERSION < 3
static int _encode_PyInt(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
#endif
static int _encode_PySequence(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_mapping_key(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
static int _encode_PyMapping(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
/******************************************************************************/
/* fp_write, if not NULL, must be a callable which accepts a single bytes argument. On failure will set exception.
* Currently only increases reference count for fp_write parameter.
*/
_ubjson_encoder_buffer_t* _ubjson_encoder_buffer_create(_ubjson_encoder_prefs_t* prefs, PyObject *fp_write) {
_ubjson_encoder_buffer_t *buffer;
if (NULL == (buffer = calloc(1, sizeof(_ubjson_encoder_buffer_t)))) {
PyErr_NoMemory();
return NULL;
}
buffer->len = (NULL != fp_write) ? BUFFER_FP_SIZE : BUFFER_INITIAL_SIZE;
BAIL_ON_NULL(buffer->obj = PyBytes_FromStringAndSize(NULL, buffer->len));
buffer->raw = PyBytes_AS_STRING(buffer->obj);
buffer->pos = 0;
BAIL_ON_NULL(buffer->markers = PySet_New(NULL));
buffer->prefs = *prefs;
buffer->fp_write = fp_write;
Py_XINCREF(fp_write);
// treat Py_None as no default_func being supplied
if (Py_None == buffer->prefs.default_func) {
buffer->prefs.default_func = NULL;
}
return buffer;
bail:
_ubjson_encoder_buffer_free(&buffer);
return NULL;
}
void _ubjson_encoder_buffer_free(_ubjson_encoder_buffer_t **buffer) {
if (NULL != buffer && NULL != *buffer) {
Py_XDECREF((*buffer)->obj);
Py_XDECREF((*buffer)->fp_write);
Py_XDECREF((*buffer)->markers);
free(*buffer);
*buffer = NULL;
}
}
// Note: Sets python exception on failure and returns non-zero
static int _encoder_buffer_write(_ubjson_encoder_buffer_t *buffer, const char* const chunk, size_t chunk_len) {
size_t new_len;
PyObject *fp_write_ret;
if (0 == chunk_len) {
return 0;
}
// no write method, use buffer only
if (NULL == buffer->fp_write) {
// increase buffer size if too small
if (chunk_len > (buffer->len - buffer->pos)) {
for (new_len = buffer->len; new_len < (buffer->pos + chunk_len); new_len *= 2);
BAIL_ON_NONZERO(_PyBytes_Resize(&buffer->obj, new_len));
buffer->raw = PyBytes_AS_STRING(buffer->obj);
buffer->len = new_len;
}
memcpy(&(buffer->raw[buffer->pos]), chunk, sizeof(char) * chunk_len);
buffer->pos += chunk_len;
} else {
// increase buffer to fit all first
if (chunk_len > (buffer->len - buffer->pos)) {
BAIL_ON_NONZERO(_PyBytes_Resize(&buffer->obj, (buffer->pos + chunk_len)));
buffer->raw = PyBytes_AS_STRING(buffer->obj);
buffer->len = buffer->pos + chunk_len;
}
memcpy(&(buffer->raw[buffer->pos]), chunk, sizeof(char) * chunk_len);
buffer->pos += chunk_len;
// flush buffer to write method
if (buffer->pos >= buffer->len) {
BAIL_ON_NULL(fp_write_ret = PyObject_CallFunctionObjArgs(buffer->fp_write, buffer->obj, NULL));
Py_DECREF(fp_write_ret);
Py_DECREF(buffer->obj);
buffer->len = BUFFER_FP_SIZE;
BAIL_ON_NULL(buffer->obj = PyBytes_FromStringAndSize(NULL, buffer->len));
buffer->raw = PyBytes_AS_STRING(buffer->obj);
buffer->pos = 0;
}
}
return 0;
bail:
return 1;
}
// Flushes remaining bytes to writer and returns None or returns final bytes object (when no writer specified).
// Does NOT free passed in buffer struct.
PyObject* _ubjson_encoder_buffer_finalise(_ubjson_encoder_buffer_t *buffer) {
PyObject *fp_write_ret;
// shrink buffer to fit
if (buffer->pos < buffer->len) {
BAIL_ON_NONZERO(_PyBytes_Resize(&buffer->obj, buffer->pos));
buffer->len = buffer->pos;
}
if (NULL == buffer->fp_write) {
Py_INCREF(buffer->obj);
return buffer->obj;
} else {
if (buffer->pos > 0) {
BAIL_ON_NULL(fp_write_ret = PyObject_CallFunctionObjArgs(buffer->fp_write, buffer->obj, NULL));
Py_DECREF(fp_write_ret);
}
Py_RETURN_NONE;
}
bail:
return NULL;
}
/******************************************************************************/
static int _encode_PyBytes(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
const char *raw;
Py_ssize_t len;
raw = PyBytes_AS_STRING(obj);
len = PyBytes_GET_SIZE(obj);
WRITE_OR_BAIL(bytes_array_prefix, sizeof(bytes_array_prefix));
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
WRITE_OR_BAIL(raw, len);
// no ARRAY_END since length was specified
return 0;
bail:
return 1;
}
static int _encode_PyByteArray(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
const char *raw;
Py_ssize_t len;
raw = PyByteArray_AS_STRING(obj);
len = PyByteArray_GET_SIZE(obj);
WRITE_OR_BAIL(bytes_array_prefix, sizeof(bytes_array_prefix));
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
WRITE_OR_BAIL(raw, len);
// no ARRAY_END since length was specified
return 0;
bail:
return 1;
}
/******************************************************************************/
static int _encode_PyObject_as_PyDecimal(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *decimal = NULL;
// Decimal class has no public C API
BAIL_ON_NULL(decimal = PyObject_CallFunctionObjArgs((PyObject*)PyDec_Type, obj, NULL));
BAIL_ON_NONZERO(_encode_PyDecimal(decimal, buffer));
Py_DECREF(decimal);
return 0;
bail:
Py_XDECREF(decimal);
return 1;
}
static int _encode_PyDecimal(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *is_finite;
PyObject *str = NULL;
PyObject *encoded = NULL;
const char *raw;
Py_ssize_t len;
// Decimal class has no public C API
BAIL_ON_NULL(is_finite = PyObject_CallMethod(obj, "is_finite", NULL));
if (Py_True == is_finite) {
#if PY_MAJOR_VERSION >= 3
BAIL_ON_NULL(str = PyObject_Str(obj));
#else
BAIL_ON_NULL(str = PyObject_Unicode(obj));
#endif
BAIL_ON_NULL(encoded = PyUnicode_AsEncodedString(str, "utf-8", NULL));
raw = PyBytes_AS_STRING(encoded);
len = PyBytes_GET_SIZE(encoded);
WRITE_CHAR_OR_BAIL(TYPE_HIGH_PREC);
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
WRITE_OR_BAIL(raw, len);
Py_DECREF(str);
Py_DECREF(encoded);
} else {
WRITE_CHAR_OR_BAIL(TYPE_NULL);
}
Py_DECREF(is_finite);
return 0;
bail:
Py_XDECREF(is_finite);
Py_XDECREF(str);
Py_XDECREF(encoded);
return 1;
}
/******************************************************************************/
static int _encode_PyUnicode(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *str;
const char *raw;
Py_ssize_t len;
BAIL_ON_NULL(str = PyUnicode_AsEncodedString(obj, "utf-8", NULL));
raw = PyBytes_AS_STRING(str);
len = PyBytes_GET_SIZE(str);
if (1 == len) {
WRITE_CHAR_OR_BAIL(TYPE_CHAR);
} else {
WRITE_CHAR_OR_BAIL(TYPE_STRING);
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
}
WRITE_OR_BAIL(raw, len);
Py_DECREF(str);
return 0;
bail:
Py_XDECREF(str);
return 1;
}
/******************************************************************************/
static int _encode_PyFloat(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
char numtmp[9]; // holds type char + float32/64
double abs;
double num = PyFloat_AsDouble(obj);
if (-1.0 == num && PyErr_Occurred()) {
goto bail;
}
#ifdef USE__FPCLASS
switch (_fpclass(num)) {
case _FPCLASS_SNAN:
case _FPCLASS_QNAN:
case _FPCLASS_NINF:
case _FPCLASS_PINF:
#else
switch (fpclassify(num)) {
case FP_NAN:
case FP_INFINITE:
#endif
WRITE_CHAR_OR_BAIL(TYPE_NULL);
return 0;
#ifdef USE__FPCLASS
case _FPCLASS_NZ:
case _FPCLASS_PZ:
#else
case FP_ZERO:
#endif
BAIL_ON_NONZERO(_pyfuncs_ubj_PyFloat_Pack4(num, (unsigned char*)&numtmp[1], 0));
numtmp[0] = TYPE_FLOAT32;
WRITE_OR_BAIL(numtmp, 5);
return 0;
#ifdef USE__FPCLASS
case _FPCLASS_ND:
case _FPCLASS_PD:
#else
case FP_SUBNORMAL:
#endif
BAIL_ON_NONZERO(_encode_PyObject_as_PyDecimal(obj, buffer));
return 0;
}
abs = fabs(num);
if (!buffer->prefs.no_float32 && 1.18e-38 <= abs && 3.4e38 >= abs) {
BAIL_ON_NONZERO(_pyfuncs_ubj_PyFloat_Pack4(num, (unsigned char*)&numtmp[1], 0));
numtmp[0] = TYPE_FLOAT32;
WRITE_OR_BAIL(numtmp, 5);
} else {
BAIL_ON_NONZERO(_pyfuncs_ubj_PyFloat_Pack8(num, (unsigned char*)&numtmp[1], 0));
numtmp[0] = TYPE_FLOAT64;
WRITE_OR_BAIL(numtmp, 9);
}
return 0;
bail:
return 1;
}
/******************************************************************************/
#define WRITE_TYPE_AND_INT8_OR_BAIL(c1, c2) {\
numtmp[0] = c1;\
numtmp[1] = (char)c2;\
WRITE_OR_BAIL(numtmp, 2);\
}
#define WRITE_INT_INTO_NUMTMP(num, size) {\
/* numtmp also stores type, so need one larger*/\
unsigned char i = size + 1;\
do {\
numtmp[--i] = (char)num;\
num >>= 8;\
} while (i > 1);\
}
#define WRITE_INT16_OR_BAIL(num) {\
WRITE_INT_INTO_NUMTMP(num, 2);\
numtmp[0] = TYPE_INT16;\
WRITE_OR_BAIL(numtmp, 3);\
}
#define WRITE_INT32_OR_BAIL(num) {\
WRITE_INT_INTO_NUMTMP(num, 4);\
numtmp[0] = TYPE_INT32;\
WRITE_OR_BAIL(numtmp, 5);\
}
#define WRITE_INT64_OR_BAIL(num) {\
WRITE_INT_INTO_NUMTMP(num, 8);\
numtmp[0] = TYPE_INT64;\
WRITE_OR_BAIL(numtmp, 9);\
}
static int _encode_longlong(long long num, _ubjson_encoder_buffer_t *buffer) {
char numtmp[9]; // large enough to hold type + maximum integer (INT64)
if (num >= 0) {
if (num < POWER_TWO(8)) {
WRITE_TYPE_AND_INT8_OR_BAIL(TYPE_UINT8, num);
} else if (num < POWER_TWO(15)) {
WRITE_INT16_OR_BAIL(num);
} else if (num < POWER_TWO(31)) {
WRITE_INT32_OR_BAIL(num);
} else {
WRITE_INT64_OR_BAIL(num);
}
} else if (num >= -(POWER_TWO(7))) {
WRITE_TYPE_AND_INT8_OR_BAIL(TYPE_INT8, num);
} else if (num >= -(POWER_TWO(15))) {
WRITE_INT16_OR_BAIL(num);
} else if (num >= -(POWER_TWO(31))) {
WRITE_INT32_OR_BAIL(num);
} else {
WRITE_INT64_OR_BAIL(num);
}
return 0;
bail:
return 1;
}
static int _encode_PyLong(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
int overflow;
long long num = PyLong_AsLongLongAndOverflow(obj, &overflow);
if (overflow) {
BAIL_ON_NONZERO(_encode_PyObject_as_PyDecimal(obj, buffer));
return 0;
} else if (num == -1 && PyErr_Occurred()) {
// unexpected as PyLong should fit if not overflowing
goto bail;
} else {
return _encode_longlong(num, buffer);
}
bail:
return 1;
}
#if PY_MAJOR_VERSION < 3
static int _encode_PyInt(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
long num = PyInt_AsLong(obj);
if (num == -1 && PyErr_Occurred()) {
// unexpected as PyInt should fit into long
return 1;
} else {
return _encode_longlong(num, buffer);
}
}
#endif
/******************************************************************************/
static int _encode_PySequence(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *ident; // id of sequence (for checking circular reference)
PyObject *seq = NULL; // converted sequence (via PySequence_Fast)
Py_ssize_t len;
Py_ssize_t i;
int seen;
// circular reference check
BAIL_ON_NULL(ident = PyLong_FromVoidPtr(obj));
if ((seen = PySet_Contains(buffer->markers, ident))) {
if (-1 != seen) {
PyErr_SetString(PyExc_ValueError, "Circular reference detected");
}
goto bail;
}
BAIL_ON_NONZERO(PySet_Add(buffer->markers, ident));
BAIL_ON_NULL(seq = PySequence_Fast(obj, "_encode_PySequence expects sequence"));
len = PySequence_Fast_GET_SIZE(seq);
WRITE_CHAR_OR_BAIL(ARRAY_START);
if (buffer->prefs.container_count) {
WRITE_CHAR_OR_BAIL(CONTAINER_COUNT);
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
}
for (i = 0; i < len; i++) {
BAIL_ON_NONZERO(_ubjson_encode_value(PySequence_Fast_GET_ITEM(seq, i), buffer));
}
if (!buffer->prefs.container_count) {
WRITE_CHAR_OR_BAIL(ARRAY_END);
}
if (-1 == PySet_Discard(buffer->markers, ident)) {
goto bail;
}
Py_DECREF(ident);
Py_DECREF(seq);
return 0;
bail:
Py_XDECREF(ident);
Py_XDECREF(seq);
return 1;
}
/******************************************************************************/
static int _encode_mapping_key(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *str = NULL;
const char *raw;
Py_ssize_t len;
if (PyUnicode_Check(obj)) {
BAIL_ON_NULL(str = PyUnicode_AsEncodedString(obj, "utf-8", NULL));
}
#if PY_MAJOR_VERSION < 3
else if (PyString_Check(obj)) {
BAIL_ON_NULL(str = PyString_AsEncodedObject(obj, "utf-8", NULL));
}
#endif
else {
PyErr_SetString(EncoderException, "Mapping keys can only be strings");
goto bail;
}
raw = PyBytes_AS_STRING(str);
len = PyBytes_GET_SIZE(str);
BAIL_ON_NONZERO(_encode_longlong(len, buffer));
WRITE_OR_BAIL(raw, len);
Py_DECREF(str);
return 0;
bail:
Py_XDECREF(str);
return 1;
}
static int _encode_PyMapping(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *ident; // id of sequence (for checking circular reference)
PyObject *items = NULL;
PyObject *iter = NULL;
PyObject *item = NULL;
int seen;
// circular reference check
BAIL_ON_NULL(ident = PyLong_FromVoidPtr(obj));
if ((seen = PySet_Contains(buffer->markers, ident))) {
if (-1 != seen) {
PyErr_SetString(PyExc_ValueError, "Circular reference detected");
}
goto bail;
}
BAIL_ON_NONZERO(PySet_Add(buffer->markers, ident));
BAIL_ON_NULL(items = PyMapping_Items(obj));
if (buffer->prefs.sort_keys) {
BAIL_ON_NONZERO(PyList_Sort(items));
}
WRITE_CHAR_OR_BAIL(OBJECT_START);
if (buffer->prefs.container_count) {
WRITE_CHAR_OR_BAIL(CONTAINER_COUNT);
_encode_longlong(PyList_GET_SIZE(items), buffer);
}
BAIL_ON_NULL(iter = PyObject_GetIter(items));
while (NULL != (item = PyIter_Next(iter))) {
if (!PyTuple_Check(item) || 2 != PyTuple_GET_SIZE(item)) {
PyErr_SetString(PyExc_ValueError, "items must return 2-tuples");
goto bail;
}
BAIL_ON_NONZERO(_encode_mapping_key(PyTuple_GET_ITEM(item, 0), buffer));
BAIL_ON_NONZERO(_ubjson_encode_value(PyTuple_GET_ITEM(item, 1), buffer));
Py_CLEAR(item);
}
// for PyIter_Next
if (PyErr_Occurred()) {
goto bail;
}
if (!buffer->prefs.container_count) {
WRITE_CHAR_OR_BAIL(OBJECT_END);
}
if (-1 == PySet_Discard(buffer->markers, ident)) {
goto bail;
}
Py_DECREF(iter);
Py_DECREF(items);
Py_DECREF(ident);
return 0;
bail:
Py_XDECREF(item);
Py_XDECREF(iter);
Py_XDECREF(items);
Py_XDECREF(ident);
return 1;
}
/******************************************************************************/
int _ubjson_encode_value(PyObject *obj, _ubjson_encoder_buffer_t *buffer) {
PyObject *newobj = NULL; // result of default call (when encoding unsupported types)
if (Py_None == obj) {
WRITE_CHAR_OR_BAIL(TYPE_NULL);
} else if (Py_True == obj) {
WRITE_CHAR_OR_BAIL(TYPE_BOOL_TRUE);
} else if (Py_False == obj) {
WRITE_CHAR_OR_BAIL(TYPE_BOOL_FALSE);
} else if (PyUnicode_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyUnicode(obj, buffer));
#if PY_MAJOR_VERSION < 3
} else if (PyInt_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyInt(obj, buffer));
#endif
} else if (PyLong_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyLong(obj, buffer));
} else if (PyFloat_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyFloat(obj, buffer));
} else if (PyDec_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyDecimal(obj, buffer));
} else if (PyBytes_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyBytes(obj, buffer));
} else if (PyByteArray_Check(obj)) {
BAIL_ON_NONZERO(_encode_PyByteArray(obj, buffer));
// order important since Mapping could also be Sequence
} else if (PyMapping_Check(obj)
// Unfortunately PyMapping_Check is no longer enough, see https://bugs.python.org/issue5945
#if PY_MAJOR_VERSION >= 3
&& PyObject_HasAttrString(obj, "items")
#endif
) {
RECURSE_AND_BAIL_ON_NONZERO(_encode_PyMapping(obj, buffer), " while encoding a UBJSON object");
} else if (PySequence_Check(obj)) {
RECURSE_AND_BAIL_ON_NONZERO(_encode_PySequence(obj, buffer), " while encoding a UBJSON array");
} else if (NULL == obj) {
PyErr_SetString(PyExc_RuntimeError, "Internal error - _ubjson_encode_value got NULL obj");
goto bail;
} else if (NULL != buffer->prefs.default_func) {
BAIL_ON_NULL(newobj = PyObject_CallFunctionObjArgs(buffer->prefs.default_func, obj, NULL));
RECURSE_AND_BAIL_ON_NONZERO(_ubjson_encode_value(newobj, buffer), " while encoding with default function");
Py_DECREF(newobj);
} else {
PyErr_Format(EncoderException, "Cannot encode item of type %s", obj->ob_type->tp_name);
goto bail;
}
return 0;
bail:
Py_XDECREF(newobj);
return 1;
}
int _ubjson_encoder_init(void) {
PyObject *tmp_module = NULL;
PyObject *tmp_obj = NULL;
// try to determine floating point format / endianess
_pyfuncs_ubj_detect_formats();
// allow encoder to access EncoderException & Decimal class
BAIL_ON_NULL(tmp_module = PyImport_ImportModule("ubjson.encoder"));
BAIL_ON_NULL(EncoderException = PyObject_GetAttrString(tmp_module, "EncoderException"));
Py_CLEAR(tmp_module);
BAIL_ON_NULL(tmp_module = PyImport_ImportModule("decimal"));
BAIL_ON_NULL(tmp_obj = PyObject_GetAttrString(tmp_module, "Decimal"));
if (!PyType_Check(tmp_obj)) {
PyErr_SetString(PyExc_ImportError, "decimal.Decimal type import failure");
goto bail;
}
PyDec_Type = (PyTypeObject*) tmp_obj;
Py_CLEAR(tmp_module);
return 0;
bail:
Py_CLEAR(EncoderException);
Py_CLEAR(PyDec_Type);
Py_XDECREF(tmp_obj);
Py_XDECREF(tmp_module);
return 1;
}
void _ubjson_encoder_cleanup(void) {
Py_CLEAR(EncoderException);
Py_CLEAR(PyDec_Type);
}
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/encoder.h 0000644 0001750 0001750 00000003571 00000000000 015217 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
#include
/******************************************************************************/
typedef struct {
PyObject *default_func;
int container_count;
int sort_keys;
int no_float32;
} _ubjson_encoder_prefs_t;
typedef struct {
// holds PyBytes instance (buffer)
PyObject *obj;
// raw access to obj, size & position
char* raw;
size_t len;
size_t pos;
// if not NULL, full buffer will be written to this method
PyObject *fp_write;
// PySet of sequences and mappings for detecting a circular reference
PyObject *markers;
_ubjson_encoder_prefs_t prefs;
} _ubjson_encoder_buffer_t;
/******************************************************************************/
extern _ubjson_encoder_buffer_t* _ubjson_encoder_buffer_create(_ubjson_encoder_prefs_t* prefs, PyObject *fp_write);
extern void _ubjson_encoder_buffer_free(_ubjson_encoder_buffer_t **buffer);
extern PyObject* _ubjson_encoder_buffer_finalise(_ubjson_encoder_buffer_t *buffer);
extern int _ubjson_encode_value(PyObject *obj, _ubjson_encoder_buffer_t *buffer);
extern int _ubjson_encoder_init(void);
extern void _ubjson_encoder_cleanup(void);
#if defined (__cplusplus)
}
#endif
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/src/markers.h 0000644 0001750 0001750 00000002522 00000000000 015237 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
#define TYPE_NONE '\0' // Used internally only, not part of ubjson specification
#define TYPE_NULL 'Z'
#define TYPE_NOOP 'N'
#define TYPE_BOOL_TRUE 'T'
#define TYPE_BOOL_FALSE 'F'
#define TYPE_INT8 'i'
#define TYPE_UINT8 'U'
#define TYPE_INT16 'I'
#define TYPE_INT32 'l'
#define TYPE_INT64 'L'
#define TYPE_FLOAT32 'd'
#define TYPE_FLOAT64 'D'
#define TYPE_HIGH_PREC 'H'
#define TYPE_CHAR 'C'
#define TYPE_STRING 'S'
// Container delimiters
#define OBJECT_START '{'
#define OBJECT_END '}'
#define ARRAY_START '['
#define ARRAY_END ']'
// Optional container parameters
#define CONTAINER_TYPE '$'
#define CONTAINER_COUNT '#'
#if defined (__cplusplus)
}
#endif
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/src/python_funcs.c 0000644 0001750 0001750 00000026435 00000000000 016316 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2001-2016 Python Software Foundation; All Rights Reserved
*
* Licensed under PSF license, see: https://docs.python.org/3/license.html
*
* Herein are a selection of functions which are not part of Python's public
* C-API, including some from floatobject.c
*/
#include
#include
#include
/******************************************************************************/
/* this is for the benefit of the pack/unpack routines below */
typedef enum {
unknown_format, ieee_big_endian_format, ieee_little_endian_format
} float_format_type;
static float_format_type double_format, float_format;
/* MUST be called before using floating point related functions defined below */
void _pyfuncs_ubj_detect_formats(void) {
float_format_type detected_double_format, detected_float_format;
/* We attempt to determine if this machine is using IEEE
floating point formats by peering at the bits of some
carefully chosen values. If it looks like we are on an
IEEE platform, the float packing/unpacking routines can
just copy bits, if not they resort to arithmetic & shifts
and masks. The shifts & masks approach works on all finite
values, but what happens to infinities, NaNs and signed
zeroes on packing is an accident, and attempting to unpack
a NaN or an infinity will raise an exception.
Note that if we're on some whacked-out platform which uses
IEEE formats but isn't strictly little-endian or big-
endian, we will fall back to the portable shifts & masks
method. */
#if SIZEOF_DOUBLE == 8
{
double x = 9006104071832581.0;
if (memcmp(&x, "\x43\x3f\xff\x01\x02\x03\x04\x05", 8) == 0)
detected_double_format = ieee_big_endian_format;
else if (memcmp(&x, "\x05\x04\x03\x02\x01\xff\x3f\x43", 8) == 0)
detected_double_format = ieee_little_endian_format;
else
detected_double_format = unknown_format;
}
#else
detected_double_format = unknown_format;
#endif
#if SIZEOF_FLOAT == 4
{
float y = 16711938.0;
if (memcmp(&y, "\x4b\x7f\x01\x02", 4) == 0)
detected_float_format = ieee_big_endian_format;
else if (memcmp(&y, "\x02\x01\x7f\x4b", 4) == 0)
detected_float_format = ieee_little_endian_format;
else
detected_float_format = unknown_format;
}
#else
detected_float_format = unknown_format;
#endif
double_format = detected_double_format;
float_format = detected_float_format;
}
/******************************************************************************/
int
_pyfuncs_ubj_PyFloat_Pack4(double x, unsigned char *p, int le)
{
if (float_format == unknown_format) {
unsigned char sign;
int e;
double f;
unsigned int fbits;
int incr = 1;
if (le) {
p += 3;
incr = -1;
}
if (x < 0) {
sign = 1;
x = -x;
}
else
sign = 0;
f = frexp(x, &e);
/* Normalize f to be in the range [1.0, 2.0) */
if (0.5 <= f && f < 1.0) {
f *= 2.0;
e--;
}
else if (f == 0.0)
e = 0;
else {
PyErr_SetString(PyExc_SystemError,
"frexp() result out of range");
return -1;
}
if (e >= 128)
goto Overflow;
else if (e < -126) {
/* Gradual underflow */
f = ldexp(f, 126 + e);
e = 0;
}
else if (!(e == 0 && f == 0.0)) {
e += 127;
f -= 1.0; /* Get rid of leading 1 */
}
f *= 8388608.0; /* 2**23 */
fbits = (unsigned int)(f + 0.5); /* Round */
assert(fbits <= 8388608);
if (fbits >> 23) {
/* The carry propagated out of a string of 23 1 bits. */
fbits = 0;
++e;
if (e >= 255)
goto Overflow;
}
/* First byte */
*p = (sign << 7) | (e >> 1);
p += incr;
/* Second byte */
*p = (char) (((e & 1) << 7) | (fbits >> 16));
p += incr;
/* Third byte */
*p = (fbits >> 8) & 0xFF;
p += incr;
/* Fourth byte */
*p = fbits & 0xFF;
/* Done */
return 0;
}
else {
float y = (float)x;
const unsigned char *s = (unsigned char*)&y;
int i, incr = 1;
if (Py_IS_INFINITY(y) && !Py_IS_INFINITY(x))
goto Overflow;
if ((float_format == ieee_little_endian_format && !le)
|| (float_format == ieee_big_endian_format && le)) {
p += 3;
incr = -1;
}
for (i = 0; i < 4; i++) {
*p = *s++;
p += incr;
}
return 0;
}
Overflow:
PyErr_SetString(PyExc_OverflowError,
"float too large to pack with f format");
return -1;
}
int
_pyfuncs_ubj_PyFloat_Pack8(double x, unsigned char *p, int le)
{
if (double_format == unknown_format) {
unsigned char sign;
int e;
double f;
unsigned int fhi, flo;
int incr = 1;
if (le) {
p += 7;
incr = -1;
}
if (x < 0) {
sign = 1;
x = -x;
}
else
sign = 0;
f = frexp(x, &e);
/* Normalize f to be in the range [1.0, 2.0) */
if (0.5 <= f && f < 1.0) {
f *= 2.0;
e--;
}
else if (f == 0.0)
e = 0;
else {
PyErr_SetString(PyExc_SystemError,
"frexp() result out of range");
return -1;
}
if (e >= 1024)
goto Overflow;
else if (e < -1022) {
/* Gradual underflow */
f = ldexp(f, 1022 + e);
e = 0;
}
else if (!(e == 0 && f == 0.0)) {
e += 1023;
f -= 1.0; /* Get rid of leading 1 */
}
/* fhi receives the high 28 bits; flo the low 24 bits (== 52 bits) */
f *= 268435456.0; /* 2**28 */
fhi = (unsigned int)f; /* Truncate */
assert(fhi < 268435456);
f -= (double)fhi;
f *= 16777216.0; /* 2**24 */
flo = (unsigned int)(f + 0.5); /* Round */
assert(flo <= 16777216);
if (flo >> 24) {
/* The carry propagated out of a string of 24 1 bits. */
flo = 0;
++fhi;
if (fhi >> 28) {
/* And it also progagated out of the next 28 bits. */
fhi = 0;
++e;
if (e >= 2047)
goto Overflow;
}
}
/* First byte */
*p = (sign << 7) | (e >> 4);
p += incr;
/* Second byte */
*p = (unsigned char) (((e & 0xF) << 4) | (fhi >> 24));
p += incr;
/* Third byte */
*p = (fhi >> 16) & 0xFF;
p += incr;
/* Fourth byte */
*p = (fhi >> 8) & 0xFF;
p += incr;
/* Fifth byte */
*p = fhi & 0xFF;
p += incr;
/* Sixth byte */
*p = (flo >> 16) & 0xFF;
p += incr;
/* Seventh byte */
*p = (flo >> 8) & 0xFF;
p += incr;
/* Eighth byte */
*p = flo & 0xFF;
/* p += incr; */
/* Done */
return 0;
Overflow:
PyErr_SetString(PyExc_OverflowError,
"float too large to pack with d format");
return -1;
}
else {
const unsigned char *s = (unsigned char*)&x;
int i, incr = 1;
if ((double_format == ieee_little_endian_format && !le)
|| (double_format == ieee_big_endian_format && le)) {
p += 7;
incr = -1;
}
for (i = 0; i < 8; i++) {
*p = *s++;
p += incr;
}
return 0;
}
}
/******************************************************************************/
double
_pyfuncs_ubj_PyFloat_Unpack4(const unsigned char *p, int le)
{
if (float_format == unknown_format) {
unsigned char sign;
int e;
unsigned int f;
double x;
int incr = 1;
if (le) {
p += 3;
incr = -1;
}
/* First byte */
sign = (*p >> 7) & 1;
e = (*p & 0x7F) << 1;
p += incr;
/* Second byte */
e |= (*p >> 7) & 1;
f = (*p & 0x7F) << 16;
p += incr;
if (e == 255) {
PyErr_SetString(
PyExc_ValueError,
"can't unpack IEEE 754 special value "
"on non-IEEE platform");
return -1;
}
/* Third byte */
f |= *p << 8;
p += incr;
/* Fourth byte */
f |= *p;
x = (double)f / 8388608.0;
/* XXX This sadly ignores Inf/NaN issues */
if (e == 0)
e = -126;
else {
x += 1.0;
e -= 127;
}
x = ldexp(x, e);
if (sign)
x = -x;
return x;
}
else {
float x;
if ((float_format == ieee_little_endian_format && !le)
|| (float_format == ieee_big_endian_format && le)) {
char buf[4];
char *d = &buf[3];
int i;
for (i = 0; i < 4; i++) {
*d-- = *p++;
}
memcpy(&x, buf, 4);
}
else {
memcpy(&x, p, 4);
}
return x;
}
}
double
_pyfuncs_ubj_PyFloat_Unpack8(const unsigned char *p, int le)
{
if (double_format == unknown_format) {
unsigned char sign;
int e;
unsigned int fhi, flo;
double x;
int incr = 1;
if (le) {
p += 7;
incr = -1;
}
/* First byte */
sign = (*p >> 7) & 1;
e = (*p & 0x7F) << 4;
p += incr;
/* Second byte */
e |= (*p >> 4) & 0xF;
fhi = (*p & 0xF) << 24;
p += incr;
if (e == 2047) {
PyErr_SetString(
PyExc_ValueError,
"can't unpack IEEE 754 special value "
"on non-IEEE platform");
return -1.0;
}
/* Third byte */
fhi |= *p << 16;
p += incr;
/* Fourth byte */
fhi |= *p << 8;
p += incr;
/* Fifth byte */
fhi |= *p;
p += incr;
/* Sixth byte */
flo = *p << 16;
p += incr;
/* Seventh byte */
flo |= *p << 8;
p += incr;
/* Eighth byte */
flo |= *p;
x = (double)fhi + (double)flo / 16777216.0; /* 2**24 */
x /= 268435456.0; /* 2**28 */
if (e == 0)
e = -1022;
else {
x += 1.0;
e -= 1023;
}
x = ldexp(x, e);
if (sign)
x = -x;
return x;
}
else {
double x;
if ((double_format == ieee_little_endian_format && !le)
|| (double_format == ieee_big_endian_format && le)) {
char buf[8];
char *d = &buf[7];
int i;
for (i = 0; i < 8; i++) {
*d-- = *p++;
}
memcpy(&x, buf, 8);
}
else {
memcpy(&x, p, 8);
}
return x;
}
}
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1536426551.0
py-ubjson-0.16.1/src/python_funcs.h 0000644 0001750 0001750 00000004761 00000000000 016321 0 ustar 00vt vt 0000000 0000000 /*
* Copyright (c) 2001-2016 Python Software Foundation; All Rights Reserved
*
* Licensed under PSF license, see: https://docs.python.org/3/license.html
*
* Herein are a selection of functions which are not part of Python's public
* C-API, including some of floatobject.c
*/
#pragma once
#if defined (__cplusplus)
extern "C" {
#endif
/******************************************************************************/
/* MUST be called before using floating point related functions defined below */
extern void _pyfuncs_ubj_detect_formats(void);
/* _PyFloat_{Pack,Unpack}{4,8}
*
* The struct and pickle (at least) modules need an efficient platform-
* independent way to store floating-point values as byte strings.
* The Pack routines produce a string from a C double, and the Unpack
* routines produce a C double from such a string. The suffix (4 or 8)
* specifies the number of bytes in the string.
*
* On platforms that appear to use (see _PyFloat_Init()) IEEE-754 formats
* these functions work by copying bits. On other platforms, the formats the
* 4- byte format is identical to the IEEE-754 single precision format, and
* the 8-byte format to the IEEE-754 double precision format, although the
* packing of INFs and NaNs (if such things exist on the platform) isn't
* handled correctly, and attempting to unpack a string containing an IEEE
* INF or NaN will raise an exception.
*
* On non-IEEE platforms with more precision, or larger dynamic range, than
* 754 supports, not all values can be packed; on non-IEEE platforms with less
* precision, or smaller dynamic range, not all values can be unpacked. What
* happens in such cases is partly accidental (alas).
*/
/* The pack routines write 2, 4 or 8 bytes, starting at p. le is a bool
* argument, true if you want the string in little-endian format (exponent
* last, at p+1, p+3 or p+7), false if you want big-endian format (exponent
* first, at p).
* Return value: 0 if all is OK, -1 if error (and an exception is
* set, most likely OverflowError).
* There are two problems on non-IEEE platforms:
* 1): What this does is undefined if x is a NaN or infinity.
* 2): -0.0 and +0.0 produce the same string.
*/
extern int _pyfuncs_ubj_PyFloat_Pack4(double x, unsigned char *p, int le);
extern int _pyfuncs_ubj_PyFloat_Pack8(double x, unsigned char *p, int le);
extern double _pyfuncs_ubj_PyFloat_Unpack4(const unsigned char *p, int le);
extern double _pyfuncs_ubj_PyFloat_Unpack8(const unsigned char *p, int le);
#ifdef __cplusplus
}
#endif
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/test/ 0000755 0001750 0001750 00000000000 00000000000 013611 5 ustar 00vt vt 0000000 0000000 ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222042.0
py-ubjson-0.16.1/test/test.py 0000644 0001750 0001750 00000072615 00000000000 015155 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from sys import version_info, getrecursionlimit, setrecursionlimit
from functools import partial
from io import BytesIO, SEEK_END
from unittest import TestCase, skipUnless
from pprint import pformat
from decimal import Decimal
from struct import pack
from collections import OrderedDict
from ubjson import (dump as ubjdump, dumpb as ubjdumpb, load as ubjload, loadb as ubjloadb, EncoderException,
DecoderException, EXTENSION_ENABLED)
from ubjson.markers import (TYPE_NULL, TYPE_NOOP, TYPE_BOOL_TRUE, TYPE_BOOL_FALSE, TYPE_INT8, TYPE_UINT8, TYPE_INT16,
TYPE_INT32, TYPE_INT64, TYPE_FLOAT32, TYPE_FLOAT64, TYPE_HIGH_PREC, TYPE_CHAR, TYPE_STRING,
OBJECT_START, OBJECT_END, ARRAY_START, ARRAY_END, CONTAINER_TYPE, CONTAINER_COUNT)
from ubjson.compat import INTEGER_TYPES
# Pure Python versions
from ubjson.encoder import dump as ubjpuredump, dumpb as ubjpuredumpb
from ubjson.decoder import load as ubjpureload, loadb as ubjpureloadb
PY2 = version_info[0] < 3
if PY2: # pragma: no cover
def u(obj):
"""Casts obj to unicode string, unless already one"""
return obj if isinstance(obj, unicode) else unicode(obj) # noqa: F821 pylint: disable=undefined-variable
else: # pragma: no cover
def u(obj):
"""Casts obj to unicode string, unless already one"""
return obj if isinstance(obj, str) else str(obj)
class TestEncodeDecodePlain(TestCase): # pylint: disable=too-many-public-methods
@staticmethod
def ubjloadb(raw, *args, **kwargs):
return ubjpureloadb(raw, *args, **kwargs)
@staticmethod
def ubjdumpb(obj, *args, **kwargs):
return ubjpuredumpb(obj, *args, **kwargs)
@staticmethod
def __format_in_out(obj, encoded):
return '\nInput:\n%s\nOutput (%d):\n%s' % (pformat(obj), len(encoded), encoded)
if PY2: # pragma: no cover
def type_check(self, actual, expected):
self.assertEqual(actual, expected)
else: # pragma: no cover
def type_check(self, actual, expected):
self.assertEqual(actual, ord(expected))
# based on math.isclose available in Python v3.5
@staticmethod
# pylint: disable=invalid-name
def numbers_close(a, b, rel_tol=1e-05, abs_tol=0.0):
return abs(a - b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
def check_enc_dec(self, obj,
# total length of encoded object
length=None,
# total length is at least the given number of bytes
length_greater_or_equal=False,
# approximate comparison (e.g. for float)
approximate=False,
# type marker expected at start of encoded output
expected_type=None,
# decoder params
object_hook=None,
object_pairs_hook=None,
# additional arguments to pass to encoder
**kwargs):
"""Black-box test to check whether the provided object is the same once encoded and subsequently decoded."""
encoded = self.ubjdumpb(obj, **kwargs)
if expected_type is not None:
self.type_check(encoded[0], expected_type)
if length is not None:
assert_func = self.assertGreaterEqual if length_greater_or_equal else self.assertEqual
assert_func(len(encoded), length, self.__format_in_out(obj, encoded))
if approximate:
self.assertTrue(self.numbers_close(self.ubjloadb(encoded, object_hook=object_hook,
object_pairs_hook=object_pairs_hook), obj),
msg=self.__format_in_out(obj, encoded))
else:
self.assertEqual(self.ubjloadb(encoded, object_hook=object_hook,
object_pairs_hook=object_pairs_hook), obj,
self.__format_in_out(obj, encoded))
def test_no_data(self):
with self.assertRaises(DecoderException):
self.ubjloadb(b'')
def test_invalid_data(self):
for invalid in (u('unicode'), 123):
with self.assertRaises(TypeError):
self.ubjloadb(invalid)
def test_trailing_input(self):
self.assertEqual(self.ubjloadb(TYPE_BOOL_TRUE * 10), True)
def test_invalid_marker(self):
with self.assertRaises(DecoderException) as ctx:
self.ubjloadb(b'A')
self.assertTrue(isinstance(ctx.exception.position, INTEGER_TYPES + (type(None),)))
def test_bool(self):
self.assertEqual(self.ubjdumpb(True), TYPE_BOOL_TRUE)
self.assertEqual(self.ubjdumpb(False), TYPE_BOOL_FALSE)
self.check_enc_dec(True, 1)
self.check_enc_dec(False, 1)
def test_null(self):
self.assertEqual(self.ubjdumpb(None), TYPE_NULL)
self.check_enc_dec(None, 1)
def test_char(self):
self.assertEqual(self.ubjdumpb(u('a')), TYPE_CHAR + 'a'.encode('utf-8'))
# no char, char invalid utf-8
for suffix in (b'', b'\xfe'):
with self.assertRaises(DecoderException):
self.ubjloadb(TYPE_CHAR + suffix)
for char in (u('a'), u('\0'), u('~')):
self.check_enc_dec(char, 2)
def test_string(self):
self.assertEqual(self.ubjdumpb(u('ab')), TYPE_STRING + TYPE_UINT8 + b'\x02' + 'ab'.encode('utf-8'))
self.check_enc_dec(u(''), 3)
# invalid string size, string too short, string invalid utf-8
for suffix in (b'\x81', b'\x01', b'\x01' + b'\xfe'):
with self.assertRaises(DecoderException):
self.ubjloadb(TYPE_STRING + TYPE_INT8 + suffix)
# Note: In Python 2 plain str type is encoded as byte array
for string in ('some ascii', u(r'\u00a9 with extended\u2122'), u('long string') * 100):
self.check_enc_dec(string, 4, length_greater_or_equal=True)
def test_int(self):
self.assertEqual(self.ubjdumpb(Decimal(-1.5)),
TYPE_HIGH_PREC + TYPE_UINT8 + b'\x04' + '-1.5'.encode('utf-8'))
# insufficient length
with self.assertRaises(DecoderException):
self.ubjloadb(TYPE_INT16 + b'\x01')
for type_, value, total_size in (
(TYPE_UINT8, 0, 2),
(TYPE_UINT8, 255, 2),
(TYPE_INT8, -128, 2),
(TYPE_INT16, -32768, 3),
(TYPE_INT16, 456, 3),
(TYPE_INT16, 32767, 3),
(TYPE_INT32, -2147483648, 5),
(TYPE_INT32, 1610612735, 5),
(TYPE_INT32, 2147483647, 5),
(TYPE_INT64, -9223372036854775808, 9),
(TYPE_INT64, 6917529027641081855, 9),
(TYPE_INT64, 9223372036854775807, 9),
# HIGH_PREC (marker + length marker + length + value)
(TYPE_HIGH_PREC, 9223372036854775808, 22),
(TYPE_HIGH_PREC, -9223372036854775809, 23),
(TYPE_HIGH_PREC, 9999999999999999999999999999999999999, 40)):
self.check_enc_dec(value, total_size, expected_type=type_)
def test_high_precision(self):
self.assertEqual(self.ubjdumpb(Decimal(-1.5)),
TYPE_HIGH_PREC + TYPE_UINT8 + b'\x04' + '-1.5'.encode('utf-8'))
# insufficient length, invalid utf-8, invalid decimal value
for suffix in (b'n', b'\xfe\xfe', b'na'):
with self.assertRaises(DecoderException):
self.ubjloadb(TYPE_HIGH_PREC + TYPE_UINT8 + b'\x02' + suffix)
self.check_enc_dec('1.8e315')
for value in (
'0.0',
'2.5',
'10e30',
'-1.2345e67890'):
# minimum length because: marker + length marker + length + value
self.check_enc_dec(Decimal(value), 4, length_greater_or_equal=True)
# cannot compare equality, so test separately (since these evaluate to "NULL"
for value in ('nan', '-inf', 'inf'):
self.assertEqual(self.ubjloadb(self.ubjdumpb(Decimal(value))), None)
def test_float(self):
# insufficient length
for float_type in (TYPE_FLOAT32, TYPE_FLOAT64):
with self.assertRaises(DecoderException):
self.ubjloadb(float_type + b'\x01')
self.check_enc_dec(0.0, 5, expected_type=TYPE_FLOAT32)
for type_, value, total_size in (
(TYPE_FLOAT32, 1.18e-37, 5),
(TYPE_FLOAT32, 3.4e37, 5),
(TYPE_FLOAT64, 2.23e-308, 9),
(TYPE_FLOAT64, 12345.44e40, 9),
(TYPE_FLOAT64, 1.8e307, 9)):
self.check_enc_dec(value,
total_size,
approximate=True,
expected_type=type_,
no_float32=False)
# using only float64 (default)
self.check_enc_dec(value,
9 if type_ == TYPE_FLOAT32 else total_size,
approximate=True,
expected_type=(TYPE_FLOAT64 if type_ == TYPE_FLOAT32 else type_))
for value in ('nan', '-inf', 'inf'):
for no_float32 in (True, False):
self.assertEqual(self.ubjloadb(self.ubjdumpb(float(value), no_float32=no_float32)), None)
# value which results in high_prec usage
for no_float32 in (True, False):
self.check_enc_dec(2.22e-308, 4, expected_type=TYPE_HIGH_PREC, length_greater_or_equal=True,
no_float32=no_float32)
def test_array(self):
# invalid length
with self.assertRaises(DecoderException):
self.ubjloadb(ARRAY_START + CONTAINER_COUNT + self.ubjdumpb(-5))
# unencodable type within
with self.assertRaises(EncoderException):
self.ubjdumpb([type(None)])
for sequence in list, tuple:
self.assertEqual(self.ubjdumpb(sequence()), ARRAY_START + ARRAY_END)
self.assertEqual(self.ubjdumpb((None,), container_count=True), (ARRAY_START + CONTAINER_COUNT + TYPE_UINT8 +
b'\x01' + TYPE_NULL))
obj = [123,
1.25,
43121609.5543,
12345.44e40,
Decimal('10e15'),
'a',
'here is a string',
None,
True,
False,
[[1, 2], 3, [4, 5, 6], 7],
{'a dict': 456}]
for opts in ({'container_count': False}, {'container_count': True}):
self.check_enc_dec(obj, **opts)
def test_bytes(self):
# insufficient length
with self.assertRaises(DecoderException):
self.ubjloadb(ARRAY_START + CONTAINER_TYPE + TYPE_UINT8 + CONTAINER_COUNT + TYPE_UINT8 + b'\x02' + b'\x01')
for cast in (bytes, bytearray):
self.check_enc_dec(cast(b''))
self.check_enc_dec(cast(b'\x01' * 4))
self.assertEqual(self.ubjloadb(self.ubjdumpb(cast(b'\x04' * 4)), no_bytes=True), [4] * 4)
self.check_enc_dec(cast(b'largebinary' * 100))
def test_array_fixed(self):
raw_start = ARRAY_START + CONTAINER_TYPE + TYPE_INT8 + CONTAINER_COUNT + TYPE_UINT8
self.assertEqual(self.ubjloadb(raw_start + b'\x00'), [])
# fixed types + count
for ubj_type, py_obj in ((TYPE_NULL, None), (TYPE_BOOL_TRUE, True), (TYPE_BOOL_FALSE, False)):
self.assertEqual(
self.ubjloadb(ARRAY_START + CONTAINER_TYPE + ubj_type + CONTAINER_COUNT + TYPE_UINT8 + b'\x05'),
[py_obj] * 5
)
self.assertEqual(self.ubjloadb(raw_start + b'\x03' + (b'\x01' * 3)), [1, 1, 1])
# invalid type
with self.assertRaises(DecoderException):
self.ubjloadb(ARRAY_START + CONTAINER_TYPE + b'\x01')
# type without count
with self.assertRaises(DecoderException):
self.ubjloadb(ARRAY_START + CONTAINER_TYPE + TYPE_INT8 + b'\x01')
# count without type
self.assertEqual(self.ubjloadb(ARRAY_START + CONTAINER_COUNT + TYPE_UINT8 + b'\x02' + TYPE_BOOL_FALSE +
TYPE_BOOL_TRUE),
[False, True])
# nested
self.assertEqual(self.ubjloadb(ARRAY_START + CONTAINER_TYPE + ARRAY_START + CONTAINER_COUNT + TYPE_UINT8 +
b'\x03' + ARRAY_END + CONTAINER_COUNT + TYPE_UINT8 + b'\x01' + TYPE_BOOL_TRUE +
TYPE_BOOL_FALSE + TYPE_BOOL_TRUE + ARRAY_END),
[[], [True], [False, True]])
def test_array_noop(self):
# only supported without type
self.assertEqual(self.ubjloadb(ARRAY_START +
TYPE_NOOP +
TYPE_UINT8 + b'\x01' +
TYPE_NOOP +
TYPE_UINT8 + b'\x02' +
TYPE_NOOP +
ARRAY_END), [1, 2])
self.assertEqual(self.ubjloadb(ARRAY_START + CONTAINER_COUNT + TYPE_UINT8 + b'\x01' +
TYPE_NOOP +
TYPE_UINT8 + b'\x01'), [1])
def test_object_invalid(self):
# negative length
with self.assertRaises(DecoderException):
self.ubjloadb(OBJECT_START + CONTAINER_COUNT + self.ubjdumpb(-1))
with self.assertRaises(EncoderException):
self.ubjdumpb({123: 'non-string key'})
with self.assertRaises(EncoderException):
self.ubjdumpb({'fish': type(list)})
# invalid key size type
with self.assertRaises(DecoderException):
self.ubjloadb(OBJECT_START + TYPE_NULL)
# invalid key size, key too short, key invalid utf-8, no value
for suffix in (b'\x81', b'\x01', b'\x01' + b'\xfe', b'\x0101'):
with self.assertRaises(DecoderException):
self.ubjloadb(OBJECT_START + TYPE_INT8 + suffix)
# invalid items() method
class BadDict(dict):
def items(self):
return super(BadDict, self).keys()
with self.assertRaises(ValueError):
self.ubjdumpb(BadDict({'a': 1, 'b': 2}))
def test_object(self):
# custom hook
with self.assertRaises(TypeError):
self.ubjloadb(self.ubjdumpb({}), object_pairs_hook=int)
# same as not specifying a custom class
self.ubjloadb(self.ubjdumpb({}), object_pairs_hook=None)
for hook in (None, OrderedDict):
check_enc_dec = partial(self.check_enc_dec, object_pairs_hook=hook)
self.assertEqual(self.ubjdumpb({}), OBJECT_START + OBJECT_END)
self.assertEqual(self.ubjdumpb({'a': None}, container_count=True),
(OBJECT_START + CONTAINER_COUNT + TYPE_UINT8 + b'\x01' + TYPE_UINT8 + b'\x01' +
'a'.encode('utf-8') + TYPE_NULL))
check_enc_dec({})
check_enc_dec({'longkey1' * 65: 1})
check_enc_dec({'longkey2' * 4096: 1})
obj = {'int': 123,
'longint': 9223372036854775807,
'float': 1.25,
'hp': Decimal('10e15'),
'char': 'a',
'str': 'here is a string',
'unicode': u(r'\u00a9 with extended\u2122'),
'': 'empty key',
u(r'\u00a9 with extended\u2122'): 'unicode-key',
'null': None,
'true': True,
'false': False,
'array': [1, 2, 3],
'bytes_array': b'1234',
'object': {'another one': 456, 'yet another': {'abc': True}}}
for opts in ({'container_count': False}, {'container_count': True}):
check_enc_dec(obj, **opts)
# dictionary key sorting
obj1 = OrderedDict.fromkeys('abcdefghijkl')
obj2 = OrderedDict.fromkeys('abcdefghijkl'[::-1])
self.assertNotEqual(self.ubjdumpb(obj1), self.ubjdumpb(obj2))
self.assertEqual(self.ubjdumpb(obj1, sort_keys=True), self.ubjdumpb(obj2, sort_keys=True))
self.assertEqual(self.ubjloadb(self.ubjdumpb(obj1), object_pairs_hook=OrderedDict), obj1)
def test_object_fixed(self):
raw_start = OBJECT_START + CONTAINER_TYPE + TYPE_INT8 + CONTAINER_COUNT + TYPE_UINT8
for hook in (None, OrderedDict):
loadb = partial(self.ubjloadb, object_pairs_hook=hook)
self.assertEqual(loadb(raw_start + b'\x00'), {})
self.assertEqual(loadb(raw_start + b'\x03' + (TYPE_UINT8 + b'\x02' + b'aa' + b'\x01' +
TYPE_UINT8 + b'\x02' + b'bb' + b'\x02' +
TYPE_UINT8 + b'\x02' + b'cc' + b'\x03')),
{'aa': 1, 'bb': 2, 'cc': 3})
# count only
self.assertEqual(loadb(OBJECT_START + CONTAINER_COUNT + TYPE_UINT8 + b'\x02' +
TYPE_UINT8 + b'\x02' + b'aa' + TYPE_NULL + TYPE_UINT8 + b'\x02' + b'bb' + TYPE_NULL),
{'aa': None, 'bb': None})
# fixed type + count
self.assertEqual(loadb(OBJECT_START + CONTAINER_TYPE + TYPE_NULL + CONTAINER_COUNT + TYPE_UINT8 + b'\x02' +
TYPE_UINT8 + b'\x02' + b'aa' + TYPE_UINT8 + b'\x02' + b'bb'),
{'aa': None, 'bb': None})
# fixed type + count (bytes)
self.assertEqual(loadb(OBJECT_START + CONTAINER_TYPE + TYPE_UINT8 + CONTAINER_COUNT + TYPE_UINT8 + b'\x02' +
TYPE_UINT8 + b'\x02' + b'aa' + b'\x04' + TYPE_UINT8 + b'\x02' + b'bb' + b'\x05'),
{'aa': 4, 'bb': 5})
def test_object_noop(self):
# only supported without type
for hook in (None, OrderedDict):
loadb = partial(self.ubjloadb, object_pairs_hook=hook)
self.assertEqual(loadb(OBJECT_START +
TYPE_NOOP +
TYPE_UINT8 + b'\x01' + 'a'.encode('utf-8') + TYPE_NULL +
TYPE_NOOP +
TYPE_UINT8 + b'\x01' + 'b'.encode('utf-8') + TYPE_BOOL_TRUE +
OBJECT_END), {'a': None, 'b': True})
self.assertEqual(loadb(OBJECT_START + CONTAINER_COUNT + TYPE_UINT8 + b'\x01' +
TYPE_NOOP +
TYPE_UINT8 + b'\x01' + 'a'.encode('utf-8') + TYPE_NULL), {'a': None})
def test_intern_object_keys(self):
encoded = self.ubjdumpb({'asdasd': 1, 'qwdwqd': 2})
mapping2 = self.ubjloadb(encoded, intern_object_keys=True)
mapping3 = self.ubjloadb(encoded, intern_object_keys=True)
for key1, key2 in zip(sorted(mapping2.keys()), sorted(mapping3.keys())):
if PY2: # pragma: no cover
# interning of unicode not supported
self.assertEqual(key1, key2)
else: # pragma: no cover
self.assertIs(key1, key2)
def test_circular(self):
sequence = [1, 2, 3]
sequence.append(sequence)
mapping = {'a': 1, 'b': 2}
mapping['c'] = mapping
for container in (sequence, mapping):
with self.assertRaises(ValueError):
self.ubjdumpb(container)
# Refering to the same container multiple times is valid however
sequence = [1, 2, 3]
mapping = {'a': 1, 'b': 2}
self.check_enc_dec([sequence, mapping, sequence, mapping])
def test_unencodable(self):
with self.assertRaises(EncoderException):
self.ubjdumpb(type(None))
def test_decoder_fuzz(self):
for start, end, fmt in ((0, pow(2, 8), '>B'), (pow(2, 8), pow(2, 16), '>H'), (pow(2, 16), pow(2, 18), '>I')):
for i in range(start, end):
try:
self.ubjloadb(pack(fmt, i))
except DecoderException:
pass
except Exception as ex: # pragma: no cover pylint: disable=broad-except
self.fail('Unexpected failure: %s' % ex)
def assert_raises_regex(self, *args, **kwargs):
# pylint: disable=deprecated-method,no-member
return (self.assertRaisesRegexp if PY2 else self.assertRaisesRegex)(*args, **kwargs)
def test_recursion(self):
old_limit = getrecursionlimit()
setrecursionlimit(200)
try:
obj = current = []
for _ in range(getrecursionlimit() * 2):
new_list = []
current.append(new_list)
current = new_list
with self.assert_raises_regex(RuntimeError, 'recursion'):
self.ubjdumpb(obj)
raw = ARRAY_START * (getrecursionlimit() * 2)
with self.assert_raises_regex(RuntimeError, 'recursion'):
self.ubjloadb(raw)
finally:
setrecursionlimit(old_limit)
def test_encode_default(self):
def default(obj):
if isinstance(obj, set):
return sorted(obj)
raise EncoderException('__test__marker__')
dumpb_default = partial(self.ubjdumpb, default=default)
# Top-level custom type
obj1 = {1, 2, 3}
obj2 = default(obj1)
# Custom type within sequence or mapping
obj3 = OrderedDict(sorted({'a': 1, 'b': obj1, 'c': [2, obj1]}.items()))
obj4 = OrderedDict(sorted({'a': 1, 'b': obj2, 'c': [2, obj2]}.items()))
with self.assert_raises_regex(EncoderException, 'Cannot encode item'):
self.ubjdumpb(obj1)
# explicit None should behave the same as no default
with self.assert_raises_regex(EncoderException, 'Cannot encode item'):
self.ubjdumpb(obj1, default=None)
with self.assert_raises_regex(EncoderException, '__test__marker__'):
dumpb_default(self)
self.assertEqual(dumpb_default(obj1), self.ubjdumpb(obj2))
self.assertEqual(dumpb_default(obj3), self.ubjdumpb(obj4))
def test_decode_object_hook(self):
with self.assertRaises(TypeError):
self.check_enc_dec({'a': 1, 'b': 2}, object_hook=int)
def default(obj):
if isinstance(obj, set):
return {'__set__': list(obj)}
raise EncoderException('__test__marker__')
def object_hook(obj):
if '__set__' in obj:
return set(obj['__set__'])
return obj
self.check_enc_dec({'a': 1, 'b': {2, 3, 4}}, object_hook=object_hook, default=default)
class UnHandled(object):
pass
with self.assertRaises(EncoderException):
self.check_enc_dec({'a': 1, 'b': UnHandled()}, object_hook=object_hook, default=default)
@skipUnless(EXTENSION_ENABLED, 'Extension not enabled')
class TestEncodeDecodePlainExt(TestEncodeDecodePlain):
@staticmethod
def ubjloadb(raw, *args, **kwargs):
return ubjloadb(raw, *args, **kwargs)
@staticmethod
def ubjdumpb(obj, *args, **kwargs):
return ubjdumpb(obj, *args, **kwargs)
class TestEncodeDecodeFp(TestEncodeDecodePlain):
"""Performs tests via file-like objects (BytesIO) instead of bytes instances"""
@staticmethod
def ubjloadb(raw, *args, **kwargs):
return ubjpureload(BytesIO(raw), *args, **kwargs)
@staticmethod
def ubjdumpb(obj, *args, **kwargs):
out = BytesIO()
ubjpuredump(obj, out, *args, **kwargs)
return out.getvalue()
@staticmethod
def ubjload(fp, *args, **kwargs):
return ubjpureload(fp, *args, **kwargs)
@staticmethod
def ubjdump(obj, fp, *args, **kwargs):
return ubjpuredump(obj, fp, *args, **kwargs)
def test_decode_exception_position(self):
with self.assertRaises(DecoderException) as ctx:
self.ubjloadb(TYPE_STRING + TYPE_INT8 + b'\x01' + b'\xfe' + b'c0fefe' * 4)
self.assertEqual(ctx.exception.position, 4)
def test_invalid_fp_dump(self):
with self.assertRaises(AttributeError):
self.ubjdump(None, 1)
class Dummy(object):
write = 1
class Dummy2(object):
@staticmethod
def write(raw):
raise ValueError('invalid - %s' % repr(raw))
with self.assertRaises(TypeError):
self.ubjdump(b'', Dummy)
with self.assertRaises(ValueError):
self.ubjdump(b'', Dummy2)
def test_invalid_fp_load(self):
with self.assertRaises(AttributeError):
self.ubjload(1)
class Dummy(object):
read = 1
class Dummy2(object):
@staticmethod
def read(length):
raise ValueError('invalid - %d' % length)
with self.assertRaises(TypeError):
self.ubjload(Dummy)
with self.assertRaises(ValueError):
self.ubjload(Dummy2)
def test_fp(self):
obj = {'a': 123, 'b': 456}
output = BytesIO()
self.ubjdump(obj, output)
output.seek(0)
self.assertEqual(self.ubjload(output), obj)
@skipUnless(EXTENSION_ENABLED, 'Extension not enabled')
class TestEncodeDecodeFpExt(TestEncodeDecodeFp):
@staticmethod
def ubjloadb(raw, *args, **kwargs):
return ubjload(BytesIO(raw), *args, **kwargs)
@staticmethod
def ubjdumpb(obj, *args, **kwargs):
out = BytesIO()
ubjdump(obj, out, *args, **kwargs)
return out.getvalue()
@staticmethod
def ubjload(fp, *args, **kwargs):
return ubjload(fp, *args, **kwargs)
@staticmethod
def ubjdump(obj, fp, *args, **kwargs):
return ubjdump(obj, fp, *args, **kwargs)
# Seekable file-like object buffering
def test_fp_buffer(self):
output = BytesIO()
# items which fit into extension decoder-internal read buffer (BUFFER_FP_SIZE in decoder.c, extension only)
obj2 = ['fishy' * 64] * 10
output.seek(0)
self.ubjdump(obj2, output)
output.seek(0)
self.assertEqual(self.ubjload(output), obj2)
# larger than extension read buffer (extension only)
obj3 = ['fishy' * 512] * 10
output.seek(0)
self.ubjdump(obj3, output)
output.seek(0)
self.assertEqual(self.ubjload(output), obj3)
# Multiple documents in same stream (issue #9)
def test_fp_multi(self):
obj = {'a': 123, 'b': b'some raw content'}
output = BytesIO()
count = 10
# Seekable an non-seekable runs
for _ in range(2):
output.seek(0)
for i in range(count):
obj['c'] = i
self.ubjdump(obj, output)
output.seek(0)
for i in range(count):
obj['c'] = i
self.assertEqual(self.ubjload(output), obj)
output.seekable = lambda: False
# Whole "token" in decoder input unavailable (in non-seekable file-like object)
def test_fp_callable_incomplete(self):
obj = [123, b'something']
# remove whole of last token (binary data 'something', without its length)
output = BytesIO(self.ubjdumpb(obj)[:-(len(obj[1]) + 1)])
output.seekable = lambda: False
with self.assert_raises_regex(DecoderException, 'Insufficient input'):
self.ubjload(output)
def test_fp_seek_invalid(self):
output = BytesIO()
self.ubjdump({'a': 333, 'b': 444}, output)
# pad with data (non-ubjson) to ensure buffering too much data
output.write(b' ' * 16)
output.seek(0)
output.seek_org = output.seek
# seek fails
def bad_seek(*_):
raise OSError('bad seek')
output.seek = bad_seek
with self.assert_raises_regex(OSError, 'bad seek'):
self.ubjload(output)
# decoding (lack of input) and seek fail - should get decoding failure
output.seek_org(0, SEEK_END)
with self.assert_raises_regex(DecoderException, 'Insufficient input'):
self.ubjload(output)
# seek is not callable
output.seek_org(0)
output.seek = True
with self.assert_raises_regex(TypeError, 'not callable'):
self.ubjload(output)
# decoding (lack of input) and seek not callable - should get decoding failure
output.seek_org(0, SEEK_END)
with self.assert_raises_regex(DecoderException, 'Insufficient input'):
self.ubjload(output)
# def pympler_run(iterations=20):
# from unittest import main
# from pympler import tracker
# from gc import collect
# tracker = tracker.SummaryTracker()
# for i in range(iterations):
# try:
# main()
# except SystemExit:
# pass
# if i % 2:
# collect()
# tracker.print_diff()
# if __name__ == '__main__':
# pympler_run()
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 011452 x ustar 00 0000000 0000000 28 mtime=1587222318.1831622
py-ubjson-0.16.1/ubjson/ 0000755 0001750 0001750 00000000000 00000000000 014132 5 ustar 00vt vt 0000000 0000000 ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1587222230.0
py-ubjson-0.16.1/ubjson/__init__.py 0000644 0001750 0001750 00000002435 00000000000 016247 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""UBJSON (draft 12) implementation without No-Op support
Example usage:
# To encode
encoded = ubjson.dumpb({'a': 1})
# To decode
decoded = ubjson.loadb(encoded)
To use a file-like object as input/output, use dump() & load() methods instead.
"""
try:
from _ubjson import dump, dumpb, load, loadb
EXTENSION_ENABLED = True
except ImportError: # pragma: no cover
from .encoder import dump, dumpb
from .decoder import load, loadb
EXTENSION_ENABLED = False
from .encoder import EncoderException
from .decoder import DecoderException
__version__ = '0.16.1'
__all__ = ('EXTENSION_ENABLED', 'dump', 'dumpb', 'EncoderException', 'load', 'loadb', 'DecoderException')
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1584446054.0
py-ubjson-0.16.1/ubjson/__main__.py 0000644 0001750 0001750 00000006323 00000000000 016230 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Converts between json & ubjson"""
from __future__ import print_function
from sys import argv, stderr, stdout, stdin, exit # pylint: disable=redefined-builtin
from json import load as jload, dump as jdump
from .compat import STDIN_RAW, STDOUT_RAW
from . import dump as ubjdump, load as ubjload, EncoderException, DecoderException
def __error(*args, **kwargs):
print(*args, file=stderr, **kwargs)
def from_json(in_stream, out_stream):
try:
obj = jload(in_stream)
except ValueError as ex:
__error('Failed to decode json: %s' % ex)
return 8
try:
ubjdump(obj, out_stream, sort_keys=True)
except EncoderException as ex:
__error('Failed to encode to ubjson: %s' % ex)
return 16
return 0
def to_json(in_stream, out_stream):
try:
obj = ubjload(in_stream, intern_object_keys=True)
except DecoderException as ex:
__error('Failed to decode ubjson: %s' % ex)
return 8
try:
jdump(obj, out_stream, sort_keys=True, separators=(',', ':'))
except TypeError as ex:
__error('Failed to encode to json: %s' % ex)
return 16
return 0
__ACTION = frozenset(('fromjson', 'tojson'))
def main():
if not (3 <= len(argv) <= 4 and argv[1] in __ACTION):
print("""USAGE: ubjson (fromjson|tojson) (INFILE|-) [OUTFILE]
Converts an objects between json and ubjson formats. Input is read from INFILE
unless set to '-', in which case stdin is used. If OUTFILE is not
specified, output goes to stdout.""", file=stderr)
return 1
do_from_json = (argv[1] == 'fromjson')
in_file = out_file = None
try:
# input
if argv[2] == '-':
in_stream = stdin if do_from_json else STDIN_RAW
else:
try:
in_stream = in_file = open(argv[2], 'r' if do_from_json else 'rb')
except IOError as ex:
__error('Failed to open input file for reading: %s' % ex)
return 2
# output
if len(argv) == 3:
out_stream = STDOUT_RAW if do_from_json else stdout
else:
try:
out_stream = out_file = open(argv[3], 'wb' if do_from_json else 'w')
except IOError as ex:
__error('Failed to open output file for writing: %s' % ex)
return 4
return (from_json if do_from_json else to_json)(in_stream, out_stream)
except IOError as ex:
__error('I/O failure: %s' % ex)
finally:
if in_file:
in_file.close()
if out_file:
out_file.close()
if __name__ == "__main__":
exit(main())
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/ubjson/compat.py 0000644 0001750 0001750 00000006420 00000000000 015771 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Original six.py copyright notice, on which snippets herein are based:
#
# Copyright (c) 2010-2015 Benjamin Peterson
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Python v2.7 (NOT 2.6) compatibility"""
# pylint: disable=unused-import,invalid-name,wrong-import-position,no-name-in-module
# pylint: disable=import-error
# pragma: no cover
from sys import stderr, stdout, stdin, version_info
PY2 = (version_info[0] == 2)
if PY2:
# pylint: disable=undefined-variable
INTEGER_TYPES = (int, long) # noqa: F821
UNICODE_TYPE = unicode # noqa: F821
TEXT_TYPES = (str, unicode) # noqa: F821
BYTES_TYPES = (str, bytearray)
STDIN_RAW = stdin
STDOUT_RAW = stdout
STDERR_RAW = stderr
# Interning applies to str, not unicode
def intern_unicode(obj):
return obj
else:
INTEGER_TYPES = (int,)
UNICODE_TYPE = str
TEXT_TYPES = (str,)
BYTES_TYPES = (bytes, bytearray)
STDIN_RAW = getattr(stdin, 'buffer', stdin)
STDOUT_RAW = getattr(stdout, 'buffer', stdout)
STDERR_RAW = getattr(stderr, 'buffer', stderr)
from sys import intern as intern_unicode # noqa: F401
try:
# introduced in v3.3
from collections.abc import Mapping, Sequence # noqa: F401
except ImportError:
from collections import Mapping, Sequence # noqa: F401
if version_info[:2] == (3, 2):
# pylint: disable=exec-used
exec("""def raise_from(value, from_value):
if from_value is None:
raise value
raise value from from_value
""")
elif version_info[:2] > (3, 2):
# pylint: disable=exec-used
exec("""def raise_from(value, from_value):
raise value from from_value
""")
else:
def raise_from(value, _):
raise value
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/ubjson/decoder.py 0000644 0001750 0001750 00000036444 00000000000 016124 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""UBJSON draft v12 decoder"""
from io import BytesIO
from struct import Struct, pack, error as StructError
from decimal import Decimal, DecimalException
from .compat import raise_from, intern_unicode
from .markers import (TYPE_NONE, TYPE_NULL, TYPE_NOOP, TYPE_BOOL_TRUE, TYPE_BOOL_FALSE, TYPE_INT8, TYPE_UINT8,
TYPE_INT16, TYPE_INT32, TYPE_INT64, TYPE_FLOAT32, TYPE_FLOAT64, TYPE_HIGH_PREC, TYPE_CHAR,
TYPE_STRING, OBJECT_START, OBJECT_END, ARRAY_START, ARRAY_END, CONTAINER_TYPE, CONTAINER_COUNT)
__TYPES = frozenset((TYPE_NULL, TYPE_BOOL_TRUE, TYPE_BOOL_FALSE, TYPE_INT8, TYPE_UINT8, TYPE_INT16, TYPE_INT32,
TYPE_INT64, TYPE_FLOAT32, TYPE_FLOAT64, TYPE_HIGH_PREC, TYPE_CHAR, TYPE_STRING, ARRAY_START,
OBJECT_START))
__TYPES_NO_DATA = frozenset((TYPE_NULL, TYPE_BOOL_FALSE, TYPE_BOOL_TRUE))
__TYPES_INT = frozenset((TYPE_INT8, TYPE_UINT8, TYPE_INT16, TYPE_INT32, TYPE_INT64))
__SMALL_INTS_DECODED = {pack('>b', i): i for i in range(-128, 128)}
__SMALL_UINTS_DECODED = {pack('>B', i): i for i in range(256)}
__UNPACK_INT16 = Struct('>h').unpack
__UNPACK_INT32 = Struct('>i').unpack
__UNPACK_INT64 = Struct('>q').unpack
__UNPACK_FLOAT32 = Struct('>f').unpack
__UNPACK_FLOAT64 = Struct('>d').unpack
class DecoderException(ValueError):
"""Raised when decoding of a UBJSON stream fails."""
def __init__(self, message, position=None):
if position is not None:
super(DecoderException, self).__init__('%s (at byte %d)' % (message, position), position)
else:
super(DecoderException, self).__init__(str(message), None)
@property
def position(self):
"""Position in stream where decoding failed. Can be None in case where decoding from string of when file-like
object does not support tell().
"""
return self.args[1] # pylint: disable=unsubscriptable-object
# pylint: disable=unused-argument
def __decode_high_prec(fp_read, marker):
length = __decode_int_non_negative(fp_read, fp_read(1))
raw = fp_read(length)
if len(raw) < length:
raise DecoderException('High prec. too short')
try:
return Decimal(raw.decode('utf-8'))
except UnicodeError as ex:
raise_from(DecoderException('Failed to decode decimal string'), ex)
except DecimalException as ex:
raise_from(DecoderException('Failed to decode decimal'), ex)
def __decode_int_non_negative(fp_read, marker):
if marker not in __TYPES_INT:
raise DecoderException('Integer marker expected')
value = __METHOD_MAP[marker](fp_read, marker)
if value < 0:
raise DecoderException('Negative count/length unexpected')
return value
def __decode_int8(fp_read, marker):
try:
return __SMALL_INTS_DECODED[fp_read(1)]
except KeyError as ex:
raise_from(DecoderException('Failed to unpack int8'), ex)
def __decode_uint8(fp_read, marker):
try:
return __SMALL_UINTS_DECODED[fp_read(1)]
except KeyError as ex:
raise_from(DecoderException('Failed to unpack uint8'), ex)
def __decode_int16(fp_read, marker):
try:
return __UNPACK_INT16(fp_read(2))[0]
except StructError as ex:
raise_from(DecoderException('Failed to unpack int16'), ex)
def __decode_int32(fp_read, marker):
try:
return __UNPACK_INT32(fp_read(4))[0]
except StructError as ex:
raise_from(DecoderException('Failed to unpack int32'), ex)
def __decode_int64(fp_read, marker):
try:
return __UNPACK_INT64(fp_read(8))[0]
except StructError as ex:
raise_from(DecoderException('Failed to unpack int64'), ex)
def __decode_float32(fp_read, marker):
try:
return __UNPACK_FLOAT32(fp_read(4))[0]
except StructError as ex:
raise_from(DecoderException('Failed to unpack float32'), ex)
def __decode_float64(fp_read, marker):
try:
return __UNPACK_FLOAT64(fp_read(8))[0]
except StructError as ex:
raise_from(DecoderException('Failed to unpack float64'), ex)
def __decode_char(fp_read, marker):
raw = fp_read(1)
if not raw:
raise DecoderException('Char missing')
try:
return raw.decode('utf-8')
except UnicodeError as ex:
raise_from(DecoderException('Failed to decode char'), ex)
def __decode_string(fp_read, marker):
# current marker is string identifier, so read next byte which identifies integer type
length = __decode_int_non_negative(fp_read, fp_read(1))
raw = fp_read(length)
if len(raw) < length:
raise DecoderException('String too short')
try:
return raw.decode('utf-8')
except UnicodeError as ex:
raise_from(DecoderException('Failed to decode string'), ex)
# same as string, except there is no 'S' marker
def __decode_object_key(fp_read, marker, intern_object_keys):
length = __decode_int_non_negative(fp_read, marker)
raw = fp_read(length)
if len(raw) < length:
raise DecoderException('String too short')
try:
return intern_unicode(raw.decode('utf-8')) if intern_object_keys else raw.decode('utf-8')
except UnicodeError as ex:
raise_from(DecoderException('Failed to decode object key'), ex)
__METHOD_MAP = {TYPE_NULL: (lambda _, __: None),
TYPE_BOOL_TRUE: (lambda _, __: True),
TYPE_BOOL_FALSE: (lambda _, __: False),
TYPE_INT8: __decode_int8,
TYPE_UINT8: __decode_uint8,
TYPE_INT16: __decode_int16,
TYPE_INT32: __decode_int32,
TYPE_INT64: __decode_int64,
TYPE_FLOAT32: __decode_float32,
TYPE_FLOAT64: __decode_float64,
TYPE_HIGH_PREC: __decode_high_prec,
TYPE_CHAR: __decode_char,
TYPE_STRING: __decode_string}
def __get_container_params(fp_read, in_mapping, no_bytes):
marker = fp_read(1)
if marker == CONTAINER_TYPE:
marker = fp_read(1)
if marker not in __TYPES:
raise DecoderException('Invalid container type')
type_ = marker
marker = fp_read(1)
else:
type_ = TYPE_NONE
if marker == CONTAINER_COUNT:
count = __decode_int_non_negative(fp_read, fp_read(1))
counting = True
# special cases (no data (None or bool) / bytes array) will be handled in calling functions
if not (type_ in __TYPES_NO_DATA or
(type_ == TYPE_UINT8 and not in_mapping and not no_bytes)):
# Reading ahead is just to capture type, which will not exist if type is fixed
marker = fp_read(1) if (in_mapping or type_ == TYPE_NONE) else type_
elif type_ == TYPE_NONE:
# set to one to indicate that not finished yet
count = 1
counting = False
else:
raise DecoderException('Container type without count')
return marker, counting, count, type_
def __decode_object(fp_read, no_bytes, object_hook, object_pairs_hook, # pylint: disable=too-many-branches
intern_object_keys):
marker, counting, count, type_ = __get_container_params(fp_read, True, no_bytes)
has_pairs_hook = object_pairs_hook is not None
obj = [] if has_pairs_hook else {}
# special case - no data (None or bool)
if type_ in __TYPES_NO_DATA:
value = __METHOD_MAP[type_](fp_read, type_)
if has_pairs_hook:
for _ in range(count):
obj.append((__decode_object_key(fp_read, fp_read(1), intern_object_keys), value))
return object_pairs_hook(obj)
for _ in range(count):
obj[__decode_object_key(fp_read, fp_read(1), intern_object_keys)] = value
return object_hook(obj)
while count > 0 and (counting or marker != OBJECT_END):
if marker == TYPE_NOOP:
marker = fp_read(1)
continue
# decode key for object
key = __decode_object_key(fp_read, marker, intern_object_keys)
marker = fp_read(1) if type_ == TYPE_NONE else type_
# decode value
try:
value = __METHOD_MAP[marker](fp_read, marker)
except KeyError:
handled = False
else:
handled = True
# handle outside above except (on KeyError) so do not have unfriendly "exception within except" backtrace
if not handled:
if marker == ARRAY_START:
value = __decode_array(fp_read, no_bytes, object_hook, object_pairs_hook, intern_object_keys)
elif marker == OBJECT_START:
value = __decode_object(fp_read, no_bytes, object_hook, object_pairs_hook, intern_object_keys)
else:
raise DecoderException('Invalid marker within object')
if has_pairs_hook:
obj.append((key, value))
else:
obj[key] = value
if counting:
count -= 1
if count > 0:
marker = fp_read(1)
return object_pairs_hook(obj) if has_pairs_hook else object_hook(obj)
def __decode_array(fp_read, no_bytes, object_hook, object_pairs_hook, intern_object_keys):
marker, counting, count, type_ = __get_container_params(fp_read, False, no_bytes)
# special case - no data (None or bool)
if type_ in __TYPES_NO_DATA:
return [__METHOD_MAP[type_](fp_read, type_)] * count
# special case - bytes array
if type_ == TYPE_UINT8 and not no_bytes:
container = fp_read(count)
if len(container) < count:
raise DecoderException('Container bytes array too short')
return container
container = []
while count > 0 and (counting or marker != ARRAY_END):
if marker == TYPE_NOOP:
marker = fp_read(1)
continue
# decode value
try:
value = __METHOD_MAP[marker](fp_read, marker)
except KeyError:
handled = False
else:
handled = True
# handle outside above except (on KeyError) so do not have unfriendly "exception within except" backtrace
if not handled:
if marker == ARRAY_START:
value = __decode_array(fp_read, no_bytes, object_hook, object_pairs_hook, intern_object_keys)
elif marker == OBJECT_START:
value = __decode_object(fp_read, no_bytes, object_hook, object_pairs_hook, intern_object_keys)
else:
raise DecoderException('Invalid marker within array')
container.append(value)
if counting:
count -= 1
if count and type_ == TYPE_NONE:
marker = fp_read(1)
return container
def __object_hook_noop(obj):
return obj
def load(fp, no_bytes=False, object_hook=None, object_pairs_hook=None, intern_object_keys=False):
"""Decodes and returns UBJSON from the given file-like object
Args:
fp: read([size])-able object
no_bytes (bool): If set, typed UBJSON arrays (uint8) will not be
converted to a bytes instance and instead treated like
any other array (i.e. result in a list).
object_hook (callable): Called with the result of any object literal
decoded (instead of dict).
object_pairs_hook (callable): Called with the result of any object
literal decoded with an ordered list of
pairs (instead of dict). Takes precedence
over object_hook.
intern_object_keys (bool): If set, object keys are interned which can
provide a memory saving when many repeated
keys are used. NOTE: This is not supported
in Python2 (since interning does not apply
to unicode) and wil be ignored.
Returns:
Decoded object
Raises:
DecoderException: If an encoding failure occured.
UBJSON types are mapped to Python types as follows. Numbers in brackets
denote Python version.
+----------------------------------+---------------+
| UBJSON | Python |
+==================================+===============+
| object | dict |
+----------------------------------+---------------+
| array | list |
+----------------------------------+---------------+
| string | (3) str |
| | (2) unicode |
+----------------------------------+---------------+
| uint8, int8, int16, int32, int64 | (3) int |
| | (2) int, long |
+----------------------------------+---------------+
| float32, float64 | float |
+----------------------------------+---------------+
| high_precision | Decimal |
+----------------------------------+---------------+
| array (typed, uint8) | (3) bytes |
| | (2) str |
+----------------------------------+---------------+
| true | True |
+----------------------------------+---------------+
| false | False |
+----------------------------------+---------------+
| null | None |
+----------------------------------+---------------+
"""
if object_pairs_hook is None and object_hook is None:
object_hook = __object_hook_noop
if not callable(fp.read):
raise TypeError('fp.read not callable')
fp_read = fp.read
marker = fp_read(1)
try:
try:
return __METHOD_MAP[marker](fp_read, marker)
except KeyError:
pass
if marker == ARRAY_START:
return __decode_array(fp_read, bool(no_bytes), object_hook, object_pairs_hook, intern_object_keys)
if marker == OBJECT_START:
return __decode_object(fp_read, bool(no_bytes), object_hook, object_pairs_hook, intern_object_keys)
raise DecoderException('Invalid marker')
except DecoderException as ex:
raise_from(DecoderException(ex.args[0], position=(fp.tell() if hasattr(fp, 'tell') else None)), ex)
def loadb(chars, no_bytes=False, object_hook=None, object_pairs_hook=None, intern_object_keys=False):
"""Decodes and returns UBJSON from the given bytes or bytesarray object. See
load() for available arguments."""
with BytesIO(chars) as fp:
return load(fp, no_bytes=no_bytes, object_hook=object_hook, object_pairs_hook=object_pairs_hook,
intern_object_keys=intern_object_keys)
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1584446054.0
py-ubjson-0.16.1/ubjson/encoder.py 0000644 0001750 0001750 00000027466 00000000000 016142 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""UBJSON draft v12 encoder"""
from struct import pack, Struct
from decimal import Decimal
from io import BytesIO
from math import isinf, isnan
from .compat import Mapping, Sequence, INTEGER_TYPES, UNICODE_TYPE, TEXT_TYPES, BYTES_TYPES
from .markers import (TYPE_NULL, TYPE_BOOL_TRUE, TYPE_BOOL_FALSE, TYPE_INT8, TYPE_UINT8, TYPE_INT16, TYPE_INT32,
TYPE_INT64, TYPE_FLOAT32, TYPE_FLOAT64, TYPE_HIGH_PREC, TYPE_CHAR, TYPE_STRING, OBJECT_START,
OBJECT_END, ARRAY_START, ARRAY_END, CONTAINER_TYPE, CONTAINER_COUNT)
# Lookup tables for encoding small intergers, pre-initialised larger integer & float packers
__SMALL_INTS_ENCODED = {i: TYPE_INT8 + pack('>b', i) for i in range(-128, 128)}
__SMALL_UINTS_ENCODED = {i: TYPE_UINT8 + pack('>B', i) for i in range(256)}
__PACK_INT16 = Struct('>h').pack
__PACK_INT32 = Struct('>i').pack
__PACK_INT64 = Struct('>q').pack
__PACK_FLOAT32 = Struct('>f').pack
__PACK_FLOAT64 = Struct('>d').pack
# Prefix applicable to specialised byte array container
__BYTES_ARRAY_PREFIX = ARRAY_START + CONTAINER_TYPE + TYPE_UINT8 + CONTAINER_COUNT
class EncoderException(TypeError):
"""Raised when encoding of an object fails."""
def __encode_decimal(fp_write, item):
if item.is_finite():
fp_write(TYPE_HIGH_PREC)
encoded_val = str(item).encode('utf-8')
__encode_int(fp_write, len(encoded_val))
fp_write(encoded_val)
else:
fp_write(TYPE_NULL)
def __encode_int(fp_write, item):
if item >= 0:
if item < 2 ** 8:
fp_write(__SMALL_UINTS_ENCODED[item])
elif item < 2 ** 15:
fp_write(TYPE_INT16)
fp_write(__PACK_INT16(item))
elif item < 2 ** 31:
fp_write(TYPE_INT32)
fp_write(__PACK_INT32(item))
elif item < 2 ** 63:
fp_write(TYPE_INT64)
fp_write(__PACK_INT64(item))
else:
__encode_decimal(fp_write, Decimal(item))
elif item >= -(2 ** 7):
fp_write(__SMALL_INTS_ENCODED[item])
elif item >= -(2 ** 15):
fp_write(TYPE_INT16)
fp_write(__PACK_INT16(item))
elif item >= -(2 ** 31):
fp_write(TYPE_INT32)
fp_write(__PACK_INT32(item))
elif item >= -(2 ** 63):
fp_write(TYPE_INT64)
fp_write(__PACK_INT64(item))
else:
__encode_decimal(fp_write, Decimal(item))
def __encode_float(fp_write, item):
if 1.18e-38 <= abs(item) <= 3.4e38 or item == 0:
fp_write(TYPE_FLOAT32)
fp_write(__PACK_FLOAT32(item))
elif 2.23e-308 <= abs(item) < 1.8e308:
fp_write(TYPE_FLOAT64)
fp_write(__PACK_FLOAT64(item))
elif isinf(item) or isnan(item):
fp_write(TYPE_NULL)
else:
__encode_decimal(fp_write, Decimal(item))
def __encode_float64(fp_write, item):
if 2.23e-308 <= abs(item) < 1.8e308:
fp_write(TYPE_FLOAT64)
fp_write(__PACK_FLOAT64(item))
elif item == 0:
fp_write(TYPE_FLOAT32)
fp_write(__PACK_FLOAT32(item))
elif isinf(item) or isnan(item):
fp_write(TYPE_NULL)
else:
__encode_decimal(fp_write, Decimal(item))
def __encode_string(fp_write, item):
encoded_val = item.encode('utf-8')
length = len(encoded_val)
if length == 1:
fp_write(TYPE_CHAR)
else:
fp_write(TYPE_STRING)
if length < 2 ** 8:
fp_write(__SMALL_UINTS_ENCODED[length])
else:
__encode_int(fp_write, length)
fp_write(encoded_val)
def __encode_bytes(fp_write, item):
fp_write(__BYTES_ARRAY_PREFIX)
length = len(item)
if length < 2 ** 8:
fp_write(__SMALL_UINTS_ENCODED[length])
else:
__encode_int(fp_write, length)
fp_write(item)
# no ARRAY_END since length was specified
def __encode_value(fp_write, item, seen_containers, container_count, sort_keys, no_float32, default):
if isinstance(item, UNICODE_TYPE):
__encode_string(fp_write, item)
elif item is None:
fp_write(TYPE_NULL)
elif item is True:
fp_write(TYPE_BOOL_TRUE)
elif item is False:
fp_write(TYPE_BOOL_FALSE)
elif isinstance(item, INTEGER_TYPES):
__encode_int(fp_write, item)
elif isinstance(item, float):
if no_float32:
__encode_float64(fp_write, item)
else:
__encode_float(fp_write, item)
elif isinstance(item, Decimal):
__encode_decimal(fp_write, item)
elif isinstance(item, BYTES_TYPES):
__encode_bytes(fp_write, item)
# order important since mappings could also be sequences
elif isinstance(item, Mapping):
__encode_object(fp_write, item, seen_containers, container_count, sort_keys, no_float32, default)
elif isinstance(item, Sequence):
__encode_array(fp_write, item, seen_containers, container_count, sort_keys, no_float32, default)
elif default is not None:
__encode_value(fp_write, default(item), seen_containers, container_count, sort_keys, no_float32, default)
else:
raise EncoderException('Cannot encode item of type %s' % type(item))
def __encode_array(fp_write, item, seen_containers, container_count, sort_keys, no_float32, default):
# circular reference check
container_id = id(item)
if container_id in seen_containers:
raise ValueError('Circular reference detected')
seen_containers[container_id] = item
fp_write(ARRAY_START)
if container_count:
fp_write(CONTAINER_COUNT)
__encode_int(fp_write, len(item))
for value in item:
__encode_value(fp_write, value, seen_containers, container_count, sort_keys, no_float32, default)
if not container_count:
fp_write(ARRAY_END)
del seen_containers[container_id]
def __encode_object(fp_write, item, seen_containers, container_count, sort_keys, no_float32, default):
# circular reference check
container_id = id(item)
if container_id in seen_containers:
raise ValueError('Circular reference detected')
seen_containers[container_id] = item
fp_write(OBJECT_START)
if container_count:
fp_write(CONTAINER_COUNT)
__encode_int(fp_write, len(item))
for key, value in sorted(item.items()) if sort_keys else item.items():
# allow both str & unicode for Python 2
if not isinstance(key, TEXT_TYPES):
raise EncoderException('Mapping keys can only be strings')
encoded_key = key.encode('utf-8')
length = len(encoded_key)
if length < 2 ** 8:
fp_write(__SMALL_UINTS_ENCODED[length])
else:
__encode_int(fp_write, length)
fp_write(encoded_key)
__encode_value(fp_write, value, seen_containers, container_count, sort_keys, no_float32, default)
if not container_count:
fp_write(OBJECT_END)
del seen_containers[container_id]
def dump(obj, fp, container_count=False, sort_keys=False, no_float32=True, default=None):
"""Writes the given object as UBJSON to the provided file-like object
Args:
obj: The object to encode
fp: write([size])-able object
container_count (bool): Specify length for container types (including
for empty ones). This can aid decoding speed
depending on implementation but requires a bit
more space and encoding speed could be reduced
if getting length of any of the containers is
expensive.
sort_keys (bool): Sort keys of mappings
no_float32 (bool): Never use float32 to store float numbers (other than
for zero). Disabling this might save space at the
loss of precision.
default (callable): Called for objects which cannot be serialised.
Should return a UBJSON-encodable version of the
object or raise an EncoderException.
Raises:
EncoderException: If an encoding failure occured.
The following Python types and interfaces (ABCs) are supported (as are any
subclasses):
+------------------------------+-----------------------------------+
| Python | UBJSON |
+==============================+===================================+
| (3) str | string |
| (2) unicode | |
+------------------------------+-----------------------------------+
| None | null |
+------------------------------+-----------------------------------+
| bool | true, false |
+------------------------------+-----------------------------------+
| (3) int | uint8, int8, int16, int32, int64, |
| (2) int, long | high_precision |
+------------------------------+-----------------------------------+
| float | float32, float64, high_precision |
+------------------------------+-----------------------------------+
| Decimal | high_precision |
+------------------------------+-----------------------------------+
| (3) bytes, bytearray | array (type, uint8) |
| (2) str | array (type, uint8) |
+------------------------------+-----------------------------------+
| (3) collections.abc.Mapping | object |
| (2) collections.Mapping | |
+------------------------------+-----------------------------------+
| (3) collections.abc.Sequence | array |
| (2) collections.Sequence | |
+------------------------------+-----------------------------------+
Notes:
- Items are resolved in the order of this table, e.g. if the item implements
both Mapping and Sequence interfaces, it will be encoded as a mapping.
- None and bool do not use an isinstance check
- Numbers in brackets denote Python version.
- Only unicode strings in Python 2 are encoded as strings, i.e. for
compatibility with e.g. Python 3 one MUST NOT use str in Python 2 (as that
will be interpreted as a byte array).
- Mapping keys have to be strings: str for Python3 and unicode or str in
Python 2.
- float conversion rules (depending on no_float32 setting):
float32: 1.18e-38 <= abs(value) <= 3.4e38 or value == 0
float64: 2.23e-308 <= abs(value) < 1.8e308
For other values Decimal is used.
"""
if not callable(fp.write):
raise TypeError('fp.write not callable')
fp_write = fp.write
__encode_value(fp_write, obj, {}, container_count, sort_keys, no_float32, default)
def dumpb(obj, container_count=False, sort_keys=False, no_float32=True, default=None):
"""Returns the given object as UBJSON in a bytes instance. See dump() for
available arguments."""
with BytesIO() as fp:
dump(obj, fp, container_count=container_count, sort_keys=sort_keys, no_float32=no_float32, default=default)
return fp.getvalue()
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 011453 x ustar 00 0000000 0000000 22 mtime=1575106549.0
py-ubjson-0.16.1/ubjson/markers.py 0000644 0001750 0001750 00000002244 00000000000 016152 0 ustar 00vt vt 0000000 0000000 # Copyright (c) 2019 Iotic Labs Ltd. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://github.com/Iotic-Labs/py-ubjson/blob/master/LICENSE
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""UBJSON marker defitions"""
# Value types
TYPE_NONE = b'\x00' # Used internally only, not part of ubjson specification
TYPE_NULL = b'Z'
TYPE_NOOP = b'N'
TYPE_BOOL_TRUE = b'T'
TYPE_BOOL_FALSE = b'F'
TYPE_INT8 = b'i'
TYPE_UINT8 = b'U'
TYPE_INT16 = b'I'
TYPE_INT32 = b'l'
TYPE_INT64 = b'L'
TYPE_FLOAT32 = b'd'
TYPE_FLOAT64 = b'D'
TYPE_HIGH_PREC = b'H'
TYPE_CHAR = b'C'
TYPE_STRING = b'S'
# Container delimiters
OBJECT_START = b'{'
OBJECT_END = b'}'
ARRAY_START = b'['
ARRAY_END = b']'
# Optional container parameters
CONTAINER_TYPE = b'$'
CONTAINER_COUNT = b'#'