demjson-2.2.4/0002775000076400007640000000000012636326504012535 5ustar demdem00000000000000demjson-2.2.4/docs/0002775000076400007640000000000012636326504013465 5ustar demdem00000000000000demjson-2.2.4/docs/HOOKS.txt0000664000076400007640000003307412337144342015112 0ustar demdem00000000000000Using callback hooks in demjson =============================== Starting with demjson release 2.0 it is possible to hook into the encoding and decoding engine to transform values. There are a set of hooks available to use during JSON decoding and another set of hooks to use during encoding. The complete descriptions of all the available hooks appear at the end of this document, but in summary they are: Decoding Hooks Encoding Hooks -------------- -------------- decode_string encode_value decode_float encode_dict decode_number encode_dict_key decode_array encode_sequence decode_object encode_bytes encode_default Although hooks can be quite powerful, they are not necessarily suitable for every situation. You may need to perform some complex transformations outside of demjson. Simple example: decoding dates ------------------------------ Say you have some JSON document where all dates have been encoded as strings with the format "YYYY-MM-DD”. You could use a hook function to automatically convert those strings into a Python date object. import demjson, datetime, re def date_converter( s ): match = re.match( '^(\d{4})-(\d{2})-(\d{2})$', s ) if match: y, m, d = [int(n) for n in match.groups()] return datetime.date( y, m, d ) else: raise demjson.JSONSkipHook demjson.decode( '{"birthdate": "1994-01-17"}' ) # gives => {'birthdate': '1994-01-17'} demjson.decode( '{"birthdate": "1994-01-17"}', decode_string=date_converter ) # gives => {'birthdate': datetime.date(1994, 1, 17) } Simple example: encoding complex numbers ---------------------------------------- Say you are generating JSON but your Python object contains complex numbers. You could use an encoding hook to transform them into a 2-ary array. import demjson def complex_to_array( val ): if isinstance( val, complex ): return [ val.real, val.imag ] else: raise demjson.JSONSkipHook demjson.encode( {'x': complex('3+9j')} ) # raises JSONEncodeError demjson.encode( {'x': complex('3+9j')}, encode_value=complex_to_array ) # gives => '{"x": [3.0, 9.0]}' Defining and setting hooks ========================== A hook must be a callable function that takes a single argument, and either returns the same value, or some other value after a transformation. # A sample user-defined hook function def my_sort_array( arr ): arr.sort() return arr You can set hook functions either by calling the set_hook() method of the JSON class; or by passing an equivalent named argument to the encode() and decode() functions. If you are using the encode() or decode() functions, then your hooks are specified with named arguments: demjson.decode( '[3,1,2]', decode_array=my_sort_array ) # gives => [1, 2, 3] If you are using the JSON class directly you need to call the set_hook() method with both the name of the hook and your function: j = demjson.JSON() j.set_hook( 'decode_array', my_sort_array ) You can also clear a hook by specifying None as your function, or with the clear_hook() method. j.set_hook( 'decode_array', None ) j.clear_hook( 'decode_array' ) # equivalent And to clear all hooks: j.clear_all_hooks() Selective processing (Skipping) =============================== When you specify a hook function, that function will be called for every matching object. Many times though your hook may only want to perform a transformation on specific objects. A hook function may therefore indicate that it does not wish to perform a transformation on a case-by-case basis. To do this, any hook function may raise a JSONSkipHook exception if it does not wish to handle a particular invocation. This will have the effect of skipping the hook, for that particular value, as if the hook was net set. # another sample hook function def my_sum_arrays( arr ): try: return sum(arr) except TypeError: # Don't do anything raise demjson.JSONSkipHook If you just 'return None', you are actually telling demjson to convert the value into None or null. Though some hooks allow you to just return the same value as passed with the same effect, be aware that not all hooks do so. Therefore it is always recommended that you raise JSONSkipHook when intending to skip processing. Other important details when using hooks ======================================== Order of processing: Decoding hooks are generally called in a bottom-up order, whereas encoding hooks are called in a top-down order. This is discussed in more detail in the description of the hooks. Modifying values: The value that is passed to a hook function is not a deep copy. If it is mutable and you make changes to it, and then raise a JSONSkipHook exception, the original value may not be preserved as expected. Exception handling ================== If your hook function raises an exception, it will be caught and wrapped in a JSONDecodeHookError or JSONEncodeHookError. Those are correspondingly subclasses of the JSONDecodeError and JSONEncodeError; so your outermost code only needs to catch one exception type. When running in Python 3 the standard Exception Chaining (PEP 3134) mechanism is employed. Under Python 2 exception chaining is simulated, but a printed traceback of the original exception may not be printed. You can get to the original exception in the '__cause__' member of the outer exception. Consider the following example: def my_encode_fractions( val ): if 'numerator' in val and 'denominator' in val: return d['numerator'] / d['denominator'] # OOPS. DIVIDE BY ZERO else: raise demjson.JSONSkipHook demjson.encode( {'numerator':0, 'denominator':0}, encode_dict=my_encode_fractions ) For this example a divide by zero error will occur within the hook function. The exception that eventually gets propagated out of the encode() call will be something like: # ==== Exception printed Python 3: Traceback (most recent call last): ... File "example.py", line 3, in my_encode_fractions ZeroDivisionError: division by zero The above exception was the direct cause of the following exception: Traceback (most recent call last): ... File "..../demjson.py", line 9999, in call_hook demjson.JSONEncodeHookError: Hook encode_dict raised 'ZeroDivisionError' while encoding type List of decoding hooks ====================== Decoding hooks let you intercept values during JSON parsing and perform additional translations. You could for example recognize strings with particular patterns and convert them into specific Python types. Decoding hooks are not given raw JSON, but instead fundamental python values that closely correspond to the original JSON text. So you don't need to worry, for instance, about Unicode surrogate pairs or other such complexities. When dealing with nested JSON data structures, objects and arrays, the decoding hooks are called bottom-up: the innermost value first working outward, and then left to right. The available hooks are: decode_string ------------- Called for every JSON string literal with the Python-equivalent string value as an argument. Expects to get a Python object in return. Remember that the keys to JSON objects are also strings, and so your hook should typically either return another string or some type that is hashable. decode_float ------------ Called for every JSON number that looks like a float (has a "."). The string representation of the number is passed as an argument. Expects to get a Python object in return. decode_number ------------- Called for every JSON number. The string representation of the number is passed as an argument. Expects to get a Python object in return. NOTE: If the number looks like a float and the 'decode_float' hook is set, then this hook will not be called. Warning: If you are decoding in non-strict mode, then your number decoding hook may also encounter so-called non-numbers. You should be prepared to handle any of 'NaN', 'Infinity', '+Infinity', '-Infinity'. Or raise a 'JSONSkipHook' exception to let demjson handle those values. decode_array ------------ Called for every JSON array. A Python list is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every array, even for nested arrays. decode_object ------------- Called for every JSON object. A Python dictionary is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every object, even for nested objects. List of encoding hooks ====================== Encoding hooks are used during JSON generation to let you intercept python values in the input and perform translations on them prior to encoding. You could for example convert complex numbers into a 2-valued array or a custom class instance into a dictionary. Remember that these hooks are not expected to output raw JSON, but instead a fundamental Python type which demjson already knows how to properly encode. When dealing with nested data structures, such as with dictionaries and lists, the encoding hooks are called top-down: the outermost value first working inward, and then from left to right. This top-down order means that encoding hooks can be dangerous in that they can create infinite loops. The available hooks are: encode_value ------------ Called for every Python object which is to be encoded into JSON. This hook will get a chance to transform every value once, regardless of it's type. Most of the time this hook will probably raise a 'JSONSkipHook', and only selectively return a transformed value. If the hook function returns a value that is generally of a different kind, then the encoding process will run again. This means that your returned value is subject to all the various hook processing again. Therefore careless coding of this hook function can result in infinite loops. This hook will be called before hook 'encode_dict_key' for dictionary keys. encode_dict ----------- Called for every Python dictionary: any 'dict', 'UserDict', 'OrderedDict', or ChainMap; or any subclass of those.

It will also be called for anything that is dictionary-like in that it supports either of iterkeys() or keys() as well as __getitem__(), so be aware in your hook function that the object you receive may not support all the varied methods that the standard dict does. On recursion: if your hook function returns a dictionary, either the same one possibly modified or anything else that looks like a dictionary, then the result is immediately processed. This means that your hook will _not_ be re-invoked on the new dictionary. Though the contents of the returned dictionary will individually be subject to their own hook processing. However if your hook function returns any other kind of object other than a dictionary then the encoding for that new object starts over; which means that other hook functions may get invoked. encode_dict_key --------------- Called for every dictionary key. This allows you to transform non-string keys into strings. Note this will also be called even for keys that are already strings. This hook is expected to return a string value. However if running in non-strict mode, it may also return a number, as ECMAScript allows numbers to be used as object keys. This hook will be called after the 'encode_value' hook. encode_sequence --------------- Called for every Python sequence-like object that is not a dictionary or string. This includes all 'list' and 'tuple' types or their subtypes, or any other type that allows for basic iteration. encode_bytes ------------ PYTHON 3 ONLY. Called for every Python 'bytes' or 'bytearray' type. Additionally, memory views (type 'memoryview') with an “unsigned byte” format ('B') will converted to a normal 'bytes' type and passed to this hook as well. Memory view objects with different item formats are treated as ordinary sequences (lists of numbers). If this hook is not set then byte types are encoded as a sequence (e.g., a list of numbers), or according to the 'encode_sequence' hook. One useful example is to compress and Base-64 encode any bytes value: import demjson, bz2, base64 def my_bz2_and_base64_encoder( bytes_val ): return base64.b64encode( bz2.compress( bytes_val ) ).decode('ascii') data = open( "some_binary_file.dat", 'rb' ).read() # Returns bytes type demjson.encode( {'opaque': data}, encode_bytes=my_bz2_and_base64_encoder ) # gives => '{"opaque": "QLEALFP ... N8WJ/tA=="}' And if you replace 'b64encode' with 'b16encode' in the above, you will hexadecimal-encode byte arrays. encode_default -------------- Called for any Python type which can not otherwise be converted into JSON, even after applying any other encoding hooks. A very simple catch-all would be the built-in 'repr()' or 'str()' functions, as in: today = datetime.date.today() demjson.encode( {"today": today}, encode_default=repr ) # gives => '{"today": "datetime.date(2014, 4, 22)"}' demjson.encode( {"today": today}, encode_default=str ) # gives => '{"today": "2014-04-22"}' demjson-2.2.4/docs/demjson.txt0000664000076400007640000035404312636325541015674 0ustar demdem00000000000000Help on module demjson: NAME demjson - A JSON data encoder and decoder. FILE demjson.py DESCRIPTION This Python module implements the JSON (http://json.org/) data encoding format; a subset of ECMAScript (aka JavaScript) for encoding primitive data types (numbers, strings, booleans, lists, and associative arrays) in a language-neutral simple text-based syntax. It can encode or decode between JSON formatted strings and native Python data types. Normally you would use the encode() and decode() functions defined by this module, but if you want more control over the processing you can use the JSON class. This implementation tries to be as completely cormforming to all intricacies of the standards as possible. It can operate in strict mode (which only allows JSON-compliant syntax) or a non-strict mode (which allows much more of the whole ECMAScript permitted syntax). This includes complete support for Unicode strings (including surrogate-pairs for non-BMP characters), and all number formats including negative zero and IEEE 754 non-numbers such a NaN or Infinity. The JSON/ECMAScript to Python type mappings are: ---JSON--- ---Python--- null None undefined undefined (note 1) Boolean (true,false) bool (True or False) Integer int or long (note 2) Float float String str or unicode ( "..." or u"..." ) Array [a, ...] list ( [...] ) Object {a:b, ...} dict ( {...} ) -- Note 1. an 'undefined' object is declared in this module which represents the native Python value for this type when in non-strict mode. -- Note 2. some ECMAScript integers may be up-converted to Python floats, such as 1e+40. Also integer -0 is converted to float -0, so as to preserve the sign (which ECMAScript requires). -- Note 3. numbers requiring more significant digits than can be represented by the Python float type will be converted into a Python Decimal type, from the standard 'decimal' module. In addition, when operating in non-strict mode, several IEEE 754 non-numbers are also handled, and are mapped to specific Python objects declared in this module: NaN (not a number) nan (float('nan')) Infinity, +Infinity inf (float('inf')) -Infinity neginf (float('-inf')) When encoding Python objects into JSON, you may use types other than native lists or dictionaries, as long as they support the minimal interfaces required of all sequences or mappings. This means you can use generators and iterators, tuples, UserDict subclasses, etc. To make it easier to produce JSON encoded representations of user defined classes, if the object has a method named json_equivalent(), then it will call that method and attempt to encode the object returned from it instead. It will do this recursively as needed and before any attempt to encode the object using it's default strategies. Note that any json_equivalent() method should return "equivalent" Python objects to be encoded, not an already-encoded JSON-formatted string. There is no such aid provided to decode JSON back into user-defined classes as that would dramatically complicate the interface. When decoding strings with this module it may operate in either strict or non-strict mode. The strict mode only allows syntax which is conforming to RFC 7159 (JSON), while the non-strict allows much more of the permissible ECMAScript syntax. The following are permitted when processing in NON-STRICT mode: * Unicode format control characters are allowed anywhere in the input. * All Unicode line terminator characters are recognized. * All Unicode white space characters are recognized. * The 'undefined' keyword is recognized. * Hexadecimal number literals are recognized (e.g., 0xA6, 0177). * String literals may use either single or double quote marks. * Strings may contain \x (hexadecimal) escape sequences, as well as the \v and \0 escape sequences. * Lists may have omitted (elided) elements, e.g., [,,,,,], with missing elements interpreted as 'undefined' values. * Object properties (dictionary keys) can be of any of the types: string literals, numbers, or identifiers (the later of which are treated as if they are string literals)---as permitted by ECMAScript. JSON only permits strings literals as keys. Concerning non-strict and non-ECMAScript allowances: * Octal numbers: If you allow the 'octal_numbers' behavior (which is never enabled by default), then you can use octal integers and octal character escape sequences (per the ECMAScript standard Annex B.1.2). This behavior is allowed, if enabled, because it was valid JavaScript at one time. * Multi-line string literals: Strings which are more than one line long (contain embedded raw newline characters) are never permitted. This is neither valid JSON nor ECMAScript. Some other JSON implementations may allow this, but this module considers that behavior to be a mistake. References: * JSON (JavaScript Object Notation) * RFC 7159. The application/json Media Type for JavaScript Object Notation (JSON) * ECMA-262 3rd edition (1999) * IEEE 754-1985: Standard for Binary Floating-Point Arithmetic. CLASSES __builtin__.long(__builtin__.object) json_int __builtin__.object JSON buffered_stream decode_state decode_statistics encode_state helpers json_options jsonlint position_marker codecs.CodecInfo(__builtin__.tuple) utf32 exceptions.Exception(exceptions.BaseException) JSONException JSONAbort JSONError JSONDecodeError JSONDecodeHookError JSONEncodeError JSONEncodeHookError JSONSkipHook JSONStopProcessing class JSON(__builtin__.object) | An encoder/decoder for JSON data streams. | | Usually you will call the encode() or decode() methods. The other | methods are for lower-level processing. | | Whether the JSON parser runs in strict mode (which enforces exact | compliance with the JSON spec) or the more forgiving non-string mode | can be affected by setting the 'strict' argument in the object's | initialization; or by assigning True or False to the 'strict' | property of the object. | | You can also adjust a finer-grained control over strictness by | allowing or forbidding specific behaviors. You can get a list of | all the available behaviors by accessing the 'behaviors' property. | Likewise the 'allowed_behaviors' and 'forbidden_behaviors' list which | behaviors will be allowed and which will not. Call the allow() | or forbid() methods to adjust these. | | Methods defined here: | | __init__(self, **kwargs) | Creates a JSON encoder/decoder object. | | You may pass encoding and decoding options either by passing | an argument named 'json_options' with an instance of a | json_options class; or with individual keyword/values that will | be used to initialize a new json_options object. | | You can also set hooks by using keyword arguments using the | hook name; e.g., encode_dict=my_hook_func. | | call_hook(self, hook_name, input_object, position=None, *args, **kwargs) | Wrapper function to invoke a user-supplied hook function. | | This will capture any exceptions raised by the hook and do something | appropriate with it. | | clear_all_hooks(self) | Unsets all hook callbacks, as previously set with set_hook(). | | clear_hook(self, hookname) | Unsets a hook callback, as previously set with set_hook(). | | decode(self, txt, encoding=None, return_errors=False, return_stats=False) | Decodes a JSON-encoded string into a Python object. | | The 'return_errors' parameter controls what happens if the | input JSON has errors in it. | | * False: the first error will be raised as a Python | exception. If there are no errors then the corresponding | Python object will be returned. | | * True: the return value is always a 2-tuple: (object, error_list) | | decode_boolean(self, state) | Intermediate-level decode for JSON boolean literals. | | Takes a string and a starting index, and returns a Python bool | (True or False) and the index of the next unparsed character. | | decode_composite(self, state) | Intermediate-level JSON decoder for composite literal types (array and object). | | decode_identifier(self, state, identifier_as_string=False) | Decodes an identifier/keyword. | | decode_javascript_identifier(self, name) | Convert a JavaScript identifier into a Python string object. | | This method can be overriden by a subclass to redefine how JavaScript | identifiers are turned into Python objects. By default this just | converts them into strings. | | decode_null(self, state) | Intermediate-level decoder for ECMAScript 'null' keyword. | | Takes a string and a starting index, and returns a Python | None object and the index of the next unparsed character. | | decode_number(self, state) | Intermediate-level decoder for JSON numeric literals. | | Takes a string and a starting index, and returns a Python | suitable numeric type and the index of the next unparsed character. | | The returned numeric type can be either of a Python int, | long, or float. In addition some special non-numbers may | also be returned such as nan, inf, and neginf (technically | which are Python floats, but have no numeric value.) | | Ref. ECMAScript section 8.5. | | decode_string(self, state) | Intermediate-level decoder for JSON string literals. | | Takes a string and a starting index, and returns a Python | string (or unicode string) and the index of the next unparsed | character. | | decodeobj(self, state, identifier_as_string=False, at_document_start=False) | Intermediate-level JSON decoder. | | Takes a string and a starting index, and returns a two-tuple consting | of a Python object and the index of the next unparsed character. | | If there is no value at all (empty string, etc), then None is | returned instead of a tuple. | | encode(self, obj, encoding=None) | Encodes the Python object into a JSON string representation. | | This method will first attempt to encode an object by seeing | if it has a json_equivalent() method. If so than it will | call that method and then recursively attempt to encode | the object resulting from that call. | | Next it will attempt to determine if the object is a native | type or acts like a squence or dictionary. If so it will | encode that object directly. | | Finally, if no other strategy for encoding the object of that | type exists, it will call the encode_default() method. That | method currently raises an error, but it could be overridden | by subclasses to provide a hook for extending the types which | can be encoded. | | encode_boolean(self, bval, state) | Encodes the Python boolean into a JSON Boolean literal. | | encode_composite(self, obj, state, obj_classification=None) | Encodes just composite objects: dictionaries, lists, or sequences. | | Basically handles any python type for which iter() can create | an iterator object. | | This method is not intended to be called directly. Use the | encode() method instead. | | encode_date(self, dt, state) | | encode_datetime(self, dt, state) | | encode_enum(self, val, state) | Encode a Python Enum value into JSON. | | encode_equivalent(self, obj, state) | This method is used to encode user-defined class objects. | | The object being encoded should have a json_equivalent() | method defined which returns another equivalent object which | is easily JSON-encoded. If the object in question has no | json_equivalent() method available then None is returned | instead of a string so that the encoding will attempt the next | strategy. | | If a caller wishes to disable the calling of json_equivalent() | methods, then subclass this class and override this method | to just return None. | | encode_null(self, state) | Produces the JSON 'null' keyword. | | encode_number(self, n, state) | Encodes a Python numeric type into a JSON numeric literal. | | The special non-numeric values of float('nan'), float('inf') | and float('-inf') are translated into appropriate JSON | literals. | | Note that Python complex types are not handled, as there is no | ECMAScript equivalent type. | | encode_string(self, s, state) | Encodes a Python string into a JSON string literal. | | encode_time(self, t, state) | | encode_timedelta(self, td, state) | | encode_undefined(self, state) | Produces the ECMAScript 'undefined' keyword. | | has_hook(self, hook_name) | | islineterm(self, c) | Determines if the given character is considered a line terminator. | | Ref. ECMAScript section 7.3 | | isws(self, c) | Determines if the given character is considered as white space. | | Note that Javscript is much more permissive on what it considers | to be whitespace than does JSON. | | Ref. ECMAScript section 7.2 | | recover_parser(self, state) | Try to recover after a syntax error by locating the next "known" position. | | set_hook(self, hookname, function) | Sets a user-defined callback function used during encoding or decoding. | | The 'hookname' argument must be a string containing the name of | one of the available hooks, listed below. | | The 'function' argument must either be None, which disables the hook, | or a callable function. Hooks do not stack, if you set a hook it will | undo any previously set hook. | | Netsted values. When decoding JSON that has nested objects or | arrays, the decoding hooks will be called once for every | corresponding value, even if nested. Generally the decoding | hooks will be called from the inner-most value outward, and | then left to right. | | Skipping. Any hook function may raise a JSONSkipHook exception | if it does not wish to handle the particular invocation. This | will have the effect of skipping the hook for that particular | value, as if the hook was net set. | | AVAILABLE HOOKS: | | * decode_string | Called for every JSON string literal with the | Python-equivalent string value as an argument. Expects to | get a Python object in return. | | * decode_float: | Called for every JSON number that looks like a float (has | a "."). The string representation of the number is passed | as an argument. Expects to get a Python object in return. | | * decode_number: | Called for every JSON number. The string representation of | the number is passed as an argument. Expects to get a | Python object in return. NOTE: If the number looks like a | float and the 'decode_float' hook is set, then this hook | will not be called. | | * decode_array: | Called for every JSON array. A Python list is passed as | the argument, and expects to get a Python object back. | NOTE: this hook will get called for every array, even | for nested arrays. | | * decode_object: | Called for every JSON object. A Python dictionary is passed | as the argument, and expects to get a Python object back. | NOTE: this hook will get called for every object, even | for nested objects. | | * encode_value: | Called for every Python object which is to be encoded into JSON. | | * encode_dict: | Called for every Python dictionary or anything that looks | like a dictionary. | | * encode_dict_key: | Called for every dictionary key. | | * encode_sequence: | Called for every Python sequence-like object that is not a | dictionary or string. This includes lists and tuples. | | * encode_bytes: | Called for every Python bytes or bytearray type; or for | any memoryview with a byte ('B') item type. (Python 3 only) | | * encode_default: | Called for any Python type which can not otherwise be converted | into JSON, even after applying any other encoding hooks. | | skip_comment(self, state) | Skips an ECMAScript comment, either // or /* style. | | The contents of the comment are returned as a string, as well | as the index of the character immediately after the comment. | | skipws(self, state) | Skips all whitespace, including comments and unicode whitespace | | Takes a string and a starting index, and returns the index of the | next non-whitespace character. | | If the 'skip_comments' behavior is True and not running in | strict JSON mode, then comments will be skipped over just like | whitespace. | | skipws_nocomments(self, state) | Skips whitespace (will not allow comments). | | try_encode_default(self, obj, state) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | options | The optional behaviors used, e.g., the JSON conformance | strictness. Returns an instance of json_options. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | all_hook_names = ('decode_number', 'decode_float', 'decode_object', 'd... | | json_syntax_characters = u'{}[]"\\,:0123456789.-+abcdefghijklmnopqrstu... class JSONAbort(JSONException) | Method resolution order: | JSONAbort | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Methods inherited from exceptions.Exception: | | __init__(...) | x.__init__(...) initializes x; see help(type(x)) for signature | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __repr__(...) | x.__repr__() <==> repr(x) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONDecodeError(JSONError) | An exception class raised when a JSON decoding error (syntax error) occurs. | | Method resolution order: | JSONDecodeError | JSONError | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Methods inherited from JSONError: | | __init__(self, message, *args, **kwargs) | | __repr__(self) | | pretty_description(self, show_positions=True, filename=None) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONError: | | position | | ---------------------------------------------------------------------- | Data and other attributes inherited from JSONError: | | severities = frozenset(['error', 'fatal', 'info', 'warning']) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONDecodeHookError(JSONDecodeError) | An exception that occured within a decoder hook. | | The original exception is available in the 'hook_exception' attribute. | | Method resolution order: | JSONDecodeHookError | JSONDecodeError | JSONError | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Methods defined here: | | __init__(self, hook_name, exc_info, encoded_obj, *args, **kwargs) | | ---------------------------------------------------------------------- | Methods inherited from JSONError: | | __repr__(self) | | pretty_description(self, show_positions=True, filename=None) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONError: | | position | | ---------------------------------------------------------------------- | Data and other attributes inherited from JSONError: | | severities = frozenset(['error', 'fatal', 'info', 'warning']) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONEncodeError(JSONError) | An exception class raised when a python object can not be encoded as a JSON string. | | Method resolution order: | JSONEncodeError | JSONError | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Methods inherited from JSONError: | | __init__(self, message, *args, **kwargs) | | __repr__(self) | | pretty_description(self, show_positions=True, filename=None) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONError: | | position | | ---------------------------------------------------------------------- | Data and other attributes inherited from JSONError: | | severities = frozenset(['error', 'fatal', 'info', 'warning']) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONEncodeHookError(JSONEncodeError) | An exception that occured within an encoder hook. | | The original exception is available in the 'hook_exception' attribute. | | Method resolution order: | JSONEncodeHookError | JSONEncodeError | JSONError | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Methods defined here: | | __init__(self, hook_name, exc_info, encoded_obj, *args, **kwargs) | | ---------------------------------------------------------------------- | Methods inherited from JSONError: | | __repr__(self) | | pretty_description(self, show_positions=True, filename=None) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONError: | | position | | ---------------------------------------------------------------------- | Data and other attributes inherited from JSONError: | | severities = frozenset(['error', 'fatal', 'info', 'warning']) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONError(JSONException) | Base class for all JSON-related errors. | | In addition to standard Python exceptions, these exceptions may | also have additional properties: | | * severity - One of: 'fatal', 'error', 'warning', 'info' | * position - An indication of the position in the input where the error occured. | * outer_position - A secondary position (optional) that gives | the location of the outer data item in which the error | occured, such as the beginning of a string or an array. | * context_description - A string that identifies the context | in which the error occured. Default is "Context". | | Method resolution order: | JSONError | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Methods defined here: | | __init__(self, message, *args, **kwargs) | | __repr__(self) | | pretty_description(self, show_positions=True, filename=None) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | position | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | severities = frozenset(['error', 'fatal', 'info', 'warning']) | | ---------------------------------------------------------------------- | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONException(exceptions.Exception) | Base class for all JSON-related exceptions. | | Method resolution order: | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Data descriptors defined here: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Methods inherited from exceptions.Exception: | | __init__(...) | x.__init__(...) initializes x; see help(type(x)) for signature | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __repr__(...) | x.__repr__() <==> repr(x) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONSkipHook(JSONException) | An exception to be raised by user-defined code within hook | callbacks to indicate the callback does not want to handle the | situation. | | Method resolution order: | JSONSkipHook | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Methods inherited from exceptions.Exception: | | __init__(...) | x.__init__(...) initializes x; see help(type(x)) for signature | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __repr__(...) | x.__repr__() <==> repr(x) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class JSONStopProcessing(JSONException) | Can be raised by anyplace, including inside a hook function, to | cause the entire encode or decode process to immediately stop | with an error. | | Method resolution order: | JSONStopProcessing | JSONException | exceptions.Exception | exceptions.BaseException | __builtin__.object | | Data descriptors inherited from JSONException: | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Methods inherited from exceptions.Exception: | | __init__(...) | x.__init__(...) initializes x; see help(type(x)) for signature | | ---------------------------------------------------------------------- | Data and other attributes inherited from exceptions.Exception: | | __new__ = | T.__new__(S, ...) -> a new object with type S, a subtype of T | | ---------------------------------------------------------------------- | Methods inherited from exceptions.BaseException: | | __delattr__(...) | x.__delattr__('name') <==> del x.name | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __reduce__(...) | | __repr__(...) | x.__repr__() <==> repr(x) | | __setattr__(...) | x.__setattr__('name', value) <==> x.name = value | | __setstate__(...) | | __str__(...) | x.__str__() <==> str(x) | | __unicode__(...) | | ---------------------------------------------------------------------- | Data descriptors inherited from exceptions.BaseException: | | __dict__ | | args | | message class buffered_stream(__builtin__.object) | A helper class for the JSON parser. | | It allows for reading an input document, while handling some | low-level Unicode issues as well as tracking the current position | in terms of line and column position. | | Methods defined here: | | __getitem__(self, index) | Returns the character at the given index relative to the current position. | | If the index goes beyond the end of the input, or prior to the | start when negative, then '' is returned. | | If the index provided is a slice object, then that range of | characters is returned as a string. Note that a stride value other | than 1 is not supported in the slice. To use a slice, do: | | s = my_stream[ 1:4 ] | | __init__(self, txt='', encoding=None) | | __repr__(self) | | at_eol(self, allow_unicode_eol=True) | Returns True if the current position contains an | end-of-line control character. | | at_ws(self, allow_unicode_whitespace=True) | Returns True if the current position contains a white-space | character. | | clear_saved_position(self) | | peek(self, offset=0) | Returns the character at the current position, or at a | given offset away from the current position. If the position | is beyond the limits of the document size, then an empty | string '' is returned. | | peekstr(self, span=1, offset=0) | Returns one or more characters starting at the current | position, or at a given offset away from the current position, | and continuing for the given span length. If the offset and | span go outside the limit of the current document size, then | the returned string may be shorter than the requested span | length. | | pop(self) | Returns the character at the current position and advances | the position to the next character. At the end of the | document this function returns an empty string. | | pop_identifier(self, match=None) | Pops the sequence of characters at the current position | that match the syntax for a JavaScript identifier. | | pop_if_startswith(self, s) | Pops the sequence of characters if they match the given string. | | See also method: startswith() | | pop_while_in(self, chars) | Pops a sequence of characters at the current position | as long as each of them is in the given set of characters. | | popif(self, testfn) | Just like the pop() function, but only returns the | character if the given predicate test function succeeds. | | popstr(self, span=1, offset=0) | Returns a string of one or more characters starting at the | current position, and advances the position to the following | character after the span. Will not go beyond the end of the | document, so the returned string may be shorter than the | requested span. | | popuntil(self, testfn, maxchars=None) | Just like popwhile() method except the predicate function | should return True to stop the sequence rather than False. | | See also methods: skipuntil() and popwhile() | | popwhile(self, testfn, maxchars=None) | Pops all the characters starting at the current position as | long as each character passes the given predicate function | test. If maxchars a numeric value instead of None then then | no more than that number of characters will be popped | regardless of the predicate test. | | See also methods: skipwhile() and popuntil() | | reset(self) | Clears the state to nothing. | | restore_position(self) | | rewind(self) | Resets the position back to the start of the input text. | | save_position(self) | | set_text(self, txt, encoding=None) | Changes the input text document and rewinds the position to | the start of the new document. | | skip(self, span=1) | Advances the current position by one (or the given number) | of characters. Will not advance beyond the end of the | document. Returns the number of characters skipped. | | skip_to_next_line(self, allow_unicode_eol=True) | Advances the current position to the start of the next | line. Will not advance beyond the end of the file. Note that | the two-character sequence CR+LF is recognized as being just a | single end-of-line marker. | | skipuntil(self, testfn) | Advances the current position until a given predicate test | function succeeds, or the end of the document is reached. | | Returns the actual number of characters skipped. | | The provided test function should take a single unicode | character and return a boolean value, such as: | | lambda c : c == '.' # Skip to next period | | See also methods: skipwhile() and popuntil() | | skipwhile(self, testfn) | Advances the current position until a given predicate test | function fails, or the end of the document is reached. | | Returns the actual number of characters skipped. | | The provided test function should take a single unicode | character and return a boolean value, such as: | | lambda c : c.isdigit() # Skip all digits | | See also methods: skipuntil() and popwhile() | | skipws(self, allow_unicode_whitespace=True) | Advances the current position past all whitespace, or until | the end of the document is reached. | | startswith(self, s) | Determines if the text at the current position starts with | the given string. | | See also method: pop_if_startswith() | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | at_end | Returns True if the position is currently at the end of the | document, of False otherwise. | | at_start | Returns True if the position is currently at the start of | the document, or False otherwise. | | bom | The Unicode Byte-Order Mark (BOM), if any, that was present | at the start of the input text. The returned BOM is a string | of the raw bytes, and is not Unicode-decoded. | | codec | The codec object used to perform Unicode decoding, or None. | | cpos | The current character offset from the start of the document. | | position | The current position (as a position_marker object). | Returns a copy. | | text_context | A short human-readable textual excerpt of the document at | the current position, in English. class decode_state(__builtin__.object) | An internal transient object used during JSON decoding to | record the current parsing state and error messages. | | Methods defined here: | | __init__(self, options=None) | | push_cond(self, behavior_value, message, *args, **kwargs) | Creates an conditional error or warning message. | | The behavior value (from json_options) controls whether | a message will be pushed and whether it is an error | or warning message. | | push_error(self, message, *args, **kwargs) | Create an error. | | push_exception(self, exc) | Add an already-built exception to the error list. | | push_fatal(self, message, *args, **kwargs) | Create a fatal error. | | push_info(self, message, *args, **kwargs) | Create a informational message. | | push_warning(self, message, *args, **kwargs) | Create a warning. | | reset(self) | Clears all errors, statistics, and input text. | | set_input(self, txt, encoding=None) | Initialize the state by setting the input document text. | | update_depth_stats(self, **kwargs) | | update_float_stats(self, float_value, **kwargs) | | update_integer_stats(self, int_value, **kwargs) | | update_negzero_float_stats(self, **kwargs) | | update_negzero_int_stats(self, **kwargs) | | update_string_stats(self, s, **kwargs) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | has_errors | Have any errors been seen already? | | has_fatal | Have any errors been seen already? | | should_stop class decode_statistics(__builtin__.object) | An object that records various statistics about a decoded JSON document. | | Methods defined here: | | __init__(self) | | pretty_description(self, prefix='') | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | num_infinites | Misspelled 'num_infinities' for backwards compatibility | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | double_int_max = 9007199254740991 | | double_int_min = -9007199254740991 | | int16_max = 32767 | | int16_min = -32768 | | int32_max = 2147483647 | | int32_min = -2147483648 | | int64_max = 9223372036854775807 | | int64_min = -9223372036854775808 | | int8_max = 127 | | int8_min = -128 class encode_state(__builtin__.object) | An internal transient object used during JSON encoding to | record the current construction state. | | Methods defined here: | | __eq__(self, other_state) | | __init__(self, jsopts=None, parent=None) | | __lt__(self, other_state) | | append(self, s) | Adds a string to the end of the current JSON document | | combine(self) | Returns the accumulated string and resets the state to empty | | join_substate(self, other_state) | | make_substate(self) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) class helpers(__builtin__.object) | A set of utility functions. | | Static methods defined here: | | auto_detect_encoding(s) | Takes a string (or byte array) and tries to determine the Unicode encoding it is in. | | Returns the encoding name, as a string. | | char_is_identifier_leader(c) | Determines if the character may be the first character of a | JavaScript identifier. | | char_is_identifier_tail(c) | Determines if the character may be part of a JavaScript | identifier. | | char_is_json_eol(c) | Determines if the given character is a JSON line separator | | char_is_json_ws(c) | Determines if the given character is a JSON white-space character | | char_is_unicode_eol(c) | Determines if the given character is a Unicode line or | paragraph separator. These correspond to CR and LF as well as | Unicode characters in the Zl or Zp categories. | | char_is_unicode_ws(c) | Determines if the given character is a Unicode space character | | decode_binary(binarystring) | Decodes a binary string into it's integer value. | | decode_hex(hexstring) | Decodes a hexadecimal string into it's integer value. | | decode_octal(octalstring) | Decodes an octal string into it's integer value. | | extend_and_flatten_list_with_sep(orig_seq, extension_seq, separator='') | | format_timedelta_iso(td) | Encodes a datetime.timedelta into ISO-8601 Time Period format. | | is_binary_digit(c) | Determines if the given character is a valid binary digit (0 or 1). | | is_hex_digit(c) | Determines if the given character is a valid hexadecimal digit (0-9, a-f, A-F). | | is_infinite(n) | Is the number infinite? | | is_nan(n) | Is the number a NaN (not-a-number)? | | is_negzero(n) | Is the number value a negative zero? | | is_octal_digit(c) | Determines if the given character is a valid octal digit (0-7). | | isnumbertype(obj) | Is the object of a Python number type (excluding complex)? | | isstringtype(obj) | Is the object of a Python string type? | | lookup_codec(encoding) | Wrapper around codecs.lookup(). | | Returns None if codec not found, rather than raising a LookupError. | | make_raw_bytes(byte_list) | Constructs a byte array (bytes in Python 3, str in Python 2) from a list of byte values (0-255). | | make_surrogate_pair(codepoint) | Given a Unicode codepoint (int) returns a 2-tuple of surrogate codepoints. | | safe_unichr(codepoint) | Just like Python's unichr() but works in narrow-Unicode Pythons. | | strip_format_control_chars(txt) | Filters out all Unicode format control characters from the string. | | ECMAScript permits any Unicode "format control characters" to | appear at any place in the source code. They are to be | ignored as if they are not there before any other lexical | tokenization occurs. Note that JSON does not allow them, | except within string literals. | | * Ref. ECMAScript section 7.1. | * http://en.wikipedia.org/wiki/Unicode_control_characters | | There are dozens of Format Control Characters, for example: | U+00AD SOFT HYPHEN | U+200B ZERO WIDTH SPACE | U+2060 WORD JOINER | | surrogate_pair_as_unicode(c1, c2) | Takes a pair of unicode surrogates and returns the equivalent unicode character. | | The input pair must be a surrogate pair, with c1 in the range | U+D800 to U+DBFF and c2 in the range U+DC00 to U+DFFF. | | unicode_as_surrogate_pair(c) | Takes a single unicode character and returns a sequence of surrogate pairs. | | The output of this function is a tuple consisting of one or two unicode | characters, such that if the input character is outside the BMP range | then the output is a two-character surrogate pair representing that character. | | If the input character is inside the BMP then the output tuple will have | just a single character...the same one. | | unicode_decode(txt, encoding=None) | Takes a string (or byte array) and tries to convert it to a Unicode string. | | Returns a named tuple: (string, codec, bom) | | The 'encoding' argument, if supplied, should either the name of | a character encoding, or an instance of codecs.CodecInfo. If | the encoding argument is None or "auto" then the encoding is | automatically determined, if possible. | | Any BOM (Byte Order Mark) that is found at the beginning of the | input will be stripped off and placed in the 'bom' portion of | the returned value. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | always_use_custom_codecs = False | | hexdigits = '0123456789ABCDEFabcdef' | | javascript_reserved_words = frozenset(['break', 'case', 'catch', 'clas... | | maxunicode = 1114111 | | octaldigits = '01234567' | | sys = | | unsafe_string_chars = u'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x... class json_int(__builtin__.long) | A subclass of the Python int/long that remembers its format (hex,octal,etc). | | Initialize it the same as an int, but also accepts an additional keyword | argument 'number_format' which should be one of the NUMBER_FORMAT_* values. | | n = json_int( x[, base, number_format=NUMBER_FORMAT_DECIMAL] ) | | Method resolution order: | json_int | __builtin__.long | __builtin__.object | | Methods defined here: | | json_format(self) | Returns the integer value formatted as a JSON literal | | ---------------------------------------------------------------------- | Static methods defined here: | | __new__(cls, *args, **kwargs) | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | number_format | The original radix format of the number | | ---------------------------------------------------------------------- | Methods inherited from __builtin__.long: | | __abs__(...) | x.__abs__() <==> abs(x) | | __add__(...) | x.__add__(y) <==> x+y | | __and__(...) | x.__and__(y) <==> x&y | | __cmp__(...) | x.__cmp__(y) <==> cmp(x,y) | | __coerce__(...) | x.__coerce__(y) <==> coerce(x, y) | | __div__(...) | x.__div__(y) <==> x/y | | __divmod__(...) | x.__divmod__(y) <==> divmod(x, y) | | __float__(...) | x.__float__() <==> float(x) | | __floordiv__(...) | x.__floordiv__(y) <==> x//y | | __format__(...) | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getnewargs__(...) | | __hash__(...) | x.__hash__() <==> hash(x) | | __hex__(...) | x.__hex__() <==> hex(x) | | __index__(...) | x[y:z] <==> x[y.__index__():z.__index__()] | | __int__(...) | x.__int__() <==> int(x) | | __invert__(...) | x.__invert__() <==> ~x | | __long__(...) | x.__long__() <==> long(x) | | __lshift__(...) | x.__lshift__(y) <==> x< x%y | | __mul__(...) | x.__mul__(y) <==> x*y | | __neg__(...) | x.__neg__() <==> -x | | __nonzero__(...) | x.__nonzero__() <==> x != 0 | | __oct__(...) | x.__oct__() <==> oct(x) | | __or__(...) | x.__or__(y) <==> x|y | | __pos__(...) | x.__pos__() <==> +x | | __pow__(...) | x.__pow__(y[, z]) <==> pow(x, y[, z]) | | __radd__(...) | x.__radd__(y) <==> y+x | | __rand__(...) | x.__rand__(y) <==> y&x | | __rdiv__(...) | x.__rdiv__(y) <==> y/x | | __rdivmod__(...) | x.__rdivmod__(y) <==> divmod(y, x) | | __repr__(...) | x.__repr__() <==> repr(x) | | __rfloordiv__(...) | x.__rfloordiv__(y) <==> y//x | | __rlshift__(...) | x.__rlshift__(y) <==> y< y%x | | __rmul__(...) | x.__rmul__(y) <==> y*x | | __ror__(...) | x.__ror__(y) <==> y|x | | __rpow__(...) | y.__rpow__(x[, z]) <==> pow(x, y[, z]) | | __rrshift__(...) | x.__rrshift__(y) <==> y>>x | | __rshift__(...) | x.__rshift__(y) <==> x>>y | | __rsub__(...) | x.__rsub__(y) <==> y-x | | __rtruediv__(...) | x.__rtruediv__(y) <==> y/x | | __rxor__(...) | x.__rxor__(y) <==> y^x | | __sizeof__(...) | Returns size in memory, in bytes | | __str__(...) | x.__str__() <==> str(x) | | __sub__(...) | x.__sub__(y) <==> x-y | | __truediv__(...) | x.__truediv__(y) <==> x/y | | __trunc__(...) | Truncating an Integral returns itself. | | __xor__(...) | x.__xor__(y) <==> x^y | | bit_length(...) | long.bit_length() -> int or long | | Number of bits necessary to represent self in binary. | >>> bin(37L) | '0b100101' | >>> (37L).bit_length() | 6 | | conjugate(...) | Returns self, the complex conjugate of any long. | | ---------------------------------------------------------------------- | Data descriptors inherited from __builtin__.long: | | denominator | the denominator of a rational number in lowest terms | | imag | the imaginary part of a complex number | | numerator | the numerator of a rational number in lowest terms | | real | the real part of a complex number class json_options(__builtin__.object) | Options to determine how strict the decoder or encoder should be. | | Methods defined here: | | __eq__ = behaviors_eq(self, other) | Determines if two options objects are equivalent. | | __init__(self, **kwargs) | Set JSON encoding and decoding options. | | If 'strict' is set to True, then only strictly-conforming JSON | output will be produced. Note that this means that some types | of values may not be convertable and will result in a | JSONEncodeError exception. | | If 'compactly' is set to True, then the resulting string will | have all extraneous white space removed; if False then the | string will be "pretty printed" with whitespace and indentation | added to make it more readable. | | If 'escape_unicode' is set to True, then all non-ASCII characters | will be represented as a unicode escape sequence; if False then | the actual real unicode character will be inserted if possible. | | The 'escape_unicode' can also be a function, which when called | with a single argument of a unicode character will return True | if the character should be escaped or False if it should not. | | allow_all_numeric_signs(self, _name='all_numeric_signs', _value='allow') | Set behavior all_numeric_signs to allow. | | allow_any_type_at_start(self, _name='any_type_at_start', _value='allow') | Set behavior any_type_at_start to allow. | | allow_binary_numbers(self, _name='binary_numbers', _value='allow') | Set behavior binary_numbers to allow. | | allow_bom(self, _name='bom', _value='allow') | Set behavior bom to allow. | | allow_comments(self, _name='comments', _value='allow') | Set behavior comments to allow. | | allow_control_char_in_string(self, _name='control_char_in_string', _value='allow') | Set behavior control_char_in_string to allow. | | allow_duplicate_keys(self, _name='duplicate_keys', _value='allow') | Set behavior duplicate_keys to allow. | | allow_extended_unicode_escapes(self, _name='extended_unicode_escapes', _value='allow') | Set behavior extended_unicode_escapes to allow. | | allow_format_control_chars(self, _name='format_control_chars', _value='allow') | Set behavior format_control_chars to allow. | | allow_hex_numbers(self, _name='hex_numbers', _value='allow') | Set behavior hex_numbers to allow. | | allow_identifier_keys(self, _name='identifier_keys', _value='allow') | Set behavior identifier_keys to allow. | | allow_initial_decimal_point(self, _name='initial_decimal_point', _value='allow') | Set behavior initial_decimal_point to allow. | | allow_js_string_escapes(self, _name='js_string_escapes', _value='allow') | Set behavior js_string_escapes to allow. | | allow_leading_zeros(self, _name='leading_zeros', _value='allow') | Set behavior leading_zeros to allow. | | allow_non_numbers(self, _name='non_numbers', _value='allow') | Set behavior non_numbers to allow. | | allow_non_portable(self, _name='non_portable', _value='allow') | Set behavior non_portable to allow. | | allow_nonescape_characters(self, _name='nonescape_characters', _value='allow') | Set behavior nonescape_characters to allow. | | allow_nonstring_keys(self, _name='nonstring_keys', _value='allow') | Set behavior nonstring_keys to allow. | | allow_octal_numbers(self, _name='octal_numbers', _value='allow') | Set behavior octal_numbers to allow. | | allow_omitted_array_elements(self, _name='omitted_array_elements', _value='allow') | Set behavior omitted_array_elements to allow. | | allow_single_quoted_strings(self, _name='single_quoted_strings', _value='allow') | Set behavior single_quoted_strings to allow. | | allow_trailing_comma(self, _name='trailing_comma', _value='allow') | Set behavior trailing_comma to allow. | | allow_trailing_decimal_point(self, _name='trailing_decimal_point', _value='allow') | Set behavior trailing_decimal_point to allow. | | allow_undefined_values(self, _name='undefined_values', _value='allow') | Set behavior undefined_values to allow. | | allow_unicode_whitespace(self, _name='unicode_whitespace', _value='allow') | Set behavior unicode_whitespace to allow. | | allow_zero_byte(self, _name='zero_byte', _value='allow') | Set behavior zero_byte to allow. | | copy(self) | | copy_from(self, other) | | describe_behavior(self, name) | Returns documentation about a given behavior. | | forbid_all_numeric_signs(self, _name='all_numeric_signs', _value='forbid') | Set behavior all_numeric_signs to forbid. | | forbid_any_type_at_start(self, _name='any_type_at_start', _value='forbid') | Set behavior any_type_at_start to forbid. | | forbid_binary_numbers(self, _name='binary_numbers', _value='forbid') | Set behavior binary_numbers to forbid. | | forbid_bom(self, _name='bom', _value='forbid') | Set behavior bom to forbid. | | forbid_comments(self, _name='comments', _value='forbid') | Set behavior comments to forbid. | | forbid_control_char_in_string(self, _name='control_char_in_string', _value='forbid') | Set behavior control_char_in_string to forbid. | | forbid_duplicate_keys(self, _name='duplicate_keys', _value='forbid') | Set behavior duplicate_keys to forbid. | | forbid_extended_unicode_escapes(self, _name='extended_unicode_escapes', _value='forbid') | Set behavior extended_unicode_escapes to forbid. | | forbid_format_control_chars(self, _name='format_control_chars', _value='forbid') | Set behavior format_control_chars to forbid. | | forbid_hex_numbers(self, _name='hex_numbers', _value='forbid') | Set behavior hex_numbers to forbid. | | forbid_identifier_keys(self, _name='identifier_keys', _value='forbid') | Set behavior identifier_keys to forbid. | | forbid_initial_decimal_point(self, _name='initial_decimal_point', _value='forbid') | Set behavior initial_decimal_point to forbid. | | forbid_js_string_escapes(self, _name='js_string_escapes', _value='forbid') | Set behavior js_string_escapes to forbid. | | forbid_leading_zeros(self, _name='leading_zeros', _value='forbid') | Set behavior leading_zeros to forbid. | | forbid_non_numbers(self, _name='non_numbers', _value='forbid') | Set behavior non_numbers to forbid. | | forbid_non_portable(self, _name='non_portable', _value='forbid') | Set behavior non_portable to forbid. | | forbid_nonescape_characters(self, _name='nonescape_characters', _value='forbid') | Set behavior nonescape_characters to forbid. | | forbid_nonstring_keys(self, _name='nonstring_keys', _value='forbid') | Set behavior nonstring_keys to forbid. | | forbid_octal_numbers(self, _name='octal_numbers', _value='forbid') | Set behavior octal_numbers to forbid. | | forbid_omitted_array_elements(self, _name='omitted_array_elements', _value='forbid') | Set behavior omitted_array_elements to forbid. | | forbid_single_quoted_strings(self, _name='single_quoted_strings', _value='forbid') | Set behavior single_quoted_strings to forbid. | | forbid_trailing_comma(self, _name='trailing_comma', _value='forbid') | Set behavior trailing_comma to forbid. | | forbid_trailing_decimal_point(self, _name='trailing_decimal_point', _value='forbid') | Set behavior trailing_decimal_point to forbid. | | forbid_undefined_values(self, _name='undefined_values', _value='forbid') | Set behavior undefined_values to forbid. | | forbid_unicode_whitespace(self, _name='unicode_whitespace', _value='forbid') | Set behavior unicode_whitespace to forbid. | | forbid_zero_byte(self, _name='zero_byte', _value='forbid') | Set behavior zero_byte to forbid. | | get_behavior(self, name) | Returns the value for a given behavior | | indentation_for_level(self, level=0) | Returns a whitespace string used for indenting. | | is_all(self, value) | Determines if all the behaviors have the given value. | | make_decimal(self, s, sign='+') | Converts a string into a decimal or float value. | | make_float(self, s, sign='+') | Converts a string into a float or decimal value. | | make_int(self, s, sign=None, number_format='decimal') | Makes an integer value according to the current options. | | First argument should be a string representation of the number, | or an integer. | | Returns a number value, which could be an int, float, or decimal. | | reset_to_defaults(self) | | set_all(self, value) | Changes all behaviors to have the given value. | | set_all_allow(self, _value='allow') | Set all behaviors to value allow. | | set_all_forbid(self, _value='forbid') | Set all behaviors to value forbid. | | set_all_warn(self, _value='warn') | Set all behaviors to value warn. | | set_behavior(self, name, value) | Changes the value for a given behavior | | set_indent(self, num_spaces, tab_width=0, limit=None) | Changes the indentation properties when outputting JSON in non-compact mode. | | 'num_spaces' is the number of spaces to insert for each level | of indentation, which defaults to 2. | | 'tab_width', if not 0, is the number of spaces which is equivalent | to one tab character. Tabs will be output where possible rather | than runs of spaces. | | 'limit', if not None, is the maximum indentation level after | which no further indentation will be output. | | spaces_to_next_indent_level(self, min_spaces=1, subtract=0) | | suppress_warnings(self) | | warn_all_numeric_signs(self, _name='all_numeric_signs', _value='warn') | Set behavior all_numeric_signs to warn. | | warn_any_type_at_start(self, _name='any_type_at_start', _value='warn') | Set behavior any_type_at_start to warn. | | warn_binary_numbers(self, _name='binary_numbers', _value='warn') | Set behavior binary_numbers to warn. | | warn_bom(self, _name='bom', _value='warn') | Set behavior bom to warn. | | warn_comments(self, _name='comments', _value='warn') | Set behavior comments to warn. | | warn_control_char_in_string(self, _name='control_char_in_string', _value='warn') | Set behavior control_char_in_string to warn. | | warn_duplicate_keys(self, _name='duplicate_keys', _value='warn') | Set behavior duplicate_keys to warn. | | warn_extended_unicode_escapes(self, _name='extended_unicode_escapes', _value='warn') | Set behavior extended_unicode_escapes to warn. | | warn_format_control_chars(self, _name='format_control_chars', _value='warn') | Set behavior format_control_chars to warn. | | warn_hex_numbers(self, _name='hex_numbers', _value='warn') | Set behavior hex_numbers to warn. | | warn_identifier_keys(self, _name='identifier_keys', _value='warn') | Set behavior identifier_keys to warn. | | warn_initial_decimal_point(self, _name='initial_decimal_point', _value='warn') | Set behavior initial_decimal_point to warn. | | warn_js_string_escapes(self, _name='js_string_escapes', _value='warn') | Set behavior js_string_escapes to warn. | | warn_leading_zeros(self, _name='leading_zeros', _value='warn') | Set behavior leading_zeros to warn. | | warn_non_numbers(self, _name='non_numbers', _value='warn') | Set behavior non_numbers to warn. | | warn_non_portable(self, _name='non_portable', _value='warn') | Set behavior non_portable to warn. | | warn_nonescape_characters(self, _name='nonescape_characters', _value='warn') | Set behavior nonescape_characters to warn. | | warn_nonstring_keys(self, _name='nonstring_keys', _value='warn') | Set behavior nonstring_keys to warn. | | warn_octal_numbers(self, _name='octal_numbers', _value='warn') | Set behavior octal_numbers to warn. | | warn_omitted_array_elements(self, _name='omitted_array_elements', _value='warn') | Set behavior omitted_array_elements to warn. | | warn_single_quoted_strings(self, _name='single_quoted_strings', _value='warn') | Set behavior single_quoted_strings to warn. | | warn_trailing_comma(self, _name='trailing_comma', _value='warn') | Set behavior trailing_comma to warn. | | warn_trailing_decimal_point(self, _name='trailing_decimal_point', _value='warn') | Set behavior trailing_decimal_point to warn. | | warn_undefined_values(self, _name='undefined_values', _value='warn') | Set behavior undefined_values to warn. | | warn_unicode_whitespace(self, _name='unicode_whitespace', _value='warn') | Set behavior unicode_whitespace to warn. | | warn_zero_byte(self, _name='zero_byte', _value='warn') | Set behavior zero_byte to warn. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | all_behaviors | Returns the names of all known behaviors. | | all_numeric_signs | Numbers may be prefixed by any '+' and '-', e.g., +4, -+-+77 | | allow_behaviors | Return the set of behaviors with the value allow. | | allow_or_warn_behaviors | Returns the set of all behaviors that are not forbidden (i.e., are allowed or warned). | | any_type_at_start | A JSON document may start with any type, not just arrays or objects | | binary_numbers | Binary numbers, e.g., 0b1001 | | bom | A JSON document may start with a Unicode BOM (Byte Order Mark) | | comments | JavaScript comments, both /*...*/ and //... styles | | control_char_in_string | Strings may contain raw control characters without \u-escaping | | duplicate_keys | Objects may have repeated keys | | encode_enum_as | The strategy for encoding Python Enum values. | | extended_unicode_escapes | Extended Unicode escape sequence \u{..} for non-BMP characters | | forbid_behaviors | Return the set of behaviors with the value forbid. | | format_control_chars | Unicode "format control characters" may appear in the input | | hex_numbers | Hexadecimal numbers, e.g., 0x1f | | identifier_keys | JavaScript identifiers are converted to strings when used as object keys | | inf | The numeric value Infinity, either a float or a decimal. | | initial_decimal_point | Floating-point numbers may start with a decimal point (no units digit) | | is_all_allow | Determines if all the behaviors have the value allow. | | is_all_forbid | Determines if all the behaviors have the value forbid. | | is_all_warn | Determines if all the behaviors have the value warn. | | is_allow_all_numeric_signs | Allow Numbers may be prefixed by any '+' and '-', e.g., +4, -+-+77 | | is_allow_any_type_at_start | Allow A JSON document may start with any type, not just arrays or objects | | is_allow_binary_numbers | Allow Binary numbers, e.g., 0b1001 | | is_allow_bom | Allow A JSON document may start with a Unicode BOM (Byte Order Mark) | | is_allow_comments | Allow JavaScript comments, both /*...*/ and //... styles | | is_allow_control_char_in_string | Allow Strings may contain raw control characters without \u-escaping | | is_allow_duplicate_keys | Allow Objects may have repeated keys | | is_allow_extended_unicode_escapes | Allow Extended Unicode escape sequence \u{..} for non-BMP characters | | is_allow_format_control_chars | Allow Unicode "format control characters" may appear in the input | | is_allow_hex_numbers | Allow Hexadecimal numbers, e.g., 0x1f | | is_allow_identifier_keys | Allow JavaScript identifiers are converted to strings when used as object keys | | is_allow_initial_decimal_point | Allow Floating-point numbers may start with a decimal point (no units digit) | | is_allow_js_string_escapes | Allow All JavaScript character \-escape sequences may be in strings | | is_allow_leading_zeros | Allow Numbers may have leading zeros | | is_allow_non_numbers | Allow Non-numbers may be used, such as NaN or Infinity | | is_allow_non_portable | Allow Anything technically valid but likely to cause data portablibity issues | | is_allow_nonescape_characters | Allow Unknown character \-escape sequences stand for that character (\Q -> 'Q') | | is_allow_nonstring_keys | Allow Value types other than strings (or identifiers) may be used as object keys | | is_allow_octal_numbers | Allow New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals) | | is_allow_omitted_array_elements | Allow Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3] | | is_allow_single_quoted_strings | Allow Strings may be delimited with both double (") and single (') quotation marks | | is_allow_trailing_comma | Allow A final comma may end the list of array or object members | | is_allow_trailing_decimal_point | Allow Floating-point number may end with a decimal point and no following fractional digits | | is_allow_undefined_values | Allow The JavaScript 'undefined' value may be used | | is_allow_unicode_whitespace | Allow Treat any Unicode whitespace character as valid whitespace | | is_allow_zero_byte | Allow Strings may contain U+0000, which may not be safe for C-based programs | | is_forbid_all_numeric_signs | Forbid Numbers may be prefixed by any '+' and '-', e.g., +4, -+-+77 | | is_forbid_any_type_at_start | Forbid A JSON document may start with any type, not just arrays or objects | | is_forbid_binary_numbers | Forbid Binary numbers, e.g., 0b1001 | | is_forbid_bom | Forbid A JSON document may start with a Unicode BOM (Byte Order Mark) | | is_forbid_comments | Forbid JavaScript comments, both /*...*/ and //... styles | | is_forbid_control_char_in_string | Forbid Strings may contain raw control characters without \u-escaping | | is_forbid_duplicate_keys | Forbid Objects may have repeated keys | | is_forbid_extended_unicode_escapes | Forbid Extended Unicode escape sequence \u{..} for non-BMP characters | | is_forbid_format_control_chars | Forbid Unicode "format control characters" may appear in the input | | is_forbid_hex_numbers | Forbid Hexadecimal numbers, e.g., 0x1f | | is_forbid_identifier_keys | Forbid JavaScript identifiers are converted to strings when used as object keys | | is_forbid_initial_decimal_point | Forbid Floating-point numbers may start with a decimal point (no units digit) | | is_forbid_js_string_escapes | Forbid All JavaScript character \-escape sequences may be in strings | | is_forbid_leading_zeros | Forbid Numbers may have leading zeros | | is_forbid_non_numbers | Forbid Non-numbers may be used, such as NaN or Infinity | | is_forbid_non_portable | Forbid Anything technically valid but likely to cause data portablibity issues | | is_forbid_nonescape_characters | Forbid Unknown character \-escape sequences stand for that character (\Q -> 'Q') | | is_forbid_nonstring_keys | Forbid Value types other than strings (or identifiers) may be used as object keys | | is_forbid_octal_numbers | Forbid New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals) | | is_forbid_omitted_array_elements | Forbid Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3] | | is_forbid_single_quoted_strings | Forbid Strings may be delimited with both double (") and single (') quotation marks | | is_forbid_trailing_comma | Forbid A final comma may end the list of array or object members | | is_forbid_trailing_decimal_point | Forbid Floating-point number may end with a decimal point and no following fractional digits | | is_forbid_undefined_values | Forbid The JavaScript 'undefined' value may be used | | is_forbid_unicode_whitespace | Forbid Treat any Unicode whitespace character as valid whitespace | | is_forbid_zero_byte | Forbid Strings may contain U+0000, which may not be safe for C-based programs | | is_warn_all_numeric_signs | Warn Numbers may be prefixed by any '+' and '-', e.g., +4, -+-+77 | | is_warn_any_type_at_start | Warn A JSON document may start with any type, not just arrays or objects | | is_warn_binary_numbers | Warn Binary numbers, e.g., 0b1001 | | is_warn_bom | Warn A JSON document may start with a Unicode BOM (Byte Order Mark) | | is_warn_comments | Warn JavaScript comments, both /*...*/ and //... styles | | is_warn_control_char_in_string | Warn Strings may contain raw control characters without \u-escaping | | is_warn_duplicate_keys | Warn Objects may have repeated keys | | is_warn_extended_unicode_escapes | Warn Extended Unicode escape sequence \u{..} for non-BMP characters | | is_warn_format_control_chars | Warn Unicode "format control characters" may appear in the input | | is_warn_hex_numbers | Warn Hexadecimal numbers, e.g., 0x1f | | is_warn_identifier_keys | Warn JavaScript identifiers are converted to strings when used as object keys | | is_warn_initial_decimal_point | Warn Floating-point numbers may start with a decimal point (no units digit) | | is_warn_js_string_escapes | Warn All JavaScript character \-escape sequences may be in strings | | is_warn_leading_zeros | Warn Numbers may have leading zeros | | is_warn_non_numbers | Warn Non-numbers may be used, such as NaN or Infinity | | is_warn_non_portable | Warn Anything technically valid but likely to cause data portablibity issues | | is_warn_nonescape_characters | Warn Unknown character \-escape sequences stand for that character (\Q -> 'Q') | | is_warn_nonstring_keys | Warn Value types other than strings (or identifiers) may be used as object keys | | is_warn_octal_numbers | Warn New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals) | | is_warn_omitted_array_elements | Warn Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3] | | is_warn_single_quoted_strings | Warn Strings may be delimited with both double (") and single (') quotation marks | | is_warn_trailing_comma | Warn A final comma may end the list of array or object members | | is_warn_trailing_decimal_point | Warn Floating-point number may end with a decimal point and no following fractional digits | | is_warn_undefined_values | Warn The JavaScript 'undefined' value may be used | | is_warn_unicode_whitespace | Warn Treat any Unicode whitespace character as valid whitespace | | is_warn_zero_byte | Warn Strings may contain U+0000, which may not be safe for C-based programs | | js_string_escapes | All JavaScript character \-escape sequences may be in strings | | leading_zero_radix | The radix to be used for numbers with leading zeros. 8 or 10 | | leading_zero_radix_as_word | | leading_zeros | Numbers may have leading zeros | | nan | The numeric value NaN, either a float or a decimal. | | neginf | The numeric value -Infinity, either a float or a decimal. | | negzero_float | The numeric value -0.0, either a float or a decimal. | | non_numbers | Non-numbers may be used, such as NaN or Infinity | | non_portable | Anything technically valid but likely to cause data portablibity issues | | nonescape_characters | Unknown character \-escape sequences stand for that character (\Q -> 'Q') | | nonstring_keys | Value types other than strings (or identifiers) may be used as object keys | | octal_numbers | New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals) | | omitted_array_elements | Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3] | | single_quoted_strings | Strings may be delimited with both double (") and single (') quotation marks | | sort_keys | The method used to sort dictionary keys when encoding JSON | | strictness | | trailing_comma | A final comma may end the list of array or object members | | trailing_decimal_point | Floating-point number may end with a decimal point and no following fractional digits | | undefined_values | The JavaScript 'undefined' value may be used | | unicode_whitespace | Treat any Unicode whitespace character as valid whitespace | | values | Set of possible behavior values | | warn_behaviors | Return the set of behaviors with the value warn. | | zero_byte | Strings may contain U+0000, which may not be safe for C-based programs | | zero_float | The numeric value 0.0, either a float or a decimal. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __metaclass__ = | Meta class used to establish a set of "behavior" options. | | Classes that use this meta class must defined a class-level | variable called '_behaviors' that is a list of tuples, each of | which describes one behavior and is like: (behavior_name, | documentation). Also define a second class-level variable called | '_behavior_values' which is a list of the permitted values for | each behavior, each being strings. | | For each behavior (e.g., pretty), and for each value (e.g., | yes) the following methods/properties will be created: | | * pretty - value of 'pretty' behavior (read-write) | * ispretty_yes - returns True if 'pretty' is 'yes' | | For each value (e.g., pink) the following methods/properties | will be created: | | * all_behaviors - set of all behaviors (read-only) | * pink_behaviors - set of behaviors with value of 'pink' (read-only) | * set_all('pink') | * set_all_pink() - set all behaviors to value of 'pink' class jsonlint(__builtin__.object) | This class contains most of the logic for the "jsonlint" command. | | You generally create an instance of this class, to defined the | program's environment, and then call the main() method. A simple | wrapper to turn this into a script might be: | | import sys, demjson | if __name__ == '__main__': | lint = demjson.jsonlint( sys.argv[0] ) | return lint.main( sys.argv[1:] ) | | Methods defined here: | | __init__(self, program_name='jsonlint', stdin=None, stdout=None, stderr=None) | Create an instance of a "jsonlint" program. | | You can optionally pass options to define the program's environment: | | * program_name - the name of the program, usually sys.argv[0] | * stdin - the file object to use for input, default sys.stdin | * stdout - the file object to use for outut, default sys.stdout | * stderr - the file object to use for error output, default sys.stderr | | After creating an instance, you typically call the main() method. | | main(self, argv) | The main routine for program "jsonlint". | | Should be called with sys.argv[1:] as its sole argument. | | Note sys.argv[0] which normally contains the program name | should not be passed to main(); instead this class itself | is initialized with sys.argv[0]. | | Use "--help" for usage syntax, or consult the 'usage' member. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | usage | A multi-line string containing the program usage instructions. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | SUCCESS_FAIL = 'E' | | SUCCESS_OK = 'OK' | | SUCCESS_WARNING = 'W' class position_marker(__builtin__.object) | A position marks a specific place in a text document. | It consists of the following attributes: | | * line - The line number, starting at 1 | * column - The column on the line, starting at 0 | * char_position - The number of characters from the start of | the document, starting at 0 | * text_after - (optional) a short excerpt of the text of | document starting at the current position | | Lines are separated by any Unicode line separator character. As an | exception a CR+LF character pair is treated as being a single line | separator demarcation. | | Columns are simply a measure of the number of characters after the | start of a new line, starting at 0. Visual effects caused by | Unicode characters such as combining characters, bidirectional | text, zero-width characters and so on do not affect the | computation of the column regardless of visual appearance. | | The char_position is a count of the number of characters since the | beginning of the document, starting at 0. As used within the | buffered_stream class, if the document starts with a Unicode Byte | Order Mark (BOM), the BOM prefix is NOT INCLUDED in the count. | | Methods defined here: | | __init__(self, offset=0, line=1, column=0, text_after=None) | | __repr__(self) | | __str__(self) | Same as the describe() function. | | advance(self, s) | Advance the position from its current place according to | the given string of characters. | | copy(self) | Create a copy of the position object. | | describe(self, show_text=True) | Returns a human-readable description of the position, in English. | | rewind(self) | Set the position to the start of the document. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | at_end | Returns True if the position is at the end of the document. | | This property must be set by the user. | | at_start | Returns True if the position is at the start of the document. | | char_position | The current character offset from the beginning of the | document, starts at 0. | | column | The current character column from the beginning of the | document, starts at 0. | | line | The current line within the document, starts at 1. | | text_after | Returns a textual excerpt starting at the current position. | | This property must be set by the user. class utf32(codecs.CodecInfo) | Unicode UTF-32 and UCS4 encoding/decoding support. | | This is for older Pythons whch did not have UTF-32 codecs. | | JSON requires that all JSON implementations must support the | UTF-32 encoding (as well as UTF-8 and UTF-16). But earlier | versions of Python did not provide a UTF-32 codec, so we must | implement UTF-32 ourselves in case we need it. | | See http://en.wikipedia.org/wiki/UTF-32 | | Method resolution order: | utf32 | codecs.CodecInfo | __builtin__.tuple | __builtin__.object | | Static methods defined here: | | decode(obj, errors='strict', endianness=None) | Decodes a UTF-32 byte string into a Unicode string. | | Returns tuple (bytearray, num_bytes) | | The errors argument shold be one of 'strict', 'ignore', | 'replace', 'backslashreplace', or 'xmlcharrefreplace'. | | The endianness should either be None (for auto-guessing), or a | word that starts with 'B' (big) or 'L' (little). | | Will detect a Byte-Order Mark. If a BOM is found and endianness | is also set, then the two must match. | | If neither a BOM is found nor endianness is set, then big | endian order is assumed. | | encode(obj, errors='strict', endianness=None, include_bom=True) | Encodes a Unicode string into a UTF-32 encoded byte string. | | Returns a tuple: (bytearray, num_chars) | | The errors argument should be one of 'strict', 'ignore', or 'replace'. | | The endianness should be one of: | * 'B', '>', or 'big' -- Big endian | * 'L', '<', or 'little' -- Little endien | * None -- Default, from sys.byteorder | | If include_bom is true a Byte-Order Mark will be written to | the beginning of the string, otherwise it will be omitted. | | lookup(name) | A standard Python codec lookup function for UCS4/UTF32. | | If if recognizes an encoding name it returns a CodecInfo | structure which contains the various encode and decoder | functions to use. | | utf32be_decode(obj, errors='strict') | Decodes a UTF-32BE (big endian) byte string into a Unicode string. | | utf32be_encode(obj, errors='strict', include_bom=False) | Encodes a Unicode string into a UTF-32BE (big endian) encoded byte string. | | utf32le_decode(obj, errors='strict') | Decodes a UTF-32LE (little endian) byte string into a Unicode string. | | utf32le_encode(obj, errors='strict', include_bom=False) | Encodes a Unicode string into a UTF-32LE (little endian) encoded byte string. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | BOM_UTF32_BE = '\x00\x00\xfe\xff' | | BOM_UTF32_LE = '\xff\xfe\x00\x00' | | ---------------------------------------------------------------------- | Methods inherited from codecs.CodecInfo: | | __repr__(self) | | ---------------------------------------------------------------------- | Static methods inherited from codecs.CodecInfo: | | __new__(cls, encode, decode, streamreader=None, streamwriter=None, incrementalencoder=None, incrementaldecoder=None, name=None) | | ---------------------------------------------------------------------- | Data descriptors inherited from codecs.CodecInfo: | | __dict__ | dictionary for instance variables (if defined) | | ---------------------------------------------------------------------- | Methods inherited from __builtin__.tuple: | | __add__(...) | x.__add__(y) <==> x+y | | __contains__(...) | x.__contains__(y) <==> y in x | | __eq__(...) | x.__eq__(y) <==> x==y | | __ge__(...) | x.__ge__(y) <==> x>=y | | __getattribute__(...) | x.__getattribute__('name') <==> x.name | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __getnewargs__(...) | | __getslice__(...) | x.__getslice__(i, j) <==> x[i:j] | | Use of negative indices is not supported. | | __gt__(...) | x.__gt__(y) <==> x>y | | __hash__(...) | x.__hash__() <==> hash(x) | | __iter__(...) | x.__iter__() <==> iter(x) | | __le__(...) | x.__le__(y) <==> x<=y | | __len__(...) | x.__len__() <==> len(x) | | __lt__(...) | x.__lt__(y) <==> x x*n | | __ne__(...) | x.__ne__(y) <==> x!=y | | __rmul__(...) | x.__rmul__(n) <==> n*x | | __sizeof__(...) | T.__sizeof__() -- size of T in memory, in bytes | | count(...) | T.count(value) -> integer -- return number of occurrences of value | | index(...) | T.index(value, [start, [stop]]) -> integer -- return first index of value. | Raises ValueError if the value is not present. FUNCTIONS decode(txt, encoding=None, **kwargs) Decodes a JSON-encoded string into a Python object. == Optional arguments == * 'encoding' (string, default None) This argument provides a hint regarding the character encoding that the input text is assumed to be in (if it is not already a unicode string type). If set to None then autodetection of the encoding is attempted (see discussion above). Otherwise this argument should be the name of a registered codec (see the standard 'codecs' module). * 'strict' (Boolean, default False) If 'strict' is set to True, then those strings that are not entirely strictly conforming to JSON will result in a JSONDecodeError exception. * 'return_errors' (Boolean, default False) Controls the return value from this function. If False, then only the Python equivalent object is returned on success, or an error will be raised as an exception. If True then a 2-tuple is returned: (object, error_list). The error_list will be an empty list [] if the decoding was successful, otherwise it will be a list of all the errors encountered. Note that it is possible for an object to be returned even if errors were encountered. * 'return_stats' (Boolean, default False) Controls whether statistics about the decoded JSON document are returns (and instance of decode_statistics). If True, then the stats object will be added to the end of the tuple returned. If return_errors is also set then a 3-tuple is returned, otherwise a 2-tuple is returned. * 'write_errors' (Boolean OR File-like object, default False) Controls what to do with errors. - If False, then the first decoding error is raised as an exception. - If True, then errors will be printed out to sys.stderr. - If a File-like object, then errors will be printed to that file. The write_errors and return_errors arguments can be set independently. * 'filename_for_errors' (string or None) Provides a filename to be used when writting error messages. * 'allow_xxx', 'warn_xxx', and 'forbid_xxx' (Booleans) These arguments allow for fine-adjustments to be made to the 'strict' argument, by allowing or forbidding specific syntaxes. There are many of these arguments, named by replacing the "xxx" with any number of possible behavior names (See the JSON class for more details). Each of these will allow (or forbid) the specific behavior, after the evaluation of the 'strict' argument. For example, if strict=True then by also passing 'allow_comments=True' then comments will be allowed. If strict=False then forbid_comments=True will allow everything except comments. Unicode decoding: ----------------- The input string can be either a python string or a python unicode string (or a byte array in Python 3). If it is already a unicode string, then it is assumed that no character set decoding is required. However, if you pass in a non-Unicode text string (a Python 2 'str' type or a Python 3 'bytes' or 'bytearray') then an attempt will be made to auto-detect and decode the character encoding. This will be successful if the input was encoded in any of UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE), and of course plain ASCII works too. Note though that if you know the character encoding, then you should convert to a unicode string yourself, or pass it the name of the 'encoding' to avoid the guessing made by the auto detection, as with python_object = demjson.decode( input_bytes, encoding='utf8' ) Callback hooks: --------------- You may supply callback hooks by using the hook name as the named argument, such as: decode_float=decimal.Decimal See the hooks documentation on the JSON.set_hook() method. decode_file(filename, encoding=None, **kwargs) Decodes JSON found in the given file. See the decode() function for a description of other possible options. determine_float_limits(number_type=) Determines the precision and range of the given float type. The passed in 'number_type' argument should refer to the type of floating-point number. It should either be the built-in 'float', or decimal context or constructor; i.e., one of: # 1. FLOAT TYPE determine_float_limits( float ) # 2. DEFAULT DECIMAL CONTEXT determine_float_limits( decimal.Decimal ) # 3. CUSTOM DECIMAL CONTEXT ctx = decimal.Context( prec=75 ) determine_float_limits( ctx ) Returns a named tuple with components: ( significant_digits, max_exponent, min_exponent ) Where: * significant_digits -- maximum number of *decimal* digits that can be represented without any loss of precision. This is conservative, so if there are 16 1/2 digits, it will return 16, not 17. * max_exponent -- The maximum exponent (power of 10) that can be represented before an overflow (or rounding to infinity) occurs. * min_exponent -- The minimum exponent (negative power of 10) that can be represented before either an underflow (rounding to zero) or a subnormal result (loss of precision) occurs. Note this is conservative, as subnormal numbers are excluded. determine_float_precision() # For backwards compatibility with older demjson versions: encode(obj, encoding=None, **kwargs) Encodes a Python object into a JSON-encoded string. * 'strict' (Boolean, default False) If 'strict' is set to True, then only strictly-conforming JSON output will be produced. Note that this means that some types of values may not be convertable and will result in a JSONEncodeError exception. * 'compactly' (Boolean, default True) If 'compactly' is set to True, then the resulting string will have all extraneous white space removed; if False then the string will be "pretty printed" with whitespace and indentation added to make it more readable. * 'encode_namedtuple_as_object' (Boolean or callable, default True) If True, then objects of type namedtuple, or subclasses of 'tuple' that have an _asdict() method, will be encoded as an object rather than an array. If can also be a predicate function that takes a namedtuple object as an argument and returns True or False. * 'indent_amount' (Integer, default 2) The number of spaces to output for each indentation level. If 'compactly' is True then indentation is ignored. * 'indent_limit' (Integer or None, default None) If not None, then this is the maximum limit of indentation levels, after which further indentation spaces are not inserted. If None, then there is no limit. CONCERNING CHARACTER ENCODING: The 'encoding' argument should be one of: * None - The return will be a Unicode string. * encoding_name - A string which is the name of a known encoding, such as 'UTF-8' or 'ascii'. * codec - A CodecInfo object, such as as found by codecs.lookup(). This allows you to use a custom codec as well as those built into Python. If an encoding is given (either by name or by codec), then the returned value will be a byte array (Python 3), or a 'str' string (Python 2); which represents the raw set of bytes. Otherwise, if encoding is None, then the returned value will be a Unicode string. The 'escape_unicode' argument is used to determine which characters in string literals must be \u escaped. Should be one of: * True -- All non-ASCII characters are always \u escaped. * False -- Try to insert actual Unicode characters if possible. * function -- A user-supplied function that accepts a single unicode character and returns True or False; where True means to \u escape that character. Regardless of escape_unicode, certain characters will always be \u escaped. Additionaly any characters not in the output encoding repertoire for the encoding codec will be \u escaped as well. encode_to_file(filename, obj, encoding='utf-8', overwrite=False, **kwargs) Encodes a Python object into JSON and writes into the given file. If no encoding is given, then UTF-8 will be used. See the encode() function for a description of other possible options. If the file already exists and the 'overwrite' option is not set to True, then the existing file will not be overwritten. (Note, there is a subtle race condition in the check so there are possible conditions in which a file may be overwritten) extend_and_flatten_list_with_sep(orig_seq, extension_seq, separator='') extend_list_with_sep(orig_seq, extension_seq, sepchar='') skipstringsafe(s, start=0, end=None) skipstringsafe_slow(s, start=0, end=None) smart_sort_transform(key) DATA ALLOW = 'allow' FORBID = 'forbid' NUMBER_AUTO = 'auto' NUMBER_DECIMAL = 'decimal' NUMBER_FLOAT = 'float' NUMBER_FORMAT_BINARY = 'binary' NUMBER_FORMAT_DECIMAL = 'decimal' NUMBER_FORMAT_HEX = 'hex' NUMBER_FORMAT_LEGACYOCTAL = 'legacyoctal' NUMBER_FORMAT_OCTAL = 'octal' SORT_ALPHA = 'alpha' SORT_ALPHA_CI = 'alpha_ci' SORT_NONE = 'none' SORT_PRESERVE = 'preserve' SORT_SMART = 'smart' STRICTNESS_STRICT = 'strict' STRICTNESS_TOLERANT = 'tolerant' STRICTNESS_WARN = 'warn' WARN = 'warn' __author__ = 'Deron Meranda ' __credits__ = 'Copyright (c) 2006-2015 Deron E. Meranda CREDITS Copyright (c) 2006-2015 Deron E. Meranda Licensed under GNU LGPL (GNU Lesser General Public License) version 3.0 or later. See LICENSE.txt included with this software. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see or . demjson-2.2.4/docs/jsonlint.txt0000664000076400007640000001444712636325541016076 0ustar demdem00000000000000Usage: jsonlint [ ...] [--] inputfile.json ... With no input filename, or "-", it will read from standard input. The return status will be 0 if the file is conforming JSON (per the RFC 7159 specification), or non-zero otherwise. GENERAL OPTIONS: -v | --verbose Show details of lint checking -q | --quiet Don't show any output (except for reformatting) STRICTNESS OPTIONS (WARNINGS AND ERRORS): -W | --tolerant Be tolerant, but warn about non-conformance (default) -s | --strict Be strict in what is considered conforming JSON -S | --nonstrict Be tolerant in what is considered conforming JSON --allow=... -\ --warn=... |-- These options let you pick specific behaviors. --forbid=... -/ Use --help-behaviors for more STATISTICS OPTIONS: --stats Show statistics about JSON document REFORMATTING OPTIONS: -f | --format Reformat the JSON text (if conforming) to stdout -F | --format-compactly Reformat the JSON simlar to -f, but do so compactly by removing all unnecessary whitespace -o filename | --output filename The filename to which reformatted JSON is to be written. Without this option the standard output is used. --[no-]keep-format Try to preserve numeric radix, e.g., hex, octal, etc. --html-safe Escape characters that are not safe to embed in HTML/XML. --sort How to sort object/dictionary keys, is one of: alpha - Sort strictly alphabetically alpha_ci - Sort alphabetically case-insensitive preserve - Preserve original order when reformatting smart - Sort alphabetically and numerically (DEFAULT) --indent tabs | Number of spaces to use per indentation level, or use tab characters if "tabs" given. UNICODE OPTIONS: -e codec | --encoding=codec Set both input and output encodings --input-encoding=codec Set the input encoding --output-encoding=codec Set the output encoding These options set the character encoding codec (e.g., "ascii", "utf-8", "utf-16"). The -e will set both the input and output encodings to the same thing. The output encoding is used when reformatting with the -f or -F options. Unless set, the input encoding is guessed and the output encoding will be "utf-8". OTHER OPTIONS: --recursion-limit=nnn Set the Python recursion limit to number --leading-zero-radix=8|10 The radix to use for numbers with leading zeros. 8=octal, 10=decimal. REFORMATTING / PRETTY-PRINTING: When reformatting JSON with -f or -F, output is only produced if the input passed validation. By default the reformatted JSON will be written to standard output, unless the -o option was given. The default output codec is UTF-8, unless an encoding option is provided. Any Unicode characters will be output as literal characters if the encoding permits, otherwise they will be \u-escaped. You can use "--output-encoding ascii" to force all Unicode characters to be escaped. MORE INFORMATION: Use 'jsonlint --version [-v]' to see versioning information. Use 'jsonlint --copyright' to see author and copyright details. Use 'jsonlint [-W|-s|-S] --help-behaviors' for help on specific checks. jsonlint is distributed as part of the "demjson" Python module. See http://deron.meranda.us/python/demjson/ BEHAVIOR OPTIONS: These set of options let you control which checks are to be performed. They may be turned on or off by listing them as arguments to one of the options --allow, --warn, or --forbid ; for example: jsonlint --allow comments,hex-numbers --forbid duplicate-keys The default shown is for strict mode Default Behavior_name Description ------- ------------------------- -------------------------------------------------- forbid all-numeric-signs Numbers may be prefixed by any '+' and '-', e.g., +4, -+-+77 allow any-type-at-start A JSON document may start with any type, not just arrays or objects forbid binary-numbers Binary numbers, e.g., 0b1001 warn bom A JSON document may start with a Unicode BOM (Byte Order Mark) forbid comments JavaScript comments, both /*...*/ and //... styles forbid control-char-in-string Strings may contain raw control characters without \u-escaping warn duplicate-keys Objects may have repeated keys forbid extended-unicode-escapes Extended Unicode escape sequence \u{..} for non-BMP characters forbid format-control-chars Unicode "format control characters" may appear in the input forbid hex-numbers Hexadecimal numbers, e.g., 0x1f forbid identifier-keys JavaScript identifiers are converted to strings when used as object keys forbid initial-decimal-point Floating-point numbers may start with a decimal point (no units digit) forbid js-string-escapes All JavaScript character \-escape sequences may be in strings forbid leading-zeros Numbers may have extra leading zeros (see --leading-zero-radix option) forbid non-numbers Non-numbers may be used, such as NaN or Infinity warn non-portable Anything technically valid but likely to cause data portablibity issues forbid nonescape-characters Unknown character \-escape sequences stand for that character (\Q -> 'Q') forbid nonstring-keys Value types other than strings (or identifiers) may be used as object keys forbid octal-numbers New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals) forbid omitted-array-elements Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3] forbid single-quoted-strings Strings may be delimited with both double (") and single (') quotation marks forbid trailing-comma A final comma may end the list of array or object members forbid trailing-decimal-point Floating-point number may end with a decimal point and no following fractional digits forbid undefined-values The JavaScript 'undefined' value may be used forbid unicode-whitespace Treat any Unicode whitespace character as valid whitespace warn zero-byte Strings may contain U+0000, which may not be safe for C-based programs demjson-2.2.4/docs/NEWS.txt0000664000076400007640000000203312636325442014776 0ustar demdem00000000000000News announcements regarding the demjson python module. See the file CHANGES.txt for more details. 2015-12-22 Release 2.2.4, jsonlint -f stdout bugfix under Python 3. 2014-11-12 Release 2.2.3, jsonlint return value bugfix, unit test fixes. 2014-06-25 Release 2.2.2, Python 3 installation fixes. 2014-06-24 Release 2.2.1, Minor bugfix and html-safe option. 2014-06-20 Release 2.2, Python 2.6, narrow-Unicode support, number enhancements. 2014-05-26 Release 2.0.1, No changes. Bumping version because of confused checksums. 2014-05-21 Release 2.0, major enhancements, Python 3, callback hooks, etc. 2011-04-01 Release 1.6, bug fix in jsonlint 2010-10-10 Release 1.5, bug fix with code point U+00FF 2008-12-17 Release 1.4, license changed to LGPL 3 (no code changes). 2008-03-19 Release 1.3, bug fixes, numeric improvements, Decimal, more collections types. 2007-11-08 Release 1.2, bug fixes, speed improvements, jsonlint, license now GPL 3. 2006-11-06 Release 1.1, bug fix and minor enhancements. 2006-08-10 Release 1.0, initial release. demjson-2.2.4/docs/PYTHON3.txt0000664000076400007640000001356412337144342015335 0ustar demdem00000000000000Using demjson with Python 3 =========================== Starting with release 2.0, demjson and jsonlint, can support either Python 2 or Python 3 -- though it must be installed as described below to work with Python 3. Be aware that the API will have slightly different behavior in Python 3, mainly because it uses the 'bytes' type in a few places. Installing for Python 3 ======================= The source for the demjson module is written for Python 2.x. However, since release 2.0, it has been designed to be converted to a Python 3 equivalent form by using the standard "2to3" Python conversion utility. If you have installed demjson with a standard PyPI package distribution mechanism; such as pip, easy_install, or just typing "python3 setup.py install"; then the 2to3 conversion will be performed automatically as part of the installation process. Running self-tests: if you install using a PyPI distribution mechanism then the test program "tests/test_demjson.py" may either not be installed or may not have been converted to Python 3. You can do this manually if needed by: cd test 2to3 -w test_demjson.py PYTHONPATH=.. python3 test_demjson.py Bytes versus strings ==================== When calling demjson functions and classes from a Python 3 environment, be aware that there are a few differences from what is documented for Python 2. Most of these differences involve Python's byte-oriented types ('bytes', 'bytearray', and 'memoryview'). Decoding JSON into Python values -------------------------------- When you decode a JSON document you can pass either a string or a bytes type. If you pass a string, then it is assumed to already be a sequence of Unicode characters. So demjson's own Unicode decoding step will be skipped. When you pass a byte-oriented type the decode() function will attempt to detect the Unicode encoding and appropriately convert the bytes into a Unicode string first. You can override the guessed encoding by specifying the appropriate codec name, or codec object. For example, the following are equivalent and have the same result: demjson.decode( '"\u2014"' ) demjson.decode( b'"\xe2\x80\x94"' ) demjson.decode( bytes([ 0x22, 0xE2, 0x80, 0x94, 0x22 ]) ) Notice that with the last two examples the decode() function has automatically detected that the byte array was UTF-8 encoded. You can of course pass in an 'encoding' argument to force the Unicode decoding codec to use -- though if you get this wrong a UnicodeDecodeError may be raised. Reading JSON from a file ------------------------ When reading from a file the bytes it contains must be converted into Unicode characters. If you want demjson to do that be sure to open the file in binary mode: json_data = open("myfile.json", "rb").read() # => json_data is a bytes array py_data = demjson.decode( json_data, encoding="utf8" ) But if you read the file in text mode then the Unicode decoding is done by Python's own IO core, and demjson will parse the already-Unicode string without doing any decoding: json_data = open("myfile.json", "r", encoding="utf8").read() # => json_data is a (unicode) string py_data = demjson.decode( json_data ) Encoding Python values to JSON ------------------------------ When encoding a Python value into a JSON document, you will generally get a string result (which is a sequence of Unicode characters). However if you specify a particular encoding, then you will instead get a byte array as a result. demjson.encode( "\u2012" ) # => Returns a string of length 3 demjson.encode( "\u2012", encoding="utf-8" ) # => Returns 5 bytes b'"\xe2\x80\x94"' Writing JSON to a file ---------------------- When generating JSON and writing it to a file all the Unicode characters must be encoded into bytes. You can let demjson do that by specifying an encoding, though be sure that you open the output file in binary mode: json_data = demjson.encode( py_data, encoding="utf-8" ) # json_data will be a bytes array open("myfile.json", "wb" ).write( json_data ) The above has the advantage that demjson can automatically adjust the \u-escaping depending on the output encoding. But if you don't ask for any encoding you'll get the JSON output as a Unicode string, in which case you need to open your output file in text mode with a specific encoding. You must choose a suitable encoding or you could get a UnicodeEncodeError. json_data = demjson.encode( py_data ) # json_data will be a (unicode) string open("myfile.json", "w", encoding="utf-8" ).write( json_data ) Encoding byte types ------------------- If you are encoding into JSON and the Python value you pass is, or contains, any byte-oriented type ('bytes', 'bytearray', or 'memoryview') value; then the bytes must be converted into a different value that can be represented in JSON. The default is to convert bytes into an array of integers, each with a value from 0 to 255 representing a single byte. For example: py_data = b'\x55\xff' demjson.encode( py_data ) # Gives => '[85,255]' You can supply a function to the 'encode_bytes' hook to change how bytes get encoded. def to_hex( bytes_val ): return ":".join([ "%02x" % b for b in bytes_val ]) demjson.encode( py_data, encode_bytes=to_hex ) # Gives => '"55:ff"' See the 'encode_bytes' hook description in HOOKS.txt for further details. Other Python 3 specifics ======================== Data types ---------- When encoding JSON, most of the new data types introduced with Python 3 will be encoded. Note only does this include the byte-oriented types, but also Enum and ChainMap.

Chained exceptions ------------------ Any errors that are incidentally raised during JSON encoding or decoding will be wrapped in a 'JSONError' (or subclass). In Python 3, this wrapping uses the standard Exception Chaining (PEP 3134) mechanism.

See the Exception Handling example in the file HOOKS.txt demjson-2.2.4/docs/INSTALL.txt0000664000076400007640000000360512337144342015332 0ustar demdem00000000000000Requirements for demjson ======================== demjson is a pure Python module; it does not contain any C code extensions. It also does not have any dependencies on any third-party modules; it only uses the standard Python library. It will work with both Python 2 and Python 3; though at least Python 2.4 is recommended. If you plan to use it with Python 3, also read the "docs/PYTHON3.txt" file included with this release. Note: for full Unicode support of non-BMP (Basic Multilingual Plane) characters, your Python interpreter must have been compiled for UCS-4 support. You can check this with: import sys sys.maxunicode > 0xffff # If True you have UCS-4 support Installation of demjson ======================= This software is published in the Python Package Index (PYPI), at , which may make it easy to install. If your system has either "pip" or "easy_install", then you may try one of the following commands to install this module: pip install demjson pip-python install demjson easy_install demjson Otherwise, you can install it by downloading the distrubtion and unpacking it in some temporary directory. Then inside that directory, type: python setup.py install Optionally, for a minimal installation, you can also just copy the "demjson.py" file into your own project's directory. jsonlint command ================ The installation should have installed the script file "jsonlint". If it did not, you can simply copy the file "jsonlint" to any directory in your executable PATH, such as into /usr/local/bin under Unix/Linux. Make sure the script is set as being executable, and if needed adjust the first "#!" line to point to your python interpreter. Running self tests ================== Self tests are included which conform to the Python unittest framework. To run these tests, do cd test PYTHONPATH=.. python test_demjson.py demjson-2.2.4/docs/CHANGES.txt0000664000076400007640000007732412636326424015312 0ustar demdem00000000000000Change history for demjson python module. Version 2.2.4 released 2015-12-22 ================================= * Fix problem with jsonlint under Python 3 when trying to reformat JSON (-f or -F options) and writing the output to standard output. Version 2.2.3 released 2014-11-12 ================================= * Fix return value of "jsonlint" command. It should return a non-zero value when an error is reported. GitHub Issue 12: https://github.com/dmeranda/demjson/issues/12 * Fix unit test failure in 32-bit Python 2.x environment. This bug only affected the unit tests, and was not a problem in the code demjson module. GitHub Issue 13: https://github.com/dmeranda/demjson/issues/13 Version 2.2.2 released 2014-06-25 ================================= This minor release only fixes installation issues in older Python 3 environments (< 3.4). If you are using Python 2 or Python 3.4, then there is nothing new. Once installed, there were no other changes to the API or any aspect of demjson operation. * Workarounds for bugs in Python's '2to3' conversion tool in Python versions prior to 3.3.3. This was Python bug 18037: * The setup.py will install correctly in Python 3 environments that do not have the 'setuptools' module installed. It can now make use of the more limited 'distutils' module instead. * The unit tests will now work without generating DeprecationWarning messages under certain Python 3 versions. Version 2.2.1 released 2014-06-24 ================================= Minor changes. * HTML use: A new encoding option, 'html_safe', is available when encoding to JSON to force any characters which are not considered to be HTML-safe (or XML-safe) to be encoded. This includes '<', '>', '&', and '/' -- among other characters which are always escaped regardless of this new option. This is useful when applications attempt to embed JSON into HTML and are not prepared to do the proper escaping. For jsonlint use '--html-safe'. $ echo '"h"elloworld]]>" | jsonlint -f --html-safe "h\u0026quot;ello\u003c\/script\u003eworld]]\u003e" See also CVE-2009-4924 for a similar report in another JSON package for needing a way to do HTML-safe escaping. * Bug fix: If you created an instance of the 'json_options' class to store any options, and then attempted to make a copy with it's copy() method, not all of the options stored within it were copied. This bug is very unlikely to occur, but it was fixed regardless. * Tests: The included self-test scripts for demjson should now pass all tests when running under a narrow-Unicode version of Python. It should be noted that the statistics for strings (max length, max codepoint, etc.) may not be correct under a narrow-Unicode Python. This is a known issue that is likely not to be fixed. Version 2.2 released 2014-06-20 ================================= [Note, version 2.1 was never released.] This release fixes compatibility with Python 2.6 and any Python compiled with narrow-Unicode; fixes bugs with statistics; adds new file I/O convenience functions; as well as adding many new enhancements concerning the treatment of numbers and floating-point values. * Python 2.6 support is now fixed (tested with 2.6.9). * Narrow-Unicode: demjson now works correctly when used with a Python that was compiled with narrow-Unicode (BMP only) support; i.e., when sys.maxunicode == 0xFFFF. Note that narrow-Pythons simulate non-BMP characters as a UTF-16 surrogate-pair encoding; so string lengths, ord(), and other such operations on Python strings may be surprising: len( u"\U00010030" ) # => 1 for wide-Pythons len( u"\U00010030" ) # => 2 for narrow-Pythons With this release you may encode and decode any Unicode character, including non-BMP characters, with demjson just as if Python was compiled for wide-Unicode. * Statistics bug: In certain cases some of the decoding statistics results -- obtained by passing 'return_stats=True' to the decode() function -- were not getting set with the correct count. For example the 'num_bools' item may not have reflected the total number of 'true' or 'false' identifiers appearing in the JSON document. This has now been fixed, and more thorough test cases added to the test suite. * Negative NaN bug: Fixed a bug when decoding the JavaScript literal "-NaN" (a negative NaN) that caused a decoding error. Now it correctly produces a Python equivalent NaN (not-a-number) value. Since the sign of a NaN is insignificant, encountering a "-NaN" which would have triggered the bug was quite unlikely. * decode_file: A convenience function 'decode_file()' has been added which wraps the 'decode()' function and which reads the JSON document from a file. It will correctly open the file in binary mode and insure the file is closed. All other options supported by decode() can be passed. data = decode_file( "sample.json", allow_comments=True ) * encode_to_file: A convenience function 'encode_to_file()' has been added which wraps the 'encode()' function and which writes the resultant JSON document into a file. It will correctly open the file in binary mode and insure it is properly closed. By default encode the JSON will be encoded as UTF-8 unless otherwise specified with the 'encoding' option. All other options supported by encode() can be passed. This function will also refuse to overwrite any existing file unless the 'overwrite' option is set to True. encode_to_file( "sample.json", data, overwrite=True ) * Number radix: When reformatting with jsonlint, in non-strict mode, the original radix of numbers will be preserved (controlled with the '--[no]-keep-format' option). If a number in a JSON/JavaScript text was hex (0x1C), octal (0o177, 0177), or binary (0b1011); then it will stay in that format rather than being converted to decimal. $ echo '[10, 0xA, 012, 0b1010]' | jsonlint -f --nonstrict [ 10, 0xa, 012, 0b1010 ] Correspondinly, in the decode() function there is a new option 'keep_format' that when True will return non-decimal integer values as a type 'json_int'; which is a subclass of the standard int, but that additionally remembers the original radix format (hex,etc.). * Integer as Float: There is a new option, int_as_float, that allows you to decode all numbers as floating point rather than distinguishing integers from floats. This allows you to parse JSON exactly as JavaScript would do, as it lacks an integer type. demjson.decode( '123', int_as_float=True ) # => 123.0 * Float vs Decimal: You can now control the promotion of 'float' to 'decimal.Decimal' numbers. Normally demjson will try to keep floating-point numbers as the Python 'float' type, unless there is an overflow or loss of precision, in which case it will use 'decimal.Decimal' instead. The new option 'float_type' can control this type selection: float_type = demjson.NUMBER_AUTO # The default float_type = demjson.NUMBER_DECIMAL # Always use decimal float_type = demjson.NUMBER_FLOAT # Always use float Do note that if using NUMBER_FLOAT -- which disables the promotion to the decimal.Decimal type -- that besides possible loss of precision (significant digits) that numeric underflow or overflow can also occur. So very large numbers may result in 'inf' (infinity) and small numbers either in subnormal values or even just zero. Normally when demjson encounters the JavaScript keywords 'NaN', 'Infinity', and '-Infinity' it will decode them as the Python float equivalent (demjson.nan, demjson.inf, and demjson.neginf). However if you use NUMBER_DECIMAL then these will be converted to decimal equivalents instead: Decimal('NaN'), Decimal('Infinity'), and Decimal('-Infinity'). * Significant digits: When reformatting JSON, the jsonlint command will now try to preserve all significant digits present in floating-point numbers when possible. $ echo "3.141592653589793238462643383279502884197169399375105820974944592307816406286" | \ jsonlint -f --quiet 3.141592653589793238462643383279502884197169399375105820974944592307816406286 * Decimal contexts: The Python 'decimal' module allows the user to establish different "contexts", which among other things can change the number of significant digits kept, the maximum exponent value, and so on. If the default context is not sufficient (which allows 23 significant digits), you can tell demjson to use a different context by setting the option 'decimal_context'. It may take several values: 'default' -- Use Python's default: decimal.DefaultContext 'basic' -- Use Python's basic: decimal.BasicContext 'extended' -- Use Python's extended: decimal.ExtendedContext 123 -- Creates a context with the number of significant digits. -- Any instance of the class decimal.Context. This option is only available in the programming interface and is not directly exposed by the jsonlint command. import decimal, demjson myctx = decimal.Context( prec=50, rounding=decimal.ROUND_DOWN ) data = demjson.decode_file( "data.json", decimal_context=myctx ) Note that Python's Decimal class will try to "store" all the significant digits originally present, including excess tailing zeros. However any excess digits beyond the context's configuration will be lost as soon as any operation is performed on the value. Version 2.1 - NOT RELEASED ============================ n/a Version 2.0.1 released 2014-05-26 ================================= This is a re-packaging of 2.0, after discovering problems with incorrect checksums in the PyPI distribution of 2.0. No changes were made from 2.0. Version 2.0 released 2014-05-21 =============================== This is a major new version that contains many added features and enhanced functionality, as well as a small number of backwards incompatibilities. Where possible these incompatibilities were kept to a minimum, however it is highly recommended that you read these change notes thoroughly. Major changes ------------- * Python 2.6 minimum: Support has been dropped for really old versions of Python. At least Python 2.6 or better is now required. * Python 3: This version works with both Python 2 and Python 3. Support for Python 3 is achieved with the 2to3 conversion program. When installing with setup.py or a PyPI distribution mechanism such as pip or easy_install, this conversion should automatically happen. Note that the API under Python 3 will be slightly different. Mainly new Python types are supported. Also there will be some cases in which byte array types are used or returned rather than strings. Read the file "docs/PYTHON3.txt" for complete information. * RFC 7159 conformance: The latest RFC 7159 (published March 2014 and which superseded RFCs 4627 and 7158) relaxes the constraint that a JSON document must start with an object or array. This also brings it into alignment with the ECMA-404 standard. Now any JSON value type is a legal JSON document. * Improved lint checking: A new JSON parsing engine has many improvements that make demjson better at "lint checking": - Generation of warnings as well as errors. - The position, line and column, of errors is reported; and in a standardized format. - Detection of potential data portability problems even when there are no errors. - Parser recovery from many errors, allowing for the reporting of multiple errors/warnings in one shot. * Statistics: The decoder can record and supply statistics on the input JSON document. This includes such things as the length of the longest string, the range of Unicode characters encountered, if any integers are larger than 32-bit, 64-bit, or more; and much more. Use the --stats option of the jsonlint command. * Callback hooks: This version allows the user to provide a number of different callback functions, or hooks, which can do special processing. For example when parsing JSON you could detect strings that look like dates, and automatically convert them into Python datetime objects instead. Read the file "docs/HOOKS.txt" for complete information. * Subclassing: Subclassing the demjson.JSON class is now highly discouraged as this version as well as future changes may alter the method names and parameters. In particular overriding the encode_default() method has been dropped; it will no longer work. The new callback hooks (see above) should provide a better way to achieve most needs that previously would have been done with subclassing. Data type support ----------------- * Python 3 types: Many new types introduced with Python 3 are directly supported, when running in a Python 3 environment. This includes 'bytes', 'bytearray', 'memoryview', 'Enum', and 'ChainMap'. Read the file "docs/PYTHON3.txt" for complete information. * Dates and times: When encoding to JSON, most of Python's standard date and time types (module 'datetime') will be automatically converted into the most universally-portable format; usually one of the formats specified by ISO-8601, and when possible the stricter syntax specified by the RFC 3339 subset. datetime.date Example output is "2014-02-17". datetime.datetime Example output is "2014-02-17T03:58:07.692005-05:00". The microseconds portion will not be included if it is zero. The timezone offset will not be present for naive datetimes, or will be the letter "Z" if UTC. datetime.time Example output is "T03:58:07.692005-05:00". Just like for datetime, the microseconds portion will not be included if it is zero. The timezone offset will not be present for naive datetimes, or will be the letter "Z" if UTC. datetime.timedelta Example output is "P2DT6H17M23.873S", which is the ISO 8601 standard format for time durations. It is possible to override the formats used with options, which all default to "iso". Generally you may provide a format string compatible with the strftime() method. For timedelta the only choices are "iso" or "hms". import demjson, datetime demjson.encode( datetime.date.today(), date_format="%m/%d/%Y" ) # gives => "02/17/2014" demjson.encode( datetime.datetime.now(), datetime_format="%a %I:%M %p" ) # gives => "Mon 08:24 AM" demjson.encode( datetime.datetime.time(), datetime_format="%H hours %M min" ) # gives => "08 hours 24 min" demjson.encode( datetime.timedelta(1,13000), timedelta_format="hms" ) # gives => "1 day, 3:36:40" * Named tuples: When encoding to JSON, all named tuples (objects of Python's standard 'collections.namedtuple' type) are now encoded into JSON as objects rather than as arrays. This behavior can be changed with the 'encode_namedtuple_as_object' argument to False, in which case they will be treated as a normal tuple. from collections import namedtuple Point = namedtuple('Point', ['x','y']) p = Point(5, 8) demjson.encode( p ) # gives => {"x":5, "y":8} demjson.encode( p, encode_namedtuple_as_object=False ) # gives => [5, 8] This behavior also applies to any object that follows the namedtuple protocol, i.e., which are subclasses of 'tuple' and that have an "_asdict()" method. Note that the order of keys is not necessarily preserved, but instead will appear in the JSON output alphabetically. * Enums: When encoding to JSON, all enumeration values (objects derived from Python's standard 'enum.Enum' type, introducted in Python 3.4) can be encoded in several ways. The default is to encode the name as a string, though the 'encode_enum_as' option can change this. import demjson, enum class Fruit(enum.Enum): apple = 1 bananna = 2 demjson.encode( Fruit.bananna, encode_enum_as='name' ) # Default # gives => "bananna" demjson.encode( Fruit.bananna, encode_enum_as='qname' ) # gives => "Fruit.bananna" demjson.encode( Fruit.bananna, encode_enum_as='value' ) # gives => 2 * Mutable strings: Support for the old Python mutable strings (the UserDict.MutableString type) has been dropped. That experimental type had already been deprecated since Python 2.6 and removed entirely from Python 3. If you have code that passes a MutableString to a JSON encoding function then either do not upgrade to this release, or first convert such types to standard strings before JSON encoding them. Unicode and codec support ------------------------- * Extended Unicode escapes: When reading JSON in non-strict mode any extended Unicode escape sequence, such as "\u{102E3C}", will be processed correctly. This new escape sequence syntax was introduced in the latest versions of ECMAScript to make it easier to encode non-BMP characters into source code; they are not however allowed in strict JSON. * Codecs: The 'encoding' argument to the decode() and encode() functions will now accept a codec object as well as an encoding name; i.e., any subclass of 'codecs.CodecInfo'. All \u-escaping in string literals will be automatically adjusted based on your custom codec's repertoire of characters. * UTF-32: The included functions for UTF-32/UCS-4 support (missing from older versions of Python) are now presented as a full-blown codec class: 'demjson.utf32'. It is completely compatible with the standard codecs module. It is normally unregisted, but you may register it with the Python codecs system by: import demjson, codecs codecs.register( demjson.utf32.lookup ) * Unicode errors: During reading or writing JSON as raw bytes (when an encoding is specified), any Unicode errors are now wrapped in a JSON error instead. - UnicodeDecodeError is transformed into JSONDecodeError - UnicodeEncodeError is transformed into JSONEncodeError The original exception is made available inside the top-most error using Python's Exception Chaining mechanism (described in the Errors and Warnings change notes). * Generating Unicode escapes: When outputting JSON certain additional characters in strings will now always be \u-escaped to increase compatibility with JavaScript. This includes line terminators (which are forbidden in JavaScript string literals) as well as format control characters (which any JavaScript implementation is allowed to ignore if it chooses per the ECMAscript standard). This essentially means that characters in any of the Unicode categories of Cc, Cf, Zl, and Zp will always be \u-escaped; which includes for example: - U+007F DELETE (Category Cc) - U+00AD SOFT HYPHEN (Category Cf) - U+200F RIGHT-TO-LEFT MARK (Category Cf) - U+2028 LINE SEPARATOR (Category Zl) - U+2029 PARAGRAPH SEPARATOR (Category Zp) - U+E007F CANCEL TAG (Category Cf) Exceptions (Errors) ------------------- * Substitutions: During JSON decoding the parser can recover from some errors. When this happens you may get back a Python representation of the JSON document that has had certain substitutions made: - Bad unicode characters (escapes, etc.) in strings will be substituted with the character U+FFFD , which is reserved by the Unicode standard specifically for this type of use. - Failure to decode a particular value, usually the result of syntax errors, will generally be represented in the Python result as the 'demjson.undefined' singleton object. * Error base type: The base error type 'JSONError' is now a subclass of Python's standard 'Exception' class rather than 'ValueError'. The new exception hierarchy is: Exception . demjson.JSONException . . demjson.JSONSkipHook . . demjson.JSONStopProcessing . . demjson.JSONError . . . demjson.JSONDecodeError . . . . demjson.JSONDecodeHookError . . . demjson.JSONEncodeError . . . . demjson.JSONEncodeHookError If any code had been using 'try...except' blocks with 'ValueError' then you will need to change; preferably to catch 'JSONError'. * Exception chaining: Any errors that are incidentally raised during JSON encoding or decoding, such as UnicodeDecodeError or anything raised by user-supplied hook functions, will now be wrapped inside a standard JSONError (or subclass). When running in Python 3 the standard Exception Chaining (PEP 3134) mechanism is employed. Under Python 2 exception chaining is simulated, but a printed traceback of the original exception may not be printed. The original exception is in the __cause__ member of the outer exception and it's traceback in the __traceback__ member. The jsonlint command -------------------- * The "jsonlint" command script will now be installed by default. * Error message format: All error messages, including warnings and such, now have a standardized output format. This includes the file position (line and column), and any other context. The first line of each message begins with some colon-separated fields: filename, line number, column number, and severity. Subsequent lines of a message are indented. A sample message might be: sample.json:6:0: Warning: Object contains duplicate key: 'title' | At line 6, column 0, offset 72 | Object started at line 1, column 0, offset 0 (AT-START) This format is compatible with many developer tools, such as the emacs 'compile-mode' syntax, which can parse the error messages and place your cursor directly at the point of the error. * jsonlint class: Almost all the logic of the jsonlint script is now available as a new class, demjson.jsonlint, should you want to call it programatically. The included "jsonlint" script file is now just a very small wrapper around that class. * Other jsonlint improvements: - New -o option to specify output filename - Verbosity is on by default, new --quiet option - Output formatting is cleaner, and has options to control indenting - Better help text Other changes ------------- * Sorting of object keys: When generating JSON there is now an option, 'sort_keys', to specify how the items within an object should be sorted. The equivalent option is '--sort' for the jsonlint command. The new default is to do a 'smart' alphabetical-and-numeric sort, so for example keys would be sorted like: { "item-1":1, "ITEM-2":2, "Item-003":3, "item-10":10 } You can sort by any of: SORT_SMART: Smart alpha-numeric SORT_ALPHA: Alphabetical, case-sensitive (in string Unicode order) SORT_ALPHA_CI: Alphabetical, case-insensitive SORT_NONE: None (random, by hash table key) SORT_PRESERVE: Preserve original order if possible. This requires the Python OrderedDict type which was introduced in Python 2.7. For all normal un-ordered dictionary types the sort order reverts to SORT_ALPHA. function: Any user-defined ordering function. * New JavaScript literals: The latest versions of JavaScript (ECMAScript) have introduced new literal syntax. When not in strict mode, demjson will now recognize several of these: - Octal numbers, e.g., 0o731 - Binary numbers, e.g., 0b1011 - Extended unicode escapes, e.g., \u{13F0C} * Octal/decimal radix: Though not permitted in JSON, when in non-strict mode the decoder will allow numbers that begin with a leading zero digit. Traditionally this has always been interpreted as being an octal numeral. However recent versions of JavaScript (ECMAScript 5) have changed the language syntax to interpret numbers with leading zeros a decimal. Therefore demjson allows you to specify which radix should be used with the 'leading_zero_radix' option. Only radix values of 8 (octal) or 10 (decimal) are permitted, where octal remains the default. demjson.decode( '023', strict=False ) # gives => 19 (octal) demjson.decode( '023', strict=False, leading_zero_radix=10 ) # gives => 23 (decimal) The equivalent option for jsonlint is '--leading-zero-radix': $ echo "023" | jsonlint --quiet --format --leading-zero-radix=10 23 * version_info: In addition to the 'demjson.__version__' string, there is a new 'demjson.version_info' object that allows more specific version testing, such as by major version number. Version 1.6 released 2011-04-01 =============================== * Bug fix. The jsonlint tool failed to accept a JSON document from standard input (stdin). Also added a --version and --copyright option support to jsonlint. Thanks to Brian Bloniarz for reporting this bug. * No changes to the core demjson library/module was made, other than a version number bump. Version 1.5 released 2010-10-10 =============================== * Bug fix. When encoding Python strings to JSON, occurances of character U+00FF (ASCII 255 or 0xFF) may result in an error. Thanks to Tom Kho and Yanxin Shi for reporting this bug. Version 1.4 released 2008-12-17 =============================== * Changed license to LGPL 3 (GNU Lesser General Public License) or later. Older versions still retain their original licenses. * No changes other than relicensing were made. Version 1.3 released 2008-03-19 * Change the default value of escape_unicode to False rather than True. * Parsing JSON strings was not strict enough. Prohibit multi-line string literals in strict mode. Also prohibit control characters U+0000 through U+001F inside string literals unless they are \u-escaped. * When in non-strict mode where object keys may be JavaScript identifiers, allow those identifiers to contain '$' and '_'. Also introduce a method, decode_javascript_identifier(), which converts a JavaScript identifier into a Python string type, or can be overridden by a subclass to do something different. * Use the Python decimal module if available for representing numbers that can not fit into a float without loosing precision. Also encode decimal numbers into JSON and use them as a source for NaN and Infinity values if necessary. * Allow Python complex types to be encoded into JSON if their imaginary part is zero. * When parsing JSON numbers try to keep whole numbers as integers rather than floats; e.g., '1e+3' will be 1000 rather than 1000.0. Also make sure that overflows and underflows (even for the larger Decimal type) always result in Infinity or -Infinity values. * Handle more Python collection types when creating JSON; such as deque, set, array, and defaultdict. Also fix a bug where UserDict was not properly handled because of it's unusual iter() behavior. Version 1.2 released 2007-11-08 =============================== * Changed license to GPL 3 or later. Older versions still retain their original licenses. * Lint Validator: Added a "jsonlint" command-line utility for validating JSON data files, and/or reformatting them. * Major performance enhancements. The most signifant of the many changes was to use a new strategy during encoding to use lists and fast list operations rather than slow string concatenation. * Floating-Point Precision: Fixed a bug which could cause loss of precision (e.g., number of significant digits) when encoding floating-point numbers into their JSON representation. Also, the bundled test suite now properly tests floating-point encoding allowing for slight rounding errors which may naturally occur on some platforms. * Very Large Hex Numbers: Fixed a bug when decoding very large hexadecimal integers which could result in the wrong value for numbers larger than 0xffffffff. Note that the language syntax allows such huge numbers, and since Python supports them too this module will decode such numbers. However in practice very few other JSON or Javascript implementations support arbitrary-size integers. Also hex numbers are not valid when in strict mode. * According to the JSON specification a document must start with either an object or an array type. When in strict mode if the first non-whitespace object is any other type it should be considered to be an invalid document. The previous version erroneously decoded any JSON value (e.g., it considered the document "1" to be valid when it should not have done so. Non-strict mode still allows any type, as well as by setting the new behavior flag 'allow_any_type_at_start'. * Exception Handling: Minor improvements in exception handling by removing most cases where unbounded catching was performed (i.e., an "except:" with no specified exception types), excluding during module initialization. This will make the module more caller-friendly, for instance by not catching and "hiding" KeyboardInterrupt or other asyncrhonous exceptions. * Identifier Parsing: The parser allows a more expanded syntax for Javascript identifiers which is more compliant with the ECMAscript standard. This will allow, for example, underscores and dollar signs to appear in identifiers. Also, to provide further information to the caller, rather than converting identifiers into Python strings they are converted to a special string-subclass. Thus they will look just like strings (and pass the "isinstance(x,basestring)" test), but the caller can do a type test to see if the value originated from Javascript identifiers or string literals. Note this only affects the non-strict (non-JSON) mode. * Fixed a liberal parsing bug which would successfully decode JSON ["a" "b"] into Python ['a', 'b'], rather than raising a syntax error for the missing comma. * Fixed a bug in the encode_default() method which raised the wrong kind of error. Thanks to Nicolas Bonardelle. * Added more test cases to the bundled self-test program (see test/test_demjson.py). There are now over 180 individual test cases being checked. Version 1.1 released 2006-11-06 =============================== * Extensive self testing code is now included, conforming to the standard Python unittest framework. See the INSTALL.txt file for instructions. * Corrected character encoding sanity check which would erroneously complain if the input contained a newline or tab character within the first two characters. * The decode() and encode() top-level functions now allow additional keyword arguments to turn specific behaviors on or off that previously could only be done using the JSON class directly. The keyword arguments look like 'allow_comments=True'. Read the function docstrings for more information on this enhancement. * The decoding of supplementary Unicode character escape sequences (such as "\ud801\udc02") was broken on some versions of Python. These are now decoded explicitly without relying on Python so they always work. * Some Unicode encoding and decoding with UCS-4 or UTF-32 were not handled correctly. * Encoding of pseudo-string types from the UserString module are now correctly encoded as if real strings. * Improved simulation of nan, inf, and neginf classes used if the Python interpreter doesn't support IEEE 754 floating point math. * Updated the documentation to describe why this module does not permit multi-line string literals. Version 1.0 released 2006-08-10 =============================== * Initial public release demjson-2.2.4/PKG-INFO0000664000076400007640000000311212636326504013625 0ustar demdem00000000000000Metadata-Version: 1.1 Name: demjson Version: 2.2.4 Summary: encoder, decoder, and lint/validator for JSON (JavaScript Object Notation) compliant with RFC 7159 Home-page: http://deron.meranda.us/python/demjson/ Author: Deron Meranda Author-email: deron.meranda@gmail.com License: GNU LGPL 3.0 Download-URL: http://deron.meranda.us/python/demjson/dist/demjson-2.2.4.tar.gz Description: The "demjson" module, and the included "jsonlint" script, provide methods for encoding and decoding JSON formatted data, as well as checking JSON data for errors and/or portability issues. The jsonlint command/script can be used from the command line without needing any programming. Although the standard Python library now includes basic JSON support (which it did not when demjson was first written), this module provides a much more comprehensive implementation with many features not found elsewhere. It is especially useful for error checking or for parsing JavaScript data which may not strictly be valid JSON data. Keywords: JSON,jsonlint,JavaScript,UTF-32 Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL) Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 3 Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content demjson-2.2.4/README.md0000664000076400007640000001317512636326420014016 0ustar demdem00000000000000demjson ======= demjson is a [Python language](http://python.org/) module for encoding, decoding, and syntax-checking [JSON](http://json.org/) data. It works under both Python 2 and Python 3. It comes with a jsonlint script which can be used to validate your JSON documents for strict conformance to the JSON specification, and to detect potential data portability issues. It can also reformat or pretty-print JSON documents; either by re-indenting or removing unnecessary whitespace. What's new ========== Version 2.2.4 fixes problem with jsonlint under Python 3 when trying to reformat JSON (-f or -F options) and writing the output to standard output. Version 2.2.3 fixes incorrect return values from the "jsonlint" command. Also fixes a minor problem with the included unit tests in certain Python versions. Version 2.2.2 fixes installation problems with certain Python 3 versions prior to Python 3.4. No other changes. Version 2.2.1 adds an enhancement for HTML safety, and a few obscure bug fixes. Version 2.2 fixes compatibility with Python 2.6 and narrow-Unicode Pythons, fixes bugs with statistics, and adds many enhancements to the treatment of numbers and floating-point values. Version 2.0.1 is a re-packaging of 2.0, after discovering problems with incorrect checksums in the PyPI distribution of 2.0. No changes were made from 2.0. Version 2.0, released 2014-05-21, is a MAJOR new version with many changes and improvements. Visit http://deron.meranda.us/python/demjson/ for complete details and documentation. Additional documentation may also be found under the "docs/" folder of the source. The biggest changes in 2.0 include: * Now works in Python 3; minimum version supported is Python 2.6 * Much improved reporting of errors and warnings * Extensible with user-supplied hooks * Handles many additional Python data types automatically * Statistics There are many more changes, as well as a small number of backwards incompatibilities. Where possible these incompatibilities were kept to a minimum, however it is highly recommended that you read the change notes thoroughly. Example use =========== To use demjson from within your Python programs: ```python >>> import demjson >>> demjson.encode( ['one',42,True,None] ) # From Python to JSON '["one",42,true,null]' >>> demjson.decode( '["one",42,true,null]' ) # From JSON to Python ['one', 42, True, None] ``` To check a JSON data file for errors or problems: ```bash $ jsonlint my.json my.json:1:8: Error: Numbers may not have extra leading zeros: '017' | At line 1, column 8, offset 8 my.json:4:10: Warning: Object contains same key more than once: 'Name' | At line 4, column 10, offset 49 | Object started at line 1, column 0, offset 0 (AT-START) my.json:9:11: Warning: Integers larger than 53-bits are not portable | At line 9, column 11, offset 142 my.json: has errors ``` Why use demjson? ================ I wrote demjson before Python had any JSON support in its standard library. If all you need is to be able to read or write JSON data, then you may wish to just use what's built into Python. However demjson is extremely feature rich and is quite useful in certain applications. It is especially good at error checking JSON data and for being able to parse more of the JavaScript syntax than is permitted by strict JSON. A few advantages of demjson are: * It works in old Python versions that don't have JSON built in; * It generally has better error handling and "lint" checking capabilities; * It will automatically use the Python Decimal (bigfloat) class instead of a floating-point number whenever there might be an overflow or loss of precision otherwise. * It can correctly deal with different Unicode encodings, including ASCII. It will automatically adapt when to use \u-escapes based on the encoding. * It generates more conservative JSON, such as escaping Unicode format control characters or line terminators, which should improve data portability. * In non-strict mode it can also deal with slightly non-conforming input that is more JavaScript than JSON (such as allowing comments). * It supports a broader set of Python types during conversion. Installation ============ To install, type: ```bash python setup.py install ``` or optionally just copy the file "demjson.py" to whereever you want. See "docs/INSTALL.txt" for more detailed instructions, including how to run the self-tests. More information ================ See the files under the "docs" subdirectory. The module is also self-documented, so within the python interpreter type: ```python import demjson help(demjson) ``` or from a shell command line: ```bash pydoc demjson ``` The "jsonlint" command script which gets installed as part of demjson has built-in usage instructions as well. Just type: ```bash jsonlint --help ``` Complete documentation and additional information is also available on the project homepage at http://deron.meranda.us/python/demjson/ It is also available on the Python Package Index (PyPI) at http://pypi.python.org/pypi/demjson/ License ======= LGPLv3 - See the included "LICENSE.txt" file. This software is Free Software and is licensed under the terms of the GNU LGPL (GNU Lesser General Public License). More information is found at the top of the demjson.py source file and the included LICENSE.txt file. Releases prior to 1.4 were released under a different license, be sure to check the corresponding LICENSE.txt file included with them. This software was written by Deron Meranda, http://deron.meranda.us/ demjson-2.2.4/LICENSE.txt0000664000076400007640000012460111545424214014355 0ustar demdem00000000000000This software and all accompaning material is licensed under the terms of the "GNU LESSER GENERAL PUBLIC LICENSE" (LGPL) version 3, or at your discretion any later version. The LGPL license is essentially an addendum to the "GNU GENERAL PUBLIC LICENSE" (GPL) which grants some additional rights. As such the complete texts of both the LGPL and GPL licenses are included below. The license texts may also be found, along with additional information, on the Free Software Foundation's website, at . ====================================================================== GNU LESSER GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. This version of the GNU Lesser General Public License incorporates the terms and conditions of version 3 of the GNU General Public License, supplemented by the additional permissions listed below. 0. Additional Definitions. As used herein, "this License" refers to version 3 of the GNU Lesser General Public License, and the "GNU GPL" refers to version 3 of the GNU General Public License. "The Library" refers to a covered work governed by this License, other than an Application or a Combined Work as defined below. An "Application" is any work that makes use of an interface provided by the Library, but which is not otherwise based on the Library. Defining a subclass of a class defined by the Library is deemed a mode of using an interface provided by the Library. A "Combined Work" is a work produced by combining or linking an Application with the Library. The particular version of the Library with which the Combined Work was made is also called the "Linked Version". The "Minimal Corresponding Source" for a Combined Work means the Corresponding Source for the Combined Work, excluding any source code for portions of the Combined Work that, considered in isolation, are based on the Application, and not on the Linked Version. The "Corresponding Application Code" for a Combined Work means the object code and/or source code for the Application, including any data and utility programs needed for reproducing the Combined Work from the Application, but excluding the System Libraries of the Combined Work. 1. Exception to Section 3 of the GNU GPL. You may convey a covered work under sections 3 and 4 of this License without being bound by section 3 of the GNU GPL. 2. Conveying Modified Versions. If you modify a copy of the Library, and, in your modifications, a facility refers to a function or data to be supplied by an Application that uses the facility (other than as an argument passed when the facility is invoked), then you may convey a copy of the modified version: a) under this License, provided that you make a good faith effort to ensure that, in the event an Application does not supply the function or data, the facility still operates, and performs whatever part of its purpose remains meaningful, or b) under the GNU GPL, with none of the additional permissions of this License applicable to that copy. 3. Object Code Incorporating Material from Library Header Files. The object code form of an Application may incorporate material from a header file that is part of the Library. You may convey such object code under terms of your choice, provided that, if the incorporated material is not limited to numerical parameters, data structure layouts and accessors, or small macros, inline functions and templates (ten or fewer lines in length), you do both of the following: a) Give prominent notice with each copy of the object code that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the object code with a copy of the GNU GPL and this license document. 4. Combined Works. You may convey a Combined Work under terms of your choice that, taken together, effectively do not restrict modification of the portions of the Library contained in the Combined Work and reverse engineering for debugging such modifications, if you also do each of the following: a) Give prominent notice with each copy of the Combined Work that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the Combined Work with a copy of the GNU GPL and this license document. c) For a Combined Work that displays copyright notices during execution, include the copyright notice for the Library among these notices, as well as a reference directing the user to the copies of the GNU GPL and this license document. d) Do one of the following: 0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source. 1) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (a) uses at run time a copy of the Library already present on the user's computer system, and (b) will operate properly with a modified version of the Library that is interface-compatible with the Linked Version. e) Provide Installation Information, but only if you would otherwise be required to provide such information under section 6 of the GNU GPL, and only to the extent that such information is necessary to install and execute a modified version of the Combined Work produced by recombining or relinking the Application with a modified version of the Linked Version. (If you use option 4d0, the Installation Information must accompany the Minimal Corresponding Source and Corresponding Application Code. If you use option 4d1, you must provide the Installation Information in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source.) 5. Combined Libraries. You may place library facilities that are a work based on the Library side by side in a single library together with other library facilities that are not Applications and are not covered by this License, and convey such a combined library under terms of your choice, if you do both of the following: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities, conveyed under the terms of this License. b) Give prominent notice with the combined library that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 6. Revised Versions of the GNU Lesser General Public License. The Free Software Foundation may publish revised and/or new versions of the GNU Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library as you received it specifies that a certain numbered version of the GNU Lesser General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that published version or of any later version published by the Free Software Foundation. If the Library as you received it does not specify a version number of the GNU Lesser General Public License, you may choose any version of the GNU Lesser General Public License ever published by the Free Software Foundation. If the Library as you received it specifies that a proxy can decide whether future versions of the GNU Lesser General Public License shall apply, that proxy's public statement of acceptance of any version is permanent authorization for you to choose that version for the Library. ====================================================================== GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . demjson-2.2.4/README.txt0000664000076400007640000001074412337144342014233 0ustar demdem00000000000000demjson ======= MORE DOCUMENTATION IS IN THE "docs" SUBDIRECTORY. 'demjson' is a Python language module for encoding, decoding, and syntax-checking JSON data. It works under both Python 2 and Python 3. It comes with a "jsonlint" script which can be used to validate your JSON documents for strict conformance to the JSON specification. It can also reformat or pretty-print JSON documents; either by re-indenting or removing unnecessary whitespace for minimal/canonical JSON output. demjson tries to be as closely conforming to the JSON specification, published as IETF RFC 7159 , as possible. It can also be used in a non-strict mode where it is much closer to the JavaScript/ECMAScript syntax (published as ECMA 262). The demjson module has full Unicode support and can deal with very large numbers. Example use =========== To use demjson from within your Python programs: import demjson # Convert a Python value into JSON demjson.encode( {'Happy': True, 'Other': None} ) # returns string => {"Happy":true,"Other":null} # Convert a JSON document into a Python value demjson.decode( '{"Happy":true,"Other":null}' ) # returns dict => {'Other': None, 'Happy': True} To use the accompaning "jsonlint" command script: # To check whether a file contains valid JSON jsonlint sample.json # To pretty-print (reformat) a JSON file jsonlint --format sample.json Why use demjson rather than the Python standard library? ======================================================== demjson was written before Python had any built-in JSON support in its standard library, and there were just a handful of third-party libraries. None of those at that time were completely compliant with the RFC, and the best of those required compiled C extensions rather than being pure Python code. So I wrote demjson to be: * Pure Python, requiring no compiled extension. * 100% RFC compliant. It should follow the JSON specification exactly. It should be noted that Python has since added JSON into its standard library -- which was actually an absorption of "simplejson", written by Bob Ippolito, and which had by then been fixed to remove bugs and improve RFC conformance. For most uses, the standard Python JSON library should be sufficient. However demjson may still be useful: * It works in old Python versions that don't have JSON built in; * It generally has better error handling and "lint" checking capabilities; * It will automatically use the Python Decimal (bigfloat) class instead of a floating-point number whenever there might be an overflow or loss of precision otherwise. * It can correctly deal with different Unicode encodings, including ASCII. It will automatically adapt when to use \u-escapes based on the encoding. * It generates more conservative JSON, such as escaping Unicode format control characters or line terminators, which should improve data portability. * In non-strict mode it can also deal with slightly non-conforming input that is more JavaScript than JSON (such as allowing comments). * It supports a broader set of types during conversion, including Python's Decimal or UserString. Installation ============ To install, type: python setup.py install or optionally just copy the file "demjson.py" to whereever you want. See docs/INSTALL.txt for more detailed instructions, including how to run the self-tests. More information ================ See the files under the "docs" subdirectory. The module is self-documented, so within the python interpreter type: import demjson help(demjson) or from a command line: pydoc demjson The "jsonlint" command script which gets installed as part of demjson has built-in usage instructions as well. Just type: jsonlint --help Complete documentation and additional information is also available on the project homepage at http://deron.meranda.us/python/demjson/ It is also available on the Python Package Index (PyPI) at http://pypi.python.org/pypi/demjson/ License ======= LGPLv3 - See the included "LICENSE.txt" file. This software is Free Software and is licensed under the terms of the GNU LGPL (GNU Lesser General Public License). More information is found at the top of the demjson.py source file and the included LICENSE.txt file. Releases prior to 1.4 were released under a different license, be sure to check the corresponding LICENSE.txt file included with them. This software was written by Deron Meranda, http://deron.meranda.us/ demjson-2.2.4/setup.py0000664000076400007640000000637412636326121014252 0ustar demdem00000000000000# Python package setup script -*- coding: utf-8 -*- name = 'demjson' version = '2.2.4' import sys try: py_major = sys.version_info.major except AttributeError: py_major = sys.version_info[0] distmech = None if py_major >= 3: # Python 3, use setuptools first try: from setuptools import setup distmech = 'setuptools' except ImportError: from distutils.core import setup distmech = 'distutils' else: # Python 2, use distutils first try: from distutils.core import setup distmech = 'distutils' except ImportError: from setuptools import setup distmech = 'setuptools' if False: sys.stdout.write("Using Python: %s\n" % sys.version.split(None,1)[0]) sys.stdout.write("Using installer: %s\n" % distmech ) py3extra = {} if py_major >= 3: # Make sure 2to3 gets run if distmech == 'setuptools': py3extra['use_2to3'] = True #py3extra['convert_2to3_doctests'] = ['src/your/module/README.txt'] #py3extra['use_2to3_fixers'] = ['your.fixers'] elif distmech == 'distutils': import distutils, distutils.command, distutils.command.build_py, distutils.command.build_scripts cmdclass = { 'build_py': distutils.command.build_py.build_py_2to3, 'build_scripts': distutils.command.build_scripts.build_scripts_2to3 } py3extra['cmdclass'] = cmdclass setup( name=name, version=version, py_modules=[name], scripts=['jsonlint'], author='Deron Meranda', author_email='deron.meranda@gmail.com', url='http://deron.meranda.us/python/%s/'%name, download_url='http://deron.meranda.us/python/%(name)s/dist/%(name)s-%(version)s.tar.gz'\ % {'name':name, 'version':version}, description='encoder, decoder, and lint/validator for JSON (JavaScript Object Notation) compliant with RFC 7159', long_description=""" The "demjson" module, and the included "jsonlint" script, provide methods for encoding and decoding JSON formatted data, as well as checking JSON data for errors and/or portability issues. The jsonlint command/script can be used from the command line without needing any programming. Although the standard Python library now includes basic JSON support (which it did not when demjson was first written), this module provides a much more comprehensive implementation with many features not found elsewhere. It is especially useful for error checking or for parsing JavaScript data which may not strictly be valid JSON data. """.strip(), license='GNU LGPL 3.0', keywords=['JSON','jsonlint','JavaScript','UTF-32'], platforms=[], classifiers=["Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)", "Operating System :: OS Independent", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Internet :: WWW/HTTP :: Dynamic Content" ], **py3extra ) demjson-2.2.4/jsonlint0000775000076400007640000000273712636325076014335 0ustar demdem00000000000000#!/usr/bin/env python # -*- coding: utf-8 -*- r"""A JSON syntax validator and formatter tool. Requires demjson module. """ __author__ = "Deron Meranda " __homepage__ = "http://deron.meranda.us/python/demjson/" __date__ = "2014-12-22" __version__ = "2.2.4" __credits__ = """Copyright (c) 2006-2015 Deron E. Meranda Licensed under GNU LGPL (GNU Lesser General Public License) version 3.0 or later. See LICENSE.txt included with this software. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see or . """ import sys try: import demjson except ImportError: sys.stderr.write("Can not import the demjson Python module.\n") sys.stderr.write("Try running: pip install demjson\n") sys.exit(1) if __name__ == '__main__': lint = demjson.jsonlint( program_name=sys.argv[0] ) rc = lint.main( sys.argv[1:] ) sys.exit(rc) demjson-2.2.4/demjson.py0000664000076400007640000074354212636325104014556 0ustar demdem00000000000000#!/usr/bin/env python # -*- coding: utf-8 -*- # r""" A JSON data encoder and decoder. This Python module implements the JSON (http://json.org/) data encoding format; a subset of ECMAScript (aka JavaScript) for encoding primitive data types (numbers, strings, booleans, lists, and associative arrays) in a language-neutral simple text-based syntax. It can encode or decode between JSON formatted strings and native Python data types. Normally you would use the encode() and decode() functions defined by this module, but if you want more control over the processing you can use the JSON class. This implementation tries to be as completely cormforming to all intricacies of the standards as possible. It can operate in strict mode (which only allows JSON-compliant syntax) or a non-strict mode (which allows much more of the whole ECMAScript permitted syntax). This includes complete support for Unicode strings (including surrogate-pairs for non-BMP characters), and all number formats including negative zero and IEEE 754 non-numbers such a NaN or Infinity. The JSON/ECMAScript to Python type mappings are: ---JSON--- ---Python--- null None undefined undefined (note 1) Boolean (true,false) bool (True or False) Integer int or long (note 2) Float float String str or unicode ( "..." or u"..." ) Array [a, ...] list ( [...] ) Object {a:b, ...} dict ( {...} ) -- Note 1. an 'undefined' object is declared in this module which represents the native Python value for this type when in non-strict mode. -- Note 2. some ECMAScript integers may be up-converted to Python floats, such as 1e+40. Also integer -0 is converted to float -0, so as to preserve the sign (which ECMAScript requires). -- Note 3. numbers requiring more significant digits than can be represented by the Python float type will be converted into a Python Decimal type, from the standard 'decimal' module. In addition, when operating in non-strict mode, several IEEE 754 non-numbers are also handled, and are mapped to specific Python objects declared in this module: NaN (not a number) nan (float('nan')) Infinity, +Infinity inf (float('inf')) -Infinity neginf (float('-inf')) When encoding Python objects into JSON, you may use types other than native lists or dictionaries, as long as they support the minimal interfaces required of all sequences or mappings. This means you can use generators and iterators, tuples, UserDict subclasses, etc. To make it easier to produce JSON encoded representations of user defined classes, if the object has a method named json_equivalent(), then it will call that method and attempt to encode the object returned from it instead. It will do this recursively as needed and before any attempt to encode the object using it's default strategies. Note that any json_equivalent() method should return "equivalent" Python objects to be encoded, not an already-encoded JSON-formatted string. There is no such aid provided to decode JSON back into user-defined classes as that would dramatically complicate the interface. When decoding strings with this module it may operate in either strict or non-strict mode. The strict mode only allows syntax which is conforming to RFC 7159 (JSON), while the non-strict allows much more of the permissible ECMAScript syntax. The following are permitted when processing in NON-STRICT mode: * Unicode format control characters are allowed anywhere in the input. * All Unicode line terminator characters are recognized. * All Unicode white space characters are recognized. * The 'undefined' keyword is recognized. * Hexadecimal number literals are recognized (e.g., 0xA6, 0177). * String literals may use either single or double quote marks. * Strings may contain \x (hexadecimal) escape sequences, as well as the \v and \0 escape sequences. * Lists may have omitted (elided) elements, e.g., [,,,,,], with missing elements interpreted as 'undefined' values. * Object properties (dictionary keys) can be of any of the types: string literals, numbers, or identifiers (the later of which are treated as if they are string literals)---as permitted by ECMAScript. JSON only permits strings literals as keys. Concerning non-strict and non-ECMAScript allowances: * Octal numbers: If you allow the 'octal_numbers' behavior (which is never enabled by default), then you can use octal integers and octal character escape sequences (per the ECMAScript standard Annex B.1.2). This behavior is allowed, if enabled, because it was valid JavaScript at one time. * Multi-line string literals: Strings which are more than one line long (contain embedded raw newline characters) are never permitted. This is neither valid JSON nor ECMAScript. Some other JSON implementations may allow this, but this module considers that behavior to be a mistake. References: * JSON (JavaScript Object Notation) * RFC 7159. The application/json Media Type for JavaScript Object Notation (JSON) * ECMA-262 3rd edition (1999) * IEEE 754-1985: Standard for Binary Floating-Point Arithmetic. """ __author__ = "Deron Meranda " __homepage__ = "http://deron.meranda.us/python/demjson/" __date__ = "2015-12-22" __version__ = "2.2.4" __version_info__ = ( 2, 2, 4 ) # Will be converted into a namedtuple below __credits__ = """Copyright (c) 2006-2015 Deron E. Meranda Licensed under GNU LGPL (GNU Lesser General Public License) version 3.0 or later. See LICENSE.txt included with this software. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see or . """ # ---------------------------------------------------------------------- # Set demjson version try: from collections import namedtuple as _namedtuple __version_info__ = _namedtuple('version_info', ['major', 'minor', 'micro'])( *__version_info__ ) except ImportError: raise ImportError("demjson %s requires a Python 2.6 or later" % __version__ ) version, version_info = __version__, __version_info__ # Determine Python version _py_major, _py_minor = None, None def _get_pyver(): global _py_major, _py_minor import sys vi = sys.version_info try: _py_major, _py_minor = vi.major, vi.minor except AttributeError: _py_major, _py_minor = vi[0], vi[1] _get_pyver() # ---------------------------------------------------------------------- # Useful global constants content_type = 'application/json' file_ext = 'json' class _dummy_context_manager(object): """A context manager that does nothing on entry or exit.""" def __enter__(self): pass def __exit__(self, exc_type, exc_val, exc_tb): return False _dummy_context_manager = _dummy_context_manager() # ---------------------------------------------------------------------- # Decimal and float types. # # If a JSON number can not be stored in a Python float without loosing # precision and the Python has the decimal type, then we will try to # use decimal instead of float. To make this determination we need to # know the limits of the float type, but Python doesn't have an easy # way to tell what the largest floating-point number it supports. So, # we detemine the precision and scale of the float type by testing it. try: # decimal module was introduced in Python 2.4 import decimal except ImportError: decimal = None def determine_float_limits( number_type=float ): """Determines the precision and range of the given float type. The passed in 'number_type' argument should refer to the type of floating-point number. It should either be the built-in 'float', or decimal context or constructor; i.e., one of: # 1. FLOAT TYPE determine_float_limits( float ) # 2. DEFAULT DECIMAL CONTEXT determine_float_limits( decimal.Decimal ) # 3. CUSTOM DECIMAL CONTEXT ctx = decimal.Context( prec=75 ) determine_float_limits( ctx ) Returns a named tuple with components: ( significant_digits, max_exponent, min_exponent ) Where: * significant_digits -- maximum number of *decimal* digits that can be represented without any loss of precision. This is conservative, so if there are 16 1/2 digits, it will return 16, not 17. * max_exponent -- The maximum exponent (power of 10) that can be represented before an overflow (or rounding to infinity) occurs. * min_exponent -- The minimum exponent (negative power of 10) that can be represented before either an underflow (rounding to zero) or a subnormal result (loss of precision) occurs. Note this is conservative, as subnormal numbers are excluded. """ if decimal: numeric_exceptions = (ValueError,decimal.Overflow,decimal.Underflow) else: numeric_exceptions = (ValueError,) if decimal and number_type == decimal.Decimal: number_type = decimal.DefaultContext if decimal and isinstance(number_type, decimal.Context): # Passed a decimal Context, extract the bound creator function. create_num = number_type.create_decimal decimal_ctx = decimal.localcontext(number_type) is_zero_or_subnormal = lambda n: n.is_zero() or n.is_subnormal() elif number_type == float: create_num = number_type decimal_ctx = _dummy_context_manager is_zero_or_subnormal = lambda n: n==0 else: raise TypeError("Expected a float type, e.g., float or decimal context") with decimal_ctx: zero = create_num('0.0') # Find signifianct digits by comparing floats of increasing # number of digits, differing in the last digit only, until # they numerically compare as being equal. sigdigits = None n = 0 while True: n = n + 1 pfx = '0.' + '1'*n a = create_num( pfx + '0') for sfx in '123456789': # Check all possible last digits to # avoid any partial-decimal. b = create_num( pfx + sfx ) if (a+zero) == (b+zero): sigdigits = n break if sigdigits: break # Find exponent limits. First find order of magnitude and # then use a binary search to find the exact exponent. base = '1.' + '1'*(sigdigits-1) base0 = '1.' + '1'*(sigdigits-2) minexp, maxexp = None, None for expsign in ('+','-'): minv = 0; maxv = 10 # First find order of magnitude of exponent limit while True: try: s = base + 'e' + expsign + str(maxv) s0 = base0 + 'e' + expsign + str(maxv) f = create_num( s ) + zero f0 = create_num( s0 ) + zero except numeric_exceptions: f = None if not f or not str(f)[0].isdigit() or is_zero_or_subnormal(f) or f==f0: break else: minv = maxv maxv = maxv * 10 # Now do a binary search to find exact limit while True: if minv+1 == maxv: if expsign=='+': maxexp = minv else: minexp = minv break elif maxv < minv: if expsign=='+': maxexp = None else: minexp = None break m = (minv + maxv) // 2 try: s = base + 'e' + expsign + str(m) s0 = base0 + 'e' + expsign + str(m) f = create_num( s ) + zero f0 = create_num( s0 ) + zero except numeric_exceptions: f = None else: if not f or not str(f)[0].isdigit(): f = None elif is_zero_or_subnormal(f) or f==f0: f = None if not f: # infinite maxv = m else: minv = m return _namedtuple('float_limits', ['significant_digits', 'max_exponent', 'min_exponent'])( sigdigits, maxexp, -minexp ) float_sigdigits, float_maxexp, float_minexp = determine_float_limits( float ) # For backwards compatibility with older demjson versions: def determine_float_precision(): v = determine_float_limits( float ) return ( v.significant_digits, v.max_exponent ) # ---------------------------------------------------------------------- # The undefined value. # # ECMAScript has an undefined value (similar to yet distinct from null). # Neither Python or strict JSON have support undefined, but to allow # JavaScript behavior we must simulate it. class _undefined_class(object): """Represents the ECMAScript 'undefined' value.""" __slots__ = [] def __repr__(self): return self.__module__ + '.undefined' def __str__(self): return 'undefined' def __nonzero__(self): return False undefined = _undefined_class() syntax_error = _undefined_class() # same as undefined, but has separate identity del _undefined_class # ---------------------------------------------------------------------- # Non-Numbers: NaN, Infinity, -Infinity # # ECMAScript has official support for non-number floats, although # strict JSON does not. Python doesn't either. So to support the # full JavaScript behavior we must try to add them into Python, which # is unfortunately a bit of black magic. If our python implementation # happens to be built on top of IEEE 754 we can probably trick python # into using real floats. Otherwise we must simulate it with classes. def _nonnumber_float_constants(): """Try to return the Nan, Infinity, and -Infinity float values. This is necessarily complex because there is no standard platform-independent way to do this in Python as the language (opposed to some implementation of it) doesn't discuss non-numbers. We try various strategies from the best to the worst. If this Python interpreter uses the IEEE 754 floating point standard then the returned values will probably be real instances of the 'float' type. Otherwise a custom class object is returned which will attempt to simulate the correct behavior as much as possible. """ try: # First, try (mostly portable) float constructor. Works under # Linux x86 (gcc) and some Unices. nan = float('nan') inf = float('inf') neginf = float('-inf') except ValueError: try: # Try the AIX (PowerPC) float constructors nan = float('NaNQ') inf = float('INF') neginf = float('-INF') except ValueError: try: # Next, try binary unpacking. Should work under # platforms using IEEE 754 floating point. import struct, sys xnan = '7ff8000000000000'.decode('hex') # Quiet NaN xinf = '7ff0000000000000'.decode('hex') xcheck = 'bdc145651592979d'.decode('hex') # -3.14159e-11 # Could use float.__getformat__, but it is a new python feature, # so we use sys.byteorder. if sys.byteorder == 'big': nan = struct.unpack('d', xnan)[0] inf = struct.unpack('d', xinf)[0] check = struct.unpack('d', xcheck)[0] else: nan = struct.unpack('d', xnan[::-1])[0] inf = struct.unpack('d', xinf[::-1])[0] check = struct.unpack('d', xcheck[::-1])[0] neginf = - inf if check != -3.14159e-11: raise ValueError('Unpacking raw IEEE 754 floats does not work') except (ValueError, TypeError): # Punt, make some fake classes to simulate. These are # not perfect though. For instance nan * 1.0 == nan, # as expected, but 1.0 * nan == 0.0, which is wrong. class nan(float): """An approximation of the NaN (not a number) floating point number.""" def __repr__(self): return 'nan' def __str__(self): return 'nan' def __add__(self,x): return self def __radd__(self,x): return self def __sub__(self,x): return self def __rsub__(self,x): return self def __mul__(self,x): return self def __rmul__(self,x): return self def __div__(self,x): return self def __rdiv__(self,x): return self def __divmod__(self,x): return (self,self) def __rdivmod__(self,x): return (self,self) def __mod__(self,x): return self def __rmod__(self,x): return self def __pow__(self,exp): return self def __rpow__(self,exp): return self def __neg__(self): return self def __pos__(self): return self def __abs__(self): return self def __lt__(self,x): return False def __le__(self,x): return False def __eq__(self,x): return False def __neq__(self,x): return True def __ge__(self,x): return False def __gt__(self,x): return False def __complex__(self,*a): raise NotImplementedError('NaN can not be converted to a complex') if decimal: nan = decimal.Decimal('NaN') else: nan = nan() class inf(float): """An approximation of the +Infinity floating point number.""" def __repr__(self): return 'inf' def __str__(self): return 'inf' def __add__(self,x): return self def __radd__(self,x): return self def __sub__(self,x): return self def __rsub__(self,x): return self def __mul__(self,x): if x is neginf or x < 0: return neginf elif x == 0: return nan else: return self def __rmul__(self,x): return self.__mul__(x) def __div__(self,x): if x == 0: raise ZeroDivisionError('float division') elif x < 0: return neginf else: return self def __rdiv__(self,x): if x is inf or x is neginf or x is nan: return nan return 0.0 def __divmod__(self,x): if x == 0: raise ZeroDivisionError('float divmod()') elif x < 0: return (nan,nan) else: return (self,self) def __rdivmod__(self,x): if x is inf or x is neginf or x is nan: return (nan, nan) return (0.0, x) def __mod__(self,x): if x == 0: raise ZeroDivisionError('float modulo') else: return nan def __rmod__(self,x): if x is inf or x is neginf or x is nan: return nan return x def __pow__(self, exp): if exp == 0: return 1.0 else: return self def __rpow__(self, x): if -1 < x < 1: return 0.0 elif x == 1.0: return 1.0 elif x is nan or x is neginf or x < 0: return nan else: return self def __neg__(self): return neginf def __pos__(self): return self def __abs__(self): return self def __lt__(self,x): return False def __le__(self,x): if x is self: return True else: return False def __eq__(self,x): if x is self: return True else: return False def __neq__(self,x): if x is self: return False else: return True def __ge__(self,x): return True def __gt__(self,x): return True def __complex__(self,*a): raise NotImplementedError('Infinity can not be converted to a complex') if decimal: inf = decimal.Decimal('Infinity') else: inf = inf() class neginf(float): """An approximation of the -Infinity floating point number.""" def __repr__(self): return '-inf' def __str__(self): return '-inf' def __add__(self,x): return self def __radd__(self,x): return self def __sub__(self,x): return self def __rsub__(self,x): return self def __mul__(self,x): if x is self or x < 0: return inf elif x == 0: return nan else: return self def __rmul__(self,x): return self.__mul__(self) def __div__(self,x): if x == 0: raise ZeroDivisionError('float division') elif x < 0: return inf else: return self def __rdiv__(self,x): if x is inf or x is neginf or x is nan: return nan return -0.0 def __divmod__(self,x): if x == 0: raise ZeroDivisionError('float divmod()') elif x < 0: return (nan,nan) else: return (self,self) def __rdivmod__(self,x): if x is inf or x is neginf or x is nan: return (nan, nan) return (-0.0, x) def __mod__(self,x): if x == 0: raise ZeroDivisionError('float modulo') else: return nan def __rmod__(self,x): if x is inf or x is neginf or x is nan: return nan return x def __pow__(self,exp): if exp == 0: return 1.0 else: return self def __rpow__(self, x): if x is nan or x is inf or x is inf: return nan return 0.0 def __neg__(self): return inf def __pos__(self): return self def __abs__(self): return inf def __lt__(self,x): return True def __le__(self,x): return True def __eq__(self,x): if x is self: return True else: return False def __neq__(self,x): if x is self: return False else: return True def __ge__(self,x): if x is self: return True else: return False def __gt__(self,x): return False def __complex__(self,*a): raise NotImplementedError('-Infinity can not be converted to a complex') if decimal: neginf = decimal.Decimal('-Infinity') else: neginf = neginf(0) return nan, inf, neginf nan, inf, neginf = _nonnumber_float_constants() del _nonnumber_float_constants # ---------------------------------------------------------------------- # Integers class json_int( (1L).__class__ ): # Have to specify base this way to satisfy 2to3 """A subclass of the Python int/long that remembers its format (hex,octal,etc). Initialize it the same as an int, but also accepts an additional keyword argument 'number_format' which should be one of the NUMBER_FORMAT_* values. n = json_int( x[, base, number_format=NUMBER_FORMAT_DECIMAL] ) """ def __new__(cls, *args, **kwargs): if 'number_format' in kwargs: number_format = kwargs['number_format'] del kwargs['number_format'] if number_format not in (NUMBER_FORMAT_DECIMAL, NUMBER_FORMAT_HEX, NUMBER_FORMAT_OCTAL, NUMBER_FORMAT_LEGACYOCTAL, NUMBER_FORMAT_BINARY): raise TypeError("json_int(): Invalid value for number_format argument") else: number_format = NUMBER_FORMAT_DECIMAL obj = super(json_int,cls).__new__(cls,*args,**kwargs) obj._jsonfmt = number_format return obj @property def number_format(self): """The original radix format of the number""" return self._jsonfmt def json_format(self): """Returns the integer value formatted as a JSON literal""" fmt = self._jsonfmt if fmt == NUMBER_FORMAT_HEX: return format(self, '#x') elif fmt == NUMBER_FORMAT_OCTAL: return format(self, '#o') elif fmt == NUMBER_FORMAT_BINARY: return format(self, '#b') elif fmt == NUMBER_FORMAT_LEGACYOCTAL: if self==0: return '0' # For some reason Python's int doesn't do '00' elif self < 0: return '-0%o' % (-self) else: return '0%o' % self else: return str(self) # ---------------------------------------------------------------------- # String processing helpers def skipstringsafe( s, start=0, end=None ): i = start #if end is None: # end = len(s) unsafe = helpers.unsafe_string_chars while i < end and s[i] not in unsafe: #c = s[i] #if c in unsafe_string_chars: # break i += 1 return i def skipstringsafe_slow( s, start=0, end=None ): i = start if end is None: end = len(s) while i < end: c = s[i] if c == '"' or c == "'" or c == '\\' or ord(c) <= 0x1f: break i += 1 return i def extend_list_with_sep( orig_seq, extension_seq, sepchar='' ): if not sepchar: orig_seq.extend( extension_seq ) else: for i, x in enumerate(extension_seq): if i > 0: orig_seq.append( sepchar ) orig_seq.append( x ) def extend_and_flatten_list_with_sep( orig_seq, extension_seq, separator='' ): for i, part in enumerate(extension_seq): if i > 0 and separator: orig_seq.append( separator ) orig_seq.extend( part ) # ---------------------------------------------------------------------- # Unicode UTF-32 # ---------------------------------------------------------------------- def _make_raw_bytes( byte_list ): """Takes a list of byte values (numbers) and returns a bytes (Python 3) or string (Python 2) """ if _py_major >= 3: b = bytes( byte_list ) else: b = ''.join(chr(n) for n in byte_list) return b import codecs class utf32(codecs.CodecInfo): """Unicode UTF-32 and UCS4 encoding/decoding support. This is for older Pythons whch did not have UTF-32 codecs. JSON requires that all JSON implementations must support the UTF-32 encoding (as well as UTF-8 and UTF-16). But earlier versions of Python did not provide a UTF-32 codec, so we must implement UTF-32 ourselves in case we need it. See http://en.wikipedia.org/wiki/UTF-32 """ BOM_UTF32_BE = _make_raw_bytes([ 0, 0, 0xFE, 0xFF ]) #'\x00\x00\xfe\xff' BOM_UTF32_LE = _make_raw_bytes([ 0xFF, 0xFE, 0, 0 ]) #'\xff\xfe\x00\x00' @staticmethod def lookup( name ): """A standard Python codec lookup function for UCS4/UTF32. If if recognizes an encoding name it returns a CodecInfo structure which contains the various encode and decoder functions to use. """ ci = None name = name.upper() if name in ('UCS4BE','UCS-4BE','UCS-4-BE','UTF32BE','UTF-32BE','UTF-32-BE'): ci = codecs.CodecInfo( utf32.utf32be_encode, utf32.utf32be_decode, name='utf-32be') elif name in ('UCS4LE','UCS-4LE','UCS-4-LE','UTF32LE','UTF-32LE','UTF-32-LE'): ci = codecs.CodecInfo( utf32.utf32le_encode, utf32.utf32le_decode, name='utf-32le') elif name in ('UCS4','UCS-4','UTF32','UTF-32'): ci = codecs.CodecInfo( utf32.encode, utf32.decode, name='utf-32') return ci @staticmethod def encode( obj, errors='strict', endianness=None, include_bom=True ): """Encodes a Unicode string into a UTF-32 encoded byte string. Returns a tuple: (bytearray, num_chars) The errors argument should be one of 'strict', 'ignore', or 'replace'. The endianness should be one of: * 'B', '>', or 'big' -- Big endian * 'L', '<', or 'little' -- Little endien * None -- Default, from sys.byteorder If include_bom is true a Byte-Order Mark will be written to the beginning of the string, otherwise it will be omitted. """ import sys, struct # Make a container that can store bytes if _py_major >= 3: f = bytearray() write = f.extend def tobytes(): return bytes(f) else: try: import cStringIO as sio except ImportError: import StringIO as sio f = sio.StringIO() write = f.write tobytes = f.getvalue if not endianness: endianness = sys.byteorder if endianness.upper()[0] in ('B>'): big_endian = True elif endianness.upper()[0] in ('L<'): big_endian = False else: raise ValueError("Invalid endianness %r: expected 'big', 'little', or None" % endianness) pack = struct.pack packspec = '>L' if big_endian else ' maxunicode or (0xD800 <= n <= 0xDFFF): if errors == 'strict': raise UnicodeDecodeError('utf32',obj,i,i+4,'Invalid code point U+%04X' % n) elif errors == 'replace': chars.append( unichr(0xFFFD) ) elif errors == 'backslashreplace': if n > 0xffff: esc = "\\u%04x" % (n,) else: esc = "\\U%08x" % (n,) for esc_c in esc: chars.append( esc_c ) elif errors == 'xmlcharrefreplace': esc = "&#%d;" % (n,) for esc_c in esc: chars.append( esc_c ) else: # ignore pass else: chars.append( helpers.safe_unichr(n) ) return (u''.join( chars ), num_bytes) @staticmethod def utf32le_decode( obj, errors='strict' ): """Decodes a UTF-32LE (little endian) byte string into a Unicode string.""" return utf32.decode( obj, errors=errors, endianness='L' ) @staticmethod def utf32be_decode( obj, errors='strict' ): """Decodes a UTF-32BE (big endian) byte string into a Unicode string.""" return utf32.decode( obj, errors=errors, endianness='B' ) # ---------------------------------------------------------------------- # Helper functions # ---------------------------------------------------------------------- def _make_unsafe_string_chars(): import unicodedata unsafe = [] for c in [unichr(i) for i in range(0x100)]: if c == u'"' or c == u'\\' \ or unicodedata.category( c ) in ['Cc','Cf','Zl','Zp']: unsafe.append( c ) return u''.join( unsafe ) class helpers(object): """A set of utility functions.""" hexdigits = '0123456789ABCDEFabcdef' octaldigits = '01234567' unsafe_string_chars = _make_unsafe_string_chars() import sys maxunicode = sys.maxunicode always_use_custom_codecs = False # If True use demjson's codecs # before system codecs. This # is mainly here for testing. javascript_reserved_words = frozenset([ # Keywords (plus "let") (ECMAScript 6 section 11.6.2.1) 'break','case','catch','class','const','continue', 'debugger','default','delete','do','else','export', 'extends','finally','for','function','if','import', 'in','instanceof','let','new','return','super', 'switch','this','throw','try','typeof','var','void', 'while','with','yield', # Future reserved words (ECMAScript 6 section 11.6.2.2) 'enum','implements','interface','package', 'private','protected','public','static', # null/boolean literals 'null','true','false' ]) @staticmethod def make_raw_bytes( byte_list ): """Constructs a byte array (bytes in Python 3, str in Python 2) from a list of byte values (0-255). """ return _make_raw_bytes( byte_list ) @staticmethod def is_hex_digit( c ): """Determines if the given character is a valid hexadecimal digit (0-9, a-f, A-F).""" return (c in helpers.hexdigits) @staticmethod def is_octal_digit( c ): """Determines if the given character is a valid octal digit (0-7).""" return (c in helpers.octaldigits) @staticmethod def is_binary_digit( c ): """Determines if the given character is a valid binary digit (0 or 1).""" return (c == '0' or c == '1') @staticmethod def char_is_json_ws( c ): """Determines if the given character is a JSON white-space character""" return c in ' \t\n\r' @staticmethod def safe_unichr( codepoint ): """Just like Python's unichr() but works in narrow-Unicode Pythons.""" if codepoint >= 0x10000 and codepoint > helpers.maxunicode: # Narrow-Unicode python, construct a UTF-16 surrogate pair. w1, w2 = helpers.make_surrogate_pair( codepoint ) if w2 is None: c = unichr(w1) else: c = unichr(w1) + unichr(w2) else: c = unichr(codepoint) return c @staticmethod def char_is_unicode_ws( c ): """Determines if the given character is a Unicode space character""" if not isinstance(c,unicode): c = unicode(c) if c in u' \t\n\r\f\v': return True import unicodedata return unicodedata.category(c) == 'Zs' @staticmethod def char_is_json_eol( c ): """Determines if the given character is a JSON line separator""" return c in '\n\r' @staticmethod def char_is_unicode_eol( c ): """Determines if the given character is a Unicode line or paragraph separator. These correspond to CR and LF as well as Unicode characters in the Zl or Zp categories. """ return c in u'\r\n\u2028\u2029' @staticmethod def char_is_identifier_leader( c ): """Determines if the character may be the first character of a JavaScript identifier. """ return c.isalpha() or c in '_$' @staticmethod def char_is_identifier_tail( c ): """Determines if the character may be part of a JavaScript identifier. """ return c.isalnum() or c in u'_$\u200c\u200d' @staticmethod def extend_and_flatten_list_with_sep( orig_seq, extension_seq, separator='' ): for i, part in enumerate(extension_seq): if i > 0 and separator: orig_seq.append( separator ) orig_seq.extend( part ) @staticmethod def strip_format_control_chars( txt ): """Filters out all Unicode format control characters from the string. ECMAScript permits any Unicode "format control characters" to appear at any place in the source code. They are to be ignored as if they are not there before any other lexical tokenization occurs. Note that JSON does not allow them, except within string literals. * Ref. ECMAScript section 7.1. * http://en.wikipedia.org/wiki/Unicode_control_characters There are dozens of Format Control Characters, for example: U+00AD SOFT HYPHEN U+200B ZERO WIDTH SPACE U+2060 WORD JOINER """ import unicodedata txt2 = filter( lambda c: unicodedata.category(unicode(c)) != 'Cf', txt ) # 2to3 NOTE: The following is needed to work around a broken # Python3 conversion in which filter() will be transformed # into a list rather than a string. if not isinstance(txt2,basestring): txt2 = u''.join(txt2) return txt2 @staticmethod def lookup_codec( encoding ): """Wrapper around codecs.lookup(). Returns None if codec not found, rather than raising a LookupError. """ import codecs if isinstance( encoding, codecs.CodecInfo ): return encoding encoding = encoding.lower() import codecs if helpers.always_use_custom_codecs: # Try custom utf32 first, then standard python codecs cdk = utf32.lookup(encoding) if not cdk: try: cdk = codecs.lookup( encoding ) except LookupError: cdk = None else: # Try standard python codecs first, then custom utf32 try: cdk = codecs.lookup( encoding ) except LookupError: cdk = utf32.lookup( encoding ) return cdk @staticmethod def auto_detect_encoding( s ): """Takes a string (or byte array) and tries to determine the Unicode encoding it is in. Returns the encoding name, as a string. """ if not s or len(s)==0: return "utf-8" # Get the byte values of up to the first 4 bytes ords = [] for i in range(0, min(len(s),4)): x = s[i] if isinstance(x, basestring): x = ord(x) ords.append( x ) # Look for BOM marker import sys, codecs bom2, bom3, bom4 = None, None, None if len(s) >= 2: bom2 = s[:2] if len(s) >= 3: bom3 = s[:3] if len(s) >= 4: bom4 = s[:4] # Assign values of first four bytes to: a, b, c, d; and last byte to: z a, b, c, d, z = None, None, None, None, None if len(s) >= 1: a = ords[0] if len(s) >= 2: b = ords[1] if len(s) >= 3: c = ords[2] if len(s) >= 4: d = ords[3] z = s[-1] if isinstance(z, basestring): z = ord(z) if bom4 and ( (hasattr(codecs,'BOM_UTF32_LE') and bom4 == codecs.BOM_UTF32_LE) or (bom4 == utf32.BOM_UTF32_LE) ): encoding = 'utf-32le' s = s[4:] elif bom4 and ( (hasattr(codecs,'BOM_UTF32_BE') and bom4 == codecs.BOM_UTF32_BE) or (bom4 == utf32.BOM_UTF32_BE) ): encoding = 'utf-32be' s = s[4:] elif bom2 and bom2 == codecs.BOM_UTF16_LE: encoding = 'utf-16le' s = s[2:] elif bom2 and bom2 == codecs.BOM_UTF16_BE: encoding = 'utf-16be' s = s[2:] elif bom3 and bom3 == codecs.BOM_UTF8: encoding = 'utf-8' s = s[3:] # No BOM, so autodetect encoding used by looking at first four # bytes according to RFC 4627 section 3. The first and last bytes # in a JSON document will be ASCII. The second byte will be ASCII # unless the first byte was a quotation mark. elif len(s)>=4 and a==0 and b==0 and c==0 and d!=0: # UTF-32BE (0 0 0 x) encoding = 'utf-32be' elif len(s)>=4 and a!=0 and b==0 and c==0 and d==0 and z==0: # UTF-32LE (x 0 0 0 [... 0]) encoding = 'utf-32le' elif len(s)>=2 and a==0 and b!=0: # UTF-16BE (0 x) encoding = 'utf-16be' elif len(s)>=2 and a!=0 and b==0 and z==0: # UTF-16LE (x 0 [... 0]) encoding = 'utf-16le' elif ord('\t') <= a <= 127: # First byte appears to be ASCII, so guess UTF-8. encoding = 'utf8' else: raise ValueError("Can not determine the Unicode encoding for byte stream") return encoding @staticmethod def unicode_decode( txt, encoding=None ): """Takes a string (or byte array) and tries to convert it to a Unicode string. Returns a named tuple: (string, codec, bom) The 'encoding' argument, if supplied, should either the name of a character encoding, or an instance of codecs.CodecInfo. If the encoding argument is None or "auto" then the encoding is automatically determined, if possible. Any BOM (Byte Order Mark) that is found at the beginning of the input will be stripped off and placed in the 'bom' portion of the returned value. """ if isinstance(txt, unicode): res = _namedtuple('DecodedString',['string','codec','bom'])( txt, None, None ) else: if encoding is None or encoding == 'auto': encoding = helpers.auto_detect_encoding( txt ) cdk = helpers.lookup_codec( encoding ) if not cdk: raise LookupError("Can not find codec for encoding %r" % encoding) try: # Determine if codec takes arguments; try a decode of nothing cdk.decode( helpers.make_raw_bytes([]), errors='strict' ) except TypeError: cdk_kw = {} # This coded doesn't like the errors argument else: cdk_kw = {'errors': 'strict'} unitxt, numbytes = cdk.decode( txt, **cdk_kw ) # DO THE DECODE HERE! # Remove BOM if present if len(unitxt) > 0 and unitxt[0] == u'\uFEFF': bom = cdk.encode(unitxt[0])[0] unitxt = unitxt[1:] elif len(unitxt) > 0 and unitxt[0] == u'\uFFFE': # Reversed BOM raise UnicodeDecodeError(cdk.name,txt,0,0,"Wrong byte order, found reversed BOM U+FFFE") else: bom = None res = _namedtuple('DecodedString',['string','codec','bom'])( unitxt, cdk, bom ) return res @staticmethod def surrogate_pair_as_unicode( c1, c2 ): """Takes a pair of unicode surrogates and returns the equivalent unicode character. The input pair must be a surrogate pair, with c1 in the range U+D800 to U+DBFF and c2 in the range U+DC00 to U+DFFF. """ n1, n2 = ord(c1), ord(c2) if n1 < 0xD800 or n1 > 0xDBFF or n2 < 0xDC00 or n2 > 0xDFFF: raise JSONDecodeError('illegal Unicode surrogate pair',(c1,c2)) a = n1 - 0xD800 b = n2 - 0xDC00 v = (a << 10) | b v += 0x10000 return helpers.safe_unichr(v) @staticmethod def unicode_as_surrogate_pair( c ): """Takes a single unicode character and returns a sequence of surrogate pairs. The output of this function is a tuple consisting of one or two unicode characters, such that if the input character is outside the BMP range then the output is a two-character surrogate pair representing that character. If the input character is inside the BMP then the output tuple will have just a single character...the same one. """ n = ord(c) w1, w2 = helpers.make_surrogate_pair(n) if w2 is None: return (unichr(w1),) else: return (unichr(w1), unichr(w2)) @staticmethod def make_surrogate_pair( codepoint ): """Given a Unicode codepoint (int) returns a 2-tuple of surrogate codepoints.""" if codepoint < 0x10000: return (codepoint,None) # in BMP, surrogate pair not required v = codepoint - 0x10000 vh = (v >> 10) & 0x3ff # highest 10 bits vl = v & 0x3ff # lowest 10 bits w1 = 0xD800 | vh w2 = 0xDC00 | vl return (w1, w2) @staticmethod def isnumbertype( obj ): """Is the object of a Python number type (excluding complex)?""" return isinstance(obj, (int,long,float)) \ and not isinstance(obj, bool) \ or obj is nan or obj is inf or obj is neginf \ or (decimal and isinstance(obj, decimal.Decimal)) @staticmethod def is_negzero( n ): """Is the number value a negative zero?""" if isinstance( n, float ): return n == 0.0 and repr(n).startswith('-') elif decimal and isinstance( n, decimal.Decimal ): return n.is_zero() and n.is_signed() else: return False @staticmethod def is_nan( n ): """Is the number a NaN (not-a-number)?""" if isinstance( n, float ): return n is nan or n.hex() == 'nan' or n != n elif decimal and isinstance( n, decimal.Decimal ): return n.is_nan() else: return False @staticmethod def is_infinite( n ): """Is the number infinite?""" if isinstance( n, float ): return n is inf or n is neginf or n.hex() in ('inf','-inf') elif decimal and isinstance( n, decimal.Decimal ): return n.is_infinite() else: return False @staticmethod def isstringtype( obj ): """Is the object of a Python string type?""" if isinstance(obj, basestring): return True # Must also check for some other pseudo-string types import types, UserString return isinstance(obj, types.StringTypes) \ or isinstance(obj, UserString.UserString) ## or isinstance(obj, UserString.MutableString) @staticmethod def decode_hex( hexstring ): """Decodes a hexadecimal string into it's integer value.""" # We don't use the builtin 'hex' codec in python since it can # not handle odd numbers of digits, nor raise the same type # of exceptions we want to. n = 0 for c in hexstring: if '0' <= c <= '9': d = ord(c) - ord('0') elif 'a' <= c <= 'f': d = ord(c) - ord('a') + 10 elif 'A' <= c <= 'F': d = ord(c) - ord('A') + 10 else: raise ValueError('Not a hexadecimal number', hexstring) # Could use ((n << 4 ) | d), but python 2.3 issues a FutureWarning. n = (n * 16) + d return n @staticmethod def decode_octal( octalstring ): """Decodes an octal string into it's integer value.""" n = 0 for c in octalstring: if '0' <= c <= '7': d = ord(c) - ord('0') else: raise ValueError('Not an octal number', octalstring) # Could use ((n << 3 ) | d), but python 2.3 issues a FutureWarning. n = (n * 8) + d return n @staticmethod def decode_binary( binarystring ): """Decodes a binary string into it's integer value.""" n = 0 for c in binarystring: if c == '0': d = 0 elif c == '1': d = 1 else: raise ValueError('Not an binary number', binarystring) # Could use ((n << 3 ) | d), but python 2.3 issues a FutureWarning. n = (n * 2) + d return n @staticmethod def format_timedelta_iso( td ): """Encodes a datetime.timedelta into ISO-8601 Time Period format. """ d = td.days s = td.seconds ms = td.microseconds m, s = divmod(s,60) h, m = divmod(m,60) a = ['P'] if d: a.append( '%dD' % d ) if h or m or s or ms: a.append( 'T' ) if h: a.append( '%dH' % h ) if m: a.append( '%dM' % m ) if s or ms: if ms: a.append( '%d.%06d' % (s,ms) ) else: a.append( '%d' % s ) if len(a)==1: a.append('T0S') return ''.join(a) # ---------------------------------------------------------------------- # File position indicator # ---------------------------------------------------------------------- class position_marker(object): """A position marks a specific place in a text document. It consists of the following attributes: * line - The line number, starting at 1 * column - The column on the line, starting at 0 * char_position - The number of characters from the start of the document, starting at 0 * text_after - (optional) a short excerpt of the text of document starting at the current position Lines are separated by any Unicode line separator character. As an exception a CR+LF character pair is treated as being a single line separator demarcation. Columns are simply a measure of the number of characters after the start of a new line, starting at 0. Visual effects caused by Unicode characters such as combining characters, bidirectional text, zero-width characters and so on do not affect the computation of the column regardless of visual appearance. The char_position is a count of the number of characters since the beginning of the document, starting at 0. As used within the buffered_stream class, if the document starts with a Unicode Byte Order Mark (BOM), the BOM prefix is NOT INCLUDED in the count. """ def __init__(self, offset=0, line=1, column=0, text_after=None): self.__char_position = offset self.__line = line self.__column = column self.__text_after = text_after self.__at_end = False self.__last_was_cr = False @property def line(self): """The current line within the document, starts at 1.""" return self.__line @property def column(self): """The current character column from the beginning of the document, starts at 0. """ return self.__column @property def char_position(self): """The current character offset from the beginning of the document, starts at 0. """ return self.__char_position @property def at_start(self): """Returns True if the position is at the start of the document.""" return (self.char_position == 0) @property def at_end(self): """Returns True if the position is at the end of the document. This property must be set by the user. """ return self.__at_end @at_end.setter def at_end(self, b): """Sets the at_end property to True or False. """ self.__at_end = bool(b) @property def text_after(self): """Returns a textual excerpt starting at the current position. This property must be set by the user. """ return self.__at_end @text_after.setter def text_after(self, value): """Sets the text_after property to a given string. """ self.__text_after = value def __repr__(self): s = "%s(offset=%r,line=%r,column=%r" \ % (self.__class__.__name__, self.__char_position, self.__line, self.__column) if self.text_after: s += ",text_after=%r" % (self.text_after,) s += ")" return s def describe(self, show_text=True): """Returns a human-readable description of the position, in English.""" s = "line %d, column %d, offset %d" % (self.__line, self.__column, self.__char_position) if self.at_start: s += " (AT-START)" elif self.at_end: s += " (AT-END)" if show_text and self.text_after: s += ", text %r" % (self.text_after) return s def __str__(self): """Same as the describe() function.""" return self.describe( show_text=True ) def copy( self ): """Create a copy of the position object.""" p = self.__class__() p.__char_position = self.__char_position p.__line = self.__line p.__column = self.__column p.text_after = self.__text_after p.at_end = self.at_end p.__last_was_cr = self.__last_was_cr return p def rewind( self ): """Set the position to the start of the document.""" if not self.at_start: self.text_after = None self.at_end = False self.__char_position = 0 self.__line = 1 self.__column = 0 self.__last_was_cr = False def advance( self, s ): """Advance the position from its current place according to the given string of characters. """ if s: self.text_after = None for c in s: self.__char_position += 1 if c == '\n' and self.__last_was_cr: self.__last_was_cr = False elif helpers.char_is_unicode_eol(c): self.__line += 1 self.__column = 0 self.__last_was_cr = (c == '\r') else: self.__column += 1 self.__last_was_cr = False # ---------------------------------------------------------------------- # Buffered Stream Reader # ---------------------------------------------------------------------- class buffered_stream(object): """A helper class for the JSON parser. It allows for reading an input document, while handling some low-level Unicode issues as well as tracking the current position in terms of line and column position. """ def __init__(self, txt='', encoding=None): self.reset() self.set_text( txt, encoding ) def reset(self): """Clears the state to nothing.""" self.__pos = position_marker() self.__saved_pos = [] # Stack of saved positions self.__bom = helpers.make_raw_bytes([]) # contains copy of byte-order mark, if any self.__codec = None # The CodecInfo self.__encoding = None # The name of the codec's encoding self.__input_is_bytes = False self.__rawbuf = None self.__raw_bytes = None self.__cmax = 0 self.num_ws_skipped = 0 def save_position(self): self.__saved_pos.append( self.__pos.copy() ) return True def clear_saved_position(self): if self.__saved_pos: self.__saved_pos.pop() return True else: return False def restore_position(self): try: old_pos = self.__saved_pos.pop() # Can raise IndexError except IndexError, err: raise IndexError("Attempt to restore buffer position that was never saved") else: self.__pos = old_pos return True def _find_codec(self, encoding): if encoding is None: self.__codec = None self.__encoding = None elif isinstance(encoding, codecs.CodecInfo): self.__codec = encoding self.__encoding = self.__codec.name else: self.__encoding = encoding self.__codec = helpers.lookup_codec( encoding ) if not self.__codec: raise JSONDecodeError('no codec available for character encoding',encoding) return self.__codec def set_text( self, txt, encoding=None ): """Changes the input text document and rewinds the position to the start of the new document. """ import sys self.rewind() self.__codec = None self.__bom = None self.__rawbuf = u'' self.__cmax = 0 # max number of chars in input try: decoded = helpers.unicode_decode( txt, encoding ) except JSONError: raise except Exception, err: # Re-raise as a JSONDecodeError e2 = sys.exc_info() newerr = JSONDecodeError("a Unicode decoding error occurred") # Simulate Python 3's: "raise X from Y" exception chaining newerr.__cause__ = err newerr.__traceback__ = e2[2] raise newerr else: self.__codec = decoded.codec self.__bom = decoded.bom self.__rawbuf = decoded.string self.__cmax = len(self.__rawbuf) def __repr__(self): return '<%s at %r text %r>' % (self.__class__.__name__, self.__pos, self.text_context) def rewind(self): """Resets the position back to the start of the input text.""" self.__pos.rewind() @property def codec(self): """The codec object used to perform Unicode decoding, or None.""" return self.__codec @property def bom(self): """The Unicode Byte-Order Mark (BOM), if any, that was present at the start of the input text. The returned BOM is a string of the raw bytes, and is not Unicode-decoded. """ return self.__bom @property def cpos(self): """The current character offset from the start of the document.""" return self.__pos.char_position @property def position(self): """The current position (as a position_marker object). Returns a copy. """ p = self.__pos.copy() p.text_after = self.text_context p.at_end = self.at_end return p @property def at_start(self): """Returns True if the position is currently at the start of the document, or False otherwise. """ return self.__pos.at_start @property def at_end(self): """Returns True if the position is currently at the end of the document, of False otherwise. """ c = self.peek() return (not c) def at_ws(self, allow_unicode_whitespace=True): """Returns True if the current position contains a white-space character. """ c = self.peek() if not c: return False elif allow_unicode_whitespace: return helpers.char_is_unicode_ws(c) else: return helpers.char_is_json_ws(c) def at_eol(self, allow_unicode_eol=True): """Returns True if the current position contains an end-of-line control character. """ c = self.peek() if not c: return True # End of file is treated as end of line elif allow_unicode_eol: return helpers.char_is_unicode_eol(c) else: return helpers.char_is_json_eol(c) def peek( self, offset=0 ): """Returns the character at the current position, or at a given offset away from the current position. If the position is beyond the limits of the document size, then an empty string '' is returned. """ i = self.cpos + offset if i < 0 or i >= self.__cmax: return '' return self.__rawbuf[i] def peekstr( self, span=1, offset=0 ): """Returns one or more characters starting at the current position, or at a given offset away from the current position, and continuing for the given span length. If the offset and span go outside the limit of the current document size, then the returned string may be shorter than the requested span length. """ i = self.cpos + offset j = i + span if i < 0 or i >= self.__cmax: return '' return self.__rawbuf[i : j] @property def text_context( self, context_size = 20 ): """A short human-readable textual excerpt of the document at the current position, in English. """ context_size = max( context_size, 4 ) s = self.peekstr(context_size + 1) if not s: return '' if len(s) > context_size: s = s[:context_size - 3] + "..." return s def startswith( self, s ): """Determines if the text at the current position starts with the given string. See also method: pop_if_startswith() """ s2 = self.peekstr( len(s) ) return s == s2 def skip( self, span=1 ): """Advances the current position by one (or the given number) of characters. Will not advance beyond the end of the document. Returns the number of characters skipped. """ i = self.cpos self.__pos.advance( self.peekstr(span) ) return self.cpos - i def skipuntil( self, testfn ): """Advances the current position until a given predicate test function succeeds, or the end of the document is reached. Returns the actual number of characters skipped. The provided test function should take a single unicode character and return a boolean value, such as: lambda c : c == '.' # Skip to next period See also methods: skipwhile() and popuntil() """ i = self.cpos while True: c = self.peek() if not c or testfn(c): break else: self.__pos.advance(c) return self.cpos - i def skipwhile( self, testfn ): """Advances the current position until a given predicate test function fails, or the end of the document is reached. Returns the actual number of characters skipped. The provided test function should take a single unicode character and return a boolean value, such as: lambda c : c.isdigit() # Skip all digits See also methods: skipuntil() and popwhile() """ return self.skipuntil( lambda c: not testfn(c) ) def skip_to_next_line( self, allow_unicode_eol=True ): """Advances the current position to the start of the next line. Will not advance beyond the end of the file. Note that the two-character sequence CR+LF is recognized as being just a single end-of-line marker. """ ln = self.__pos.line while True: c = self.pop() if not c or self.__pos.line > ln: if c == '\r' and self.peek() == '\n': self.skip() break def skipws( self, allow_unicode_whitespace=True ): """Advances the current position past all whitespace, or until the end of the document is reached. """ if allow_unicode_whitespace: n = self.skipwhile( helpers.char_is_unicode_ws ) else: n = self.skipwhile( helpers.char_is_json_ws ) self.num_ws_skipped += n return n def pop( self ): """Returns the character at the current position and advances the position to the next character. At the end of the document this function returns an empty string. """ c = self.peek() if c: self.__pos.advance( c ) return c def popstr( self, span=1, offset=0 ): """Returns a string of one or more characters starting at the current position, and advances the position to the following character after the span. Will not go beyond the end of the document, so the returned string may be shorter than the requested span. """ s = self.peekstr(span) if s: self.__pos.advance( s ) return s def popif( self, testfn ): """Just like the pop() function, but only returns the character if the given predicate test function succeeds. """ c = self.peek() if c and testfn(c): self.__pos.advance( c ) return c return '' def pop_while_in( self, chars ): """Pops a sequence of characters at the current position as long as each of them is in the given set of characters. """ if not isinstance( chars, (set,frozenset)): cset = set( chars ) c = self.peek() if c and c in cset: s = self.popwhile( lambda c: c and c in cset ) return s return None def pop_identifier( self, match=None ): """Pops the sequence of characters at the current position that match the syntax for a JavaScript identifier. """ c = self.peek() if c and helpers.char_is_identifier_leader(c): s = self.popwhile( helpers.char_is_identifier_tail ) return s return None def pop_if_startswith( self, s ): """Pops the sequence of characters if they match the given string. See also method: startswith() """ s2 = self.peekstr( len(s) ) if s2 != s: return NULL self.__pos.advance( s2 ) return s2 def popwhile( self, testfn, maxchars=None ): """Pops all the characters starting at the current position as long as each character passes the given predicate function test. If maxchars a numeric value instead of None then then no more than that number of characters will be popped regardless of the predicate test. See also methods: skipwhile() and popuntil() """ s = [] i = 0 while maxchars is None or i < maxchars: c = self.popif( testfn ) if not c: break s.append( c ) i += 1 return ''.join(s) def popuntil( self, testfn, maxchars=None ): """Just like popwhile() method except the predicate function should return True to stop the sequence rather than False. See also methods: skipuntil() and popwhile() """ return popwhile( lambda c: not testfn(c), maxchars=maxchars ) def __getitem__( self, index ): """Returns the character at the given index relative to the current position. If the index goes beyond the end of the input, or prior to the start when negative, then '' is returned. If the index provided is a slice object, then that range of characters is returned as a string. Note that a stride value other than 1 is not supported in the slice. To use a slice, do: s = my_stream[ 1:4 ] """ if isinstance( index, slice ): return self.peekstr( index.stop - index.start, index.start ) else: return self.peek( index ) # ---------------------------------------------------------------------- # Exception classes. # ---------------------------------------------------------------------- class JSONException(Exception): """Base class for all JSON-related exceptions. """ pass class JSONSkipHook(JSONException): """An exception to be raised by user-defined code within hook callbacks to indicate the callback does not want to handle the situation. """ pass class JSONStopProcessing(JSONException): """Can be raised by anyplace, including inside a hook function, to cause the entire encode or decode process to immediately stop with an error. """ pass class JSONAbort(JSONException): pass class JSONError(JSONException): """Base class for all JSON-related errors. In addition to standard Python exceptions, these exceptions may also have additional properties: * severity - One of: 'fatal', 'error', 'warning', 'info' * position - An indication of the position in the input where the error occured. * outer_position - A secondary position (optional) that gives the location of the outer data item in which the error occured, such as the beginning of a string or an array. * context_description - A string that identifies the context in which the error occured. Default is "Context". """ severities = frozenset(['fatal','error','warning','info']) def __init__(self, message, *args, **kwargs ): self.severity = 'error' self._position = None self.outer_position = None self.context_description = None for kw,val in kwargs.items(): if kw == 'severity': if val not in self.severities: raise TypeError("%s given invalid severity %r" % (self.__class__.__name__, val)) self.severity = val elif kw == 'position': self.position = val elif kw == 'outer_position': self.outer_position = val elif kw == 'context_description' or kw=='context': self.context_description = val else: raise TypeError("%s does not accept %r keyword argument" % (self.__class__.__name__, kw)) super( JSONError, self ).__init__( message, *args ) self.message = message @property def position(self): return self._position @position.setter def position(self, pos): if pos == 0: self._position = 0 #position_marker() # start of input else: self._position = pos def __repr__(self): s = "%s(%r" % (self.__class__.__name__, self.message) for a in self.args[1:]: s += ", %r" % (a,) if self.position: s += ", position=%r" % (self.position,) if self.outer_position: s += ", outer_position=%r" % (self.outer_position,) s += ", severity=%r)" % (self.severity,) return s def pretty_description(self, show_positions=True, filename=None): if filename: pfx = filename.rstrip().rstrip(':') + ':' else: pfx = '' # Print file position as numeric abbreviation err = pfx if self.position == 0: err += '0:0:' elif self.position: err += '%d:%d:' % (self.position.line, self.position.column) else: err += ' ' # Print severity and main error message err += " %s: %s" % (self.severity.capitalize(), self.message) if len(self.args) > 1: err += ': ' for anum, a in enumerate(self.args[1:]): if anum > 1: err += ', ' astr = repr(a) if len(astr) > 30: astr = astr[:30] + '...' err += astr # Print out exception chain e2 = self while e2: if hasattr(e2,'__cause__') and isinstance(e2.__cause__,Exception): e2 = e2.__cause__ e2desc = str(e2).strip() if not e2desc: e2desc = repr(e2).strip() err += "\n | Cause: %s" % e2desc.strip().replace('\n','\n | ') else: e2 = None # Show file position if show_positions and self.position is not None: if self.position == 0: err += "\n | At start of input" else: err += "\n | At %s" % (self.position.describe(show_text=False),) if self.position.text_after: err += "\n | near text: %r" % (self.position.text_after,) # Show context if show_positions and self.outer_position: if self.context_description: cdesc = self.context_description.capitalize() else: cdesc = "Context" err += "\n | %s started at %s" % (cdesc, self.outer_position.describe(show_text=False),) if self.outer_position.text_after: err += "\n | with text: %r" % (self.outer_position.text_after,) return err class JSONDecodeError(JSONError): """An exception class raised when a JSON decoding error (syntax error) occurs.""" pass class JSONDecodeHookError(JSONDecodeError): """An exception that occured within a decoder hook. The original exception is available in the 'hook_exception' attribute. """ def __init__(self, hook_name, exc_info, encoded_obj, *args, **kwargs): self.hook_name = hook_name if not exc_info: exc_info = (None, None, None) exc_type, self.hook_exception, self.hook_traceback = exc_info self.object_type = type(encoded_obj) msg = "Hook %s raised %r while decoding type <%s>" % (hook_name, self.hook_exception.__class__.__name__, self.object_type.__name__) if len(args) >= 1: msg += ": " + args[0] args = args[1:] super(JSONDecodeHookError,self).__init__(msg, *args,**kwargs) class JSONEncodeError(JSONError): """An exception class raised when a python object can not be encoded as a JSON string.""" pass class JSONEncodeHookError(JSONEncodeError): """An exception that occured within an encoder hook. The original exception is available in the 'hook_exception' attribute. """ def __init__(self, hook_name, exc_info, encoded_obj, *args, **kwargs): self.hook_name = hook_name if not exc_info: exc_info = (None, None, None) exc_type, self.hook_exception, self.hook_traceback = exc_info self.object_type = type(encoded_obj) msg = "Hook %s raised %r while encoding type <%s>" % (self.hook_name, self.hook_exception.__class__.__name__, self.object_type.__name__) if len(args) >= 1: msg += ": " + args[0] args = args[1:] super(JSONEncodeHookError,self).__init__(msg, *args, **kwargs) #---------------------------------------------------------------------- # Encoder state object #---------------------------------------------------------------------- class encode_state(object): """An internal transient object used during JSON encoding to record the current construction state. """ def __init__(self, jsopts=None, parent=None ): import sys self.chunks = [] if not parent: self.parent = None self.nest_level = 0 self.options = jsopts self.escape_unicode_test = False # or a function f(unichar)=>True/False else: self.parent = parent self.nest_level = parent.nest_level + 1 self.escape_unicode_test = parent.escape_unicode_test self.options = parent.options def make_substate(self): return encode_state( parent=self ) def join_substate(self, other_state): self.chunks.extend( other_state.chunks ) other_state.chunks = [] def append(self, s): """Adds a string to the end of the current JSON document""" self.chunks.append(s) def combine(self): """Returns the accumulated string and resets the state to empty""" s = ''.join( self.chunks ) self.chunks = [] return s def __eq__(self, other_state): return self.nest_level == other_state.nest_level and \ self.chunks == other_state.chunks def __lt__(self, other_state): if self.nest_level != other_state.nest_level: return self.nest_level < other_state.nest_level return self.chunks < other_state.chunks #---------------------------------------------------------------------- # Decoder statistics #---------------------------------------------------------------------- class decode_statistics(object): """An object that records various statistics about a decoded JSON document. """ int8_max = 0x7f int8_min = - 0x7f - 1 int16_max = 0x7fff int16_min = - 0x7fff - 1 int32_max = 0x7fffffff int32_min = - 0x7fffffff - 1 int64_max = 0x7fffffffffffffff int64_min = - 0x7fffffffffffffff - 1 double_int_max = 2**53 - 1 double_int_min = - (2**53 - 1) def __init__(self): # Nesting self.max_depth = 0 self.max_items_in_array = 0 self.max_items_in_object = 0 # Integer stats self.num_ints = 0 self.num_ints_8bit = 0 self.num_ints_16bit = 0 self.num_ints_32bit = 0 self.num_ints_53bit = 0 # ints which will overflow IEEE doubles self.num_ints_64bit = 0 self.num_ints_long = 0 self.num_negative_zero_ints = 0 # Floating-point stats self.num_negative_zero_floats = 0 self.num_floats = 0 self.num_floats_decimal = 0 # overflowed 'float' # String stats self.num_strings = 0 self.max_string_length = 0 self.total_string_length = 0 self.min_codepoint = None self.max_codepoint = None # Other data type stats self.num_arrays = 0 self.num_objects = 0 self.num_bools = 0 self.num_nulls = 0 self.num_undefineds = 0 self.num_nans = 0 self.num_infinities = 0 self.num_comments = 0 self.num_identifiers = 0 # JavaScript identifiers self.num_excess_whitespace = 0 @property def num_infinites(self): """Misspelled 'num_infinities' for backwards compatibility""" return self.num_infinities def pretty_description(self, prefix=''): import unicodedata lines = [ "Number of integers:", " 8-bit: %5d (%d to %d)" % (self.num_ints_8bit, self.int8_min, self.int8_max), " 16-bit: %5d (%d to %d)" % (self.num_ints_16bit, self.int16_min, self.int16_max), " 32-bit: %5d (%d to %d)" % (self.num_ints_32bit, self.int32_min, self.int32_max), " > 53-bit: %5d (%d to %d - overflows JavaScript)" % (self.num_ints_53bit, self.double_int_min, self.double_int_max), " 64-bit: %5d (%d to %d)" % (self.num_ints_64bit, self.int64_min, self.int64_max), " > 64 bit: %5d (not portable, may require a \"Big Num\" package)" % self.num_ints_long, " total ints: %5d" % self.num_ints, " Num -0: %5d (negative-zero integers are not portable)" % self.num_negative_zero_ints, "Number of floats:", " doubles: %5d" % self.num_floats, " > doubles: %5d (will overflow IEEE doubles)" % self.num_floats_decimal, " total flts: %5d" % (self.num_floats + self.num_floats_decimal), " Num -0.0: %5d (negative-zero floats are usually portable)" % self.num_negative_zero_floats, "Number of:", " nulls: %5d" % self.num_nulls, " booleans: %5d" % self.num_bools, " arrays: %5d" % self.num_arrays, " objects: %5d" % self.num_objects, "Strings:", " number: %5d strings" % self.num_strings, " max length: %5d characters" % self.max_string_length, " total chars: %5d across all strings" % self.total_string_length, ] if self.min_codepoint is not None: cp = 'U+%04X' % self.min_codepoint try: charname = unicodedata.name(unichr(self.min_codepoint)) except ValueError: charname = '? UNKNOWN CHARACTER' lines.append(" min codepoint: %6s (%s)" % (cp, charname)) else: lines.append(" min codepoint: %6s" % ('n/a',)) if self.max_codepoint is not None: cp = 'U+%04X' % self.max_codepoint try: charname = unicodedata.name(unichr(self.max_codepoint)) except ValueError: charname = '? UNKNOWN CHARACTER' lines.append(" max codepoint: %6s (%s)" % (cp, charname)) else: lines.append(" max codepoint: %6s" % ('n/a',)) lines.extend([ "Other JavaScript items:", " NaN: %5d" % self.num_nans, " Infinite: %5d" % self.num_infinities, " undefined: %5d" % self.num_undefineds, " Comments: %5d" % self.num_comments, " Identifiers: %5d" % self.num_identifiers, "Max items in any array: %5d" % self.max_items_in_array, "Max keys in any object: %5d" % self.max_items_in_object, "Max nesting depth: %5d" % self.max_depth, ]) if self.total_chars == 0: lines.append("Unnecessary whitespace: 0 of 0 characters") else: lines.append( "Unnecessary whitespace: %5d of %d characters (%.2f%%)" \ % (self.num_excess_whitespace, self.total_chars, self.num_excess_whitespace * 100.0 / self.total_chars) ) if prefix: return '\n'.join([ prefix+s for s in lines ]) + '\n' else: return '\n'.join( lines ) + '\n' #---------------------------------------------------------------------- # Decoder state object #---------------------------------------------------------------------- class decode_state(object): """An internal transient object used during JSON decoding to record the current parsing state and error messages. """ def __init__(self, options=None): self.reset() self.options = options def reset(self): """Clears all errors, statistics, and input text.""" self.buf = None self.errors = [] self.obj = None self.cur_depth = 0 # how deep in nested structures are we? self.stats = decode_statistics() self._have_warned_nonbmp = False self._have_warned_long_string = False self._have_warned_max_depth = False @property def should_stop(self): if self.has_fatal: return True return False @property def has_errors(self): """Have any errors been seen already?""" return len([err for err in self.errors if err.severity in ('fatal','error')]) > 0 @property def has_fatal(self): """Have any errors been seen already?""" return len([err for err in self.errors if err.severity in ('fatal',)]) > 0 def set_input( self, txt, encoding=None ): """Initialize the state by setting the input document text.""" import sys self.reset() try: self.buf = buffered_stream( txt, encoding=encoding ) except JSONError as err: err.position = 0 # set position to start of file err.severity = 'fatal' self.push_exception( err ) except Exception as err: # Re-raise as JSONDecodeError e2 = sys.exc_info() newerr = JSONDecodeError("Error while reading input", position=0, severity='fatal') self.push_exception( err ) self.buf = None else: if self.buf.bom: self.push_cond( self.options.bom, "JSON document was prefixed by a BOM (Byte Order Mark)", self.buf.bom ) if not self.buf: self.push_fatal( "Aborting, can not read JSON document.", position=0 ) def push_exception(self, exc): """Add an already-built exception to the error list.""" self.errors.append(exc) def push_fatal(self, message, *args, **kwargs): """Create a fatal error.""" kwargs['severity'] = 'fatal' self.__push_err( message, *args, **kwargs) def push_error(self, message, *args, **kwargs): """Create an error.""" kwargs['severity'] = 'error' self.__push_err( message, *args, **kwargs) def push_warning(self, message, *args, **kwargs): """Create a warning.""" kwargs['severity'] = 'warning' self.__push_err( message, *args, **kwargs) def push_info(self, message, *args, **kwargs): """Create a informational message.""" kwargs['severity'] = 'info' self.__push_err( message, *args, **kwargs) def push_cond(self, behavior_value, message, *args, **kwargs): """Creates an conditional error or warning message. The behavior value (from json_options) controls whether a message will be pushed and whether it is an error or warning message. """ if behavior_value == ALLOW: return elif behavior_value == WARN: kwargs['severity'] = 'warning' else: kwargs['severity'] = 'error' self.__push_err( message, *args, **kwargs ) def __push_err(self, message, *args, **kwargs): """Stores an error in the error list.""" position = None outer_position = None severity = 'error' context_description = None for kw, val in kwargs.items(): if kw == 'position': position = val elif kw == 'outer_position': outer_position = val elif kw == 'severity': severity = val elif kw == 'context_description' or kw == 'context': context_description=val else: raise TypeError('Unknown keyword argument',kw) if position is None and self.buf: position = self.buf.position # Current position err = JSONDecodeError( message, position=position, outer_position=outer_position, context_description=context_description, severity=severity, *args) self.push_exception( err ) def update_depth_stats(self, **kwargs): st = self.stats st.max_depth = max(st.max_depth, self.cur_depth) if not self._have_warned_max_depth and self.cur_depth > self.options.warn_max_depth: self._have_warned_max_depth = True self.push_cond( self.options.non_portable, "Arrays or objects nested deeper than %d levels may not be portable" \ % self.options.warn_max_depth ) def update_string_stats(self, s, **kwargs): st = self.stats st.num_strings += 1 st.max_string_length = max(st.max_string_length, len(s)) st.total_string_length += len(s) if self.options.warn_string_length and len(s) > self.options.warn_string_length and not self._have_warned_long_string: self._have_warned_long_string = True self.push_cond( self.options.non_portable, "Strings longer than %d may not be portable" % self.options.warn_string_length, **kwargs ) if len(s) > 0: mincp = ord(min(s)) maxcp = ord(max(s)) if st.min_codepoint is None: st.min_codepoint = mincp st.max_codepoint = maxcp else: st.min_codepoint = min( st.min_codepoint, mincp ) st.max_codepoint = max( st.max_codepoint, maxcp ) if maxcp > 0xffff and not self._have_warned_nonbmp: self._have_warned_nonbmp = True self.push_cond( self.options.non_portable, "Strings containing non-BMP characters (U+%04X) may not be portable" % maxcp, **kwargs ) def update_negzero_int_stats(self, **kwargs): st = self.stats st.num_negative_zero_ints += 1 if st.num_negative_zero_ints == 1: # Only warn once self.push_cond( self.options.non_portable, "Negative zero (-0) integers are usually not portable", **kwargs ) def update_negzero_float_stats(self, **kwargs): st = self.stats st.num_negative_zero_floats += 1 if st.num_negative_zero_floats == 1: # Only warn once self.push_cond( self.options.non_portable, "Negative zero (-0.0) numbers may not be portable", **kwargs) def update_float_stats(self, float_value, **kwargs): st = self.stats if 'sign' in kwargs: del kwargs['sign'] if helpers.is_negzero( float_value ): self.update_negzero_float_stats( **kwargs ) if helpers.is_infinite( float_value ): st.num_infinities += 1 if isinstance(float_value, decimal.Decimal): st.num_floats_decimal += 1 if st.num_floats_decimal == 1: # Only warn once self.push_cond( self.options.non_portable, "Floats larger or more precise than an IEEE \"double\" may not be portable", **kwargs) elif isinstance(float_value, float): st.num_floats += 1 def update_integer_stats(self, int_value, **kwargs ): sign=kwargs.get('sign', 1) if 'sign' in kwargs: del kwargs['sign'] if int_value == 0 and sign < 0: self.update_negzero_int_stats( **kwargs ) if sign < 0: int_value = - int_value st = self.stats st.num_ints += 1 if st.int8_min <= int_value <= st.int8_max: st.num_ints_8bit += 1 elif st.int16_min <= int_value <= st.int16_max: st.num_ints_16bit += 1 elif st.int32_min <= int_value <= st.int32_max: st.num_ints_32bit += 1 elif st.int64_min <= int_value <= st.int64_max: st.num_ints_64bit += 1 else: st.num_ints_long += 1 if int_value < st.double_int_min or st.double_int_max < int_value: st.num_ints_53bit += 1 if st.num_ints_53bit == 1: # Only warn once self.push_cond( self.options.non_portable, "Integers larger than 53-bits are not portable", **kwargs ) # ---------------------------------------------------------------------- # JSON strictness options # ---------------------------------------------------------------------- STRICTNESS_STRICT = 'strict' STRICTNESS_WARN = 'warn' STRICTNESS_TOLERANT = 'tolerant' ALLOW = 'allow' WARN = 'warn' FORBID = 'forbid' # For float_type option NUMBER_AUTO = 'auto' NUMBER_FLOAT = 'float' NUMBER_DECIMAL = 'decimal' # For json_int class NUMBER_FORMAT_DECIMAL = 'decimal' NUMBER_FORMAT_HEX = 'hex' NUMBER_FORMAT_LEGACYOCTAL = 'legacyoctal' NUMBER_FORMAT_OCTAL = 'octal' NUMBER_FORMAT_BINARY = 'binary' class _behaviors_metaclass(type): """Meta class used to establish a set of "behavior" options. Classes that use this meta class must defined a class-level variable called '_behaviors' that is a list of tuples, each of which describes one behavior and is like: (behavior_name, documentation). Also define a second class-level variable called '_behavior_values' which is a list of the permitted values for each behavior, each being strings. For each behavior (e.g., pretty), and for each value (e.g., yes) the following methods/properties will be created: * pretty - value of 'pretty' behavior (read-write) * ispretty_yes - returns True if 'pretty' is 'yes' For each value (e.g., pink) the following methods/properties will be created: * all_behaviors - set of all behaviors (read-only) * pink_behaviors - set of behaviors with value of 'pink' (read-only) * set_all('pink') * set_all_pink() - set all behaviors to value of 'pink' """ def __new__(cls, clsname, bases, attrs): values = attrs.get('_behavior_values') attrs['values'] = property( lambda self: set(self._behavior_values), doc='Set of possible behavior values') behaviors = attrs.get('_behaviors') def get_behavior(self, name): """Returns the value for a given behavior""" try: return getattr( self, '_behavior_'+name ) except AttributeError: raise ValueError('Unknown behavior',name) attrs['get_behavior'] = get_behavior def set_behavior(self, name, value): """Changes the value for a given behavior""" if value not in self._behavior_values: raise ValueError('Unknown value for behavior',value) varname = '_behavior_'+name if hasattr(self,varname): setattr( self, varname, value ) else: raise ValueError('Unknown behavior',name) attrs['set_behavior'] = set_behavior def describe_behavior(self,name): """Returns documentation about a given behavior.""" for n, doc in self._behaviors: if n==name: return doc else: raise AttributeError('No such behavior',name) attrs['describe_behavior'] = describe_behavior for name, doc in behaviors: attrs['_behavior_'+name] = True for v in values: vs = v + '_' + name def getx(self,name=name,forval=v): return self.get_behavior(name) == forval attrs['is_'+v+'_'+name] = property(getx,doc=v.capitalize()+' '+doc) # method value_name() fnset = lambda self,_name=name,_value=v: self.set_behavior(_name,_value) fnset.__name__ = v+'_'+name fnset.__doc__ = 'Set behavior ' + name + ' to ' + v + "." attrs[fnset.__name__] = fnset def get_value_for_behavior(self,name=name): return self.get_behavior(name) def set_value_for_behavior(self,value,name=name): self.set_behavior(name,value) attrs[name] = property(get_value_for_behavior,set_value_for_behavior,doc=doc) @property def all_behaviors(self): """Returns the names of all known behaviors.""" return set([t[0] for t in self._behaviors]) attrs['all_behaviors'] = all_behaviors def set_all(self,value): """Changes all behaviors to have the given value.""" if value not in self._behavior_values: raise ValueError('Unknown behavior',value) for name in self.all_behaviors: setattr(self, '_behavior_'+name, value) attrs['set_all'] = set_all def is_all(self,value): """Determines if all the behaviors have the given value.""" if value not in self._behavior_values: raise ValueError('Unknown behavior',value) for name in self.all_behaviors: if getattr(self, '_behavior_'+name) != value: return False return True attrs['is_all'] = is_all for v in values: # property value_behaviors def getbehaviorsfor(self,value=v): return set([name for name in self.all_behaviors if getattr(self,name)==value]) attrs[v+'_behaviors'] = property(getbehaviorsfor,doc='Return the set of behaviors with the value '+v+'.') # method set_all_value() setfn = lambda self,_value=v: set_all(self,_value) setfn.__name__ = 'set_all_'+v setfn.__doc__ = 'Set all behaviors to value ' + v + "." attrs[setfn.__name__] = setfn # property is_all_value attrs['is_all_'+v] = property( lambda self,v=v: is_all(self,v), doc='Determines if all the behaviors have the value '+v+'.') def behaviors_eq(self, other): """Determines if two options objects are equivalent.""" if self.all_behaviors != other.all_behaviors: return False return self.allowed_behaviors == other.allowed_behaviors attrs['__eq__'] = behaviors_eq return super(_behaviors_metaclass, cls).__new__(cls, clsname, bases, attrs) SORT_NONE = 'none' SORT_PRESERVE = 'preserve' SORT_ALPHA = 'alpha' SORT_ALPHA_CI = 'alpha_ci' SORT_SMART = 'smart' sorting_methods = { SORT_NONE: "Do not sort, resulting order may be random", SORT_PRESERVE: "Preserve original order when reformatting", SORT_ALPHA: "Sort strictly alphabetically", SORT_ALPHA_CI: "Sort alphabetically case-insensitive", SORT_SMART: "Sort alphabetically and numerically (DEFAULT)" } sorting_method_aliases = { 'ci': SORT_ALPHA_CI } def smart_sort_transform( key ): numfmt = '%012d' digits = '0123456789' zero = ord('0') if not key: key = '' elif isinstance( key, (int,long) ): key = numfmt % key elif isinstance( key, basestring ): keylen = len(key) words = [] i=0 while i < keylen: if key[i] in digits: num = 0 while i < keylen and key[i] in digits: num *= 10 num += ord(key[i]) - zero i += 1 words.append( numfmt % num ) else: words.append( key[i].upper() ) i += 1 key = ''.join(words) else: key = str(key) return key # Find Enum type (introduced in Python 3.4) try: from enum import Enum as _enum except ImportError: _enum = None # Find OrderedDict type try: from collections import OrderedDict as _OrderedDict except ImportError: _OrderedDict = None class json_options(object): """Options to determine how strict the decoder or encoder should be.""" __metaclass__ = _behaviors_metaclass _behavior_values = (ALLOW, WARN, FORBID) _behaviors = ( ("all_numeric_signs", "Numbers may be prefixed by any \'+\' and \'-\', e.g., +4, -+-+77"), ("any_type_at_start", "A JSON document may start with any type, not just arrays or objects"), ("comments", "JavaScript comments, both /*...*/ and //... styles"), ("control_char_in_string", "Strings may contain raw control characters without \\u-escaping"), ("hex_numbers", "Hexadecimal numbers, e.g., 0x1f"), ("binary_numbers", "Binary numbers, e.g., 0b1001"), ("octal_numbers", "New-style octal numbers, e.g., 0o731 (see leading-zeros for legacy octals)"), ("initial_decimal_point", "Floating-point numbers may start with a decimal point (no units digit)"), ("extended_unicode_escapes", "Extended Unicode escape sequence \\u{..} for non-BMP characters"), ("js_string_escapes", "All JavaScript character \\-escape sequences may be in strings"), ("leading_zeros", "Numbers may have extra leading zeros (see --leading-zero-radix option)"), ("non_numbers", "Non-numbers may be used, such as NaN or Infinity"), ("nonescape_characters", "Unknown character \\-escape sequences stand for that character (\\Q -> 'Q')"), ("identifier_keys", "JavaScript identifiers are converted to strings when used as object keys"), ("nonstring_keys", "Value types other than strings (or identifiers) may be used as object keys"), ("omitted_array_elements", "Arrays may have omitted/elided elements, e.g., [1,,3] == [1,undefined,3]"), ("single_quoted_strings", "Strings may be delimited with both double (\") and single (\') quotation marks"), ("trailing_comma", "A final comma may end the list of array or object members"), ("trailing_decimal_point", "Floating-point number may end with a decimal point and no following fractional digits"), ("undefined_values", "The JavaScript 'undefined' value may be used"), ("format_control_chars", "Unicode \"format control characters\" may appear in the input"), ("unicode_whitespace", "Treat any Unicode whitespace character as valid whitespace"), # Never legal ("leading_zeros", "Numbers may have leading zeros"), # Normally warnings ("duplicate_keys", "Objects may have repeated keys"), ("zero_byte", "Strings may contain U+0000, which may not be safe for C-based programs"), ("bom", "A JSON document may start with a Unicode BOM (Byte Order Mark)"), ("non_portable", "Anything technically valid but likely to cause data portablibity issues"), ) # end behavior list def reset_to_defaults(self): # Plain attrs (other than above behaviors) are simply copied # by value, either during initialization (via keyword # arguments) or via the copy() method. self._plain_attrs = ['leading_zero_radix', 'encode_namedtuple_as_object', 'encode_enum_as', 'encode_compactly', 'escape_unicode', 'always_escape_chars', 'warn_string_length', 'warn_max_depth', 'int_as_float', 'decimal_context', 'float_type', 'keep_format', 'date_format', 'datetime_format', 'time_format', 'timedelta_format', 'sort_keys', 'indent_amount', 'indent_tab_width', 'indent_limit', 'max_items_per_line', 'py2str_encoding' ] self.strictness = STRICTNESS_WARN self._leading_zero_radix = 8 # via property: leading_zero_radix self._sort_keys = SORT_SMART # via property: sort_keys self.int_as_float = False self.float_type = NUMBER_AUTO self.decimal_context = (decimal.DefaultContext if decimal else None) self.keep_format = False # keep track of when numbers are hex, octal, etc. self.encode_namedtuple_as_object = True self._encode_enum_as = 'name' # via property self.encode_compactly = True self.escape_unicode = False self.always_escape_chars = None # None, or a set of Unicode characters to always escape self.warn_string_length = 0xfffd # with 16-bit length prefix self.warn_max_depth = 64 self.date_format = 'iso' # or strftime format self.datetime_format = 'iso' # or strftime format self.time_format = 'iso' # or strftime format self.timedelta_format = 'iso' # or 'hms' self.sort_keys = SORT_ALPHA self.indent_amount = 2 self.indent_tab_width = 0 # 0, or number of equivalent spaces self.indent_limit = None self.max_items_per_line = 1 # When encoding how many items per array/object # before breaking into multiple lines # For interpreting Python 2 'str' types: if _py_major == 2: self.py2str_encoding = 'ascii' else: self.py2str_encoding = None def __init__(self, **kwargs): """Set JSON encoding and decoding options. If 'strict' is set to True, then only strictly-conforming JSON output will be produced. Note that this means that some types of values may not be convertable and will result in a JSONEncodeError exception. If 'compactly' is set to True, then the resulting string will have all extraneous white space removed; if False then the string will be "pretty printed" with whitespace and indentation added to make it more readable. If 'escape_unicode' is set to True, then all non-ASCII characters will be represented as a unicode escape sequence; if False then the actual real unicode character will be inserted if possible. The 'escape_unicode' can also be a function, which when called with a single argument of a unicode character will return True if the character should be escaped or False if it should not. """ self.reset_to_defaults() if 'strict' in kwargs: # Do this keyword first, so other keywords may override specific behaviors self.strictness = kwargs['strict'] for kw,val in kwargs.items(): if kw == 'compactly': # alias for 'encode_compactly' self.encode_compactly = val elif kw == 'strict': pass # Already handled elif kw == 'warnings': if val: self.suppress_warnings() elif kw == 'html_safe' or kw == 'xml_safe': if bool(val): if self.always_escape_chars is None: self.always_escape_chars = set(u'<>/&') else: self.always_escape_chars.update( set(u'<>/&') ) elif kw == 'always_escape': if val: if self.always_escape_chars is None: self.always_escape_chars = set(val) else: self.always_escape_chars.update( set(val) ) elif kw == 'int_as_float': self.int_as_float = bool(val) elif kw == 'keep_format': self.keep_format = bool(val) elif kw == 'float_type': if val in (NUMBER_AUTO, NUMBER_FLOAT, NUMBER_DECIMAL): self.float_type = val else: raise ValueError("Unknown option %r for argument %r to initialize %s" % (val,kw,self.__class__.__name__)) elif kw == 'decimal' or kw == 'decimal_context': if decimal: if not val or val == 'default': self.decimal_context = decimal.DefaultContext elif val == 'basic': self.decimal_context = decimal.BasicContext elif val == 'extended': self.decimal_context = decimal.ExtendedContext elif isinstance(val, decimal.Context): self.decimal_context = val elif isinstance(val,(int,long)) or val[0].isdigit: prec = int(val) self.decimal_context = decimal.Context( prec=prec ) else: raise ValueError("Option for %r should be a decimal.Context, a number of significant digits, or one of 'default','basic', or 'extended'." % (kw,)) elif kw in ('allow','warn','forbid','prevent','deny'): action = {'allow':ALLOW, 'warn':WARN, 'forbid':FORBID, 'prevent':FORBID, 'deny':FORBID}[ kw ] if isinstance(val,basestring): val = [b.replace('-','_') for b in val.replace(',',' ').split()] for behavior in val: self.set_behavior( behavior, action ) elif kw.startswith('allow_') or kw.startswith('forbid_') or kw.startswith('prevent_') or kw.startswith('deny_') or kw.startswith('warn_'): action, behavior = kw.split('_',1) if action == 'allow': if val: self.set_behavior( behavior, ALLOW ) else: self.set_behavior( behavior, FORBID ) elif action in ('forbid','prevent','deny'): if val: self.set_behavior( behavior, FORBID ) else: self.set_behavior( behavior, ALLOW ) elif action == 'warn': if val: self.set_behavior( behavior, WARN ) else: self.set_behavior( behavior, ALLOW ) elif kw in self._plain_attrs: setattr(self, kw, val) else: raise ValueError("Unknown keyword argument %r to initialize %s" % (kw,self.__class__.__name__)) def copy(self): other = self.__class__() other.copy_from( self ) return other def copy_from(self, other): if self is other: return # Myself! self.strictness = other.strictness # sets behaviors in bulk for name in self.all_behaviors: self.set_behavior( name, other.get_behavior(name) ) for name in self._plain_attrs: val = getattr(other,name) if isinstance(val, set): val = val.copy() elif decimal and isinstance(val, decimal.Decimal): val = val.copy() setattr(self, name, val) def spaces_to_next_indent_level( self, min_spaces=1, subtract=0 ): n = self.indent_amount - subtract if n < 0: n = 0 n = max( min_spaces, n ) return ' ' * n def indentation_for_level( self, level=0 ): """Returns a whitespace string used for indenting.""" if self.indent_limit is not None and level > self.indent_limit: n = self.indent_limit else: n = level n *= self.indent_amount if self.indent_tab_width: tw, sw = divmod(n, self.indent_tab_width) return '\t'*tw + ' '*sw else: return ' ' * n def set_indent( self, num_spaces, tab_width=0, limit=None ): """Changes the indentation properties when outputting JSON in non-compact mode. 'num_spaces' is the number of spaces to insert for each level of indentation, which defaults to 2. 'tab_width', if not 0, is the number of spaces which is equivalent to one tab character. Tabs will be output where possible rather than runs of spaces. 'limit', if not None, is the maximum indentation level after which no further indentation will be output. """ n = int(num_spaces) if n < 0: raise ValueError("indentation amount can not be negative",n) self.indent_amount = n self.indent_tab_width = tab_width self.indent_limit = limit @property def sort_keys(self): """The method used to sort dictionary keys when encoding JSON """ return self._sort_keys @sort_keys.setter def sort_keys(self, method): if not method: self._sort_keys = SORT_NONE elif callable(method): self._sort_keys = method elif method in sorting_methods: self._sort_keys = method elif method in sorting_method_aliases: # alias self._sort_keys = sorting_method_aliases[method] elif method == True: self._sort_keys = SORT_ALPHA else: raise ValueError("Not a valid sorting method: %r" % method) @property def encode_enum_as(self): """The strategy for encoding Python Enum values. """ return self._encode_enum_as @encode_enum_as.setter def encode_enum_as(self, val): if val not in ('name','qname','value'): raise ValueError("encode_enum_as must be one of 'name','qname', or 'value'") self._encode_enum_as = val @property def zero_float(self): """The numeric value 0.0, either a float or a decimal.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.decimal_context.create_decimal('0.0') else: return 0.0 @property def negzero_float(self): """The numeric value -0.0, either a float or a decimal.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.decimal_context.create_decimal('-0.0') else: return -0.0 @property def nan(self): """The numeric value NaN, either a float or a decimal.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.decimal_context.create_decimal('NaN') else: return nan @property def inf(self): """The numeric value Infinity, either a float or a decimal.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.decimal_context.create_decimal('Infinity') else: return inf @property def neginf(self): """The numeric value -Infinity, either a float or a decimal.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.decimal_context.create_decimal('-Infinity') else: return neginf def make_int( self, s, sign=None, number_format=NUMBER_FORMAT_DECIMAL ): """Makes an integer value according to the current options. First argument should be a string representation of the number, or an integer. Returns a number value, which could be an int, float, or decimal. """ if isinstance(sign, (int,long)): if sign < 0: sign = '-' else: sign = '+' if isinstance(s,basestring): if s.startswith('-') or s.startswith('+'): sign = s[0] s = s[1:] if self.int_as_float: # Making a float/decimal if isinstance(s, (int,long)): if self.float_type == NUMBER_DECIMAL: n = self.decimal_context.create_decimal( s ) if sign=='-': n = n.copy_negate() elif s == 0 and sign=='-': n = self.negzero_float elif -999999999999999 <= s <= 999999999999999: n = float(s) if sign=='-': n *= -1 else: n = float(s) if (n == inf or int(n) != s) and self.float_type != NUMBER_FLOAT: n = self.decimal_context.create_decimal( s ) if sign=='-': n = n.copy_negate() elif sign=='-': n *= -1 else: # not already an int n = self.make_float( s, sign ) n2 = self.make_float( s[:-1] + ('9' if s[-1]<='5' else '0'), sign ) if (n==inf or n==n2) and self.float_type != NUMBER_FLOAT: n = self.make_decimal( s, sign ) elif isinstance( s, (int,long) ): # already an integer n = s if sign=='-': if n == 0: n = self.negzero_float else: n *= -1 else: # Making an actual integer try: n = int( s ) except ValueError: n = self.nan else: if sign=='-': if n==0: n = self.negzero_float else: n *= -1 if isinstance(n,(int,long)) and self.keep_format: n = json_int(n, number_format=number_format) return n def make_decimal( self, s, sign='+' ): """Converts a string into a decimal or float value.""" if not decimal or self.float_type == NUMBER_FLOAT: return self.make_float( s, sign ) if s.startswith('-') or s.startswith('+'): sign = s[0] s = s[1:] elif isinstance(sign, (int,long)): if sign < 0: sign = '-' else: sign = '+' try: f = self.decimal_context.create_decimal( s ) except decimal.InvalidOperation: f = self.decimal_context.create_decimal( 'NaN' ) except decimal.Overflow: if sign=='-': f = self.decimal_context.create_decimal( '-Infinity' ) else: f = self.decimal_context.create_decimal( 'Infinity' ) else: if sign=='-': f = f.copy_negate() return f def make_float( self, s, sign='+' ): """Converts a string into a float or decimal value.""" if decimal and self.float_type == NUMBER_DECIMAL: return self.make_decimal( s, sign ) if s.startswith('-') or s.startswith('+'): sign = s[0] s = s[1:] elif isinstance(sign, (int,long)): if sign < 0: sign = '-' else: sign = '+' try: f = float(s) except ValueError: f = nan else: if sign=='-': f *= -1 return f @property def leading_zero_radix(self): """The radix to be used for numbers with leading zeros. 8 or 10 """ return self._leading_zero_radix @leading_zero_radix.setter def leading_zero_radix(self, radix): if isinstance(radix,basestring): try: radix = int(radix) except ValueError: radix = radix.lower() if radix=='octal' or radix=='oct' or radix=='8': radix = 8 elif radix=='decimal' or radix=='dec': radix = 10 if radix not in (8,10): raise ValueError("Radix must either be 8 (octal) or 10 (decimal)") self._leading_zero_radix = radix @property def leading_zero_radix_as_word(self): return {8:'octal', 10:'decimal'}[ self._leading_zero_radix ] def suppress_warnings(self): for name in self.warn_behaviors: self.set_behavior(name, 'allow') @property def allow_or_warn_behaviors(self): """Returns the set of all behaviors that are not forbidden (i.e., are allowed or warned).""" return self.allow_behaviors.union( self.warn_behaviors ) @property def strictness(self): return self._strictness @strictness.setter def strictness(self, strict): """Changes whether the options should be re-configured for strict JSON conformance.""" if strict == STRICTNESS_WARN: self._strictness = STRICTNESS_WARN self.set_all_warn() elif strict == STRICTNESS_STRICT or strict is True: self._strictness = STRICTNESS_STRICT self.keep_format = False self.set_all_forbid() self.warn_duplicate_keys() self.warn_zero_byte() self.warn_bom() self.warn_non_portable() elif strict == STRICTNESS_TOLERANT or strict is False: self._strictness = STRICTNESS_TOLERANT self.set_all_allow() self.warn_duplicate_keys() self.warn_zero_byte() self.warn_leading_zeros() self.leading_zero_radix = 8 self.warn_bom() self.allow_non_portable() else: raise ValueError("Unknown strictness options %r" % strict) self.allow_any_type_at_start() # ---------------------------------------------------------------------- # The main JSON encoder/decoder class. # ---------------------------------------------------------------------- class JSON(object): """An encoder/decoder for JSON data streams. Usually you will call the encode() or decode() methods. The other methods are for lower-level processing. Whether the JSON parser runs in strict mode (which enforces exact compliance with the JSON spec) or the more forgiving non-string mode can be affected by setting the 'strict' argument in the object's initialization; or by assigning True or False to the 'strict' property of the object. You can also adjust a finer-grained control over strictness by allowing or forbidding specific behaviors. You can get a list of all the available behaviors by accessing the 'behaviors' property. Likewise the 'allowed_behaviors' and 'forbidden_behaviors' list which behaviors will be allowed and which will not. Call the allow() or forbid() methods to adjust these. """ _string_quotes = '"\'' _escapes_json = { # character escapes in JSON '"': '"', '/': '/', '\\': '\\', 'b': '\b', 'f': '\f', 'n': '\n', 'r': '\r', 't': '\t', } _escapes_js = { # character escapes in Javascript '"': '"', '\'': '\'', '\\': '\\', 'b': '\b', 'f': '\f', 'n': '\n', 'r': '\r', 't': '\t', 'v': '\v', '0': '\x00' } # Following is a reverse mapping of escape characters, used when we # output JSON. Only those escapes which are always safe (e.g., in JSON) # are here. It won't hurt if we leave questionable ones out. _rev_escapes = {'\n': '\\n', '\t': '\\t', '\b': '\\b', '\r': '\\r', '\f': '\\f', '"': '\\"', '\\': '\\\\' } _optional_rev_escapes = { '/': '\\/' } # only escaped if forced to do so json_syntax_characters = u"{}[]\"\\,:0123456789.-+abcdefghijklmnopqrstuvwxyz \t\n\r" all_hook_names = ('decode_number', 'decode_float', 'decode_object', 'decode_array', 'decode_string', 'encode_value', 'encode_dict', 'encode_dict_key', 'encode_sequence', 'encode_bytes', 'encode_default') def __init__(self, **kwargs): """Creates a JSON encoder/decoder object. You may pass encoding and decoding options either by passing an argument named 'json_options' with an instance of a json_options class; or with individual keyword/values that will be used to initialize a new json_options object. You can also set hooks by using keyword arguments using the hook name; e.g., encode_dict=my_hook_func. """ import sys, unicodedata, re kwargs = kwargs.copy() # Initialize hooks for hookname in self.all_hook_names: if hookname in kwargs: self.set_hook( hookname, kwargs[hookname] ) del kwargs[hookname] else: self.set_hook( hookname, None ) # Set options if 'json_options' in kwargs: self._options = kwargs['json_options'] else: self._options = json_options(**kwargs) # The following is a boolean map of the first 256 characters # which will quickly tell us which of those characters never # need to be escaped. self._asciiencodable = \ [32 <= c < 128 \ and not self._rev_escapes.has_key(chr(c)) \ and not unicodedata.category(unichr(c)) in ['Cc','Cf','Zl','Zp'] for c in range(0,256)] @property def options(self): """The optional behaviors used, e.g., the JSON conformance strictness. Returns an instance of json_options. """ return self._options def clear_hook(self, hookname): """Unsets a hook callback, as previously set with set_hook().""" self.set_hook( hookname, None ) def clear_all_hooks(self): """Unsets all hook callbacks, as previously set with set_hook().""" for hookname in self.all_hook_names: self.clear_hook( hookname ) def set_hook(self, hookname, function): """Sets a user-defined callback function used during encoding or decoding. The 'hookname' argument must be a string containing the name of one of the available hooks, listed below. The 'function' argument must either be None, which disables the hook, or a callable function. Hooks do not stack, if you set a hook it will undo any previously set hook. Netsted values. When decoding JSON that has nested objects or arrays, the decoding hooks will be called once for every corresponding value, even if nested. Generally the decoding hooks will be called from the inner-most value outward, and then left to right. Skipping. Any hook function may raise a JSONSkipHook exception if it does not wish to handle the particular invocation. This will have the effect of skipping the hook for that particular value, as if the hook was net set. AVAILABLE HOOKS: * decode_string Called for every JSON string literal with the Python-equivalent string value as an argument. Expects to get a Python object in return. * decode_float: Called for every JSON number that looks like a float (has a "."). The string representation of the number is passed as an argument. Expects to get a Python object in return. * decode_number: Called for every JSON number. The string representation of the number is passed as an argument. Expects to get a Python object in return. NOTE: If the number looks like a float and the 'decode_float' hook is set, then this hook will not be called. * decode_array: Called for every JSON array. A Python list is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every array, even for nested arrays. * decode_object: Called for every JSON object. A Python dictionary is passed as the argument, and expects to get a Python object back. NOTE: this hook will get called for every object, even for nested objects. * encode_value: Called for every Python object which is to be encoded into JSON. * encode_dict: Called for every Python dictionary or anything that looks like a dictionary. * encode_dict_key: Called for every dictionary key. * encode_sequence: Called for every Python sequence-like object that is not a dictionary or string. This includes lists and tuples. * encode_bytes: Called for every Python bytes or bytearray type; or for any memoryview with a byte ('B') item type. (Python 3 only) * encode_default: Called for any Python type which can not otherwise be converted into JSON, even after applying any other encoding hooks. """ if hookname in self.all_hook_names: att = hookname + '_hook' if function != None and not callable(function): raise ValueError("Hook %r must be None or a callable function" % hookname) setattr( self, att, function ) else: raise ValueError("Unknown hook name %r" % hookname) def has_hook(self, hook_name): if not hook_name or hook_name not in self.all_hook_names: return False hook = getattr( self, hook_name + '_hook' ) return callable(hook) def call_hook(self, hook_name, input_object, position=None, *args, **kwargs): """Wrapper function to invoke a user-supplied hook function. This will capture any exceptions raised by the hook and do something appropriate with it. """ import sys if hook_name not in self.all_hook_names: raise AttributeError("No such hook %r" % hook_name) hook = getattr( self, hook_name + '_hook' ) if not callable(hook): raise TypeError("Hook is not callable: %r" % (hook,)) try: rval = hook( input_object, *args, **kwargs ) except JSONSkipHook: raise # Do nothing except Exception, err: exc_info = sys.exc_info() if hook_name.startswith('encode_'): ex_class = JSONEncodeHookError else: ex_class = JSONDecodeHookError if isinstance(err, JSONStopProcessing): severity = 'fatal' else: severity = 'error' newerr = ex_class( hook_name, exc_info, input_object, *args, position=position, severity=severity ) # Simulate Python 3's: "raise X from Y" exception chaining newerr.__cause__ = err newerr.__traceback__ = exc_info[2] raise newerr return rval def isws(self, c): """Determines if the given character is considered as white space. Note that Javscript is much more permissive on what it considers to be whitespace than does JSON. Ref. ECMAScript section 7.2 """ if not self.options.unicode_whitespace: return c in ' \t\n\r' else: if not isinstance(c,unicode): c = unicode(c) if c in u' \t\n\r\f\v': return True import unicodedata return unicodedata.category(c) == 'Zs' def islineterm(self, c): """Determines if the given character is considered a line terminator. Ref. ECMAScript section 7.3 """ if c == '\r' or c == '\n': return True if c == u'\u2028' or c == u'\u2029': # unicodedata.category(c) in ['Zl', 'Zp'] return True return False def recover_parser(self, state): """Try to recover after a syntax error by locating the next "known" position.""" buf = state.buf buf.skipuntil( lambda c: c in ",:[]{}\"\';" or helpers.char_is_unicode_eol(c) ) stopchar = buf.peek() self.skipws(state) if buf.at_end: state.push_info("Could not recover parsing after previous error",position=buf.position) else: state.push_info("Recovering parsing after character %r" % stopchar, position=buf.position) return stopchar def decode_null(self, state): """Intermediate-level decoder for ECMAScript 'null' keyword. Takes a string and a starting index, and returns a Python None object and the index of the next unparsed character. """ buf = state.buf start_position = buf.position kw = buf.pop_identifier() if not kw or kw != 'null': state.push_error("Expected a 'null' keyword'", kw, position=start_position) else: state.stats.num_nulls += 1 return None def encode_undefined(self, state): """Produces the ECMAScript 'undefined' keyword.""" state.append('undefined') def encode_null(self, state): """Produces the JSON 'null' keyword.""" state.append('null') def decode_boolean(self, state): """Intermediate-level decode for JSON boolean literals. Takes a string and a starting index, and returns a Python bool (True or False) and the index of the next unparsed character. """ buf = state.buf start_position = buf.position kw = buf.pop_identifier() if not kw or kw not in ('true','false'): state.push_error("Expected a 'true' or 'false' keyword'", kw, position=start_position) else: state.stats.num_bools += 1 return (kw == 'true') def encode_boolean(self, bval, state): """Encodes the Python boolean into a JSON Boolean literal.""" state.append( 'true' if bool(bval) else 'false' ) def decode_number(self, state): """Intermediate-level decoder for JSON numeric literals. Takes a string and a starting index, and returns a Python suitable numeric type and the index of the next unparsed character. The returned numeric type can be either of a Python int, long, or float. In addition some special non-numbers may also be returned such as nan, inf, and neginf (technically which are Python floats, but have no numeric value.) Ref. ECMAScript section 8.5. """ buf = state.buf self.skipws(state) start_position = buf.position # Use external number parser hook if available if self.has_hook('decode_number') or self.has_hook('decode_float'): c = buf.peek() if c and c in '-+0123456789.': # First chars for a number-like value buf.save_position() nbr = buf.pop_while_in( '-+0123456789abcdefABCDEF' 'NaN' 'Infinity.' ) if '.' in nbr and self.has_hook('decode_float'): hook_name = 'decode_float' elif self.has_hook('decode_number'): hook_name = 'decode_number' else: hook_name = None if hook_name: try: val = self.call_hook( hook_name, nbr, position=start_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) val = undefined else: buf.clear_saved_position() return val # Hook didn't handle it, restore old position buf.restore_position() # Detect initial sign character(s) sign = +1 sign_count = 0 sign_saw_plus = False sign_saw_ws = False c = buf.peek() while c and c in '+-': if c == '-': sign = sign * -1 elif c == '+': sign_saw_plus = True sign_count += 1 buf.skip() if self.skipws_nocomments(state) > 0: sign_saw_ws = True c = buf.peek() if sign_count > 1 or sign_saw_plus: state.push_cond( self.options.all_numeric_signs, 'Numbers may only have a single "-" as a sign prefix', position=start_position) if sign_saw_ws: state.push_error('Spaces may not appear between a +/- number sign and the digits', position=start_position) # Check for ECMAScript symbolic non-numbers if not c: state.push_error('Missing numeric value after sign', position=start_position) self.recover_parser(state) self.stats.num_undefineds += 1 return undefined elif c.isalpha() or c in '_$': kw = buf.popwhile( lambda c: c.isalnum() or c in '_$' ) if kw == 'NaN': state.push_cond( self.options.non_numbers, 'NaN literals are not allowed in strict JSON', position=start_position) state.stats.num_nans += 1 return self.options.nan elif kw == 'Infinity': state.push_cond( self.options.non_numbers, 'Infinity literals are not allowed in strict JSON', position=start_position) state.stats.num_infinities += 1 if sign < 0: return self.options.neginf else: return self.options.inf else: state.push_error('Unknown numeric value keyword', kw, position=start_position) return undefined # Check for radix-prefixed numbers elif c == '0' and (buf.peek(1) in [u'x',u'X']): # ----- HEX NUMBERS 0x123 prefix = buf.popstr(2) digits = buf.popwhile( helpers.is_hex_digit ) state.push_cond( self.options.hex_numbers, 'Hexadecimal literals are not allowed in strict JSON', prefix+digits, position=start_position ) if len(digits)==0: state.push_error('Hexadecimal number is invalid', position=start_position) self.recover_parser(state) return undefined ival = helpers.decode_hex( digits ) state.update_integer_stats( ival, sign=sign, position=start_position ) n = state.options.make_int( ival, sign, number_format=NUMBER_FORMAT_HEX ) return n elif c == '0' and (buf.peek(1) in [u'o','O']): # ----- NEW-STYLE OCTAL NUMBERS 0o123 prefix = buf.popstr(2) digits = buf.popwhile( helpers.is_octal_digit ) state.push_cond( self.options.octal_numbers, "Octal literals are not allowed in strict JSON", prefix+digits, position=start_position ) if len(digits)==0: state.push_error("Octal number is invalid", position=start_position) self.recover_parser(state) return undefined ival = helpers.decode_octal( digits ) state.update_integer_stats( ival, sign=sign, position=start_position ) n = state.options.make_int( ival, sign, number_format=NUMBER_FORMAT_OCTAL ) return n elif c == '0' and (buf.peek(1) in [u'b','B']): # ----- NEW-STYLE BINARY NUMBERS 0b1101 prefix = buf.popstr(2) digits = buf.popwhile( helpers.is_binary_digit ) state.push_cond( self.options.binary_numbers, "Binary literals are not allowed in strict JSON", prefix+digits, position=start_position ) if len(digits)==0: state.push_error("Binary number is invalid", position=start_position) self.recover_parser(state) return undefined ival = helpers.decode_binary( digits ) state.update_integer_stats( ival, sign=sign, position=start_position ) n = state.options.make_int( ival, sign, number_format=NUMBER_FORMAT_BINARY ) return n else: # ----- DECIMAL OR LEGACY-OCTAL NUMBER. 123, 0123 # General syntax is: \d+[\.\d+][e[+-]?\d+] number = buf.popwhile( lambda c: c in '0123456789.+-eE' ) imax = len(number) if imax == 0: state.push_error('Missing numeric value', position=start_position) has_leading_zero = False units_digits = [] # digits making up whole number portion fraction_digits = [] # digits making up fractional portion exponent_digits = [] # digits making up exponent portion (excluding sign) esign = '+' # sign of exponent sigdigits = 0 # number of significant digits (approximate) saw_decimal_point = False saw_exponent = False # Break number into parts in a first pass...use a mini state machine in_part = 'units' for i, c in enumerate(number): if c == '.': if in_part != 'units': state.push_error('Bad number', number, position=start_position) self.recover_parser(state) return undefined in_part = 'fraction' saw_decimal_point = True elif c in 'eE': if in_part == 'exponent': state.push_error('Bad number', number, position=start_position) self.recover_parser(state) return undefined in_part = 'exponent' saw_exponent = True elif c in '+-': if in_part != 'exponent' or exponent_digits: state.push_error('Bad number', number, position=start_position) self.recover_parser(state) return undefined esign = c else: #digit if in_part == 'units': units_digits.append( c ) elif in_part == 'fraction': fraction_digits.append( c ) elif in_part == 'exponent': exponent_digits.append( c ) units_s = ''.join(units_digits) fraction_s = ''.join(fraction_digits) exponent_s = ''.join(exponent_digits) # Basic syntax rules checking is_integer = not (saw_decimal_point or saw_exponent) if not units_s and not fraction_s: state.push_error('Bad number', number, position=start_position) self.recover_parser(state) return undefined if saw_decimal_point and not fraction_s: state.push_cond( self.options.trailing_decimal_point, 'Bad number, decimal point must be followed by at least one digit', number, position=start_position) fraction_s = '0' if saw_exponent and not exponent_s: state.push_error('Bad number, exponent is missing', number, position=start_position) self.recover_parser(state) return undefined if not units_s: state.push_cond( self.options.initial_decimal_point, 'Bad number, decimal point must be preceded by at least one digit', number, position=start_position) units = '0' elif len(units_s) > 1 and units_s[0] == '0': has_leading_zero = True if self.options.is_forbid_leading_zeros: state.push_cond( self.options.leading_zeros, 'Numbers may not have extra leading zeros', number, position=start_position) elif self.options.is_warn_leading_zeros: state.push_cond( self.options.leading_zeros, 'Numbers may not have leading zeros; interpreting as %s' \ % self.options.leading_zero_radix_as_word, number, position=start_position) # Estimate number of significant digits sigdigits = len( (units_s + fraction_s).replace('0',' ').strip() ) # Handle legacy octal integers. if has_leading_zero and is_integer and self.options.leading_zero_radix == 8: # ----- LEGACY-OCTAL 0123 try: ival = helpers.decode_octal( units_s ) except ValueError: state.push_error('Bad number, not a valid octal value', number, position=start_position) self.recover_parser(state) return self.options.nan # undefined state.update_integer_stats( ival, sign=sign, position=start_position ) n = state.options.make_int( ival, sign, number_format=NUMBER_FORMAT_LEGACYOCTAL ) return n # Determine the exponential part if exponent_s: try: exponent = int(exponent_s) except ValueError: state.push_error('Bad number, bad exponent', number, position=start_position) self.recover_parser(state) return undefined if esign == '-': exponent = - exponent else: exponent = 0 # Try to make an int/long first. if not saw_decimal_point and exponent >= 0: # ----- A DECIMAL INTEGER ival = int(units_s) if exponent != 0: ival *= 10**exponent state.update_integer_stats( ival, sign=sign, position=start_position ) n = state.options.make_int( ival, sign ) else: # ----- A FLOATING-POINT NUMBER try: if exponent < float_minexp or exponent > float_maxexp or sigdigits > float_sigdigits: n = state.options.make_decimal( number, sign ) else: n = state.options.make_float( number, sign ) except ValueError as err: state.push_error('Bad number, %s' % err.message, number, position=start_position) n = undefined else: state.update_float_stats( n, sign=sign, position=start_position ) return n def encode_number(self, n, state): """Encodes a Python numeric type into a JSON numeric literal. The special non-numeric values of float('nan'), float('inf') and float('-inf') are translated into appropriate JSON literals. Note that Python complex types are not handled, as there is no ECMAScript equivalent type. """ if isinstance(n, complex): if n.imag: raise JSONEncodeError('Can not encode a complex number that has a non-zero imaginary part',n) n = n.real if isinstance(n, json_int): state.append( n.json_format() ) return if isinstance(n, (int,long)): state.append( str(n) ) return if decimal and isinstance(n, decimal.Decimal): if n.is_nan(): # Could be 'NaN' or 'sNaN' state.append( 'NaN' ) elif n.is_infinite(): if n.is_signed(): state.append( '-Infinity' ) else: state.append( 'Infinity' ) else: s = str(n).lower() if 'e' not in s and '.' not in s: s = s + '.0' state.append( s ) return global nan, inf, neginf if n is nan: state.append( 'NaN' ) elif n is inf: state.append( 'Infinity' ) elif n is neginf: state.append( '-Infinity' ) elif isinstance(n, float): # Check for non-numbers. # In python nan == inf == -inf, so must use repr() to distinguish reprn = repr(n).lower() if ('inf' in reprn and '-' in reprn) or n == neginf: state.append( '-Infinity' ) elif 'inf' in reprn or n is inf: state.append( 'Infinity' ) elif 'nan' in reprn or n is nan: state.append( 'NaN' ) else: # A normal float. state.append( repr(n) ) else: raise TypeError('encode_number expected an integral, float, or decimal number type',type(n)) def decode_string(self, state): """Intermediate-level decoder for JSON string literals. Takes a string and a starting index, and returns a Python string (or unicode string) and the index of the next unparsed character. """ buf = state.buf self.skipws(state) quote = buf.peek() if quote == '"': pass elif quote == "'": state.push_cond( self.options.single_quoted_strings, 'String literals must use double quotation marks in strict JSON' ) else: state.push_error('String literal must be properly quoted') return undefined string_position = buf.position buf.skip() if self.options.is_forbid_js_string_escapes: escapes = self._escapes_json else: escapes = self._escapes_js ccallowed = not self.options.is_forbid_control_char_in_string chunks = [] _append = chunks.append # Used to track the last seen high-surrogate character high_surrogate = None highsur_position = None # Used to track if errors occured so we don't keep reporting multiples had_lineterm_error = False # Start looping character by character until the final quotation mark saw_final_quote = False should_stop = False while not saw_final_quote and not should_stop: if buf.at_end: state.push_error("String literal is not terminated", outer_position=string_position, context='String') break c = buf.peek() # Make sure a high surrogate is immediately followed by a low surrogate if high_surrogate: if 0xdc00 <= ord(c) <= 0xdfff: low_surrogate = buf.pop() try: uc = helpers.surrogate_pair_as_unicode( high_surrogate, low_surrogate ) except ValueError as err: state.push_error( 'Illegal Unicode surrogate pair', (high_surrogate, low_surrogate), position=highsur_position, outer_position=string_position, context='String') should_stop = state.should_stop uc = u'\ufffd' # replacement char _append( uc ) high_surrogate = None highsur_position = None continue # ==== NEXT CHAR elif buf.peekstr(2) != '\\u': state.push_error('High unicode surrogate must be followed by a low surrogate', position=highsur_position, outer_position=string_position, context='String') should_stop = state.should_stop _append( u'\ufffd' ) # replacement char high_surrogate = None highsur_position = None if c == quote: buf.skip() # skip over closing quote saw_final_quote = True break elif c == '\\': # Escaped character escape_position = buf.position buf.skip() # skip over backslash c = buf.peek() if not c: state.push_error('Escape in string literal is incomplete', position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop break elif helpers.is_octal_digit(c): # Handle octal escape codes first so special \0 doesn't kick in yet. # Follow Annex B.1.2 of ECMAScript standard. if '0' <= c <= '3': maxdigits = 3 else: maxdigits = 2 digits = buf.popwhile( helpers.is_octal_digit, maxchars=maxdigits ) n = helpers.decode_octal(digits) if n == 0: state.push_cond( self.options.zero_byte, 'Zero-byte character (U+0000) in string may not be universally safe', "\\"+digits, position=escape_position, outer_position=string_position, context='String') else: # n != 0 state.push_cond( self.options.octal_numbers, "JSON does not allow octal character escapes other than \"\\0\"", "\\"+digits, position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop if n < 128: _append( chr(n) ) else: _append( helpers.safe_unichr(n) ) elif escapes.has_key(c): buf.skip() _append( escapes[c] ) elif c == 'u' or c == 'x': buf.skip() esc_opener = '\\' + c esc_closer = '' if c == 'u': if buf.peek() == '{': buf.skip() esc_opener += '{' esc_closer = '}' maxdigits = None state.push_cond( self.options.extended_unicode_escapes, "JSON strings do not allow \\u{...} escapes", position=escape_position, outer_position=string_position, context='String') else: maxdigits = 4 else: # c== 'x' state.push_cond( self.options.js_string_escapes, "JSON strings may not use the \\x hex-escape", position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop maxdigits = 2 digits = buf.popwhile( helpers.is_hex_digit, maxchars=maxdigits ) if esc_closer: if buf.peek() != esc_closer: state.push_error( "Unicode escape sequence is missing closing \'%s\'" % esc_closer, esc_opener+digits, position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop else: buf.skip() esc_sequence = esc_opener + digits + esc_closer if not digits: state.push_error('numeric character escape sequence is truncated', esc_sequence, position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop codepoint = 0xfffd # replacement char else: if maxdigits and len(digits) != maxdigits: state.push_error('escape sequence has too few hexadecimal digits', esc_sequence, position=escape_position, outer_position=string_position, context='String') codepoint = helpers.decode_hex( digits ) if codepoint > 0x10FFFF: state.push_error( 'Unicode codepoint is beyond U+10FFFF', esc_opener+digits+esc_closer, position=escape_position, outer_position=string_position, context='String') codepoint = 0xfffd # replacement char if high_surrogate: # Decode surrogate pair and clear high surrogate low_surrogate = unichr(codepoint) try: uc = helpers.surrogate_pair_as_unicode( high_surrogate, low_surrogate ) except ValueError as err: state.push_error( 'Illegal Unicode surrogate pair', (high_surrogate, low_surrogate), position=highsur_position, outer_position=string_position, context='String') should_stop = state.should_stop uc = u'\ufffd' # replacement char _append( uc ) high_surrogate = None highsur_position = None elif codepoint < 128: # ASCII chars always go in as a str if codepoint==0: state.push_cond( self.options.zero_byte, 'Zero-byte character (U+0000) in string may not be universally safe', position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop _append( chr(codepoint) ) elif 0xd800 <= codepoint <= 0xdbff: # high surrogate high_surrogate = unichr(codepoint) # remember until we get to the low surrogate highsur_position = escape_position.copy() elif 0xdc00 <= codepoint <= 0xdfff: # low surrogate state.push_error('Low unicode surrogate must be proceeded by a high surrogate', position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop _append( u'\ufffd' ) # replacement char else: # Other chars go in as a unicode char _append( helpers.safe_unichr(codepoint) ) else: # Unknown escape sequence state.push_cond( self.options.nonescape_characters, 'String escape code is not allowed in strict JSON', '\\'+c, position=escape_position, outer_position=string_position, context='String') should_stop = state.should_stop _append( c ) buf.skip() elif ord(c) <= 0x1f: # A control character if ord(c) == 0: state.push_cond( self.options.zero_byte, 'Zero-byte character (U+0000) in string may not be universally safe', position=buf.position, outer_position=string_position, context='String') should_stop = state.should_stop if self.islineterm(c): if not had_lineterm_error: state.push_error('Line terminator characters must be escaped inside string literals', 'U+%04X'%ord(c), position=buf.position, outer_position=string_position, context='String') should_stop = state.should_stop had_lineterm_error = True _append( c ) buf.skip() elif ccallowed: _append( c ) buf.skip() else: state.push_error('Control characters must be escaped inside JSON string literals', 'U+%04X'%ord(c), position=buf.position, outer_position=string_position, context='String') should_stop = state.should_stop buf.skip() elif 0xd800 <= ord(c) <= 0xdbff: # a raw high surrogate high_surrogate = buf.pop() # remember until we get to the low surrogate highsur_position = buf.position.copy() else: # A normal character; not an escape sequence or end-quote. # Find a whole sequence of "safe" characters so we can append them # all at once rather than one a time, for speed. chunk = buf.popwhile( lambda c: c not in helpers.unsafe_string_chars and c != quote ) if not chunk: _append( c ) buf.skip() else: _append( chunk ) # Check proper string termination if high_surrogate: state.push_error('High unicode surrogate must be followed by a low surrogate', position=highsur_position, outer_position=string_position, context='String') _append( u'\ufffd' ) # replacement char high_surrogate = None highsur_position = None if not saw_final_quote: state.push_error('String literal is not terminated with a quotation mark', position=buf.position, outer_position=string_position, context='String') if state.should_stop: return undefined # Compose the python string and update stats s = ''.join( chunks ) state.update_string_stats( s, position=string_position ) # Call string hook if self.has_hook('decode_string'): try: s = self.call_hook( 'decode_string', s, position=string_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) s = undefined return s def encode_string(self, s, state): """Encodes a Python string into a JSON string literal. """ # Must handle instances of UserString specially in order to be # able to use ord() on it's simulated "characters". Also # convert Python2 'str' types to unicode strings first. import unicodedata, sys import UserString py2strenc = self.options.py2str_encoding if isinstance(s, UserString.UserString): def tochar(c): c2 = c.data if py2strenc and not isinstance(c2,unicode): return c2.decode( py2strenc ) else: return c2 elif py2strenc and not isinstance(s,unicode): s = s.decode( py2strenc ) tochar = None else: # Could use "lambda c:c", but that is too slow. So we set to None # and use an explicit if test inside the loop. tochar = None chunks = [] chunks.append('"') revesc = self._rev_escapes optrevesc = self._optional_rev_escapes asciiencodable = self._asciiencodable always_escape = state.options.always_escape_chars encunicode = state.escape_unicode_test i = 0 imax = len(s) while i < imax: if tochar: c = tochar(s[i]) else: c = s[i] cord = ord(c) if cord < 256 and asciiencodable[cord] and isinstance(encunicode, bool) \ and not (always_escape and c in always_escape): # Contiguous runs of plain old printable ASCII can be copied # directly to the JSON output without worry (unless the user # has supplied a custom is-encodable function). j = i i += 1 while i < imax: if tochar: c = tochar(s[i]) else: c = s[i] cord = ord(c) if cord < 256 and asciiencodable[cord] \ and not (always_escape and c in always_escape): i += 1 else: break chunks.append( unicode(s[j:i]) ) elif revesc.has_key(c): # Has a shortcut escape sequence, like "\n" chunks.append(revesc[c]) i += 1 elif cord <= 0x1F: # Always unicode escape ASCII-control characters chunks.append(r'\u%04x' % cord) i += 1 elif 0xD800 <= cord <= 0xDFFF: # A raw surrogate character! # This should ONLY happen in "narrow" Python builds # where (sys.maxunicode == 65535) as Python itself # uses UTF-16. But for "wide" Python builds, a raw # surrogate should never happen. handled_raw_surrogates = False if sys.maxunicode == 0xFFFF and 0xD800 <= cord <= 0xDBFF and (i+1) < imax: # In a NARROW Python, output surrogate pair as-is hsurrogate = cord i += 1 if tochar: c = tochar(s[i]) else: c = s[i] cord = ord(c) i += 1 if 0xDC00 <= cord <= 0xDFFF: lsurrogate = cord chunks.append(r'\u%04x\u%04x' % (hsurrogate,lsurrogate)) handled_raw_surrogates = True if not handled_raw_surrogates: cname = 'U+%04X' % cord raise JSONEncodeError('can not include or escape a Unicode surrogate character',cname) elif cord <= 0xFFFF: # Other BMP Unicode character if always_escape and c in always_escape: doesc = True elif unicodedata.category( c ) in ['Cc','Cf','Zl','Zp']: doesc = True elif callable(encunicode): doesc = encunicode( c ) else: doesc = encunicode if doesc: if optrevesc.has_key(c): chunks.append(optrevesc[c]) else: chunks.append(r'\u%04x' % cord) else: chunks.append( c ) i += 1 else: # ord(c) >= 0x10000 # Non-BMP Unicode if always_escape and c in always_escape: doesc = True elif unicodedata.category( c ) in ['Cc','Cf','Zl','Zp']: doesc = True elif callable(encunicode): doesc = encunicode( c ) else: doesc = encunicode if doesc: for surrogate in helpers.unicode_as_surrogate_pair(c): chunks.append(r'\u%04x' % ord(surrogate)) else: chunks.append( c ) i += 1 chunks.append('"') state.append( ''.join( chunks ) ) def decode_identifier(self, state, identifier_as_string=False): """Decodes an identifier/keyword. """ buf = state.buf self.skipws(state) start_position = buf.position obj = None kw = buf.pop_identifier() if not kw: state.push_error("Expected an identifier", position=start_position) elif kw == 'null': obj = None state.stats.num_nulls += 1 elif kw == 'true': obj = True state.stats.num_bools += 1 elif kw == 'false': obj = False state.stats.num_bools += 1 elif kw == 'undefined': state.push_cond( self.options.undefined_values, "Strict JSON does not allow the 'undefined' keyword", kw, position=start_position) obj = undefined state.stats.num_undefineds += 1 elif kw == 'NaN' or kw == 'Infinity': state.push_cond( self.options.non_numbers, "%s literals are not allowed in strict JSON" % kw, kw, position=start_position) if self.has_hook('decode_float'): try: val = self.call_hook( 'decode_float', kw, position=start_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) return undefined else: return val elif self.has_hook('decode_number'): try: val = self.call_hook( 'decode_number', kw, position=start_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) return undefined else: return val if kw == 'NaN': state.stats.num_nans += 1 obj = state.options.nan else: state.stats.num_infinities += 1 obj = state.options.inf else: # Convert unknown identifiers into strings if identifier_as_string: if kw in helpers.javascript_reserved_words: state.push_warning( "Identifier is a JavaScript reserved word", kw, position=start_position) state.push_cond( self.options.identifier_keys, "JSON does not allow identifiers to be used as strings", kw, position=start_position) state.stats.num_identifiers += 1 obj = self.decode_javascript_identifier( kw ) else: state.push_error("Unknown identifier", kw, position=start_position) obj = undefined state.stats.num_identifiers += 1 return obj def skip_comment(self, state): """Skips an ECMAScript comment, either // or /* style. The contents of the comment are returned as a string, as well as the index of the character immediately after the comment. """ buf = state.buf uniws = self.options.unicode_whitespace s = buf.peekstr(2) if s != '//' and s != '/*': return None state.push_cond( self.options.comments, 'Comments are not allowed in strict JSON' ) start_position = buf.position buf.skip(2) multiline = (s == '/*') saw_close = False while not buf.at_end: if multiline: if buf.peekstr(2) == '*/': buf.skip(2) saw_close = True break elif buf.peekstr(2) == '/*': state.push_error('Multiline /* */ comments may not nest', outer_position=start_position, context='Comment') else: if buf.at_eol( uniws ): buf.skip_to_next_line( uniws ) saw_close = True break buf.pop() if not saw_close and multiline: state.push_error('Comment was never terminated', outer_position=start_position, context='Comment') state.stats.num_comments += 1 def skipws_nocomments(self, state): """Skips whitespace (will not allow comments). """ return state.buf.skipws( not self.options.is_forbid_unicode_whitespace ) def skipws(self, state): """Skips all whitespace, including comments and unicode whitespace Takes a string and a starting index, and returns the index of the next non-whitespace character. If the 'skip_comments' behavior is True and not running in strict JSON mode, then comments will be skipped over just like whitespace. """ buf = state.buf uniws = not self.options.unicode_whitespace while not buf.at_end: c = buf.peekstr(2) if c == '/*' or c == '//': cmt = self.skip_comment( state ) elif buf.at_ws( uniws ): buf.skipws( uniws ) else: break def decode_composite(self, state): """Intermediate-level JSON decoder for composite literal types (array and object). """ if state.should_stop: return None buf = state.buf self.skipws(state) opener = buf.peek() if opener not in '{[': state.push_error('Composite data must start with "[" or "{"') return None start_position = buf.position buf.skip() if opener == '[': isdict = False closer = ']' obj = [] else: isdict = True closer = '}' if state.options.sort_keys == SORT_PRESERVE and _OrderedDict: obj = _OrderedDict() else: obj = {} num_items = 0 self.skipws(state) c = buf.peek() if c == closer: # empty composite buf.skip() done = True else: saw_value = False # set to false at beginning and after commas done = False while not done and not buf.at_end and not state.should_stop: self.skipws(state) c = buf.peek() if c == '': break # will report error futher down because done==False elif c == ',': if not saw_value: # no preceeding value, an elided (omitted) element if isdict: state.push_error('Can not omit elements of an object (dictionary)', outer_position=start_position, context='Object') else: state.push_cond( self.options.omitted_array_elements, 'Can not omit elements of an array (list)', outer_position=start_position, context='Array') obj.append( undefined ) if state.stats: state.stats.num_undefineds += 1 buf.skip() # skip over comma saw_value = False continue elif c == closer: if not saw_value: if isdict: state.push_cond( self.options.trailing_comma, 'Strict JSON does not allow a final comma in an object (dictionary) literal', outer_position=start_position, context='Object') else: state.push_cond( self.options.trailing_comma, 'Strict JSON does not allow a final comma in an array (list) literal', outer_position=start_position, context='Array') buf.skip() # skip over closer done = True break elif c in ']}': if isdict: cdesc='Object' else: cdesc='Array' state.push_error("Expected a '%c' but saw '%c'" % (closer,c), outer_position=start_position, context=cdesc) done = True break if state.should_stop: break # Decode the item/value value_position = buf.position if isdict: val = self.decodeobj(state, identifier_as_string=True) else: val = self.decodeobj(state, identifier_as_string=False) if val is syntax_error: recover_c = self.recover_parser(state) if recover_c not in ':': continue if state.should_stop: break if saw_value: # Two values without a separating comma if isdict: cdesc='Object' else: cdesc='Array' state.push_error('Values must be separated by a comma', position=value_position, outer_position=start_position, context=cdesc) saw_value = True self.skipws(state) if state.should_stop: break if isdict: skip_item = False key = val # Ref 11.1.5 key_position = value_position if not helpers.isstringtype(key): if helpers.isnumbertype(key): state.push_cond( self.options.nonstring_keys, 'JSON only permits string literals as object properties (keys)', position=key_position, outer_position=start_position, context='Object') else: state.push_error('Object properties (keys) must be string literals, numbers, or identifiers', position=key_position, outer_position=start_position, context='Object') skip_item = True c = buf.peek() if c != ':': state.push_error('Missing value for object property, expected ":"', position=value_position, outer_position=start_position, context='Object') buf.skip() # skip over colon self.skipws(state) rval = self.decodeobj(state) self.skipws(state) if not skip_item: if key in obj: state.push_cond( self.options.duplicate_keys, 'Object contains duplicate key', key, position=key_position, outer_position=start_position, context='Object') if key == '': state.push_cond( self.options.non_portable, 'Using an empty string "" as an object key may not be portable', position=key_position, outer_position=start_position, context='Object') obj[ key ] = rval num_items += 1 else: # islist obj.append( val ) num_items += 1 # end while if state.stats: if isdict: state.stats.max_items_in_object = max(state.stats.max_items_in_object, num_items) else: state.stats.max_items_in_array = max(state.stats.max_items_in_array, num_items) if state.should_stop: return obj # Make sure composite value is properly terminated if not done: if isdict: state.push_error('Object literal (dictionary) is not terminated', outer_position=start_position, context='Object') else: state.push_error('Array literal (list) is not terminated', outer_position=start_position, context='Array') # Update stats and run hooks if isdict: state.stats.num_objects += 1 if self.has_hook('decode_object'): try: obj = self.call_hook( 'decode_object', obj, position=start_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) obj = undefined else: state.stats.num_arrays += 1 if self.has_hook('decode_array'): try: obj = self.call_hook( 'decode_array', obj, position=start_position ) except JSONSkipHook: pass except JSONError, err: state.push_exception(err) obj = undefined return obj def decode_javascript_identifier(self, name): """Convert a JavaScript identifier into a Python string object. This method can be overriden by a subclass to redefine how JavaScript identifiers are turned into Python objects. By default this just converts them into strings. """ return name def decodeobj(self, state, identifier_as_string=False, at_document_start=False): """Intermediate-level JSON decoder. Takes a string and a starting index, and returns a two-tuple consting of a Python object and the index of the next unparsed character. If there is no value at all (empty string, etc), then None is returned instead of a tuple. """ buf = state.buf obj = None self.skipws(state) if buf.at_end: state.push_error('Unexpected end of input') c = buf.peek() if c in '{[': state.cur_depth += 1 try: state.update_depth_stats() obj = self.decode_composite(state) finally: state.cur_depth -= 1 else: if at_document_start: state.push_cond( self.options.any_type_at_start, 'JSON document must start with an object or array type only' ) if c in self._string_quotes: obj = self.decode_string(state) elif c.isdigit() or c in '.+-': obj = self.decode_number(state) elif c.isalpha() or c in'_$': obj = self.decode_identifier(state, identifier_as_string=identifier_as_string) else: state.push_error('Can not decode value starting with character %r' % c) buf.skip() self.recover_parser(state) obj = syntax_error return obj def decode(self, txt, encoding=None, return_errors=False, return_stats=False): """Decodes a JSON-encoded string into a Python object. The 'return_errors' parameter controls what happens if the input JSON has errors in it. * False: the first error will be raised as a Python exception. If there are no errors then the corresponding Python object will be returned. * True: the return value is always a 2-tuple: (object, error_list) """ import sys state = decode_state( options=self.options ) # Prepare the input state.set_input( txt, encoding=encoding ) # Do the decoding if not state.has_errors: self.__sanity_check_start( state ) if not state.has_errors: try: self._do_decode( state ) # DECODE! except JSONException, err: state.push_exception( err ) except Exception, err: # Mainly here to catch maximum recursion depth exceeded e2 = sys.exc_info() raise newerr = JSONDecodeError("An unexpected failure occured", severity='fatal', position=state.buf.position) newerr.__cause__ = err newerr.__traceback__ = e2[2] state.push_exception( newerr ) if return_stats and state.buf: state.stats.num_excess_whitespace = state.buf.num_ws_skipped state.stats.total_chars = state.buf.position.char_position # Handle the errors result_type = _namedtuple('json_results',['object','errors','stats']) if return_errors: if return_stats: return result_type(state.obj, state.errors, state.stats) else: return result_type(state.obj, state.errors, None) else: # Don't cause warnings to raise an error errors = [err for err in state.errors if err.severity in ('fatal','error')] if errors: raise errors[0] if return_stats: return result_type(state.obj, None, state.stats) else: return state.obj def __sanity_check_start(self, state): """Check that the document seems sane by looking at the first couple characters. Check that the decoding seems sane. Per RFC 4627 section 3: "Since the first two characters of a JSON text will always be ASCII characters [RFC0020], ..." [WAS removed from RFC 7158, but still valid via the grammar.] This check is probably not necessary, but it allows us to raise a suitably descriptive error rather than an obscure syntax error later on. Note that the RFC requirements of two ASCII characters seems to be an incorrect statement as a JSON string literal may have as it's first character any unicode character. Thus the first two characters will always be ASCII, unless the first character is a quotation mark. And in non-strict mode we can also have a few other characters too. """ is_sane = True unitxt = state.buf.peekstr(2) if len(unitxt) >= 2: first, second = unitxt[:2] if first in self._string_quotes: pass # second can be anything inside string literal else: if ((ord(first) < 0x20 or ord(first) > 0x7f) or \ (ord(second) < 0x20 or ord(second) > 0x7f)) and \ (not self.isws(first) and not self.isws(second)): # Found non-printable ascii, must check unicode # categories to see if the character is legal. # Only whitespace, line and paragraph separators, # and format control chars are legal here. import unicodedata catfirst = unicodedata.category(unicode(first)) catsecond = unicodedata.category(unicode(second)) if catfirst not in ('Zs','Zl','Zp','Cf') or \ catsecond not in ('Zs','Zl','Zp','Cf'): state.push_fatal( 'The input is gibberish, is the Unicode encoding correct?' ) return is_sane def _do_decode(self, state): """This is the internal function that does the JSON decoding. Called by the decode() method, after it has performed any Unicode decoding, etc. """ buf = state.buf self.skipws(state) if buf.at_end: state.push_error('No value to decode') else: if state.options.decimal_context: dec_ctx = decimal.localcontext( state.options.decimal_context ) else: dec_ctx = _dummy_context_manager with dec_ctx: state.obj = self.decodeobj(state, at_document_start=True ) if not state.should_stop: # Make sure there's nothing at the end self.skipws(state) if not buf.at_end: state.push_error('Unexpected text after end of JSON value') def _classify_for_encoding( self, obj ): import datetime c = 'other' if obj is None: c = 'null' elif obj is undefined: c = 'undefined' elif isinstance(obj,bool): c = 'bool' elif isinstance(obj, (int,long,float,complex)) or\ (decimal and isinstance(obj, decimal.Decimal)): c = 'number' elif isinstance(obj, basestring) or helpers.isstringtype(obj): c = 'string' else: if isinstance(obj,dict): c = 'dict' elif isinstance(obj,tuple) and hasattr(obj,'_asdict') and callable(obj._asdict): # Have a named tuple enc_nt = self.options.encode_namedtuple_as_object if enc_nt and (enc_nt is True or (callable(enc_nt) and enc_nt(obj))): c = 'namedtuple' else: c = 'sequence' elif isinstance(obj, (list,tuple,set,frozenset)): c = 'sequence' elif hasattr(obj,'iterkeys') or (hasattr(obj,'__getitem__') and hasattr(obj,'keys')): c = 'dict' elif isinstance(obj, datetime.datetime): # Check datetime before date because it is a subclass! c = 'datetime' elif isinstance(obj, datetime.date): c = 'date' elif isinstance(obj, datetime.time): c = 'time' elif isinstance(obj, datetime.timedelta): c = 'timedelta' elif _py_major >= 3 and isinstance(obj,(bytes,bytearray)): c = 'bytes' elif _py_major >= 3 and isinstance(obj,memoryview): c = 'memoryview' elif _enum is not None and isinstance(obj,_enum): c = 'enum' else: c = 'other' return c def encode(self, obj, encoding=None ): """Encodes the Python object into a JSON string representation. This method will first attempt to encode an object by seeing if it has a json_equivalent() method. If so than it will call that method and then recursively attempt to encode the object resulting from that call. Next it will attempt to determine if the object is a native type or acts like a squence or dictionary. If so it will encode that object directly. Finally, if no other strategy for encoding the object of that type exists, it will call the encode_default() method. That method currently raises an error, but it could be overridden by subclasses to provide a hook for extending the types which can be encoded. """ import sys, codecs # Make a fresh encoding state state = encode_state( self.options ) # Find the codec to use. CodecInfo will be in 'cdk' and name in 'encoding'. # # Also set the state's 'escape_unicode_test' property which is used to # determine what characters to \u-escape. if encoding is None: cdk = None elif isinstance(encoding, codecs.CodecInfo): cdk = encoding encoding = cdk.name else: cdk = helpers.lookup_codec( encoding ) if not cdk: raise JSONEncodeError('no codec available for character encoding',encoding) if self.options.escape_unicode and callable(self.options.escape_unicode): # User-supplied repertoire test function state.escape_unicode_test = self.options.escape_unicode else: if self.options.escape_unicode==True or not cdk or cdk.name.lower() == 'ascii': # ASCII, ISO8859-1, or and Unknown codec -- \u escape anything not ASCII state.escape_unicode_test = lambda c: ord(c) >= 0x80 elif cdk.name == 'iso8859-1': state.escape_unicode_test = lambda c: ord(c) >= 0x100 elif cdk and cdk.name.lower().startswith('utf'): # All UTF-x encodings can do the whole Unicode repertoire, so # do nothing special. state.escape_unicode_test = False else: # An unusual codec. We need to test every character # to see if it is in the codec's repertoire to determine # if we should \u escape that character. enc_func = cdk.encode def escape_unicode_hardway( c ): try: enc_func( c ) except UnicodeEncodeError: return True else: return False state.escape_unicode_test = escape_unicode_hardway # Make sure the encoding is not degenerate: it can encode the minimal # number of characters needed by the JSON syntax rules. if encoding is not None: try: output, nchars = cdk.encode( JSON.json_syntax_characters ) except UnicodeError, err: raise JSONEncodeError("Output encoding %s is not sufficient to encode JSON" % cdk.name) # Do the JSON encoding! self._do_encode( obj, state ) if not self.options.encode_compactly: state.append('\n') unitxt = state.combine() # Do the final Unicode encoding if encoding is None: output = unitxt else: try: output, nchars = cdk.encode( unitxt ) except UnicodeEncodeError, err: # Re-raise as a JSONDecodeError e2 = sys.exc_info() newerr = JSONEncodeError("a Unicode encoding error occurred") # Simulate Python 3's: "raise X from Y" exception chaining newerr.__cause__ = err newerr.__traceback__ = e2[2] raise newerr return output def _do_encode(self, obj, state): """Internal encode function.""" obj_classification = self._classify_for_encoding( obj ) if self.has_hook('encode_value'): orig_obj = obj try: obj = self.call_hook( 'encode_value', obj ) except JSONSkipHook: pass if obj is not orig_obj: prev_cls = obj_classification obj_classification = self._classify_for_encoding( obj ) if obj_classification != prev_cls: # Got a different type of object, re-encode again self._do_encode( obj, state ) return if hasattr(obj, 'json_equivalent'): success = self.encode_equivalent( obj, state ) if success: return if obj_classification == 'null': self.encode_null( state ) elif obj_classification == 'undefined': if not self.options.is_forbid_undefined_values: self.encode_undefined( state ) else: raise JSONEncodeError('strict JSON does not permit "undefined" values') elif obj_classification == 'bool': self.encode_boolean( obj, state ) elif obj_classification == 'number': try: self.encode_number( obj, state ) except JSONEncodeError, err1: # Bad number, probably a complex with non-zero imaginary part. # Let the default encoders take a shot at encoding. try: self.try_encode_default(obj, state) except Exception, err2: # Default handlers couldn't deal with it, re-raise original exception. raise err1 elif obj_classification == 'string': self.encode_string( obj, state ) elif obj_classification == 'enum': # Python 3.4 enum.Enum self.encode_enum( obj, state ) elif obj_classification == 'datetime': # Python datetime.datetime self.encode_datetime( obj, state ) elif obj_classification == 'date': # Python datetime.date self.encode_date( obj, state ) elif obj_classification == 'time': # Python datetime.time self.encode_time( obj, state ) elif obj_classification == 'timedelta': # Python datetime.time self.encode_timedelta( obj, state ) else: # Anything left is probably composite, or an unconvertable type. self.encode_composite( obj, state ) def encode_enum(self, val, state): """Encode a Python Enum value into JSON.""" eas = self.options.encode_enum_as if eas == 'qname': self.encode_string( str(obj), state ) elif eas == 'value': self._do_encode( obj.value, state ) else: # eas == 'name' self.encode_string( obj.name, state ) def encode_date(self, dt, state): fmt = self.options.date_format if not fmt or fmt == 'iso': fmt = '%Y-%m-%d' self.encode_string( dt.strftime(fmt), state ) def encode_datetime(self, dt, state): fmt = self.options.datetime_format is_iso = not fmt or fmt == 'iso' if is_iso: if dt.microsecond == 0: fmt = '%Y-%m-%dT%H:%M:%S%z' else: fmt = '%Y-%m-%dT%H:%M:%S.%f%z' s = dt.strftime(fmt) if is_iso and s.endswith('-00:00') or s.endswith('+00:00'): s = s[:-6] + 'Z' # Change UTC to use 'Z' notation self.encode_string( s, state ) def encode_time(self, t, state): fmt = self.options.datetime_format is_iso = not fmt or fmt == 'iso' if is_iso: if dt.microsecond == 0: fmt = 'T%H:%M:%S%z' else: fmt = 'T%H:%M:%S.%f%z' s = t.strftime(fmt) if is_iso and s.endswith('-00:00') or s.endswith('+00:00'): s = s[:-6] + 'Z' # Change UTC to use 'Z' notation self.encode_string( s, state ) def encode_timedelta(self, td, state): fmt = self.options.timedelta_format if not fmt or fmt == 'iso': s = helpers.format_timedelta_iso( td ) elif fmt == 'hms': s = str(td) else: raise ValueError("Unknown timedelta_format %r" % fmt) self.encode_string( s, state ) def encode_composite(self, obj, state, obj_classification=None): """Encodes just composite objects: dictionaries, lists, or sequences. Basically handles any python type for which iter() can create an iterator object. This method is not intended to be called directly. Use the encode() method instead. """ import sys if not obj_classification: obj_classification = self._classify_for_encoding(obj) # Convert namedtuples to dictionaries if obj_classification == 'namedtuple': obj = obj._asdict() obj_classification = 'dict' # Convert 'unsigned byte' memory views into plain bytes if obj_classification == 'memoryview' and obj.format == 'B': obj = obj.tobytes() obj_classification = 'bytes' # Run hooks hook_name = None if obj_classification == 'dict': hook_name = 'encode_dict' elif obj_classification == 'sequence': hook_name = 'encode_sequence' elif obj_classification == 'bytes': hook_name = 'encode_bytes' if self.has_hook(hook_name): try: new_obj = self.call_hook( hook_name, obj ) except JSONSkipHook: pass else: if new_obj is not obj: obj = new_obj prev_cls = obj_classification obj_classification = self._classify_for_encoding( obj ) if obj_classification != prev_cls: # Transformed to a different kind of object, call # back to the general encode() method. self._do_encode( obj, state ) return # Else, fall through # At his point we have decided to do with an object or an array isdict = (obj_classification == 'dict') # Get iterator it = None if isdict and hasattr(obj,'iterkeys'): try: it = obj.iterkeys() except AttributeError: pass else: try: it = iter(obj) except TypeError: pass # Convert each member to JSON if it is not None: # Try to get length, but don't fail if we can't try: numitems = len(obj) except TypeError: numitems = 0 # Output the opening bracket or brace compactly = self.options.encode_compactly if not compactly: indent0 = self.options.indentation_for_level( state.nest_level ) indent = self.options.indentation_for_level( state.nest_level+1 ) spaces_after_opener = '' if isdict: opener = '{' closer = '}' if compactly: dictcolon = ':' else: dictcolon = ' : ' else: opener = '[' closer = ']' if not compactly: #opener = opener + ' ' spaces_after_opener = self.options.spaces_to_next_indent_level(subtract=len(opener)) state.append( opener ) state.append( spaces_after_opener ) # Now iterate through all the items and collect their representations parts = [] # Collects each of the members part_keys = [] # For dictionary key sorting, tuples (key,index) try: # while not StopIteration part_idx = 0 while True: obj2 = it.next() part_idx += 1 # Note, will start counting at 1 if obj2 is obj: raise JSONEncodeError('trying to encode an infinite sequence',obj) if isdict: obj3 = obj[obj2] # Dictionary key is in obj2 and value in obj3. # Let any hooks transform the key. if self.has_hook('encode_value'): try: newobj = self.call_hook( 'encode_value', obj2 ) except JSONSkipHook: pass else: obj2 = newobj if self.has_hook('encode_dict_key'): try: newkey = self.call_hook( 'encode_dict_key', obj2 ) except JSONSkipHook: pass else: obj2 = newkey # Check JSON restrictions on key types if not helpers.isstringtype(obj2): if helpers.isnumbertype(obj2): if not self.options.is_allow_nonstring_keys: raise JSONEncodeError('object properties (dictionary keys) must be strings in strict JSON',obj2) else: raise JSONEncodeError('object properties (dictionary keys) can only be strings or numbers in ECMAScript',obj2) part_keys.append( (obj2, part_idx-1) ) # Encode this item in the sequence and put into item_chunks substate = state.make_substate() self._do_encode( obj2, substate ) if isdict: substate.append( dictcolon ) substate2 = substate.make_substate() self._do_encode( obj3, substate2 ) substate.join_substate( substate2 ) parts.append( substate ) # Next item iteration except StopIteration: pass # Sort dictionary keys if isdict: srt = self.options.sort_keys if srt == SORT_PRESERVE: if _OrderedDict and isinstance(obj,_OrderedDict): srt = SORT_NONE # Will keep order else: srt = SORT_SMART if not srt or srt in (SORT_NONE, SORT_PRESERVE): srt = None elif callable(srt): part_keys.sort( key=(lambda t: (srt(t[0]),t[0])) ) elif srt == SORT_SMART: part_keys.sort( key=(lambda t: (smart_sort_transform(t[0]),t[0])) ) elif srt == SORT_ALPHA_CI: part_keys.sort( key=(lambda t: (unicode(t[0]).upper(),t[0])) ) elif srt or srt == SORT_ALPHA: part_keys.sort( key=(lambda t: unicode(t[0])) ) # Now make parts match the new sort order if srt is not None: parts = [parts[pk[1]] for pk in part_keys] if compactly: sep = ',' elif len(parts) <= self.options.max_items_per_line: sep = ', ' else: #state.append(spaces_after_opener) state.append('\n' + indent) sep = ',\n' + indent for pnum, substate in enumerate(parts): if pnum > 0: state.append( sep ) state.join_substate( substate ) if not compactly: if numitems > self.options.max_items_per_line: state.append('\n' + indent0) else: state.append(' ') state.append(closer) # final '}' or ']' else: # Can't create an iterator for the object self.try_encode_default( obj, state ) def encode_equivalent( self, obj, state ): """This method is used to encode user-defined class objects. The object being encoded should have a json_equivalent() method defined which returns another equivalent object which is easily JSON-encoded. If the object in question has no json_equivalent() method available then None is returned instead of a string so that the encoding will attempt the next strategy. If a caller wishes to disable the calling of json_equivalent() methods, then subclass this class and override this method to just return None. """ if hasattr(obj, 'json_equivalent') \ and callable(getattr(obj,'json_equivalent')): obj2 = obj.json_equivalent() if obj2 is obj: # Try to prevent careless infinite recursion raise JSONEncodeError('object has a json_equivalent() method that returns itself',obj) self._do_encode( obj2, state ) return True else: return False def try_encode_default( self, obj, state ): orig_obj = obj if self.has_hook('encode_default'): try: obj = self.call_hook( 'encode_default', obj ) except JSONSkipHook: pass else: if obj is not orig_obj: # Hook made a transformation, re-encode it return self._do_encode( obj, state ) # End of the road. raise JSONEncodeError('can not encode object into a JSON representation',obj) # ------------------------------ def encode( obj, encoding=None, **kwargs ): r"""Encodes a Python object into a JSON-encoded string. * 'strict' (Boolean, default False) If 'strict' is set to True, then only strictly-conforming JSON output will be produced. Note that this means that some types of values may not be convertable and will result in a JSONEncodeError exception. * 'compactly' (Boolean, default True) If 'compactly' is set to True, then the resulting string will have all extraneous white space removed; if False then the string will be "pretty printed" with whitespace and indentation added to make it more readable. * 'encode_namedtuple_as_object' (Boolean or callable, default True) If True, then objects of type namedtuple, or subclasses of 'tuple' that have an _asdict() method, will be encoded as an object rather than an array. If can also be a predicate function that takes a namedtuple object as an argument and returns True or False. * 'indent_amount' (Integer, default 2) The number of spaces to output for each indentation level. If 'compactly' is True then indentation is ignored. * 'indent_limit' (Integer or None, default None) If not None, then this is the maximum limit of indentation levels, after which further indentation spaces are not inserted. If None, then there is no limit. CONCERNING CHARACTER ENCODING: The 'encoding' argument should be one of: * None - The return will be a Unicode string. * encoding_name - A string which is the name of a known encoding, such as 'UTF-8' or 'ascii'. * codec - A CodecInfo object, such as as found by codecs.lookup(). This allows you to use a custom codec as well as those built into Python. If an encoding is given (either by name or by codec), then the returned value will be a byte array (Python 3), or a 'str' string (Python 2); which represents the raw set of bytes. Otherwise, if encoding is None, then the returned value will be a Unicode string. The 'escape_unicode' argument is used to determine which characters in string literals must be \u escaped. Should be one of: * True -- All non-ASCII characters are always \u escaped. * False -- Try to insert actual Unicode characters if possible. * function -- A user-supplied function that accepts a single unicode character and returns True or False; where True means to \u escape that character. Regardless of escape_unicode, certain characters will always be \u escaped. Additionaly any characters not in the output encoding repertoire for the encoding codec will be \u escaped as well. """ # Do the JSON encoding j = JSON( **kwargs ) output = j.encode( obj, encoding ) return output def decode( txt, encoding=None, **kwargs ): """Decodes a JSON-encoded string into a Python object. == Optional arguments == * 'encoding' (string, default None) This argument provides a hint regarding the character encoding that the input text is assumed to be in (if it is not already a unicode string type). If set to None then autodetection of the encoding is attempted (see discussion above). Otherwise this argument should be the name of a registered codec (see the standard 'codecs' module). * 'strict' (Boolean, default False) If 'strict' is set to True, then those strings that are not entirely strictly conforming to JSON will result in a JSONDecodeError exception. * 'return_errors' (Boolean, default False) Controls the return value from this function. If False, then only the Python equivalent object is returned on success, or an error will be raised as an exception. If True then a 2-tuple is returned: (object, error_list). The error_list will be an empty list [] if the decoding was successful, otherwise it will be a list of all the errors encountered. Note that it is possible for an object to be returned even if errors were encountered. * 'return_stats' (Boolean, default False) Controls whether statistics about the decoded JSON document are returns (and instance of decode_statistics). If True, then the stats object will be added to the end of the tuple returned. If return_errors is also set then a 3-tuple is returned, otherwise a 2-tuple is returned. * 'write_errors' (Boolean OR File-like object, default False) Controls what to do with errors. - If False, then the first decoding error is raised as an exception. - If True, then errors will be printed out to sys.stderr. - If a File-like object, then errors will be printed to that file. The write_errors and return_errors arguments can be set independently. * 'filename_for_errors' (string or None) Provides a filename to be used when writting error messages. * 'allow_xxx', 'warn_xxx', and 'forbid_xxx' (Booleans) These arguments allow for fine-adjustments to be made to the 'strict' argument, by allowing or forbidding specific syntaxes. There are many of these arguments, named by replacing the "xxx" with any number of possible behavior names (See the JSON class for more details). Each of these will allow (or forbid) the specific behavior, after the evaluation of the 'strict' argument. For example, if strict=True then by also passing 'allow_comments=True' then comments will be allowed. If strict=False then forbid_comments=True will allow everything except comments. Unicode decoding: ----------------- The input string can be either a python string or a python unicode string (or a byte array in Python 3). If it is already a unicode string, then it is assumed that no character set decoding is required. However, if you pass in a non-Unicode text string (a Python 2 'str' type or a Python 3 'bytes' or 'bytearray') then an attempt will be made to auto-detect and decode the character encoding. This will be successful if the input was encoded in any of UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE), and of course plain ASCII works too. Note though that if you know the character encoding, then you should convert to a unicode string yourself, or pass it the name of the 'encoding' to avoid the guessing made by the auto detection, as with python_object = demjson.decode( input_bytes, encoding='utf8' ) Callback hooks: --------------- You may supply callback hooks by using the hook name as the named argument, such as: decode_float=decimal.Decimal See the hooks documentation on the JSON.set_hook() method. """ import sys # Initialize the JSON object return_errors = False return_stats = False write_errors = False filename_for_errors = None write_stats = False kwargs = kwargs.copy() todel = [] for kw,val in kwargs.items(): if kw == "return_errors": return_errors = bool(val) todel.append(kw) elif kw == 'return_stats': return_stats = bool(val) todel.append(kw) elif kw == "write_errors": write_errors = val todel.append(kw) elif kw == "filename_for_errors": filename_for_errors = val todel.append(kw) elif kw == "write_stats": write_stats = val todel.append(kw) # next keyword argument for kw in todel: del kwargs[kw] j = JSON( **kwargs ) # Now do the actual JSON decoding result = j.decode( txt, encoding=encoding, return_errors=(return_errors or write_errors), return_stats=(return_stats or write_stats) ) if write_errors: import sys if write_errors is True: write_errors = sys.stderr for err in result.errors: write_errors.write( err.pretty_description(filename=filename_for_errors) + "\n" ) if write_stats: import sys if write_stats is True: write_stats = sys.stderr if result.stats: write_stats.write( "%s----- Begin JSON statistics\n" % filename_for_errors ) write_stats.write( result.stats.pretty_description( prefix=" | " ) ) write_stats.write( "%s----- End of JSON statistics\n" % filename_for_errors ) return result def encode_to_file( filename, obj, encoding='utf-8', overwrite=False, **kwargs ): """Encodes a Python object into JSON and writes into the given file. If no encoding is given, then UTF-8 will be used. See the encode() function for a description of other possible options. If the file already exists and the 'overwrite' option is not set to True, then the existing file will not be overwritten. (Note, there is a subtle race condition in the check so there are possible conditions in which a file may be overwritten) """ import os, errno if not encoding: encoding = 'utf-8' if not isinstance(filename,basestring) or not filename: raise TypeError("Expected a file name") if not overwrite and os.path.exists(filename): raise IOError(errno.EEXIST, "File exists: %r" % filename) jsondata = encode( obj, encoding=encoding, **kwargs ) try: fp = open(filename, 'wb') except Exception: raise else: try: fp.write( jsondata ) finally: fp.close() def decode_file( filename, encoding=None, **kwargs ): """Decodes JSON found in the given file. See the decode() function for a description of other possible options. """ if isinstance(filename,basestring): try: fp = open(filename, 'rb') except Exception: raise else: try: jsondata = fp.read() finally: fp.close() else: raise TypeError("Expected a file name") return decode( jsondata, encoding=encoding, **kwargs ) # ====================================================================== class jsonlint(object): """This class contains most of the logic for the "jsonlint" command. You generally create an instance of this class, to defined the program's environment, and then call the main() method. A simple wrapper to turn this into a script might be: import sys, demjson if __name__ == '__main__': lint = demjson.jsonlint( sys.argv[0] ) return lint.main( sys.argv[1:] ) """ _jsonlint_usage = r"""Usage: %(program_name)s [ ...] [--] inputfile.json ... With no input filename, or "-", it will read from standard input. The return status will be 0 if the file is conforming JSON (per the RFC 7159 specification), or non-zero otherwise. GENERAL OPTIONS: -v | --verbose Show details of lint checking -q | --quiet Don't show any output (except for reformatting) STRICTNESS OPTIONS (WARNINGS AND ERRORS): -W | --tolerant Be tolerant, but warn about non-conformance (default) -s | --strict Be strict in what is considered conforming JSON -S | --nonstrict Be tolerant in what is considered conforming JSON --allow=... -\ --warn=... |-- These options let you pick specific behaviors. --forbid=... -/ Use --help-behaviors for more STATISTICS OPTIONS: --stats Show statistics about JSON document REFORMATTING OPTIONS: -f | --format Reformat the JSON text (if conforming) to stdout -F | --format-compactly Reformat the JSON simlar to -f, but do so compactly by removing all unnecessary whitespace -o filename | --output filename The filename to which reformatted JSON is to be written. Without this option the standard output is used. --[no-]keep-format Try to preserve numeric radix, e.g., hex, octal, etc. --html-safe Escape characters that are not safe to embed in HTML/XML. --sort How to sort object/dictionary keys, is one of: %(sort_options_help)s --indent tabs | Number of spaces to use per indentation level, or use tab characters if "tabs" given. UNICODE OPTIONS: -e codec | --encoding=codec Set both input and output encodings --input-encoding=codec Set the input encoding --output-encoding=codec Set the output encoding These options set the character encoding codec (e.g., "ascii", "utf-8", "utf-16"). The -e will set both the input and output encodings to the same thing. The output encoding is used when reformatting with the -f or -F options. Unless set, the input encoding is guessed and the output encoding will be "utf-8". OTHER OPTIONS: --recursion-limit=nnn Set the Python recursion limit to number --leading-zero-radix=8|10 The radix to use for numbers with leading zeros. 8=octal, 10=decimal. REFORMATTING / PRETTY-PRINTING: When reformatting JSON with -f or -F, output is only produced if the input passed validation. By default the reformatted JSON will be written to standard output, unless the -o option was given. The default output codec is UTF-8, unless an encoding option is provided. Any Unicode characters will be output as literal characters if the encoding permits, otherwise they will be \u-escaped. You can use "--output-encoding ascii" to force all Unicode characters to be escaped. MORE INFORMATION: Use '%(program_name)s --version [-v]' to see versioning information. Use '%(program_name)s --copyright' to see author and copyright details. Use '%(program_name)s [-W|-s|-S] --help-behaviors' for help on specific checks. %(program_name)s is distributed as part of the "demjson" Python module. See %(homepage)s """ SUCCESS_FAIL = 'E' SUCCESS_WARNING = 'W' SUCCESS_OK = 'OK' def __init__(self, program_name='jsonlint', stdin=None, stdout=None, stderr=None ): """Create an instance of a "jsonlint" program. You can optionally pass options to define the program's environment: * program_name - the name of the program, usually sys.argv[0] * stdin - the file object to use for input, default sys.stdin * stdout - the file object to use for outut, default sys.stdout * stderr - the file object to use for error output, default sys.stderr After creating an instance, you typically call the main() method. """ import os, sys self.program_path = program_name self.program_name = os.path.basename(program_name) if stdin: self.stdin = stdin else: self.stdin = sys.stdin if stdout: self.stdout = stdout else: self.stdout = sys.stdout if stderr: self.stderr = stderr else: self.stderr = sys.stderr @property def usage(self): """A multi-line string containing the program usage instructions. """ sorthelp = '\n'.join([ " %12s - %s" % (sm, sd) for sm, sd in sorted(sorting_methods.items()) if sm != SORT_NONE ]) return self._jsonlint_usage % {'program_name':self.program_name, 'homepage':__homepage__, 'sort_options_help': sorthelp } def _lintcheck_data( self, jsondata, verbose_fp=None, reformat=False, show_stats=False, input_encoding=None, output_encoding=None, escape_unicode=True, pfx='', jsonopts=None ): global decode, encode success = self.SUCCESS_FAIL reformatted = None if show_stats: stats_fp = verbose_fp else: stats_fp = None try: results = decode( jsondata, encoding=input_encoding, return_errors=True, return_stats=True, write_errors=verbose_fp, write_stats=stats_fp, filename_for_errors=pfx, json_options=jsonopts ) except JSONError, err: success = self.SUCCESS_FAIL if verbose_fp: verbose_fp.write('%s%s\n' % (pfx, err.pretty_description()) ) except Exception, err: success = self.SUCCESS_FAIL if verbose_fp: verbose_fp.write('%s%s\n' % (pfx, str(err) )) else: errors = [err for err in results.errors if err.severity in ('fatal','error')] warnings = [err for err in results.errors if err.severity in ('warning',)] if errors: success = self.SUCCESS_FAIL elif warnings: success = self.SUCCESS_WARNING else: success = self.SUCCESS_OK if reformat: encopts = jsonopts.copy() encopts.strictness = STRICTNESS_TOLERANT if reformat == 'compactly': encopts.encode_compactly = True else: encopts.encode_compactly = False reformatted = encode(results.object, encoding=output_encoding, json_options=encopts) return (success, reformatted) def _lintcheck( self, filename, output_filename, verbose=False, reformat=False, show_stats=False, input_encoding=None, output_encoding=None, escape_unicode=True, jsonopts=None ): import sys verbose_fp = None if not filename or filename == "-": pfx = ': ' jsondata = self.stdin.read() if verbose: verbose_fp = self.stderr else: pfx = '%s: ' % filename try: fp = open( filename, 'rb' ) jsondata = fp.read() fp.close() except IOError, err: self.stderr.write('%s: %s\n' % (pfx, str(err)) ) return self.SUCCESS_FAIL if verbose: verbose_fp = self.stdout success, reformatted = self._lintcheck_data( jsondata, verbose_fp=verbose_fp, reformat=reformat, show_stats=show_stats, input_encoding=input_encoding, output_encoding=output_encoding, pfx=pfx, jsonopts=jsonopts ) if success != self.SUCCESS_FAIL and reformat: if output_filename: try: fp = open( output_filename, 'wb' ) fp.write( reformatted ) except IOError, err: self.stderr.write('%s: %s\n' % (pfx, str(err)) ) success = False else: if hasattr(sys.stdout,'buffer'): # To write binary data rather than strings self.stdout.buffer.write( reformatted ) else: self.stdout.write( reformatted ) elif success == self.SUCCESS_OK and verbose_fp: verbose_fp.write('%sok\n' % pfx) elif success == self.SUCCESS_WARNING and verbose_fp: verbose_fp.write('%sok, with warnings\n' % pfx) elif verbose_fp: verbose_fp.write("%shas errors\n" % pfx) return success def main( self, argv ): """The main routine for program "jsonlint". Should be called with sys.argv[1:] as its sole argument. Note sys.argv[0] which normally contains the program name should not be passed to main(); instead this class itself is initialized with sys.argv[0]. Use "--help" for usage syntax, or consult the 'usage' member. """ import sys, os, getopt, unicodedata recursion_limit = None success = True verbose = 'auto' # one of 'auto', True, or False reformat = False show_stats = False output_filename = None input_encoding = None output_encoding = 'utf-8' kwoptions = { # Will be used to initialize json_options "sort_keys": SORT_SMART, "strict": STRICTNESS_WARN, "keep_format": True, "decimal_context": 100, } try: opts, args = getopt.getopt( argv, 'vqfFe:o:sSW', ['verbose','quiet', 'format','format-compactly', 'stats', 'output', 'strict','nonstrict','warn', 'html-safe','xml-safe', 'encoding=', 'input-encoding=','output-encoding=', 'sort=', 'recursion-limit=', 'leading-zero-radix=', 'keep-format', 'no-keep-format', 'indent=', 'indent-amount=', 'indent-limit=', 'indent-tab-width=', 'max-items-per-line=', 'allow=', 'warn=', 'forbid=', 'deny=', 'help', 'help-behaviors', 'version','copyright'] ) except getopt.GetoptError, err: self.stderr.write( "Error: %s. Use \"%s --help\" for usage information.\n" \ % (err.msg, self.program_name) ) return 1 # Set verbose before looking at any other options for opt, val in opts: if opt in ('-v', '--verbose'): verbose=True # Process all options for opt, val in opts: if opt in ('-h', '--help'): self.stdout.write( self.usage ) return 0 elif opt == '--help-behaviors': self.stdout.write(""" BEHAVIOR OPTIONS: These set of options let you control which checks are to be performed. They may be turned on or off by listing them as arguments to one of the options --allow, --warn, or --forbid ; for example: %(program_name)s --allow comments,hex-numbers --forbid duplicate-keys """ % {"program_name":self.program_name}) self.stdout.write("The default shown is for %s mode\n\n" % kwoptions['strict']) self.stdout.write('%-7s %-25s %s\n' % ("Default", "Behavior_name", "Description")) self.stdout.write('-'*7 + ' ' + '-'*25 + ' ' + '-'*50 + '\n') j = json_options( **kwoptions ) for behavior in sorted(j.all_behaviors): v = j.get_behavior( behavior ) desc = j.describe_behavior( behavior ) self.stdout.write('%-7s %-25s %s\n' % (v.lower(), behavior.replace('_','-'), desc)) return 0 elif opt == '--version': self.stdout.write( '%s (%s) version %s (%s)\n' \ % (self.program_name, __name__, __version__, __date__) ) if verbose == True: self.stdout.write( 'demjson from %r\n' % (__file__,) ) if verbose == True: self.stdout.write( 'Python version: %s\n' % (sys.version.replace('\n',' '),) ) self.stdout.write( 'This python implementation supports:\n' ) self.stdout.write( ' * Max unicode: U+%X\n' % (sys.maxunicode,) ) self.stdout.write( ' * Unicode version: %s\n' % (unicodedata.unidata_version,) ) self.stdout.write( ' * Floating-point significant digits: %d\n' % (float_sigdigits,) ) self.stdout.write( ' * Floating-point max 10^exponent: %d\n' % (float_maxexp,) ) if str(0.0)==str(-0.0): szero = 'No' else: szero = 'Yes' self.stdout.write( ' * Floating-point has signed-zeros: %s\n' % (szero,) ) if decimal: has_dec = 'Yes' else: has_dec = 'No' self.stdout.write( ' * Decimal (bigfloat) support: %s\n' % (has_dec,) ) return 0 elif opt == '--copyright': self.stdout.write( "%s is distributed as part of the \"demjson\" python package.\n" \ % (self.program_name,) ) self.stdout.write( "See %s\n\n\n" % (__homepage__,) ) self.stdout.write( __credits__ ) return 0 elif opt in ('-v', '--verbose'): verbose = True elif opt in ('-q', '--quiet'): verbose = False elif opt in ('-s', '--strict'): kwoptions['strict'] = STRICTNESS_STRICT kwoptions['keep_format'] = False elif opt in ('-S', '--nonstrict'): kwoptions['strict'] = STRICTNESS_TOLERANT elif opt in ('-W', '--tolerant'): kwoptions['strict'] = STRICTNESS_WARN elif opt in ('-f', '--format'): reformat = True kwoptions['encode_compactly'] = False elif opt in ('-F', '--format-compactly'): kwoptions['encode_compactly'] = True reformat = 'compactly' elif opt in ('--stats',): show_stats=True elif opt in ('-o', '--output'): output_filename = val elif opt in ('-e','--encoding'): input_encoding = val output_encoding = val escape_unicode = False elif opt in ('--output-encoding'): output_encoding = val escape_unicode = False elif opt in ('--input-encoding'): input_encoding = val elif opt in ('--html-safe','--xml-safe'): kwoptions['html_safe'] = True elif opt in ('--allow','--warn','--forbid'): action = opt[2:] if action in kwoptions: kwoptions[action] += "," + val else: kwoptions[action] = val elif opt in ('--keep-format',): kwoptions['keep_format']=True elif opt in ('--no-keep-format',): kwoptions['keep_format']=False elif opt == '--leading-zero-radix': kwoptions['leading_zero_radix'] = val elif opt in ('--indent', '--indent-amount'): if val in ('tab','tabs'): kwoptions['indent_amount'] = 8 kwoptions['indent_tab_width'] = 8 else: try: kwoptions['indent_amount'] = int(val) except ValueError: self.stderr.write("Indentation amount must be a number\n") return 1 elif opt == 'indent-tab-width': try: kwoptions['indent_tab_width'] = int(val) except ValueError: self.stderr.write("Indentation tab width must be a number\n") return 1 elif opt == '--max-items-per-line': try: kwoptions['max_items_per_line'] = int(val) except ValueError: self.stderr.write("Max items per line must be a number\n") return 1 elif opt == '--sort': val = val.lower() if val == 'alpha': kwoptions['sort_keys'] = SORT_ALPHA elif val == 'alpha_ci': kwoptions['sort_keys'] = SORT_ALPHA_CI elif val == 'preserve': kwoptions['sort_keys'] = SORT_PRESERVE else: kwoptions['sort_keys'] = SORT_SMART elif opt == '--recursion-limit': try: recursion_limit = int(val) except ValueError: self.stderr.write("Recursion limit must be a number: %r\n" % val) return 1 else: max_limit = 100000 old_limit = sys.getrecursionlimit() if recursion_limit > max_limit: self.stderr.write("Recursion limit must be a number between %d and %d\n" % (old_limit,max_limit)) return 1 elif recursion_limit > old_limit: sys.setrecursionlimit( recursion_limit ) else: self.stderr.write('Unknown option %r\n' % opt) return 1 # Make the JSON options kwoptions['decimal_context'] = 100 jsonopts = json_options( **kwoptions ) # Now decode each file... if not args: args = [None] for fn in args: try: rc = self._lintcheck( fn, output_filename=output_filename, verbose=verbose, reformat=reformat, show_stats=show_stats, input_encoding=input_encoding, output_encoding=output_encoding, jsonopts=jsonopts ) if rc != self.SUCCESS_OK: # Warnings or errors should result in failure. If # checking multiple files, do not change a # previous error back to ok. success = False except KeyboardInterrupt, err: sys.stderr.write("\njsonlint interrupted!\n") sys.exit(1) if not success: return 1 return 0 # end file demjson-2.2.4/test/0002775000076400007640000000000012636326504013514 5ustar demdem00000000000000demjson-2.2.4/test/test_demjson.py0000664000076400007640000023661012430736767016602 0ustar demdem00000000000000#!/usr/bin/env python # -*- coding: utf-8 -*- """This module tests demjson.py using unittest. NOTE ON PYTHON 3: If running in Python 3, you must transform this test script with "2to3" first. """ import sys, os, time import unittest import string import unicodedata import codecs import collections import datetime # Force PYTHONPATH to head of sys.path, as the easy_install (egg files) will # have rudely forced itself ahead of PYTHONPATH. for pos, name in enumerate(os.environ.get('PYTHONPATH','').split(os.pathsep)): if os.path.isdir(name): sys.path.insert(pos, name) import demjson # ------------------------------ # Python version-specific stuff... is_python3 = False is_python27_plus = False try: is_python3 = (sys.version_info.major >= 3) is_python27_plus = (sys.version_info.major > 2 or (sys.version_info.major==2 and sys.version_info.minor >= 7)) except AttributeError: is_python3 = (sys.version_info[0] >= 3) is_python27_plus = (sys.version_info[0] > 2 or (sys.version_info[0]==2 and sys.version_info[1] >= 7)) is_wide_python = (sys.maxunicode > 0xFFFF) try: import decimal except ImportError: decimal = None # ==================== if hasattr(unittest, 'skipUnless'): def skipUnlessPython3(method): return unittest.skipUnless(method, is_python3) def skipUnlessPython27(method): return unittest.skipUnless(method, is_python27_plus) def skipUnlessWidePython(method): return unittest.skipUnless(method, is_wide_python) else: # Python <= 2.6 does not have skip* decorators, so # just make a dummy decorator that always passes the # test method. def skipUnlessPython3(method): def always_pass(self): print "\nSKIPPING TEST %s: Requires Python 3" % method.__name__ return True return always_pass def skipUnlessPython27(method): def always_pass(self): print "\nSKIPPING TEST %s: Requires Python 2.7 or greater" % method.__name__ return True return always_pass def skipUnlessWidePython(method): def always_pass(self): print "\nSKIPPING TEST %s: Requires Python with wide Unicode support (maxunicode > U+FFFF)" % method.__name__ return True return always_pass ## ------------------------------ def is_negzero( n ): if isinstance(n,float): return n == -0.0 and repr(n).startswith('-') elif decimal and isinstance(n,decimal.Decimal): return n.is_zero() and n.is_signed() else: return False def is_nan( n ): if isinstance(n,float): return n.hex() == 'nan' elif decimal and isinstance(n,decimal.Decimal): return n.is_nan() else: return False def is_infinity( n ): if isinstance(n,float): return n.hex() in ('inf', '-inf') elif decimal and isinstance(n,decimal.Decimal): return n.is_infinite() else: return False ## ------------------------------ def rawbytes( byte_list ): if is_python3: b = bytes( byte_list ) else: b = ''.join(chr(n) for n in byte_list) return b ## ------------------------------ try: import UserDict dict_mixin = UserDict.DictMixin except ImportError: # Python 3 has no UserDict. MutableMapping is close, but must # supply own __iter__() and __len__() methods. dict_mixin = collections.MutableMapping # A class that behaves like a dict, but is not a subclass of dict class LetterOrdDict(dict_mixin): def __init__(self, letters): self._letters = letters def __getitem__(self,key): try: if key in self._letters: return ord(key) except TypeError: raise KeyError('Key of wrong type: %r' % key) raise KeyError('No such key', key) def __setitem__(self,key): raise RuntimeError('read only object') def __delitem__(self,key): raise RuntimeError('read only object') def keys(self): return list(self._letters) def __len__(self): return len(self._letters) def __iter__(self): for v in self._letters: yield v ## ------------------------------ class rot_one(codecs.CodecInfo): """Dummy codec for ROT-1. Rotate by 1 character. A->B, B->C, ..., Z->A """ @staticmethod def lookup( name ): if name.lower() in ('rot1','rot-1'): return codecs.CodecInfo( rot_one.encode, rot_one.decode, name='rot-1' ) return None @staticmethod def encode( s ): byte_list = [] for i, c in enumerate(s): if 'A' <= c <= 'Y': byte_list.append( ord(c)+1 ) elif c == 'Z': byte_list.append( ord('A') ) elif ord(c) <= 0x7f: byte_list.append( ord(c) ) else: raise UnicodeEncodeError('rot-1',s,i,i,"Can not encode code point U+%04X"%ord(c)) return (rawbytes(byte_list), i+1) @staticmethod def decode( byte_list ): if is_python3: byte_values = byte_list else: byte_values = [ord(n) for n in byte_list] chars = [] for i, b in enumerate(byte_values): if ord('B') <= b <= ord('Z'): chars.append( unichr(b-1) ) elif b == ord('A'): chars.append( u'Z' ) elif b <= 0x7fL: chars.append( unichr(b) ) else: raise UnicodeDecodeError('rot-1',byte_list,i,i,"Can not decode byte value 0x%02x"%b) return (u''.join(chars), i+1) ## ------------------------------ class no_curly_braces(codecs.CodecInfo): """Degenerate codec that does not have curly braces. """ @staticmethod def lookup( name ): if name.lower() in ('degenerate','degenerate'): return codecs.CodecInfo( no_curly_braces.encode, no_curly_braces.decode, name='degenerate' ) return None @staticmethod def encode( s ): byte_list = [] for i, c in enumerate(s): if c=='{' or c=='}': raise UnicodeEncodeError('degenerate',s,i,i,"Can not encode curly braces") elif ord(c) <= 0x7f: byte_list.append( ord(c) ) else: raise UnicodeEncodeError('degenerate',s,i,i,"Can not encode code point U+%04X"%ord(c)) return (rawbytes(byte_list), i+1) @staticmethod def decode( byte_list ): if is_python3: byte_values = byte_list else: byte_values = [ord(n) for n in byte_list] chars = [] for i, b in enumerate(byte_values): if b > 0x7f or b == ord('{') or b == ord('}'): raise UnicodeDecodeError('degenerate',byte_list,i,i,"Can not decode byte value 0x%02x"%b) else: chars.append( unichr(b) ) return (u''.join(chars), i+1) ## ------------------------------ if is_python3: def hexencode_bytes( bytelist ): return ''.join( ['%02x' % n for n in bytelist] ) ## ============================================================ class DemjsonTest(unittest.TestCase): """This class contains test cases for demjson. """ def testConstants(self): self.assertFalse( not isinstance(demjson.nan, float), "Missing nan constant" ) self.assertFalse( not isinstance(demjson.inf, float), "Missing inf constant" ) self.assertFalse( not isinstance(demjson.neginf, float), "Missing neginf constant" ) self.assertFalse( not hasattr(demjson, 'undefined'), "Missing undefined constant" ) def testDecodeKeywords(self): self.assertEqual(demjson.decode('true'), True) self.assertEqual(demjson.decode('false'), False) self.assertEqual(demjson.decode('null'), None) self.assertEqual(demjson.decode('undefined'), demjson.undefined) def testEncodeKeywords(self): self.assertEqual(demjson.encode(None), 'null') self.assertEqual(demjson.encode(True), 'true') self.assertEqual(demjson.encode(False), 'false') self.assertEqual(demjson.encode(demjson.undefined), 'undefined') def testDecodeNumber(self): self.assertEqual(demjson.decode('0'), 0) self.assertEqual(demjson.decode('12345'), 12345) self.assertEqual(demjson.decode('-12345'), -12345) self.assertEqual(demjson.decode('1e6'), 1000000) self.assertEqual(demjson.decode('1.5'), 1.5) self.assertEqual(demjson.decode('-1.5'), -1.5) self.assertEqual(demjson.decode('3e10'), 30000000000) self.assertEqual(demjson.decode('3E10'), 30000000000) self.assertEqual(demjson.decode('3e+10'), 30000000000) self.assertEqual(demjson.decode('3E+10'), 30000000000) self.assertEqual(demjson.decode('3E+00010'), 30000000000) self.assertEqual(demjson.decode('1000e-2'), 10) self.assertEqual(demjson.decode('1.2E+3'), 1200) self.assertEqual(demjson.decode('3.5e+8'), 350000000) self.assertEqual(demjson.decode('-3.5e+8'), -350000000) self.assertAlmostEqual(demjson.decode('1.23456e+078'), 1.23456e78) self.assertAlmostEqual(demjson.decode('1.23456e-078'), 1.23456e-78) self.assertAlmostEqual(demjson.decode('-1.23456e+078'), -1.23456e78) self.assertAlmostEqual(demjson.decode('-1.23456e-078'), -1.23456e-78) def testDecodeStrictNumber(self): """Make sure that strict mode is picky about numbers.""" for badnum in ['+1', '.5', '1.', '01', '0x1', '1e']: try: self.assertRaises(demjson.JSONDecodeError, demjson.decode, badnum, strict=True, allow_any_type_at_start=True) except demjson.JSONDecodeError: pass def testDecodeHexNumbers(self): self.assertEqual(demjson.decode('0x0', allow_hex_numbers=True), 0) self.assertEqual(demjson.decode('0X0', allow_hex_numbers=True), 0) self.assertEqual(demjson.decode('0x0000', allow_hex_numbers=True), 0) self.assertEqual(demjson.decode('0x8', allow_hex_numbers=True), 8) self.assertEqual(demjson.decode('0x1f', allow_hex_numbers=True), 31) self.assertEqual(demjson.decode('0x1F', allow_hex_numbers=True), 31) self.assertEqual(demjson.decode('0xff', allow_hex_numbers=True), 255) self.assertEqual(demjson.decode('0xffff', allow_hex_numbers=True), 65535) self.assertEqual(demjson.decode('0xffffffff', allow_hex_numbers=True), 4294967295) self.assertEqual(demjson.decode('0x0F1a7Cb', allow_hex_numbers=True), 0xF1A7CB) self.assertEqual(demjson.decode('0X0F1a7Cb', allow_hex_numbers=True), 0xF1A7CB) self.assertEqual(demjson.decode('0x0000000000000000000000000000000000123', allow_hex_numbers=True), 0x123) self.assertEqual(demjson.decode('0x000000000000000000000000000000000012300', allow_hex_numbers=True), 0x12300) self.assertTrue( is_negzero(demjson.decode('-0x0', allow_hex_numbers=True)), "Decoding negative zero hex numbers should give -0.0" ) self.assertEqual(demjson.decode('-0x1', allow_hex_numbers=True), -1) self.assertEqual(demjson.decode('-0x000Fc854ab', allow_hex_numbers=True), -0xFC854AB) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0x', allow_hex_numbers=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0xG', allow_hex_numbers=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0x-3', allow_hex_numbers=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0x3G', allow_hex_numbers=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0x0x3', allow_hex_numbers=True) def testDecodeLargeIntegers(self): self.assertEqual(demjson.decode('9876543210123456789'), 9876543210123456789) self.assertEqual(demjson.decode('-9876543210123456789'), -9876543210123456789) self.assertEqual(demjson.decode('0xfedcba9876543210ABCDEF', allow_hex_numbers=True), 308109520888805757320678895) self.assertEqual(demjson.decode('-0xfedcba9876543210ABCDEF', allow_hex_numbers=True), -308109520888805757320678895) self.assertEqual(demjson.decode('0177334565141662503102052746757', allow_leading_zeros=True, leading_zero_radix=8), 308109520888805757320678895) self.assertEqual(demjson.decode('-0177334565141662503102052746757', allow_leading_zeros=True, leading_zero_radix=8), -308109520888805757320678895) def testDecodeOctalNumbers(self): self.assertEqual(demjson.decode('017', allow_leading_zeros=True, leading_zero_radix=8), 15) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '018', allow_leading_zeros=True, leading_zero_radix=8) self.assertEqual(demjson.decode('00017', allow_leading_zeros=True, leading_zero_radix=8), 15) self.assertEqual(demjson.decode('-017', allow_leading_zeros=True, leading_zero_radix=8), -15) self.assertEqual(demjson.decode('00', allow_leading_zeros=True, leading_zero_radix=8), 0) self.assertTrue( is_negzero(demjson.decode('-00', allow_leading_zeros=True, leading_zero_radix=8)), "Decoding negative zero octal number should give -0.0") def testDecodeNewOctalNumbers(self): self.assertEqual(demjson.decode('0o0'), 0) self.assertEqual(demjson.decode('0O0'), 0) self.assertEqual(demjson.decode('0o000'), 0) self.assertEqual(demjson.decode('0o1'), 1) self.assertEqual(demjson.decode('0o7'), 7) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0o18') self.assertEqual(demjson.decode('0o17'), 15) self.assertEqual(demjson.decode('-0o17'), -15) self.assertEqual(demjson.decode('0o4036517'), 1064271) self.assertEqual(demjson.decode('0O4036517'), 1064271) self.assertEqual(demjson.decode('0o000000000000000000000000000000000000000017'), 15) self.assertEqual(demjson.decode('0o00000000000000000000000000000000000000001700'), 960) self.assertTrue( is_negzero(demjson.decode('-0o0')), "Decoding negative zero octal number should give -0.0") self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0o') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0oA') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0o-3') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0o3A') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0o0o3') def testDecodeBinaryNumbers(self): self.assertEqual(demjson.decode('0b0'), 0) self.assertEqual(demjson.decode('0B0'), 0) self.assertEqual(demjson.decode('0b000'), 0) self.assertEqual(demjson.decode('0b1'), 1) self.assertEqual(demjson.decode('0b01'), 1) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0b2') self.assertEqual(demjson.decode('0b1101'), 13) self.assertEqual(demjson.decode('-0b1101'), -13) self.assertEqual(demjson.decode('0b11010001101111100010101101011'), 439862635), self.assertEqual(demjson.decode('0B11010001101111100010101101011'), 439862635), self.assertEqual(demjson.decode('0b00000000000000000000000000000000000000001101'), 13) self.assertEqual(demjson.decode('0b0000000000000000000000000000000000000000110100'), 52) self.assertTrue( is_negzero(demjson.decode('-0b0')), "Decoding negative zero binary number should give -0.0") self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0b') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0bA') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0b-1') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0b1A') self.assertRaises(demjson.JSONDecodeError, demjson.decode, '0b0b1') def testDecodeNegativeZero(self): """Makes sure 0 and -0 are distinct. This is not a JSON requirement, but is required by ECMAscript. """ self.assertEqual(demjson.decode('-0.0'), -0.0) self.assertEqual(demjson.decode('0.0'), 0.0) self.assertTrue(demjson.decode('0.0') is not demjson.decode('-0.0'), 'Numbers 0.0 and -0.0 are not distinct') self.assertTrue(demjson.decode('0') is not demjson.decode('-0'), 'Numbers 0 and -0 are not distinct') def testDecodeNaN(self): """Checks parsing of JavaScript NaN. """ # Have to use is_nan(), since by definition nan != nan self.assertTrue( isinstance(demjson.decode('NaN', allow_non_numbers=True), float) ) self.assertTrue( is_nan(demjson.decode('NaN', allow_non_numbers=True)) ) self.assertTrue( is_nan(demjson.decode('+NaN', allow_non_numbers=True)) ) self.assertTrue( is_nan(demjson.decode('-NaN', allow_non_numbers=True)) ) if decimal: self.assertTrue( isinstance(demjson.decode('NaN', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL), decimal.Decimal) ) self.assertTrue( is_nan(demjson.decode('NaN', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertTrue( is_nan(demjson.decode('+NaN', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertTrue( is_nan(demjson.decode('-NaN', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, 'NaN', allow_non_numbers=False) def testDecodeInfinite(self): """Checks parsing of JavaScript Infinite. """ # Have to use is_nan(), since by definition nan != nan self.assertTrue( isinstance(demjson.decode('Infinity', allow_non_numbers=True), float) ) self.assertTrue( is_infinity(demjson.decode('Infinity', allow_non_numbers=True)) ) self.assertTrue( is_infinity(demjson.decode('+Infinity', allow_non_numbers=True)) ) self.assertTrue( is_infinity(demjson.decode('-Infinity', allow_non_numbers=True)) ) self.assertTrue( demjson.decode('-Infinity', allow_non_numbers=True) < 0 ) if decimal: self.assertTrue( isinstance(demjson.decode('Infinity', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL), decimal.Decimal) ) self.assertTrue( is_infinity(demjson.decode('Infinity', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertTrue( is_infinity(demjson.decode('+Infinity', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertTrue( is_infinity(demjson.decode('-Infinity', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL)) ) self.assertTrue( demjson.decode('-Infinity', allow_non_numbers=True, float_type=demjson.NUMBER_DECIMAL).is_signed() ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, 'Infinity', allow_non_numbers=False) def assertMatchesRegex(self, value, pattern, msg=None): import re r = re.compile( '^' + pattern + '$' ) try: m = r.match( value ) except TypeError: raise self.failureException, \ "can't compare non-string to regex: %r" % value if m is None: raise self.failureException, \ (msg or '%r !~ /%s/' %(value,pattern)) def testEncodeNumber(self): self.assertEqual(demjson.encode(0), '0') self.assertEqual(demjson.encode(12345), '12345') self.assertEqual(demjson.encode(-12345), '-12345') # Floating point numbers must be "approximately" compared to # allow for slight changes due to rounding errors in the # least significant digits. self.assertMatchesRegex(demjson.encode(1.5), r'1.(' \ r'(5(000+[0-9])?)' \ r'|' \ r'(4999(9+[0-9])?)' \ r')' ) self.assertMatchesRegex(demjson.encode(-1.5), r'-1.(' \ r'(5(000+[0-9])?)' \ r'|' \ r'(4999(9+[0-9])?)' \ r')' ) self.assertMatchesRegex(demjson.encode(1.2300456e78), r'1.230045(' \ r'(6(0+[0-9])?)' r'|' \ r'(59(9+[0-9])?)' \ r')[eE][+]0*78') self.assertMatchesRegex(demjson.encode(1.2300456e-78), r'1.230045(' \ r'(6(0+[0-9])?)' r'|' \ r'(59(9+[0-9])?)' \ r')[eE][-]0*78') self.assertMatchesRegex(demjson.encode(-1.2300456e78), r'-1.230045(' \ r'(6(0+[0-9])?)' r'|' \ r'(59(9+[0-9])?)' \ r')[eE][+]0*78') self.assertMatchesRegex(demjson.encode(-1.2300456e-78), r'-1.230045(' \ r'(6(0+[0-9])?)' r'|' \ r'(59(9+[0-9])?)' \ r')[eE][-]0*78') self.assertMatchesRegex(demjson.encode(0.0000043), r'4.3[0[0-9]*]?[eE]-0*6$') self.assertMatchesRegex(demjson.encode(40000000000), r'(4[eE]+0*10)|(40000000000)$', 'Large integer not encoded properly') def testEncodeNegativeZero(self): self.assertTrue(demjson.encode(-0.0) in ['-0','-0.0'], 'Float -0.0 is not encoded as a negative zero') if decimal: self.assertTrue(demjson.encode( decimal.Decimal('-0') ) in ['-0','-0.0'], 'Decimal -0 is not encoded as a negative zero') def testJsonInt(self): self.assertTrue( isinstance( demjson.json_int(0), (int,long) ) ) self.assertEqual(demjson.json_int(0), 0) self.assertEqual(demjson.json_int(555999), 555999) self.assertEqual(demjson.json_int(-555999), -555999) self.assertEqual(demjson.json_int(12131415161718191029282726), 12131415161718191029282726) self.assertEqual(demjson.json_int('123'), 123) self.assertEqual(demjson.json_int('+123'), 123) self.assertEqual(demjson.json_int('-123'), -123) self.assertEqual(demjson.json_int('123',8), 83) self.assertEqual(demjson.json_int('123',16), 291) self.assertEqual(demjson.json_int('110101',2), 53) self.assertEqual( 123, demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_DECIMAL)) self.assertEqual( 123, demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_HEX)) self.assertEqual( 123, demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_OCTAL)) self.assertEqual( 123, demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)) self.assertEqual( 123, demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_BINARY)) self.assertEqual(demjson.json_int(123), demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_DECIMAL)) self.assertEqual(demjson.json_int(123), demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_HEX)) self.assertEqual(demjson.json_int(123), demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_OCTAL)) self.assertEqual(demjson.json_int(123), demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)) self.assertEqual(demjson.json_int(123), demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_BINARY)) self.assertEqual(demjson.json_int(123).json_format(), '123' ) self.assertEqual(demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_DECIMAL).json_format(), '123' ) self.assertEqual(demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_HEX).json_format(), '0x7b' ) self.assertEqual(demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_OCTAL).json_format(), '0o173' ) self.assertEqual(demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL).json_format(), '0173' ) self.assertEqual(demjson.json_int(0,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL).json_format(), '0' ) self.assertEqual(demjson.json_int(123,number_format=demjson.NUMBER_FORMAT_BINARY).json_format(), '0b1111011' ) def testEncodeDecimalIntegers(self): self.assertEqual(demjson.encode( demjson.json_int(0)), '0') self.assertEqual(demjson.encode( demjson.json_int(123)), '123') self.assertEqual(demjson.encode( demjson.json_int(-123)), '-123') self.assertEqual(demjson.encode( demjson.json_int(12345678901234567890888)), '12345678901234567890888') def testEncodeHexIntegers(self): self.assertEqual(demjson.encode( demjson.json_int(0x0,number_format=demjson.NUMBER_FORMAT_HEX)), '0x0') self.assertEqual(demjson.encode( demjson.json_int(0xff,number_format=demjson.NUMBER_FORMAT_HEX)), '0xff') self.assertEqual(demjson.encode( demjson.json_int(-0x7f,number_format=demjson.NUMBER_FORMAT_HEX)), '-0x7f') self.assertEqual(demjson.encode( demjson.json_int(0x123456789abcdef,number_format=demjson.NUMBER_FORMAT_HEX)), '0x123456789abcdef') def testEncodeOctalIntegers(self): self.assertEqual(demjson.encode( demjson.json_int(0,number_format=demjson.NUMBER_FORMAT_OCTAL)), '0o0') self.assertEqual(demjson.encode( demjson.json_int(359,number_format=demjson.NUMBER_FORMAT_OCTAL)), '0o547') self.assertEqual(demjson.encode( demjson.json_int(-359,number_format=demjson.NUMBER_FORMAT_OCTAL)), '-0o547') def testEncodeLegacyOctalIntegers(self): self.assertEqual(demjson.encode( demjson.json_int(0,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)), '0') self.assertEqual(demjson.encode( demjson.json_int(1,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)), '01') self.assertEqual(demjson.encode( demjson.json_int(359,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)), '0547') self.assertEqual(demjson.encode( demjson.json_int(-359,number_format=demjson.NUMBER_FORMAT_LEGACYOCTAL)), '-0547') def testIntAsFloat(self): self.assertEqual(demjson.decode('[0,-5,600,0xFF]', int_as_float=True), [0.0,-5.0,600.0,255.0] ) if decimal: self.assertEqual(demjson.decode('[0,-5,600,0xFF]', int_as_float=True, float_type=demjson.NUMBER_DECIMAL), [decimal.Decimal('0.0'), decimal.Decimal('-5.0'), decimal.Decimal('600.0'), decimal.Decimal('255.0')] ) self.assertEqual([type(x) for x in demjson.decode('[0,-5,600,0xFF]', int_as_float=True, float_type=demjson.NUMBER_DECIMAL)], [decimal.Decimal, decimal.Decimal, decimal.Decimal, decimal.Decimal] ) def testKeepFormat(self): self.assertEqual(demjson.encode(demjson.decode( '[3,03,0o3,0x3,0b11]', keep_format=True )), '[3,03,0o3,0x3,0b11]' ) def testEncodeNaN(self): self.assertEqual(demjson.encode( demjson.nan ), 'NaN') self.assertEqual(demjson.encode( -demjson.nan ), 'NaN') if decimal: self.assertEqual(demjson.encode( decimal.Decimal('NaN') ), 'NaN') self.assertEqual(demjson.encode( decimal.Decimal('sNaN') ), 'NaN') def testEncodeInfinity(self): self.assertEqual(demjson.encode( demjson.inf ), 'Infinity') self.assertEqual(demjson.encode( -demjson.inf ), '-Infinity') self.assertEqual(demjson.encode( demjson.neginf ), '-Infinity') if decimal: self.assertEqual(demjson.encode( decimal.Decimal('Infinity') ), 'Infinity') self.assertEqual(demjson.encode( decimal.Decimal('-Infinity') ), '-Infinity') def testDecodeString(self): self.assertEqual(demjson.decode(r'""'), '') self.assertEqual(demjson.decode(r'"a"'), 'a') self.assertEqual(demjson.decode(r'"abc def"'), 'abc def') self.assertEqual(demjson.decode(r'"\n\t\\\"\b\r\f"'), '\n\t\\"\b\r\f') self.assertEqual(demjson.decode(r'"\abc def"'), 'abc def') def testEncodeString(self): self.assertEqual(demjson.encode(''), r'""') self.assertEqual(demjson.encode('a'), r'"a"') self.assertEqual(demjson.encode('abc def'), r'"abc def"') self.assertEqual(demjson.encode('\n'), r'"\n"') self.assertEqual(demjson.encode('\n\t\r\b\f'), r'"\n\t\r\b\f"') self.assertEqual(demjson.encode('\n'), r'"\n"') self.assertEqual(demjson.encode('"'), r'"\""') self.assertEqual(demjson.encode('\\'), '"\\\\"') def testDecodeStringWithNull(self): self.assertEqual(demjson.decode('"\x00"',warnings=False), '\0') self.assertEqual(demjson.decode('"a\x00b"',warnings=False), 'a\x00b') def testDecodeStringUnicodeEscape(self): self.assertEqual(demjson.decode(r'"\u0000"',warnings=False), '\0') self.assertEqual(demjson.decode(r'"\u0061"'), 'a') self.assertEqual(demjson.decode(r'"\u2012"'), u'\u2012') self.assertEqual(demjson.decode(r'"\u1eDc"'), u'\u1edc') self.assertEqual(demjson.decode(r'"\uffff"'), u'\uffff') self.assertEqual(demjson.decode(r'"\u00a012"'), u'\u00a0' + '12') self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u041"', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u041Z"', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u"', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\uZ"', strict=True) def testEncodeStringUnicodeEscape(self): self.assertEqual(demjson.encode('\0', escape_unicode=True), r'"\u0000"') self.assertEqual(demjson.encode(u'\u00e0', escape_unicode=True), r'"\u00e0"') self.assertEqual(demjson.encode(u'\u2012', escape_unicode=True), r'"\u2012"') def testHtmlSafe(self): self.assertEqual(demjson.encode('<', html_safe=True), r'"\u003c"') self.assertEqual(demjson.encode('>', html_safe=True), r'"\u003e"') self.assertEqual(demjson.encode('&', html_safe=True), r'"\u0026"') self.assertEqual(demjson.encode('/', html_safe=True), r'"\/"') self.assertEqual(demjson.encode('ac&d/e', html_safe=True), r'"a\u003cb\u003ec\u0026d\/e"') self.assertEqual(demjson.encode('ac&d/e', html_safe=False), r'"ac&d/e"') def testDecodeStringExtendedUnicodeEscape(self): self.assertEqual(demjson.decode(r'"\u{0041}"',allow_extended_unicode_escapes=True), u'A') self.assertEqual(demjson.decode(r'"\u{1aFe}"',allow_extended_unicode_escapes=True), u'\u1afe') self.assertEqual(demjson.decode(r'"\u{41}"',allow_extended_unicode_escapes=True), u'A') self.assertEqual(demjson.decode(r'"\u{1}"',allow_extended_unicode_escapes=True), u'\u0001') self.assertEqual(demjson.decode(r'"\u{00000000000041}"',allow_extended_unicode_escapes=True), u'A') self.assertEqual(demjson.decode(r'"\u{1000a}"',allow_extended_unicode_escapes=True), u'\U0001000a') self.assertEqual(demjson.decode(r'"\u{10ffff}"',allow_extended_unicode_escapes=True), u'\U0010FFFF') self.assertEqual(demjson.decode(r'"\u{0000010ffff}"',allow_extended_unicode_escapes=True), u'\U0010FFFF') self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{0041}"', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{110000}"', allow_extended_unicode_escapes=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{012g}"', allow_extended_unicode_escapes=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{ 0041}"', allow_extended_unicode_escapes=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{0041 }"', allow_extended_unicode_escapes=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{0041"', allow_extended_unicode_escapes=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, r'"\u{}"', allow_extended_unicode_escapes=True) def testAutoDetectEncodingWithCustomUTF32(self): old_use_custom = demjson.helpers.always_use_custom_codecs try: demjson.helpers.always_use_custom_codecs = True self.runTestAutoDetectEncoding() finally: demjson.helpers.always_use_custom_codecs = old_use_custom def testAutoDetectEncodingWithBuiltinUTF32(self): old_use_custom = demjson.helpers.always_use_custom_codecs try: demjson.helpers.always_use_custom_codecs = False self.runTestAutoDetectEncoding() finally: demjson.helpers.always_use_custom_codecs = old_use_custom def runTestAutoDetectEncoding(self): QT = ord('"') TAB = ord('\t') FOUR = ord('4') TWO = ord('2') # Plain byte strings, without BOM self.assertEqual(demjson.decode( rawbytes([ 0, 0, 0, FOUR ]) ), 4 ) # UTF-32BE self.assertEqual(demjson.decode( rawbytes([ 0, 0, 0, FOUR, 0, 0, 0, TWO ]) ), 42 ) self.assertEqual(demjson.decode( rawbytes([ FOUR, 0, 0, 0 ]) ), 4 ) # UTF-32LE self.assertEqual(demjson.decode( rawbytes([ FOUR, 0, 0, 0, TWO, 0, 0, 0 ]) ), 42 ) self.assertEqual(demjson.decode( rawbytes([ 0, FOUR, 0, TWO ]) ), 42 ) # UTF-16BE self.assertEqual(demjson.decode( rawbytes([ FOUR, 0, TWO, 0 ]) ), 42 ) # UTF-16LE self.assertEqual(demjson.decode( rawbytes([ 0, FOUR ]) ), 4 ) #UTF-16BE self.assertEqual(demjson.decode( rawbytes([ FOUR, 0 ]) ), 4 ) #UTF-16LE self.assertEqual(demjson.decode( rawbytes([ FOUR, TWO ]) ), 42 ) # UTF-8 self.assertEqual(demjson.decode( rawbytes([ TAB, FOUR, TWO ]) ), 42 ) # UTF-8 self.assertEqual(demjson.decode( rawbytes([ FOUR ]) ), 4 ) # UTF-8 # With byte-order marks (BOM) # UTF-32BE self.assertEqual(demjson.decode( rawbytes([ 0, 0, 0xFE, 0xFF, 0, 0, 0, FOUR ]) ), 4 ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ 0, 0, 0xFE, 0xFF, FOUR, 0, 0, 0 ]) ) # UTF-32LE self.assertEqual(demjson.decode( rawbytes([ 0xFF, 0xFE, 0, 0, FOUR, 0, 0, 0 ]) ), 4 ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ 0xFF, 0xFE, 0, 0, 0, 0, 0, FOUR ]) ) # UTF-16BE self.assertEqual(demjson.decode( rawbytes([ 0xFE, 0xFF, 0, FOUR ]) ), 4 ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ 0xFE, 0xFF, FOUR, 0 ]) ) # UTF-16LE self.assertEqual(demjson.decode( rawbytes([ 0xFF, 0xFE, FOUR, 0 ]) ), 4 ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ 0xFF, 0xFE, 0, FOUR ]) ) # Invalid Unicode strings self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ 0 ]) ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ TAB, FOUR, TWO, 0 ]) ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ FOUR, 0, 0 ]) ) self.assertRaises(demjson.JSONDecodeError, demjson.decode, rawbytes([ FOUR, 0, 0, TWO ]) ) def testDecodeStringRawUnicode(self): QT = ord('"') self.assertEqual(demjson.decode(rawbytes([ QT,0xC3,0xA0,QT ]), encoding='utf-8'), u'\u00e0') self.assertEqual(demjson.decode(rawbytes([ QT,0,0,0, 0xE0,0,0,0, QT,0,0,0 ]), encoding='ucs4le'), u'\u00e0') self.assertEqual(demjson.decode(rawbytes([ 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]), encoding='ucs4be'), u'\u00e0') self.assertEqual(demjson.decode(rawbytes([ 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]), encoding='utf-32be'), u'\u00e0') self.assertEqual(demjson.decode(rawbytes([ 0,0,0xFE,0xFF, 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]), encoding='ucs4'), u'\u00e0') def testEncodeStringRawUnicode(self): QT = ord('"') self.assertEqual(demjson.encode(u'\u00e0', escape_unicode=False, encoding='utf-8'), rawbytes([ QT, 0xC3, 0xA0, QT ]) ) self.assertEqual(demjson.encode(u'\u00e0', escape_unicode=False, encoding='ucs4le'), rawbytes([ QT,0,0,0, 0xE0,0,0,0, QT,0,0,0 ]) ) self.assertEqual(demjson.encode(u'\u00e0', escape_unicode=False, encoding='ucs4be'), rawbytes([ 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]) ) self.assertEqual(demjson.encode(u'\u00e0', escape_unicode=False, encoding='utf-32be'), rawbytes([ 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]) ) self.assertTrue(demjson.encode(u'\u00e0', escape_unicode=False, encoding='ucs4') in [rawbytes([ 0,0,0xFE,0xFF, 0,0,0,QT, 0,0,0,0xE0, 0,0,0,QT ]), rawbytes([ 0xFF,0xFE,0,0, QT,0,0,0, 0xE0,0,0,0, QT,0,0,0 ]) ]) def testEncodeStringWithSpecials(self): # Make sure that certain characters are always \u-encoded even if the # output encoding could have represented them in the raw. # Test U+001B escape - a control character self.assertEqual(demjson.encode(u'\u001B', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u001b"' ]) ) # Test U+007F delete - a control character self.assertEqual(demjson.encode(u'\u007F', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u007f"' ]) ) # Test U+00AD soft hyphen - a format control character self.assertEqual(demjson.encode(u'\u00AD', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u00ad"' ]) ) # Test U+200F right-to-left mark self.assertEqual(demjson.encode(u'\u200F', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u200f"' ]) ) # Test U+2028 line separator self.assertEqual(demjson.encode(u'\u2028', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u2028"' ]) ) # Test U+2029 paragraph separator self.assertEqual(demjson.encode(u'\u2029', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\u2029"' ]) ) # Test U+E007F cancel tag self.assertEqual(demjson.encode(u'\U000E007F', escape_unicode=False, encoding='utf-8'), rawbytes([ ord(c) for c in '"\\udb40\\udc7f"' ]) ) def testDecodeSupplementalUnicode(self): import sys if sys.maxunicode > 65535: self.assertEqual(demjson.decode( rawbytes([ ord(c) for c in r'"\udbc8\udf45"' ]) ), u'\U00102345') self.assertEqual(demjson.decode( rawbytes([ ord(c) for c in r'"\ud800\udc00"' ]) ), u'\U00010000') self.assertEqual(demjson.decode( rawbytes([ ord(c) for c in r'"\udbff\udfff"' ]) ), u'\U0010ffff') for bad_case in [r'"\ud801"', r'"\udc02"', r'"\ud801\udbff"', r'"\ud801\ue000"', r'"\ud801\u2345"']: try: self.assertRaises(demjson.JSONDecodeError, demjson.decode( rawbytes([ ord(c) for c in bad_case ]) ) ) except demjson.JSONDecodeError: pass def testEncodeSupplementalUnicode(self): import sys if sys.maxunicode > 65535: self.assertEqual(demjson.encode(u'\U00010000',encoding='ascii'), rawbytes([ ord(c) for c in r'"\ud800\udc00"' ]) ) self.assertEqual(demjson.encode(u'\U00102345',encoding='ascii'), rawbytes([ ord(c) for c in r'"\udbc8\udf45"' ]) ) self.assertEqual(demjson.encode(u'\U0010ffff',encoding='ascii'), rawbytes([ ord(c) for c in r'"\udbff\udfff"' ]) ) def have_codec(self, name): import codecs try: i = codecs.lookup(name) except LookupError: return False else: return True def testDecodeWithWindows1252(self): have_cp1252 = self.have_codec('cp1252') if have_cp1252: # Use Windows-1252 code page. Note character 0x8c is U+0152, which # is different than ISO8859-1. d = rawbytes([ ord('"'), ord('a'), 0xe0, 0x8c, ord('"') ]) self.assertEqual(demjson.decode( d, encoding='cp1252' ), u"a\u00e0\u0152") def testDecodeWithEBCDIC(self): have_ebcdic = self.have_codec('ibm037') if have_ebcdic: # Try EBCDIC d = rawbytes([ 0x7f, 0xc1, 0xc0, 0x7c, 0xe0, 0xa4, 0xf0, 0xf1, 0xf5, 0xf2, 0x7f ]) self.assertEqual(demjson.decode( d, encoding='ibm037' ), u"A{@\u0152") def testDecodeWithISO8859_1(self): have_iso8859_1 = self.have_codec('iso8859-1') if have_iso8859_1: # Try ISO-8859-1 d = rawbytes([ ord('"'), ord('a'), 0xe0, ord('\\'), ord('u'), ord('0'), ord('1'), ord('5'), ord('2'), ord('"') ]) self.assertEqual(demjson.decode( d, encoding='iso8859-1' ), u"a\u00e0\u0152") def testDecodeWithCustomCodec(self): # Try Rot-1 ci = rot_one.lookup('rot-1') d = rawbytes([ ord('"'), ord('A'), ord('B'), ord('Y'), ord('Z'), ord(' '), ord('5'), ord('"') ]) self.assertEqual(demjson.decode( d, encoding=ci ), u"ZAXY 5") def testDecodeWithDegenerateCodec(self): ci = no_curly_braces.lookup('degenerate') d = rawbytes([ord(c) for c in '"abc"' ]) self.assertEqual(demjson.decode( d, encoding=ci ), u"abc") d = rawbytes([ord(c) for c in '{"abc":42}' ]) self.assertRaises(demjson.JSONDecodeError, demjson.decode, d, encoding=ci ) def testEncodeWithWindows1252(self): have_cp1252 = self.have_codec('cp1252') if have_cp1252: s = u'a\u00e0\u0152' self.assertEqual(demjson.encode( s, encoding='cp1252' ), rawbytes([ ord('"'), ord('a'), 0xe0, 0x8c, ord('"') ]) ) def testEncodeWithEBCDIC(self): have_ebcdic = self.have_codec('ibm037') if have_ebcdic: s = u"A{@\u0152" self.assertEqual(demjson.encode( s, encoding='ibm037' ), rawbytes([ 0x7f, 0xc1, 0xc0, 0x7c, 0xe0, 0xa4, 0xf0, 0xf1, 0xf5, 0xf2, 0x7f ]) ) def testEncodeWithISO8859_1(self): have_iso8859_1 = self.have_codec('iso8859-1') if have_iso8859_1: s = u'a\u00e0\u0152' self.assertEqual(demjson.encode( s, encoding='iso8859-1' ), rawbytes([ ord('"'), ord('a'), 0xe0, ord('\\'), ord('u'), ord('0'), ord('1'), ord('5'), ord('2'), ord('"') ]) ) def testEncodeWithCustomCodec(self): # Try Rot-1 ci = rot_one.lookup('rot-1') d = u"ABYZ 5" self.assertEqual(demjson.encode( d, encoding=ci ), rawbytes([ ord('"'), ord('B'), ord('C'), ord('Z'), ord('A'), ord(' '), ord('5'), ord('"') ]) ) def testEncodeWithDegenerateCodec(self): ci = no_curly_braces.lookup('degenerate') self.assertRaises(demjson.JSONEncodeError, demjson.encode, u'"abc"', encoding=ci ) self.assertRaises(demjson.JSONEncodeError, demjson.encode, u'{"abc":42}', encoding=ci ) def testDecodeArraySimple(self): self.assertEqual(demjson.decode('[]'), []) self.assertEqual(demjson.decode('[ ]'), []) self.assertEqual(demjson.decode('[ 42 ]'), [42]) self.assertEqual(demjson.decode('[ 42 ,99 ]'), [42, 99]) self.assertEqual(demjson.decode('[ 42, ,99 ]', strict=False), [42, demjson.undefined, 99]) self.assertEqual(demjson.decode('[ "z" ]'), ['z']) self.assertEqual(demjson.decode('[ "z[a]" ]'), ['z[a]']) def testDecodeArrayBad(self): self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[,]', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[1,]', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[,1]', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[1,,2]', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[1 2]', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '[[][]]', strict=True) def testDecodeArrayNested(self): self.assertEqual(demjson.decode('[[]]'), [[]]) self.assertEqual(demjson.decode('[ [ ] ]'), [[]]) self.assertEqual(demjson.decode('[[],[]]'), [[],[]]) self.assertEqual(demjson.decode('[[42]]'), [[42]]) self.assertEqual(demjson.decode('[[42,99]]'), [[42,99]]) self.assertEqual(demjson.decode('[[42],33]'), [[42],33]) self.assertEqual(demjson.decode('[[42,[],44],[77]]'), [[42,[],44],[77]]) def testEncodeArraySimple(self): self.assertEqual(demjson.encode([]), '[]') self.assertEqual(demjson.encode([42]), '[42]') self.assertEqual(demjson.encode([42,99]), '[42,99]') self.assertEqual(demjson.encode([42,demjson.undefined,99],strict=False), '[42,undefined,99]') def testEncodeArrayNested(self): self.assertEqual(demjson.encode([[]]), '[[]]') self.assertEqual(demjson.encode([[42]]), '[[42]]') self.assertEqual(demjson.encode([[42, 99]]), '[[42,99]]') self.assertEqual(demjson.encode([[42], 33]), '[[42],33]') self.assertEqual(demjson.encode([[42, [], 44], [77]]), '[[42,[],44],[77]]') def testDecodeObjectSimple(self): self.assertEqual(demjson.decode('{}'), {}) self.assertEqual(demjson.decode('{"":1}'), {'':1}) self.assertEqual(demjson.decode('{"a":1}'), {'a':1}) self.assertEqual(demjson.decode('{ "a" : 1}'), {'a':1}) self.assertEqual(demjson.decode('{"a":1,"b":2}'), {'a':1,'b':2}) self.assertEqual(demjson.decode(' { "a" : 1 , "b" : 2 } '), {'a':1,'b':2}) def testDecodeObjectHarder(self): self.assertEqual(demjson.decode('{ "b" :\n2 , "a" : 1\t,"\\u0063"\n\t: 3 }'), {'a':1,'b':2,'c':3}) self.assertEqual(demjson.decode('{"a":1,"b":2,"c{":3}'), {'a':1,'b':2,'c{':3}) self.assertEqual(demjson.decode('{"a":1,"b":2,"d}":3}'), {'a':1,'b':2,'d}':3}) self.assertEqual(demjson.decode('{"a:{":1,"b,":2,"d}":3}'), {'a:{':1,'b,':2,'d}':3}) def testDecodeObjectWithDuplicates(self): self.assertEqual(demjson.decode('{"a":1,"a":2}'), {'a':2}) self.assertEqual(demjson.decode('{"a":2,"a":1}'), {'a':1}) self.assertEqual(demjson.decode('{"a":1,"b":99,"a":2,"b":42}'), {'a':2,'b':42}) self.assertEqual(demjson.decode('{"a":1,"b":2}', prevent_duplicate_keys=True), {'a':1,'b':2}) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a":1,"a":1}', prevent_duplicate_keys=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a":1,"a":2}', prevent_duplicate_keys=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"b":9,"a":1,"c":42,"a":2}', prevent_duplicate_keys=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a":1,"\u0061":1}', prevent_duplicate_keys=True) def testDecodeObjectBad(self): self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a"}', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a":}', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{,"a":1}', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{"a":1,}', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{["a","b"]:1}', strict=True) def testDecodeObjectNested(self): self.assertEqual(demjson.decode('{"a":{"b":2}}'), {'a':{'b':2}}) self.assertEqual(demjson.decode('{"a":{"b":2,"c":"{}"}}'), {'a':{'b':2,'c':'{}'}}) self.assertEqual(demjson.decode('{"a":{"b":2},"c":{"d":4}}'), \ {'a':{'b':2},'c':{'d':4}}) def testEncodeObjectSimple(self): self.assertEqual(demjson.encode({}), '{}') self.assertEqual(demjson.encode({'':1}), '{"":1}') self.assertEqual(demjson.encode({'a':1}), '{"a":1}') self.assertEqual(demjson.encode({'a':1,'b':2}), '{"a":1,"b":2}') self.assertEqual(demjson.encode({'a':1,'c':3,'b':'xyz'}), '{"a":1,"b":"xyz","c":3}') def testEncodeObjectNested(self): self.assertEqual(demjson.encode({'a':{'b':{'c':99}}}), '{"a":{"b":{"c":99}}}') self.assertEqual(demjson.encode({'a':{'b':88},'c':99}), '{"a":{"b":88},"c":99}') def testEncodeBadObject(self): self.assertRaises(demjson.JSONEncodeError, demjson.encode, {1:True}, strict=True) self.assertRaises(demjson.JSONEncodeError, demjson.encode, {('a','b'):True}, strict=True) def testEncodeObjectDictLike(self): """Makes sure it can encode things which look like dictionarys but aren't. """ letters = 'ABCDEFGHIJKL' mydict = LetterOrdDict( letters ) self.assertEqual( demjson.encode(mydict), '{' + ','.join(['"%s":%d'%(c,ord(c)) for c in letters]) + '}' ) def testEncodeArrayLike(self): class LikeList(object): def __iter__(self): class i(object): def __init__(self): self.n = 0 def next(self): self.n += 1 if self.n < 10: return 2**self.n raise StopIteration return i() mylist = LikeList() self.assertEqual(demjson.encode(mylist), \ '[2,4,8,16,32,64,128,256,512]' ) def testEncodeStringLike(self): import UserString class LikeString(UserString.UserString): pass mystring = LikeString('hello') self.assertEqual(demjson.encode(mystring), '"hello"') mystring = LikeString(u'hi\u2012there') self.assertEqual(demjson.encode(mystring, escape_unicode=True, encoding='utf-8'), rawbytes([ ord(c) for c in r'"hi\u2012there"' ]) ) def testObjectNonstringKeys(self): self.assertEqual(demjson.decode('{55:55}',strict=False), {55:55}) self.assertEqual(demjson.decode('{fiftyfive:55}',strict=False), {'fiftyfive':55}) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '{fiftyfive:55}', strict=True) self.assertRaises(demjson.JSONEncodeError, demjson.encode, {55:'fiftyfive'}, strict=True) self.assertEqual(demjson.encode({55:55}, strict=False), '{55:55}') def testDecodeWhitespace(self): self.assertEqual(demjson.decode(' []'), []) self.assertEqual(demjson.decode('[] '), []) self.assertEqual(demjson.decode(' [ ] '), []) self.assertEqual(demjson.decode('\n[]\n'), []) self.assertEqual(demjson.decode('\t\r \n[\n\t]\n'), []) # Form-feed is not a valid JSON whitespace char self.assertRaises(demjson.JSONDecodeError, demjson.decode, '\x0c[]', strict=True) # No-break-space is not a valid JSON whitespace char self.assertRaises(demjson.JSONDecodeError, demjson.decode, u'\u00a0[]', strict=True) def testDecodeInvalidStartingType(self): if False: # THESE TESTS NO LONGER APPLY WITH RFC 7158, WHICH SUPERSEDED RFC 4627 self.assertRaises(demjson.JSONDecodeError, demjson.decode, '', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '1', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '1.5', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, '"a"', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, 'true', strict=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, 'null', strict=True) def testDecodeMixed(self): self.assertEqual(demjson.decode('[0.5,{"3e6":[true,"d{["]}]'), \ [0.5, {'3e6': [True, 'd{[']}] ) def testEncodeMixed(self): self.assertEqual(demjson.encode([0.5, {'3e6': [True, 'd{[']}] ), '[0.5,{"3e6":[true,"d{["]}]' ) def testDecodeComments(self): self.assertEqual(demjson.decode('//hi\n42', allow_comments=True), 42) self.assertEqual(demjson.decode('/*hi*/42', allow_comments=True), 42) self.assertEqual(demjson.decode('/*hi//x\n*/42', allow_comments=True), 42) self.assertEqual(demjson.decode('"a/*xx*/z"', allow_comments=True), 'a/*xx*/z') self.assertRaises(demjson.JSONDecodeError, demjson.decode, \ '4/*aa*/2', allow_comments=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, \ '//hi/*x\n*/42', allow_comments=True) self.assertRaises(demjson.JSONDecodeError, demjson.decode, \ '/*hi/*x*/42', allow_comments=True) def testNamedTuples(self): import collections point = collections.namedtuple('point',['x','y']) rgb = collections.namedtuple('RGB',['red','green','blue']) position = point( 7, 3 ) orange = rgb( 255, 255, 0 ) ispoint = lambda typ: isinstance(typ,point) iscolor = lambda typ: isinstance(typ,rgb) self.assertEqual(demjson.encode(position, encode_namedtuple_as_object=True, compactly=True), '{"x":7,"y":3}' ) self.assertEqual(demjson.encode(position, encode_namedtuple_as_object=False, compactly=True), '[7,3]' ) self.assertEqual(demjson.encode(orange, encode_namedtuple_as_object=ispoint, compactly=True), '[255,255,0]' ) self.assertEqual(demjson.encode(orange, encode_namedtuple_as_object=iscolor, compactly=True), '{"blue":0,"green":255,"red":255}' ) def testDecodeNumberHook(self): """Tests the 'decode_number' and 'decode_float' hooks.""" def round_plus_one(s): return round(float(s)) + 1 def negate(s): if s.startswith('-'): return float( s[1:] ) else: return float( '-' + s ) def xnonnum(s): if s=='NaN': return 'x-not-a-number' elif s=='Infinity' or s=='+Infinity': return 'x-infinity' elif s=='-Infinity': return 'x-neg-infinity' else: raise demjson.JSONSkipHook self.assertEqual(demjson.decode('[3.14,-2.7]'), [3.14, -2.7] ) self.assertEqual(demjson.decode('[3.14,-2.7]',decode_number=negate), [-3.14, 2.7] ) self.assertEqual(demjson.decode('[3.14,-2.7]',decode_number=round_plus_one), [4.0, -2.0] ) self.assertEqual(demjson.decode('[3.14,-2.7,8]',decode_float=negate), [-3.14, 2.7, 8] ) self.assertEqual(demjson.decode('[3.14,-2.7,8]',decode_float=negate,decode_number=round_plus_one), [-3.14, 2.7, 9.0] ) self.assertEqual(demjson.decode('[2,3.14,NaN,Infinity,+Infinity,-Infinity]', strict=False, decode_number=xnonnum), [2, 3.14, 'x-not-a-number', 'x-infinity', 'x-infinity', 'x-neg-infinity'] ) def testDecodeArrayHook(self): def reverse(arr): return list(reversed(arr)) self.assertEqual(demjson.decode('[3, 8, 9, [1, 3, 5]]'), [3, 8, 9, [1, 3, 5]] ) self.assertEqual(demjson.decode('[3, 8, 9, [1, 3, 5]]', decode_array=reverse), [[5, 3, 1], 9, 8, 3] ) self.assertEqual(demjson.decode('[3, 8, 9, [1, 3, 5]]', decode_array=sum), 29 ) def testDecodeObjectHook(self): def pairs(dct): return sorted(dct.items()) self.assertEqual(demjson.decode('{"a":42, "b":{"c":99}}'), {u'a': 42, u'b': {u'c': 99}} ) self.assertEqual(demjson.decode('{"a":42, "b":{"c":99}}', decode_object=pairs), [(u'a', 42), (u'b', [(u'c', 99)])] ) def testDecodeStringHook(self): import string def s2num( s ): try: s = int(s) except ValueError: pass return s doc = '{"one":["two","three",{"four":"005"}]}' self.assertEqual(demjson.decode(doc), {'one':['two','three',{'four':'005'}]} ) self.assertEqual(demjson.decode(doc, decode_string=lambda s: s.capitalize()), {'One':['Two','Three',{'Four':'005'}]} ) self.assertEqual(demjson.decode(doc, decode_string=s2num), {'one':['two','three',{'four':5}]} ) def testEncodeDictKey(self): d1 = {42: "forty-two", "a":"Alpha"} d2 = {complex(0,42): "imaginary-forty-two", "a":"Alpha"} def make_key( k ): if isinstance(k,basestring): raise demjson.JSONSkipHook else: return repr(k) def make_key2( k ): if isinstance(k, (int,basestring)): raise demjson.JSONSkipHook else: return repr(k) self.assertRaises(demjson.JSONEncodeError, demjson.encode, \ d1, strict=True) self.assertEqual(demjson.encode(d1,strict=False,sort_keys=demjson.SORT_ALPHA), '{42:"forty-two","a":"Alpha"}' ) self.assertEqual(demjson.encode(d1, encode_dict_key=make_key), '{"42":"forty-two","a":"Alpha"}' ) self.assertEqual(demjson.encode(d1,strict=False, encode_dict_key=make_key2, sort_keys=demjson.SORT_ALPHA), '{42:"forty-two","a":"Alpha"}' ) self.assertRaises(demjson.JSONEncodeError, demjson.encode, \ d2, strict=True) self.assertEqual(demjson.encode(d2, encode_dict_key=make_key), '{"%r":"imaginary-forty-two","a":"Alpha"}' % complex(0,42) ) def testEncodeDict(self): def d2pairs( d ): return sorted( d.items() ) def add_keys( d ): d['keys'] = list(sorted(d.keys())) return d d = {"a":42, "b":{"c":99,"d":7}} self.assertEqual(demjson.encode( d, encode_dict=d2pairs ), '[["a",42],["b",[["c",99],["d",7]]]]' ) self.assertEqual(demjson.encode( d, encode_dict=add_keys ), '{"a":42,"b":{"c":99,"d":7,"keys":["c","d"]},"keys":["a","b"]}' ) def testEncodeDictSorting(self): d = {'apple':1,'Ball':1,'cat':1,'dog1':1,'dog002':1,'dog10':1,'DOG03':1} self.assertEqual(demjson.encode( d, sort_keys=demjson.SORT_ALPHA ), '{"Ball":1,"DOG03":1,"apple":1,"cat":1,"dog002":1,"dog1":1,"dog10":1}' ) self.assertEqual(demjson.encode( d, sort_keys=demjson.SORT_ALPHA_CI ), '{"apple":1,"Ball":1,"cat":1,"dog002":1,"DOG03":1,"dog1":1,"dog10":1}' ) self.assertEqual(demjson.encode( d, sort_keys=demjson.SORT_SMART ), '{"apple":1,"Ball":1,"cat":1,"dog1":1,"dog002":1,"DOG03":1,"dog10":1}' ) @skipUnlessPython27 def testEncodeDictPreserveSorting(self): import collections d = collections.OrderedDict() d['X'] = 42 d['A'] = 99 d['Z'] = 50 self.assertEqual(demjson.encode( d, sort_keys=demjson.SORT_PRESERVE ), '{"X":42,"A":99,"Z":50}') d['E'] = {'h':'H',"d":"D","b":"B"} d['C'] = 1 self.assertEqual(demjson.encode( d, sort_keys=demjson.SORT_PRESERVE ), '{"X":42,"A":99,"Z":50,"E":{"b":"B","d":"D","h":"H"},"C":1}') def testEncodeSequence(self): def list2hash( seq ): return dict([ (str(i),val) for i, val in enumerate(seq) ]) d = [1,2,3,[4,5,6],7,8] self.assertEqual(demjson.encode( d, encode_sequence=reversed ), '[8,7,[6,5,4],3,2,1]' ) self.assertEqual(demjson.encode( d, encode_sequence=list2hash ), '{"0":1,"1":2,"2":3,"3":{"0":4,"1":5,"2":6},"4":7,"5":8}' ) @skipUnlessPython3 def testEncodeBytes(self): no_bytes = bytes([]) all_bytes = bytes( list(range(256)) ) self.assertEqual(demjson.encode( no_bytes ), '[]' ) self.assertEqual(demjson.encode( all_bytes ), '[' + ','.join([str(n) for n in all_bytes]) + ']' ) self.assertEqual(demjson.encode( no_bytes, encode_bytes=hexencode_bytes ), '""' ) self.assertEqual(demjson.encode( all_bytes, encode_bytes=hexencode_bytes ), '"' + hexencode_bytes(all_bytes) + '"' ) def testEncodeValue(self): def enc_val( val ): if isinstance(val, complex): return {'real':val.real, 'imaginary':val.imag} elif isinstance(val, basestring): return val.upper() elif isinstance(val, datetime.date): return val.strftime("Year %Y Month %m Day %d") else: raise demjson.JSONSkipHook v = {'ten':10, 'number': complex(3, 7.25), 'asof': datetime.date(2014,1,17)} self.assertEqual(demjson.encode( v, encode_value=enc_val ), u'{"ASOF":"YEAR 2014 MONTH 01 DAY 17","NUMBER":{"IMAGINARY":7.25,"REAL":3.0},"TEN":10}' ) def testEncodeDefault(self): import datetime def dictkeys( d ): return "/".join( sorted([ str(k) for k in d.keys() ]) ) def magic( d ): return complex( 1, len(d)) class Anon(object): def __init__(self, val): self.v = val def __repr__(self): return "" class Anon2(object): def __init__(self, val): self.v = val def encode_anon( obj ): if isinstance(obj,Anon): return obj.v raise demjson.JSONSkipHook vals = [ "abc", 123, Anon("Hello"), sys, {'a':42,'wow':True} ] self.assertEqual(demjson.encode( vals, encode_default=repr ), u'["abc",123,"%s","%s",{"a":42,"wow":true}]' % ( repr(vals[2]), repr(vals[3])) ) self.assertEqual(demjson.encode( vals, encode_default=repr, encode_dict=dictkeys ), u'["abc",123,"%s","%s","a/wow"]' % ( repr(vals[2]), repr(vals[3])) ) self.assertEqual(demjson.encode( vals, encode_default=repr, encode_dict=magic ), u'["abc",123,"%s","%s","%s"]' % ( repr(vals[2]), repr(vals[3]), repr(magic(vals[4])) ) ) self.assertRaises( demjson.JSONEncodeError, demjson.encode, Anon("Hello") ) self.assertEqual( demjson.encode( Anon("Hello"), encode_default=encode_anon ), '"Hello"' ) self.assertRaises( demjson.JSONEncodeError, demjson.encode, Anon2("Hello"), encode_default=encode_anon ) def testEncodeDate(self): d = datetime.date(2014,01,04) self.assertEqual(demjson.encode( d ), '"2014-01-04"' ) self.assertEqual(demjson.encode( d, date_format='%m/%d/%Y' ), '"01/04/2014"' ) def testEncodeDatetime(self): d = datetime.datetime(2014,01,04,13,22,15) self.assertEqual(demjson.encode( d ), '"2014-01-04T13:22:15"' ) self.assertEqual(demjson.encode( d, datetime_format='%m/%d/%Y %H hr %M min' ), '"01/04/2014 13 hr 22 min"' ) def testEncodeTime(self): pass #!!! def testEncodeTimedelta(self): pass #!!! def testStopProcessing(self): def jack_in_the_box( obj ): if obj == 42 or obj == "42": raise demjson.JSONStopProcessing else: raise demjson.JSONSkipHook self.assertEqual(demjson.encode( [1,2,3], encode_value=jack_in_the_box), "[1,2,3]" ) self.assertRaises( demjson.JSONEncodeError, demjson.encode, [1,2,42], encode_value=jack_in_the_box ) self.assertEqual(demjson.decode( '[1,2,3]', decode_number=jack_in_the_box), [1,2,3] ) def decode_stats(self, data, *args, **kwargs): """Runs demjson.decode() and returns the statistics object.""" kwargs['return_stats'] = True res = demjson.decode( data, *args, **kwargs ) if res: return res.stats return None def testStatsSimple(self): self.assertEqual( self.decode_stats( '1' ).num_ints, 1 ) self.assertEqual( self.decode_stats( '3.14' ).num_floats, 1 ) self.assertEqual( self.decode_stats( 'true' ).num_bools, 1 ) self.assertEqual( self.decode_stats( 'false' ).num_bools, 1 ) self.assertEqual( self.decode_stats( 'null' ).num_nulls, 1 ) self.assertEqual( self.decode_stats( '"hello"' ).num_strings, 1 ) self.assertEqual( self.decode_stats( '[]' ).num_arrays, 1 ) self.assertEqual( self.decode_stats( '{}' ).num_objects, 1 ) self.assertEqual( self.decode_stats( '1//HI' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '1/*HI*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'NaN' ).num_nans, 1 ) self.assertEqual( self.decode_stats( 'Infinity' ).num_infinities, 1 ) self.assertEqual( self.decode_stats( '-Infinity' ).num_infinities, 1 ) self.assertEqual( self.decode_stats( 'undefined' ).num_undefineds, 1 ) self.assertEqual( self.decode_stats( '[,1,2]' ).num_undefineds, 1 ) self.assertEqual( self.decode_stats( '[1,,2]' ).num_undefineds, 1 ) self.assertEqual( self.decode_stats( '[1,2,]' ).num_undefineds, 0 ) self.assertEqual( self.decode_stats( '{hello:1}' ).num_identifiers, 1 ) self.assertEqual( self.decode_stats( '[1,{"a":2},[{"b":{"c":[4,[[5]],6]}},7],8]' ).max_depth, 7 ) def testStatsWhitespace(self): self.assertEqual( self.decode_stats( '1' ).num_excess_whitespace, 0 ) self.assertEqual( self.decode_stats( ' 1' ).num_excess_whitespace, 1 ) self.assertEqual( self.decode_stats( '1 ' ).num_excess_whitespace, 1 ) self.assertEqual( self.decode_stats( ' 1 ' ).num_excess_whitespace, 2 ) self.assertEqual( self.decode_stats( '[1, 2]' ).num_excess_whitespace, 1 ) self.assertEqual( self.decode_stats( '[1 , 2]' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '[ 1, 2]' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '[1, 2 ]' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '{"a": 1}' ).num_excess_whitespace, 1 ) self.assertEqual( self.decode_stats( '{"a" : 1}' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '{ "a": 1}' ).num_excess_whitespace, 2 ) self.assertEqual( self.decode_stats( '{"a": 1 }' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '{"a":1,"b":2}' ).num_excess_whitespace, 0 ) self.assertEqual( self.decode_stats( '{"a":1, "b":2}' ).num_excess_whitespace, 2 ) self.assertEqual( self.decode_stats( '{"a":1 ,"b":2}' ).num_excess_whitespace, 2 ) self.assertEqual( self.decode_stats( '{"a":1 , "b":2}' ).num_excess_whitespace, 3 ) self.assertEqual( self.decode_stats( '\n[\t1 , \t2 ]\n' ).num_excess_whitespace, 7 ) def testStatsArrays(self): self.assertEqual( self.decode_stats( '123' ).num_arrays, 0 ) self.assertEqual( self.decode_stats( '[]' ).num_arrays, 1 ) self.assertEqual( self.decode_stats( '[1,2]' ).num_arrays, 1 ) self.assertEqual( self.decode_stats( '{"a":[1,2]}' ).num_arrays, 1 ) self.assertEqual( self.decode_stats( '[1,[2],3]' ).num_arrays, 2 ) self.assertEqual( self.decode_stats( '[[[1,[],[[]],4],[3]]]' ).num_arrays, 7 ) self.assertEqual( self.decode_stats( '123' ).max_depth, 0 ) self.assertEqual( self.decode_stats( '[123]' ).max_items_in_array, 1 ) self.assertEqual( self.decode_stats( '[[[1,[],[[]],4],[3]]]' ).max_depth, 5 ) self.assertEqual( self.decode_stats( '123' ).max_items_in_array, 0 ) self.assertEqual( self.decode_stats( '[]' ).max_items_in_array, 0 ) self.assertEqual( self.decode_stats( '[[[[[[[]]]]]]]' ).max_items_in_array, 1 ) self.assertEqual( self.decode_stats( '[[[[],[[]]],[]]]' ).max_items_in_array, 2 ) self.assertEqual( self.decode_stats( '[[[1,[],[[]],4],[3]]]' ).max_items_in_array, 4 ) def testStatsObjects(self): self.assertEqual( self.decode_stats( '123' ).num_objects, 0 ) self.assertEqual( self.decode_stats( '{}' ).num_objects, 1 ) self.assertEqual( self.decode_stats( '{"a":1}' ).num_objects, 1 ) self.assertEqual( self.decode_stats( '[{"a":1}]' ).num_objects, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b":2}' ).num_objects, 1 ) self.assertEqual( self.decode_stats( '{"a":{}}' ).num_objects, 2 ) self.assertEqual( self.decode_stats( '{"a":{"b":null}}' ).num_objects, 2 ) self.assertEqual( self.decode_stats( '{"a":{"b":{"c":false}},"d":{}}' ).num_objects, 4 ) self.assertEqual( self.decode_stats( '123' ).max_depth, 0 ) self.assertEqual( self.decode_stats( '{}' ).max_depth, 1 ) self.assertEqual( self.decode_stats( '{"a":{"b":{"c":false}},"d":{}}' ).max_depth, 3 ) self.assertEqual( self.decode_stats( '123' ).max_items_in_object, 0 ) self.assertEqual( self.decode_stats( '{}' ).max_items_in_object, 0 ) self.assertEqual( self.decode_stats( '{"a":1}' ).max_items_in_object, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b":2}' ).max_items_in_object, 2 ) self.assertEqual( self.decode_stats( '{"a":{"b":{"c":false}},"d":{}}' ).max_items_in_object, 2 ) def testStatsIntegers(self): n8s = [0,1,127,-127,-128] # -128..127 n16s = [128,255,32767,-129,-32768] # -32768..32767 n32s = [32768,2147483647,-32769,-2147483648] # -2147483648..2147483647 n64s = [2147483648,9223372036854775807,-2147483649,-9223372036854775808]# -9223372036854775808..9223372036854775807 nxls = [9223372036854775808,-9223372036854775809,10**20,-10**20] allnums = [] allnums.extend(n8s) allnums.extend(n16s) allnums.extend(n32s) allnums.extend(n64s) allnums.extend(nxls) alljson = '[' + ','.join([str(n) for n in allnums]) + ']' self.assertEqual( self.decode_stats( 'true' ).num_ints, 0 ) self.assertEqual( self.decode_stats( '1' ).num_ints, 1 ) self.assertEqual( self.decode_stats( '[1,2,"a",3]' ).num_ints, 3 ) self.assertEqual( self.decode_stats( '[1,2,{"a":3}]' ).num_ints, 3 ) self.assertEqual( self.decode_stats( alljson ).num_ints_8bit, len(n8s) ) self.assertEqual( self.decode_stats( alljson ).num_ints_16bit, len(n16s) ) self.assertEqual( self.decode_stats( alljson ).num_ints_32bit, len(n32s) ) self.assertEqual( self.decode_stats( alljson ).num_ints_64bit, len(n64s) ) self.assertEqual( self.decode_stats( alljson ).num_ints_long, len(nxls) ) n53s = [-9007199254740992,-9007199254740991, 9007199254740991,9007199254740992]# -9007199254740991..9007199254740991 self.assertEqual( self.decode_stats( repr(n53s).replace('L','') ).num_ints_53bit, 2 ) def testStatsFloats(self): self.assertEqual( self.decode_stats( 'true' ).num_floats, 0 ) self.assertEqual( self.decode_stats( '1' ).num_floats, 0 ) self.assertEqual( self.decode_stats( '1.1' ).num_floats, 1 ) self.assertEqual( self.decode_stats( '1e-8' ).num_floats, 1 ) self.assertEqual( self.decode_stats( '[1.0,2.0,{"a":-3.0}]' ).num_floats, 3 ) self.assertEqual( self.decode_stats( '0.0' ).num_negative_zero_floats, 0 ) self.assertEqual( self.decode_stats( '-0.0' ).num_negative_zero_floats, 1 ) self.assertEqual( self.decode_stats( '-0.0', float_type=demjson.NUMBER_DECIMAL ).num_negative_zero_floats, 1 ) self.assertEqual( self.decode_stats( '-1.0e-500', float_type=demjson.NUMBER_FLOAT ).num_negative_zero_floats, 1 ) self.assertEqual( self.decode_stats( '1.0e500', float_type=demjson.NUMBER_FLOAT ).num_infinities, 1 ) self.assertEqual( self.decode_stats( '-1.0e500', float_type=demjson.NUMBER_FLOAT ).num_infinities, 1 ) if decimal: self.assertEqual( self.decode_stats( '3.14e100' ).num_floats_decimal, 0 ) self.assertEqual( self.decode_stats( '3.14e500' ).num_floats_decimal, 1 ) self.assertEqual( self.decode_stats( '3.14e-500' ).num_floats_decimal, 1 ) self.assertEqual( self.decode_stats( '3.14159265358979' ).num_floats_decimal, 0 ) self.assertEqual( self.decode_stats( '3.141592653589793238462643383279502884197169399375105820974944592307816406286' ).num_floats_decimal, 1 ) def testStatsStrings(self): self.assertEqual( self.decode_stats( 'true' ).num_strings, 0 ) self.assertEqual( self.decode_stats( '""' ).num_strings, 1 ) self.assertEqual( self.decode_stats( '"abc"' ).num_strings, 1 ) self.assertEqual( self.decode_stats( '["a","b",null,{"c":"d","e":42}]' ).num_strings, 5 ) self.assertEqual( self.decode_stats( '""' ).max_string_length, 0 ) self.assertEqual( self.decode_stats( '""' ).total_string_length, 0 ) self.assertEqual( self.decode_stats( '"abc"' ).max_string_length, 3 ) self.assertEqual( self.decode_stats( '"abc"' ).total_string_length, 3 ) self.assertEqual( self.decode_stats( r'"\u2020"' ).max_string_length, 1 ) self.assertEqual( self.decode_stats( u'"\u2020"' ).max_string_length, 1 ) self.assertEqual( self.decode_stats( u'"\U0010ffff"' ).max_string_length, (1 if is_wide_python else 2) ) self.assertEqual( self.decode_stats( r'"\ud804\udc88"' ).max_string_length, (1 if is_wide_python else 2) ) self.assertEqual( self.decode_stats( '["","abc","defghi"]' ).max_string_length, 6 ) self.assertEqual( self.decode_stats( '["","abc","defghi"]' ).total_string_length, 9 ) self.assertEqual( self.decode_stats( '""' ).min_codepoint, None ) self.assertEqual( self.decode_stats( '""' ).max_codepoint, None ) self.assertEqual( self.decode_stats( r'"\0"' ).min_codepoint, 0 ) self.assertEqual( self.decode_stats( r'"\0"' ).max_codepoint, 0 ) self.assertEqual( self.decode_stats( r'"\u0000"' ).min_codepoint, 0 ) self.assertEqual( self.decode_stats( r'"\u0000"' ).max_codepoint, 0 ) self.assertEqual( self.decode_stats( u'"\u0000"' ).min_codepoint, 0 ) self.assertEqual( self.decode_stats( u'"\u0000"' ).max_codepoint, 0 ) self.assertEqual( self.decode_stats( r'"\1"' ).min_codepoint, 1 ) self.assertEqual( self.decode_stats( r'"\1"' ).max_codepoint, 1 ) self.assertEqual( self.decode_stats( r'"\u0001"' ).min_codepoint, 1 ) self.assertEqual( self.decode_stats( r'"\u0001"' ).max_codepoint, 1 ) self.assertEqual( self.decode_stats( r'"\ud804\udc88"' ).min_codepoint, (69768 if is_wide_python else 0xd804) ) self.assertEqual( self.decode_stats( r'"\ud804\udc88"' ).max_codepoint, (69768 if is_wide_python else 0xdc88) ) self.assertEqual( self.decode_stats( r'"\u60ccABC\u0001"' ).min_codepoint, 1 ) self.assertEqual( self.decode_stats( r'"\u60ccABC\u0001"' ).max_codepoint, 0x60cc ) self.assertEqual( self.decode_stats( r'"\377"' ).min_codepoint, 255 ) self.assertEqual( self.decode_stats( r'"\377"' ).max_codepoint, 255 ) self.assertEqual( self.decode_stats( r'"\uffff"' ).min_codepoint, 0xffff ) self.assertEqual( self.decode_stats( r'"\uffff"' ).max_codepoint, 0xffff ) self.assertEqual( self.decode_stats( u'"\uffff"' ).min_codepoint, 0xffff ) self.assertEqual( self.decode_stats( u'"\uffff"' ).max_codepoint, 0xffff ) self.assertEqual( self.decode_stats( '["mnoapj","kzcde"]' ).min_codepoint, ord('a') ) self.assertEqual( self.decode_stats( '["mnoapj","kzcde"]' ).max_codepoint, ord('z') ) self.assertEqual( self.decode_stats( u'"\U0010ffff"' ).min_codepoint, (0x10ffff if is_wide_python else 0xdbff) ) self.assertEqual( self.decode_stats( u'"\U0010ffff"' ).max_codepoint, (0x10ffff if is_wide_python else 0xdfff) ) def testStatsComments(self): self.assertEqual( self.decode_stats( 'true' ).num_comments, 0 ) self.assertEqual( self.decode_stats( '/**/true' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '/*hi*/true' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true/*hi*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '/*hi*/true/*there*/' ).num_comments, 2 ) self.assertEqual( self.decode_stats( 'true//' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true//\n' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true//hi' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true//hi\n' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '//hi\ntrue' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '/**//**/true' ).num_comments, 2 ) self.assertEqual( self.decode_stats( '/**/ /**/true' ).num_comments, 2 ) self.assertEqual( self.decode_stats( 'true//hi/*there*/\n' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true/*hi//there*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true/*hi\nthere\nworld*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true/*hi\n//there\nworld*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( 'true/*ab*cd*/' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '"abc/*HI*/xyz"' ).num_comments, 0 ) self.assertEqual( self.decode_stats( '[1,2]' ).num_comments, 0 ) self.assertEqual( self.decode_stats( '[/*hi*/1,2]' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '[1/*hi*/,2]' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '[1,/*hi*/2]' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '[1,2/*hi*/]' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b":2}' ).num_comments, 0 ) self.assertEqual( self.decode_stats( '{/*hi*/"a":1,"b":2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a"/*hi*/:1,"b":2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":/*hi*/1,"b":2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1/*hi*/,"b":2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1,/*hi*/"b":2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b"/*hi*/:2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b":/*hi*/2}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '{"a":1,"b":2/*hi*/}' ).num_comments, 1 ) self.assertEqual( self.decode_stats( '//\n[/*A*/1/**/,/*\n\n*/2/*C\n*///D\n]//' ).num_comments, 7 ) def testStatsIdentifiers(self): self.assertEqual( self.decode_stats( 'true' ).num_identifiers, 0 ) self.assertEqual( self.decode_stats( '{"a":2}' ).num_identifiers, 0 ) self.assertEqual( self.decode_stats( '{a:2}' ).num_identifiers, 1 ) self.assertEqual( self.decode_stats( '{a:2,xyz:4}' ).num_identifiers, 2 ) def run_all_tests(): unicode_width = 'narrow' if sys.maxunicode<=0xFFFF else 'wide' print 'Running with demjson version %s, Python version %s with %s-Unicode' % (demjson.__version__, sys.version.split(' ',1)[0],unicode_width) if int( demjson.__version__.split('.',1)[0] ) < 2: print 'WARNING: TESTING AGAINST AN OLD VERSION!' unittest.main() if __name__ == '__main__': run_all_tests() # end file