././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.095979 partial_json_parser-0.2.1.1.post5/LICENSE0000644000000000000000000000205214737516472014735 0ustar00MIT License Copyright (c) 2025 Promplate Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.095979 partial_json_parser-0.2.1.1.post5/README.md0000644000000000000000000001266014737516472015215 0ustar00# Partial JSON Parser Sometimes we need **LLM (Large Language Models)** to produce **structural information** instead of natural language. The easiest way is to use JSON. But before receiving the last token of response, the JSON is broken, which means you can't use `JSON.parse` to decode it. But we still want to stream the data to the user. Here comes `partial-json-parser`, a lightweight and customizable library for parsing partial JSON strings. Here is a [demo](https://promplate.dev/partial-json-parser). (Note that there is [a JavaScript implementation](https://github.com/promplate/partial-json-parser-js) too) ## Installation ```sh pip install partial-json-parser # or poetry / pdm / uv ``` `partial-json-parser` is implemented purely in Python, with good type hints. It is zero-dependency and works with Python 3.6+. You can install run its demo playground by installing `rich` too or: ```sh pip install partial-json-parser[playground] ``` Then run the `json-playground` in your terminal, and you can try the parser interactively. ## Usage ```py from partial_json_parser import loads >>> loads('{"key": "v') # {'key': 'v'} ``` Alternatively, you can use `ensure_json` to get the completed JSON string: ```py from partial_json_parser import ensure_json >>> ensure_json('{"key": "v') # '{"key": "v"}' ``` ### Detailed Usage You can import the `loads` function and the `Allow` object from the library like this: ```py from partial_json_parser import loads, Allow ``` The `Allow` object is just an Enum for options. It determines what types can be partial. types not included in `allow` only appears after its completion can be ensured. ### Parsing complete / partial JSON strings The `loads` function works just like the built-in `json.loads` when parsing a complete JSON string: ```py result = loads('{"key":"value"}') print(result) # Outputs: {'key': 'value'} ``` You can parse a partial JSON string by passing an additional parameter to the `loads` function. This parameter is a **bitwise OR** of the constants from the `Allow` flag: (Note that you can directly import the constants you need from `partial-json-parser`) ```py from partial_json_parser import loads, Allow, STR, OBJ result = loads('{"key": "v', STR | OBJ) print(result) # Outputs: {'key': 'v'} ``` In this example, `Allow.STR` tells the parser that it's okay if a string is incomplete, and `Allow.OBJ` tells the parser so as a dict. The parser then try to return as much data as it can. If you don't allow partial strings, then it will not add `"key"` to the object because `"v` is not close: ```py result = loads('{"key": "v', OBJ) print(result) # Outputs: {} result = loads('{"key": "value"', OBJ) print(result) # Outputs: {'key': 'value'} ``` Similarity, you can parse partial lists or even partial special values if you allow it: (Note that `allow` defaults to `Allow.ALL`) ```py result = loads('[ {"key1": "value1", "key2": [ "value2') print(result) # Outputs: [{'key1': 'value1', 'key2': ['value2']}] result = loads("-Inf") print(result) # Outputs: -inf ``` ### Handling malformed JSON If the JSON string is malformed, the `parse` function will throw an error: ```py loads("wrong") # MalformedJSON: Malformed node or string on line 1 ``` ## API Reference ### loads(json_string, [allow_partial], [parser]) - `json_string` ``: The (incomplete) JSON string to parse. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). - `parser` `(str) -> JSON`: An ordinary JSON parser. Default is `json.loads`. Complete the JSON string and parse it with `parser` function. Returns the parsed Python value. Alias: `decode`, `parse_json`. ### ensure_json(json_string, [allow_partial]) - `json_string` ``: The (incomplete) JSON string to complete. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). Returns the completed JSON string. ### fix(json_string, [allow_partial]) - `json_string` ``: The (incomplete) JSON string to complete. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). Returns a tuple of a slice of the input string and the completion. Note that this is a low-level API, only useful for debugging and demonstration. ### Allow Enum class that specifies what kind of partialness is allowed during JSON parsing. It has the following members: - `STR`: Allow partial string. - `NUM`: Allow partial number. - `ARR`: Allow partial array. - `OBJ`: Allow partial object. - `NULL`: Allow partial null. - `BOOL`: Allow partial boolean. - `NAN`: Allow partial NaN. - `INFINITY`: Allow partial Infinity. - `_INFINITY`: Allow partial -Infinity. - `INF`: Allow both partial Infinity and -Infinity. - `SPECIAL`: Allow all special values. - `ATOM`: Allow all atomic values. - `COLLECTION`: Allow all collection values. - `ALL`: Allow all values. ## Testing To run the tests for this library, you should clone the repository and install the dependencies: ```sh git clone https://github.com/promplate/partial-json-parser.git cd partial-json-parser pdm install ``` Then, you can run the tests using [Hypothesis](https://hypothesis.works/) and [Pytest](https://pytest.org/): ```sh pdm test ``` Please note that while we strive to cover as many edge cases as possible, it's always possible that some cases might not be covered. ## License This project is licensed under the MIT License. ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1736351039.1150246 partial_json_parser-0.2.1.1.post5/pyproject.toml0000644000000000000000000000333314737516477016654 0ustar00[project] name = "partial-json-parser" dynamic = [] description = "Parse partial JSON generated by LLM" authors = [ { name = "Muspi Merol", email = "me@promplate.dev" }, ] requires-python = ">=3.6" readme = "README.md" keywords = [ "JSON", "parser", "LLM", "nlp", ] classifiers = [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", ] version = "0.2.1.1.post5" [project.optional-dependencies] playground = [ "rich", ] [project.license] text = "MIT" [project.scripts] json-playground = "partial_json_parser.playground:main" [project.urls] repository = "https://github.com/promplate/partial-json-parser" homepage = "https://promplate.dev/partial-json-parser" [build-system] requires = [ "pdm-backend", ] build-backend = "pdm.backend" [tool.pdm.build] excludes = [ "tests", ] [tool.pdm.dev-dependencies] dev = [ "hypothesis", "tqdm", "pytest", ] [tool.pdm.scripts] test-examples = "pytest tests/test_examples.py" post_build = "python src/overrides.py 3.8" [tool.pdm.scripts.test-performance] call = "tests.test_performance:main" [tool.pdm.scripts.test-hypotheses] call = "tests.test_hypotheses:main" [tool.pdm.scripts.test] composite = [ "test-examples", "test-hypotheses", "test-performance", ] [tool.pdm.scripts.format] composite = [ "isort ./{args}", "black ./{args}", ] [tool.pdm.scripts.playground] call = "partial_json_parser.playground:main" [tool.pdm.scripts.pre_build] composite = [ "format", "python src/overrides.py 3.6", ] [tool.pdm.version] source = "file" path = "src/partial_json_parser/version.py" [tool.black] line-length = 160 [tool.isort] profile = "black" [tool.pyright] reportPossiblyUnboundVariable = false ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.095979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/__init__.py0000644000000000000000000000031514737516472022671 0ustar00from .core.api import JSON, ensure_json, parse_json from .core.complete import fix from .core.exceptions import * from .core.myelin import fix_fast from .core.options import * loads = decode = parse_json ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/core/api.py0000644000000000000000000000150214737516472022632 0ustar00from typing import Callable, Dict, List, Optional, Union from .complete import fix from .myelin import fix_fast from .options import * Number = Union[int, float] JSON = Union[str, bool, Number, List["JSON"], Dict[str, "JSON"], None] def parse_json(json_string: str, allow_partial: Union[Allow, int] = ALL, parser: Optional[Callable[[str], JSON]] = None, use_fast_fix=True) -> JSON: if parser is None: from json import loads as parser return parser(ensure_json(json_string, allow_partial, use_fast_fix)) def ensure_json(json_string: str, allow_partial: Union[Allow, int] = ALL, use_fast_fix=True) -> str: """get the completed JSON string""" if use_fast_fix: head, tail = fix_fast(json_string, allow_partial) else: head, tail = fix(json_string, allow_partial) return head + tail ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/core/complete.py0000644000000000000000000001521414737516472023676 0ustar00from typing import TYPE_CHECKING, Tuple, Union from .exceptions import MalformedJSON, PartialJSON from .options import * if TYPE_CHECKING: from typing import Literal CompleteResult = Union[Tuple[int, Union[str, "Literal[True]"]], "Literal[False]"] # (length, complete_string / already completed) / partial def fix(json_string: str, allow_partial: Union[Allow, int] = ALL): """get the original slice and the trailing suffix separately""" return _fix(json_string, Allow(allow_partial), True) def _fix(json_string: str, allow: Allow, is_top_level=False): try: result = complete_any(json_string.strip(), allow, is_top_level) if result is False: raise PartialJSON index, completion = result return json_string[:index], ("" if completion is True else completion) except (AssertionError, IndexError) as err: raise MalformedJSON(*err.args) from err def skip_blank(text: str, index: int): try: while text[index].isspace(): index += 1 finally: return index def complete_any(json_string: str, allow: Allow, is_top_level=False) -> CompleteResult: i = skip_blank(json_string, 0) char = json_string[i] if char == '"': return complete_str(json_string, allow) if char in "1234567890": return complete_num(json_string, allow, is_top_level) if char == "[": return complete_arr(json_string, allow) if char == "{": return complete_obj(json_string, allow) if json_string.startswith("null"): return (4, True) if "null".startswith(json_string): return (0, "null") if NULL in allow else False if json_string.startswith("true"): return (4, True) if "true".startswith(json_string): return (0, "true") if BOOL in allow else False if json_string.startswith("false"): return (5, True) if "false".startswith(json_string): return (0, "false") if BOOL in allow else False if json_string.startswith("Infinity"): return (8, True) if "Infinity".startswith(json_string): return (0, "Infinity") if INFINITY in allow else False if char == "-": if len(json_string) == 1: return False elif json_string[1] != "I": return complete_num(json_string, allow, is_top_level) if json_string.startswith("-Infinity"): return (9, True) if "-Infinity".startswith(json_string): return (0, "-Infinity") if _INFINITY in allow else False if json_string.startswith("NaN"): return (3, True) if "NaN".startswith(json_string): return (0, "NaN") if NAN in allow else False raise MalformedJSON(f"Unexpected character {char}") def complete_str(json_string: str, allow: Allow) -> CompleteResult: assert json_string[0] == '"' length = len(json_string) i = 1 try: while True: char = json_string[i] if char == "\\": if i + 1 == length: raise IndexError i += 2 continue if char == '"': return i + 1, True i += 1 except IndexError: if STR not in allow: return False def not_escaped(index: int): text_before = json_string[:index] count = index - len(text_before.rstrip("\\")) return count % 2 == 0 # \uXXXX _u = json_string.rfind("\\u", max(0, i - 5), i) if _u != -1 and not_escaped(_u): return _u, '"' # \UXXXXXXXX _U = json_string.rfind("\\U", max(0, i - 9), i) if _U != -1 and not_escaped(_U): return _U, '"' # \xXX _x = json_string.rfind("\\x", max(0, i - 3), i) if _x != -1 and not_escaped(_x): return _x, '"' return i, '"' def complete_arr(json_string: str, allow: Allow) -> CompleteResult: assert json_string[0] == "[" i = j = 1 try: while True: j = skip_blank(json_string, j) if json_string[j] == "]": return j + 1, True result = complete_any(json_string[j:], allow) if result is False: # incomplete return (i, "]") if ARR in allow else False if result[1] is True: # complete i = j = j + result[0] else: # incomplete return (j + result[0], result[1] + "]") if ARR in allow else False j = skip_blank(json_string, j) if json_string[j] == ",": j += 1 elif json_string[j] == "]": return j + 1, True else: raise MalformedJSON(f"Expected ',' or ']', got {json_string[j]}") except IndexError: return (i, "]") if ARR in allow else False def complete_obj(json_string: str, allow: Allow) -> CompleteResult: assert json_string[0] == "{" i = j = 1 try: while True: j = skip_blank(json_string, j) if json_string[j] == "}": return j + 1, True result = complete_str(json_string[j:], allow) if result and result[1] is True: # complete j += result[0] else: # incomplete return (i, "}") if OBJ in allow else False j = skip_blank(json_string, j) if json_string[j] != ":": raise MalformedJSON(f"Expected ':', got {json_string[j]}") j += 1 j = skip_blank(json_string, j) result = complete_any(json_string[j:], allow) if result is False: # incomplete return (i, "}") if OBJ in allow else False if result[1] is True: # complete i = j = j + result[0] else: # incomplete return (j + result[0], result[1] + "}") if OBJ in allow else False j = skip_blank(json_string, j) if json_string[j] == ",": j += 1 elif json_string[j] == "}": return j + 1, True else: raise MalformedJSON(f"Expected ',' or '}}', got {json_string[j]}") except IndexError: return (i, "}") if OBJ in allow else False def complete_num(json_string: str, allow: Allow, is_top_level=False) -> CompleteResult: i = 1 length = len(json_string) # forward while i < length and json_string[i] in "1234567890.-+eE": i += 1 modified = False # backward while json_string[i - 1] in ".-+eE": modified = True i -= 1 if modified or i == length and not is_top_level: return (i, "") if NUM in allow else False else: return i, True ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/core/exceptions.py0000644000000000000000000000021414737516472024241 0ustar00class JSONDecodeError(ValueError): pass class PartialJSON(JSONDecodeError): pass class MalformedJSON(JSONDecodeError): pass ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/core/myelin.py0000644000000000000000000002155514737516472023370 0ustar00"""Myelin acts as the highway among neurons, epitomizing the leapfrog methodology within this algorithm.""" from re import compile from typing import List, Tuple, Union from .complete import _fix from .exceptions import PartialJSON from .options import * finditer = compile(r'["\[\]{}]').finditer def scan(json_string: str): return [(match.start(), match.group()) for match in finditer(json_string)] def join_closing_tokens(stack: List[Tuple[int, str]]): return "".join("}" if char == "{" else "]" for _, char in reversed(stack)) def fix_fast(json_string: str, allow_partial: Union[Allow, int] = ALL): allow = Allow(allow_partial) def is_escaped(index: int): text_before = json_string[:index] count = index - len(text_before.rstrip("\\")) return count % 2 stack = [] in_string = False last_string_start = -1 last_string_end = -1 tokens = scan(json_string) if not tokens or tokens[0][1] == '"': return _fix(json_string, allow, True) for i, char in tokens: if char == '"': if not in_string: in_string = True last_string_start = i elif not is_escaped(i): in_string = False last_string_end = i elif not in_string: if char == "}": _i, _char = stack.pop() assert _char == "{", f"Expected '{{' at index {_i}, got '{_char}'" elif char == "]": _i, _char = stack.pop() assert _char == "[", f"Expected '[' at index {_i}, got '{_char}'" else: stack.append((i, char)) if not stack: return json_string, "" # check if the opening tokens are allowed if (STR | COLLECTION) not in allow: def truncate_before_last_key_start(container_start: int, last_string_end: int, stack): last_key_start = last_string_end # backtrace the last key's start and retry finding the last comma while True: last_key_start = json_string.rfind('"', container_start, last_key_start) if last_key_start == -1: # this is the only key # { "key": "v return json_string[: container_start + 1], join_closing_tokens(stack) if is_escaped(last_key_start): last_key_start -= 1 else: last_comma = json_string.rfind(",", container_start, last_key_start) if last_comma == -1: # { "key": " return json_string[: container_start + 1], join_closing_tokens(stack) # # { ... "key": ... , " return json_string[:last_comma], join_closing_tokens(stack) if COLLECTION not in allow: for index, [_i, _char] in enumerate(stack): if _char == "{" and OBJ not in allow or _char == "[" and ARR not in allow: if index == 0: raise PartialJSON # to truncate before the last container token and the last comma (if exists) of its parent container # reset `last_string_end` to before `_i` if _i < last_string_start: if last_string_start < _i: # ... { "k last_string_end = json_string.rfind('"', last_string_end, _i) else: # ... { "" ... last_string_end = json_string.rfind('"', None, _i) last_comma = json_string.rfind(",", max(stack[index - 1][0], last_string_end) + 1, _i) if last_comma == -1: if stack[index - 1][1] == "[": # [ ... [ return json_string[:_i], join_closing_tokens(stack[:index]) # { "key": [ 1, 2, "v # { "key": [ 1, 2, "value" if last_string_start > last_string_end: return truncate_before_last_key_start(stack[index - 1][0], last_string_end, stack[:index]) last_comma = json_string.rfind(",", stack[index - 1][0] + 1, last_string_start) if last_comma == -1: return json_string[: stack[index - 1][0] + 1], join_closing_tokens(stack[:index]) return json_string[:last_comma], join_closing_tokens(stack[:index]) # { ..., "key": { # ..., { return json_string[:last_comma], join_closing_tokens(stack[:index]) if STR not in allow and in_string: # truncate before the last key if stack[-1][0] > last_string_end and stack[-1][1] == "{": # { "k return json_string[: stack[-1][0] + 1], join_closing_tokens(stack) last_comma = json_string.rfind(",", max(stack[-1][0], last_string_end) + 1, last_string_start - 1) if last_comma != -1: # { "key": "v", "k # { "key": 123, "k # [ 1, 2, 3, "k return json_string[:last_comma], join_closing_tokens(stack) # { ... "key": "v return truncate_before_last_key_start(stack[-1][0], last_string_end, stack) # only fix the rest of the container in O(1) time complexity assert in_string == (last_string_end < last_string_start) if in_string: if stack[-1][1] == "[": # [ ... "val head, tail = _fix(json_string[last_string_start:], allow) # fix the last string return json_string[:last_string_start] + head, tail + join_closing_tokens(stack) assert stack[-1][1] == "{" # { ... "val start = max(last_string_end, stack[-1][0]) if "," in json_string[start + 1 : last_string_start]: # { ... "k": "v", "key # { ... "k": 123, "key last_comma = json_string.rindex(",", start, last_string_start) head, tail = _fix(stack[-1][1] + json_string[last_comma + 1 :], allow) return json_string[:last_comma] + head[1:], tail + join_closing_tokens(stack[:-1]) if ":" in json_string[start + 1 : last_string_start]: # { ... ": "val head, tail = _fix(json_string[last_string_start:], allow) # fix the last string (same as array) return json_string[:last_string_start] + head, tail + join_closing_tokens(stack) # {"key return json_string[:last_string_start], join_closing_tokens(stack) last_comma = json_string.rfind(",", max(last_string_end, i) + 1) if last_comma != -1: i, char = stack[-1] if not json_string[last_comma + 1 :].strip(): # comma at the end # { ... "key": "value", return json_string[:last_comma], join_closing_tokens(stack) assert char == "[", json_string # array with many non-string literals # [ ..., 1, 2, 3, 4 head, tail = _fix(char + json_string[last_comma + 1 :], allow) if not head[1:] + tail[:-1].strip(): # empty, so trim the last comma return json_string[:last_comma] + head[1:], tail + join_closing_tokens(stack[:-1]) return json_string[: last_comma + 1] + head[1:], tail + join_closing_tokens(stack[:-1]) # can't find comma after the last string and after the last container token if char in "]}": # ... [ ... ] # ... { ... } assert not json_string[i + 1 :].strip() return json_string, join_closing_tokens(stack) if char in "[{": # ... [ ... # ... { ... head, tail = _fix(json_string[i:], allow) return json_string[:i] + head, tail + join_closing_tokens(stack[:-1]) assert char == '"' i, char = stack[-1] if char == "[": # [ ... "val" return json_string, join_closing_tokens(stack) assert char == "{" last_colon = json_string.rfind(":", last_string_end) last_comma = json_string.rfind(",", i + 1, last_string_start) if last_comma == -1: # only 1 key # ... { "key" # ... { "key": "value" head, tail = _fix(json_string[i:], allow) return json_string[:i] + head, tail + join_closing_tokens(stack[:-1]) if last_colon == -1: if json_string.rfind(":", max(i, last_comma) + 1, last_string_start) != -1: # { ... , "key": "value" return json_string, join_closing_tokens(stack) # { ... , "key" head, tail = _fix("{" + json_string[last_comma + 1 :], allow) if not head[1:] + tail[:-1].strip(): return json_string[:last_comma] + head[1:], tail + join_closing_tokens(stack[:-1]) return json_string[: last_comma + 1] + head[1:], tail + join_closing_tokens(stack) assert last_colon > last_comma # { ... , "key": head, tail = _fix("{" + json_string[last_comma + 1 :], allow) if not head[1:] + tail[:-1].strip(): return json_string[:last_comma] + head[1:], tail + join_closing_tokens(stack[:-1]) return json_string[: last_comma + 1] + head[1:], tail + join_closing_tokens(stack[:-1]) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/core/options.py0000644000000000000000000000165014737516472023560 0ustar00from enum import IntFlag, auto class Allow(IntFlag): """Specify what kind of partialness is allowed during JSON parsing""" STR = auto() NUM = auto() ARR = auto() OBJ = auto() NULL = auto() BOOL = auto() NAN = auto() INFINITY = auto() _INFINITY = auto() INF = INFINITY | _INFINITY SPECIAL = NULL | BOOL | INF | NAN ATOM = STR | NUM | SPECIAL COLLECTION = ARR | OBJ ALL = ATOM | COLLECTION STR = Allow.STR NUM = Allow.NUM ARR = Allow.ARR OBJ = Allow.OBJ NULL = Allow.NULL BOOL = Allow.BOOL NAN = Allow.NAN INFINITY = Allow.INFINITY _INFINITY = Allow._INFINITY INF = Allow.INF SPECIAL = Allow.SPECIAL ATOM = Allow.ATOM COLLECTION = Allow.COLLECTION ALL = Allow.ALL __all__ = [ "Allow", "STR", "NUM", "ARR", "OBJ", "NULL", "BOOL", "NAN", "INFINITY", "_INFINITY", "INF", "SPECIAL", "ATOM", "COLLECTION", "ALL", ] ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/options.py0000644000000000000000000000007714737516472022632 0ustar00"""For backward compatibility.""" from .core.options import * ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/playground.py0000644000000000000000000000154114737516472023320 0ustar00from rich.console import Console from rich.highlighter import JSONHighlighter from rich.style import Style from rich.text import Span from partial_json_parser import fix_fast console = Console() highlight = JSONHighlighter() def main(): while True: try: input_str = console.input("[d]>>> ") head, tail = fix_fast(input_str) json = head + tail rich_text = highlight(json) if tail: rich_text.spans.append(Span(len(head), len(json), Style(dim=True))) console.print(" " * 3, rich_text) except KeyboardInterrupt: return except Exception as err: console.print(f"{err.__class__.__name__}:", style="bold red", highlight=False, end=" ") console.print(" ".join(map(str, err.args)), style="yellow", highlight=False) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1736351034.096979 partial_json_parser-0.2.1.1.post5/src/partial_json_parser/version.py0000644000000000000000000000003614737516472022617 0ustar00__version__ = "0.2.1.1.post5" partial_json_parser-0.2.1.1.post5/PKG-INFO0000644000000000000000000001376100000000000014757 0ustar00Metadata-Version: 2.1 Name: partial-json-parser Version: 0.2.1.1.post5 Summary: Parse partial JSON generated by LLM Keywords: JSON,parser,LLM,nlp Author-Email: Muspi Merol License: MIT Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Project-URL: repository, https://github.com/promplate/partial-json-parser Project-URL: homepage, https://promplate.dev/partial-json-parser Requires-Python: >=3.6 Provides-Extra: playground Requires-Dist: rich; extra == "playground" Description-Content-Type: text/markdown # Partial JSON Parser Sometimes we need **LLM (Large Language Models)** to produce **structural information** instead of natural language. The easiest way is to use JSON. But before receiving the last token of response, the JSON is broken, which means you can't use `JSON.parse` to decode it. But we still want to stream the data to the user. Here comes `partial-json-parser`, a lightweight and customizable library for parsing partial JSON strings. Here is a [demo](https://promplate.dev/partial-json-parser). (Note that there is [a JavaScript implementation](https://github.com/promplate/partial-json-parser-js) too) ## Installation ```sh pip install partial-json-parser # or poetry / pdm / uv ``` `partial-json-parser` is implemented purely in Python, with good type hints. It is zero-dependency and works with Python 3.6+. You can install run its demo playground by installing `rich` too or: ```sh pip install partial-json-parser[playground] ``` Then run the `json-playground` in your terminal, and you can try the parser interactively. ## Usage ```py from partial_json_parser import loads >>> loads('{"key": "v') # {'key': 'v'} ``` Alternatively, you can use `ensure_json` to get the completed JSON string: ```py from partial_json_parser import ensure_json >>> ensure_json('{"key": "v') # '{"key": "v"}' ``` ### Detailed Usage You can import the `loads` function and the `Allow` object from the library like this: ```py from partial_json_parser import loads, Allow ``` The `Allow` object is just an Enum for options. It determines what types can be partial. types not included in `allow` only appears after its completion can be ensured. ### Parsing complete / partial JSON strings The `loads` function works just like the built-in `json.loads` when parsing a complete JSON string: ```py result = loads('{"key":"value"}') print(result) # Outputs: {'key': 'value'} ``` You can parse a partial JSON string by passing an additional parameter to the `loads` function. This parameter is a **bitwise OR** of the constants from the `Allow` flag: (Note that you can directly import the constants you need from `partial-json-parser`) ```py from partial_json_parser import loads, Allow, STR, OBJ result = loads('{"key": "v', STR | OBJ) print(result) # Outputs: {'key': 'v'} ``` In this example, `Allow.STR` tells the parser that it's okay if a string is incomplete, and `Allow.OBJ` tells the parser so as a dict. The parser then try to return as much data as it can. If you don't allow partial strings, then it will not add `"key"` to the object because `"v` is not close: ```py result = loads('{"key": "v', OBJ) print(result) # Outputs: {} result = loads('{"key": "value"', OBJ) print(result) # Outputs: {'key': 'value'} ``` Similarity, you can parse partial lists or even partial special values if you allow it: (Note that `allow` defaults to `Allow.ALL`) ```py result = loads('[ {"key1": "value1", "key2": [ "value2') print(result) # Outputs: [{'key1': 'value1', 'key2': ['value2']}] result = loads("-Inf") print(result) # Outputs: -inf ``` ### Handling malformed JSON If the JSON string is malformed, the `parse` function will throw an error: ```py loads("wrong") # MalformedJSON: Malformed node or string on line 1 ``` ## API Reference ### loads(json_string, [allow_partial], [parser]) - `json_string` ``: The (incomplete) JSON string to parse. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). - `parser` `(str) -> JSON`: An ordinary JSON parser. Default is `json.loads`. Complete the JSON string and parse it with `parser` function. Returns the parsed Python value. Alias: `decode`, `parse_json`. ### ensure_json(json_string, [allow_partial]) - `json_string` ``: The (incomplete) JSON string to complete. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). Returns the completed JSON string. ### fix(json_string, [allow_partial]) - `json_string` ``: The (incomplete) JSON string to complete. - `allow_partial` ``: Specify what kind of partialness is allowed during JSON parsing (default: `Allow.ALL`). Returns a tuple of a slice of the input string and the completion. Note that this is a low-level API, only useful for debugging and demonstration. ### Allow Enum class that specifies what kind of partialness is allowed during JSON parsing. It has the following members: - `STR`: Allow partial string. - `NUM`: Allow partial number. - `ARR`: Allow partial array. - `OBJ`: Allow partial object. - `NULL`: Allow partial null. - `BOOL`: Allow partial boolean. - `NAN`: Allow partial NaN. - `INFINITY`: Allow partial Infinity. - `_INFINITY`: Allow partial -Infinity. - `INF`: Allow both partial Infinity and -Infinity. - `SPECIAL`: Allow all special values. - `ATOM`: Allow all atomic values. - `COLLECTION`: Allow all collection values. - `ALL`: Allow all values. ## Testing To run the tests for this library, you should clone the repository and install the dependencies: ```sh git clone https://github.com/promplate/partial-json-parser.git cd partial-json-parser pdm install ``` Then, you can run the tests using [Hypothesis](https://hypothesis.works/) and [Pytest](https://pytest.org/): ```sh pdm test ``` Please note that while we strive to cover as many edge cases as possible, it's always possible that some cases might not be covered. ## License This project is licensed under the MIT License.