python-tokenize-rt_4.2.1.orig/LICENSE0000644000000000000000000000204313114322361014257 0ustar00Copyright (c) 2017 Anthony Sottile Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. python-tokenize-rt_4.2.1.orig/PKG-INFO0000644000000000000000000001072514205213375014363 0ustar00Metadata-Version: 2.1 Name: tokenize_rt Version: 4.2.1 Summary: A wrapper around the stdlib `tokenize` which roundtrips. Home-page: https://github.com/asottile/tokenize-rt Author: Anthony Sottile Author-email: asottile@umich.edu License: MIT Platform: UNKNOWN Classifier: License :: OSI Approved :: MIT License Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Requires-Python: >=3.6.1 Description-Content-Type: text/markdown License-File: LICENSE [![Build Status](https://dev.azure.com/asottile/asottile/_apis/build/status/asottile.tokenize-rt?branchName=master)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![Azure DevOps coverage](https://img.shields.io/azure-devops/coverage/asottile/asottile/25/master.svg)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/master.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/master) tokenize-rt =========== The stdlib `tokenize` module does not properly roundtrip. This wrapper around the stdlib provides two additional tokens `ESCAPED_NL` and `UNIMPORTANT_WS`, and a `Token` data type. Use `src_to_tokens` and `tokens_to_src` to roundtrip. This library is useful if you're writing a refactoring tool based on the python tokenization. ## Installation `pip install tokenize-rt` ## Usage ### datastructures #### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)` A token offset, useful as a key when cross referencing the `ast` and the tokenized source. #### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)` Construct a token - `name`: one of the token names listed in `token.tok_name` or `ESCAPED_NL` or `UNIMPORTANT_WS` - `src`: token's source as text - `line`: the line number that this token appears on. - `utf8_byte_offset`: the utf8 byte offset that this token appears on in the line. #### `tokenize_rt.Token.offset` Retrieves an `Offset` for this token. ### converting to and from `Token` representations #### `tokenize_rt.src_to_tokens(text: str) -> List[Token]` #### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str` ### additional tokens added by `tokenize-rt` #### `tokenize_rt.ESCAPED_NL` #### `tokenize_rt.UNIMPORTANT_WS` ### helpers #### `tokenize_rt.NON_CODING_TOKENS` A `frozenset` containing tokens which may appear between others while not affecting control flow or code: - `COMMENT` - `ESCAPED_NL` - `NL` - `UNIMPORTANT_WS` #### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]` parse a string literal into its prefix and string content ```pycon >>> parse_string_literal('f"foo"') ('f', '"foo"') ``` #### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]` yields `(index, token)` pairs. Useful for rewriting source. #### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]` find the indices of the string parts of a (joined) string literal - `i` should start at the end of the string literal - returns `()` (an empty tuple) for things which are not string literals ```pycon >>> tokens = src_to_tokens('"foo" "bar".capitalize()') >>> rfind_string_parts(tokens, 2) (0, 2) >>> tokens = src_to_tokens('("foo" "bar").capitalize()') >>> rfind_string_parts(tokens, 4) (1, 3) ``` ## Differences from `tokenize` - `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token" - `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`) - `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in python 2. - `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal literals (`0755`) in python 3 (for easier rewriting of python 2 code while running python 3). ## Sample usage - https://github.com/asottile/add-trailing-comma - https://github.com/asottile/future-annotations - https://github.com/asottile/future-fstrings - https://github.com/asottile/pyupgrade - https://github.com/asottile/yesqa python-tokenize-rt_4.2.1.orig/README.md0000644000000000000000000000712414134160224014537 0ustar00[![Build Status](https://dev.azure.com/asottile/asottile/_apis/build/status/asottile.tokenize-rt?branchName=master)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![Azure DevOps coverage](https://img.shields.io/azure-devops/coverage/asottile/asottile/25/master.svg)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/master.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/master) tokenize-rt =========== The stdlib `tokenize` module does not properly roundtrip. This wrapper around the stdlib provides two additional tokens `ESCAPED_NL` and `UNIMPORTANT_WS`, and a `Token` data type. Use `src_to_tokens` and `tokens_to_src` to roundtrip. This library is useful if you're writing a refactoring tool based on the python tokenization. ## Installation `pip install tokenize-rt` ## Usage ### datastructures #### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)` A token offset, useful as a key when cross referencing the `ast` and the tokenized source. #### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)` Construct a token - `name`: one of the token names listed in `token.tok_name` or `ESCAPED_NL` or `UNIMPORTANT_WS` - `src`: token's source as text - `line`: the line number that this token appears on. - `utf8_byte_offset`: the utf8 byte offset that this token appears on in the line. #### `tokenize_rt.Token.offset` Retrieves an `Offset` for this token. ### converting to and from `Token` representations #### `tokenize_rt.src_to_tokens(text: str) -> List[Token]` #### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str` ### additional tokens added by `tokenize-rt` #### `tokenize_rt.ESCAPED_NL` #### `tokenize_rt.UNIMPORTANT_WS` ### helpers #### `tokenize_rt.NON_CODING_TOKENS` A `frozenset` containing tokens which may appear between others while not affecting control flow or code: - `COMMENT` - `ESCAPED_NL` - `NL` - `UNIMPORTANT_WS` #### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]` parse a string literal into its prefix and string content ```pycon >>> parse_string_literal('f"foo"') ('f', '"foo"') ``` #### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]` yields `(index, token)` pairs. Useful for rewriting source. #### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]` find the indices of the string parts of a (joined) string literal - `i` should start at the end of the string literal - returns `()` (an empty tuple) for things which are not string literals ```pycon >>> tokens = src_to_tokens('"foo" "bar".capitalize()') >>> rfind_string_parts(tokens, 2) (0, 2) >>> tokens = src_to_tokens('("foo" "bar").capitalize()') >>> rfind_string_parts(tokens, 4) (1, 3) ``` ## Differences from `tokenize` - `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token" - `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`) - `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in python 2. - `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal literals (`0755`) in python 3 (for easier rewriting of python 2 code while running python 3). ## Sample usage - https://github.com/asottile/add-trailing-comma - https://github.com/asottile/future-annotations - https://github.com/asottile/future-fstrings - https://github.com/asottile/pyupgrade - https://github.com/asottile/yesqa python-tokenize-rt_4.2.1.orig/setup.cfg0000644000000000000000000000247314205213375015110 0ustar00[metadata] name = tokenize_rt version = 4.2.1 description = A wrapper around the stdlib `tokenize` which roundtrips. long_description = file: README.md long_description_content_type = text/markdown url = https://github.com/asottile/tokenize-rt author = Anthony Sottile author_email = asottile@umich.edu license = MIT license_file = LICENSE classifiers = License :: OSI Approved :: MIT License Programming Language :: Python :: 3 Programming Language :: Python :: 3 :: Only Programming Language :: Python :: 3.6 Programming Language :: Python :: 3.7 Programming Language :: Python :: 3.8 Programming Language :: Python :: 3.9 Programming Language :: Python :: 3.10 Programming Language :: Python :: Implementation :: CPython Programming Language :: Python :: Implementation :: PyPy [options] py_modules = tokenize_rt python_requires = >=3.6.1 [options.entry_points] console_scripts = tokenize-rt = tokenize_rt:main [bdist_wheel] universal = True [coverage:run] plugins = covdefaults [mypy] check_untyped_defs = true disallow_any_generics = true disallow_incomplete_defs = true disallow_untyped_defs = true no_implicit_optional = true warn_redundant_casts = true warn_unused_ignores = true [mypy-testing.*] disallow_untyped_defs = false [mypy-tests.*] disallow_untyped_defs = false [egg_info] tag_build = tag_date = 0 python-tokenize-rt_4.2.1.orig/setup.py0000644000000000000000000000004513435707554015005 0ustar00from setuptools import setup setup() python-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/0000755000000000000000000000000014205213375017310 5ustar00python-tokenize-rt_4.2.1.orig/tokenize_rt.py0000644000000000000000000001531714134301743016175 0ustar00import argparse import io import keyword import re import sys import tokenize from typing import Generator from typing import Iterable from typing import List from typing import NamedTuple from typing import Optional from typing import Pattern from typing import Sequence from typing import Tuple # this is a performance hack. see https://bugs.python.org/issue43014 if ( sys.version_info < (3, 10) and callable(getattr(tokenize, '_compile', None)) ): # pragma: no cover ( Offset: return Offset(self.line, self.utf8_byte_offset) _string_re = re.compile('^([^\'"]*)(.*)$', re.DOTALL) _string_prefixes = frozenset('bfru') _escaped_nl_re = re.compile(r'\\(\n|\r\n|\r)') def _re_partition(regex: Pattern[str], s: str) -> Tuple[str, str, str]: match = regex.search(s) if match: return s[:match.start()], s[slice(*match.span())], s[match.end():] else: return (s, '', '') def src_to_tokens(src: str) -> List[Token]: tokenize_target = io.StringIO(src) lines = ('',) + tuple(tokenize_target) tokenize_target.seek(0) tokens = [] last_line = 1 last_col = 0 end_offset = 0 gen = tokenize.generate_tokens(tokenize_target.readline) for tok_type, tok_text, (sline, scol), (eline, ecol), line in gen: if sline > last_line: newtok = lines[last_line][last_col:] for lineno in range(last_line + 1, sline): newtok += lines[lineno] if scol > 0: newtok += lines[sline][:scol] # a multiline unimportant whitespace may contain escaped newlines while _escaped_nl_re.search(newtok): ws, nl, newtok = _re_partition(_escaped_nl_re, newtok) if ws: tokens.append( Token(UNIMPORTANT_WS, ws, last_line, end_offset), ) end_offset += len(ws.encode()) tokens.append(Token(ESCAPED_NL, nl, last_line, end_offset)) end_offset = 0 last_line += 1 if newtok: tokens.append(Token(UNIMPORTANT_WS, newtok, sline, 0)) end_offset = len(newtok.encode()) else: end_offset = 0 elif scol > last_col: newtok = line[last_col:scol] tokens.append(Token(UNIMPORTANT_WS, newtok, sline, end_offset)) end_offset += len(newtok.encode()) tok_name = tokenize.tok_name[tok_type] # when a string prefix is not recognized, the tokenizer produces a # NAME token followed by a STRING token if ( tok_name == 'STRING' and tokens and tokens[-1].name == 'NAME' and frozenset(tokens[-1].src.lower()) <= _string_prefixes ): newsrc = tokens[-1].src + tok_text tokens[-1] = tokens[-1]._replace(src=newsrc, name=tok_name) # produce octal literals as a single token in python 3 as well elif ( tok_name == 'NUMBER' and tokens and tokens[-1].name == 'NUMBER' ): tokens[-1] = tokens[-1]._replace(src=tokens[-1].src + tok_text) # produce long literals as a single token in python 3 as well elif ( tok_name == 'NAME' and tok_text.lower() == 'l' and tokens and tokens[-1].name == 'NUMBER' ): tokens[-1] = tokens[-1]._replace(src=tokens[-1].src + tok_text) else: tokens.append(Token(tok_name, tok_text, sline, end_offset)) last_line, last_col = eline, ecol if sline != eline: end_offset = len(lines[last_line][:last_col].encode()) else: end_offset += len(tok_text.encode()) return tokens def tokens_to_src(tokens: Iterable[Token]) -> str: return ''.join(tok.src for tok in tokens) def reversed_enumerate( tokens: Sequence[Token], ) -> Generator[Tuple[int, Token], None, None]: for i in reversed(range(len(tokens))): yield i, tokens[i] def parse_string_literal(src: str) -> Tuple[str, str]: """parse a string literal's source into (prefix, string)""" match = _string_re.match(src) assert match is not None return match.group(1), match.group(2) def rfind_string_parts(tokens: Sequence[Token], i: int) -> Tuple[int, ...]: """find the indicies of the string parts of a (joined) string literal - `i` should start at the end of the string literal - returns `()` (an empty tuple) for things which are not string literals """ ret = [] depth = 0 for i in range(i, -1, -1): token = tokens[i] if token.name == 'STRING': ret.append(i) elif token.name in NON_CODING_TOKENS: pass elif token.src == ')': depth += 1 elif depth and token.src == '(': depth -= 1 # if we closed the paren(s) make sure it was a parenthesized string # and not actually a call if depth == 0: for j in range(i - 1, -1, -1): tok = tokens[j] if tok.name in NON_CODING_TOKENS: pass # this was actually a call and not a parenthesized string elif ( tok.src in {']', ')'} or ( tok.name == 'NAME' and tok.src not in keyword.kwlist ) ): return () else: break break elif depth: # it looked like a string but wasn't return () else: break return tuple(reversed(ret)) def main(argv: Optional[Sequence[str]] = None) -> int: parser = argparse.ArgumentParser() parser.add_argument('filename') args = parser.parse_args(argv) with open(args.filename) as f: tokens = src_to_tokens(f.read()) for token in tokens: line, col = str(token.line), str(token.utf8_byte_offset) print(f'{line}:{col} {token.name} {token.src!r}') return 0 if __name__ == '__main__': exit(main()) python-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/PKG-INFO0000644000000000000000000001072514205213375020412 0ustar00Metadata-Version: 2.1 Name: tokenize-rt Version: 4.2.1 Summary: A wrapper around the stdlib `tokenize` which roundtrips. Home-page: https://github.com/asottile/tokenize-rt Author: Anthony Sottile Author-email: asottile@umich.edu License: MIT Platform: UNKNOWN Classifier: License :: OSI Approved :: MIT License Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Requires-Python: >=3.6.1 Description-Content-Type: text/markdown License-File: LICENSE [![Build Status](https://dev.azure.com/asottile/asottile/_apis/build/status/asottile.tokenize-rt?branchName=master)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![Azure DevOps coverage](https://img.shields.io/azure-devops/coverage/asottile/asottile/25/master.svg)](https://dev.azure.com/asottile/asottile/_build/latest?definitionId=25&branchName=master) [![pre-commit.ci status](https://results.pre-commit.ci/badge/github/asottile/tokenize-rt/master.svg)](https://results.pre-commit.ci/latest/github/asottile/tokenize-rt/master) tokenize-rt =========== The stdlib `tokenize` module does not properly roundtrip. This wrapper around the stdlib provides two additional tokens `ESCAPED_NL` and `UNIMPORTANT_WS`, and a `Token` data type. Use `src_to_tokens` and `tokens_to_src` to roundtrip. This library is useful if you're writing a refactoring tool based on the python tokenization. ## Installation `pip install tokenize-rt` ## Usage ### datastructures #### `tokenize_rt.Offset(line=None, utf8_byte_offset=None)` A token offset, useful as a key when cross referencing the `ast` and the tokenized source. #### `tokenize_rt.Token(name, src, line=None, utf8_byte_offset=None)` Construct a token - `name`: one of the token names listed in `token.tok_name` or `ESCAPED_NL` or `UNIMPORTANT_WS` - `src`: token's source as text - `line`: the line number that this token appears on. - `utf8_byte_offset`: the utf8 byte offset that this token appears on in the line. #### `tokenize_rt.Token.offset` Retrieves an `Offset` for this token. ### converting to and from `Token` representations #### `tokenize_rt.src_to_tokens(text: str) -> List[Token]` #### `tokenize_rt.tokens_to_src(Iterable[Token]) -> str` ### additional tokens added by `tokenize-rt` #### `tokenize_rt.ESCAPED_NL` #### `tokenize_rt.UNIMPORTANT_WS` ### helpers #### `tokenize_rt.NON_CODING_TOKENS` A `frozenset` containing tokens which may appear between others while not affecting control flow or code: - `COMMENT` - `ESCAPED_NL` - `NL` - `UNIMPORTANT_WS` #### `tokenize_rt.parse_string_literal(text: str) -> Tuple[str, str]` parse a string literal into its prefix and string content ```pycon >>> parse_string_literal('f"foo"') ('f', '"foo"') ``` #### `tokenize_rt.reversed_enumerate(Sequence[Token]) -> Iterator[Tuple[int, Token]]` yields `(index, token)` pairs. Useful for rewriting source. #### `tokenize_rt.rfind_string_parts(Sequence[Token], i) -> Tuple[int, ...]` find the indices of the string parts of a (joined) string literal - `i` should start at the end of the string literal - returns `()` (an empty tuple) for things which are not string literals ```pycon >>> tokens = src_to_tokens('"foo" "bar".capitalize()') >>> rfind_string_parts(tokens, 2) (0, 2) >>> tokens = src_to_tokens('("foo" "bar").capitalize()') >>> rfind_string_parts(tokens, 4) (1, 3) ``` ## Differences from `tokenize` - `tokenize-rt` adds `ESCAPED_NL` for a backslash-escaped newline "token" - `tokenize-rt` adds `UNIMPORTANT_WS` for whitespace (discarded in `tokenize`) - `tokenize-rt` normalizes string prefixes, even if they are not parsed -- for instance, this means you'll see `Token('STRING', "f'foo'", ...)` even in python 2. - `tokenize-rt` normalizes python 2 long literals (`4l` / `4L`) and octal literals (`0755`) in python 3 (for easier rewriting of python 2 code while running python 3). ## Sample usage - https://github.com/asottile/add-trailing-comma - https://github.com/asottile/future-annotations - https://github.com/asottile/future-fstrings - https://github.com/asottile/pyupgrade - https://github.com/asottile/yesqa python-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/SOURCES.txt0000644000000000000000000000034514205213375021176 0ustar00LICENSE README.md setup.cfg setup.py tokenize_rt.py tokenize_rt.egg-info/PKG-INFO tokenize_rt.egg-info/SOURCES.txt tokenize_rt.egg-info/dependency_links.txt tokenize_rt.egg-info/entry_points.txt tokenize_rt.egg-info/top_level.txtpython-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/dependency_links.txt0000644000000000000000000000000114205213375023356 0ustar00 python-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/entry_points.txt0000644000000000000000000000006214205213375022604 0ustar00[console_scripts] tokenize-rt = tokenize_rt:main python-tokenize-rt_4.2.1.orig/tokenize_rt.egg-info/top_level.txt0000644000000000000000000000001414205213375022035 0ustar00tokenize_rt