pax_global_header00006660000000000000000000000064145463452550014527gustar00rootroot0000000000000052 comment=a964097108c165c609ecf0272a8908ce10f4cc28 sqlparse-0.4.4/000077500000000000000000000000001454634525500133665ustar00rootroot00000000000000sqlparse-0.4.4/AUTHORS000066400000000000000000000060411454634525500144370ustar00rootroot00000000000000python-sqlparse is written and maintained by Andi Albrecht . This module contains code (namely the lexer and filter mechanism) from the pygments project that was written by Georg Brandl. This module contains code (Python 2/3 compatibility) from the six project: https://bitbucket.org/gutworth/six. Alphabetical list of contributors: * Adam Greenhall * Aki Ariga * Alexander Beedie * Alexey Malyshev * ali-tny * andrew deryabin * Andrew Tipton * atronah * casey * Cauê Beloni * Christian Clauss * circld * Corey Zumar * Cristian Orellana * Dag Wieers * Daniel Harding * Darik Gamble * Demetrio92 * Dennis Taylor * Dvořák Václav * Erik Cederstrand * Florian Bauer * Fredy Wijaya * Gavin Wahl * hurcy * Ian Robertson * JacekPliszka * JavierPan * Jean-Martin Archer * Jesús Leganés Combarro "Piranna" * Johannes Hoff * John Bodley * Jon Dufresne * Josh Soref * Kevin Jing Qiu * koljonen * Likai Liu * Long Le Xich * mathilde.oustlant * Michael Schuller * Mike Amy * mulos * Oleg Broytman * osmnv <80402144+osmnv@users.noreply.github.com> * Patrick Schemitz * Pi Delport * Prudhvi Vatala * quest * Robert Nix * Rocky Meza * Romain Rigaux * Rowan Seymour * Ryan Wooden * saaj * Shen Longxing * Simon Heisterkamp * Sjoerd Job Postmus * Soloman Weng * spigwitmer * Tao Wang * Tenghuan * Tim Graham * Victor Hahn * Victor Uriarte * Ville Skyttä * vthriller * wayne.wuw * Will Jones * William Ivanski * Yago Riveiro sqlparse-0.4.4/CHANGELOG000066400000000000000000000461161454634525500146100ustar00rootroot00000000000000Release 0.4.4 (Apr 18, 2023) ---------------------------- Notable Changes * IMPORTANT: This release fixes a security vulnerability in the parser where a regular expression vulnerable to ReDOS (Regular Expression Denial of Service) was used. See the security advisory for details: https://github.com/andialbrecht/sqlparse/security/advisories/GHSA-rrm6-wvj7-cwh2 The vulnerability was discovered by @erik-krogh from GitHub Security Lab (GHSL). Thanks for reporting! Bug Fixes * Revert a change from 0.4.0 that changed IN to be a comparison (issue694). The primary expectation is that IN is treated as a keyword and not as a comparison operator. That also follows the definition of reserved keywords for the major SQL syntax definitions. * Fix regular expressions for string parsing. Other * sqlparse now uses pyproject.toml instead of setup.cfg (issue685). Release 0.4.3 (Sep 23, 2022) ---------------------------- Enhancements * Add support for DIV operator (pr664, by chezou). * Add support for additional SPARK keywords (pr643, by mrmasterplan). * Avoid tokens copy (pr622, by living180). * Add REGEXP as a comparision (pr647, by PeterSandwich). * Add DISTINCTROW keyword for MS Access (issue677). * Improve parsing of CREATE TABLE AS SELECT (pr662, by chezou). Bug Fixes * Fix spelling of INDICATOR keyword (pr653, by ptld). * Fix formatting error in EXTRACT function (issue562, issue670, pr676, by ecederstrand). * Fix bad parsing of create table statements that use lower case (issue217, pr642, by mrmasterplan). * Handle backtick as valid quote char (issue628, pr629, by codenamelxl). * Allow any unicode character as valid identifier name (issue641). Other * Update github actions to test on Python 3.10 as well (pr661, by cclaus). Release 0.4.2 (Sep 10, 2021) ---------------------------- Notable Changes * IMPORTANT: This release fixes a security vulnerability in the strip comments filter. In this filter a regular expression that was vulnerable to ReDOS (Regular Expression Denial of Service) was used. See the security advisory for details: https://github.com/andialbrecht/sqlparse/security/advisories/GHSA-p5w8-wqhj-9hhf The vulnerability was discovered by @erik-krogh and @yoff from GitHub Security Lab (GHSL). Thanks for reporting! Enhancements * Add ELSIF as keyword (issue584). * Add CONFLICT and ON_ERROR_STOP keywords (pr595, by j-martin). Bug Fixes * Fix parsing of backticks (issue588). * Fix parsing of scientific number (issue399). Release 0.4.1 (Oct 08, 2020) ---------------------------- Bug Fixes * Just removed a debug print statement, sorry... Release 0.4.0 (Oct 07, 2020) ---------------------------- Notable Changes * Remove support for end-of-life Python 2.7 and 3.4. Python 3.5+ is now required. * Remaining strings that only consist of whitespaces are not treated as statements anymore. Code that ignored the last element from sqlparse.split() should be updated accordingly since that function now doesn't return an empty string as the last element in some cases (issue496). Enhancements * Add WINDOW keyword (pr579 by ali-tny). * Add RLIKE keyword (pr582 by wjones1). Bug Fixes * Improved parsing of IN(...) statements (issue566, pr567 by hurcy). * Preserve line breaks when removing comments (issue484). * Fix parsing error when using square bracket notation (issue583). * Fix splitting when using DECLARE ... HANDLER (issue581). * Fix splitting of statements using CASE ... WHEN (issue580). * Improve formatting of type casts in parentheses. * Stabilize formatting of invalid SQL statements. Release 0.3.1 (Feb 29, 2020) ---------------------------- Enhancements * Add HQL keywords (pr475, by matwalk). * Add support for time zone casts (issue489). * Enhance formatting of AS keyword (issue507, by john-bodley). * Stabilize grouping engine when parsing invalid SQL statements. Bug Fixes * Fix splitting of SQL with multiple statements inside parentheses (issue485, pr486 by win39). * Correctly identify NULLS FIRST / NULLS LAST as keywords (issue487). * Fix splitting of SQL statements that contain dollar signs in identifiers (issue491). * Remove support for parsing double slash comments introduced in 0.3.0 (issue456) as it had some side-effects with other dialects and doesn't seem to be widely used (issue476). * Restrict detection of alias names to objects that actually could have an alias (issue455, adopted some parts of pr509 by john-bodley). * Fix parsing of date/time literals (issue438, by vashek). * Fix initialization of TokenList (issue499, pr505 by john-bodley). * Fix parsing of LIKE (issue493, pr525 by dbczumar). * Improve parsing of identifiers (pr527 by liulk). Release 0.3.0 (Mar 11, 2019) ---------------------------- Notable Changes * Remove support for Python 3.3. Enhancements * New formatting option "--indent_after_first" (pr345, by johshoff). * New formatting option "--indent_columns" (pr393, by digitalarbeiter). * Add UPSERT keyword (issue408). * Strip multiple whitespace within parentheses (issue473, by john-bodley). * Support double slash (//) comments (issue456, by theianrobertson). * Support for Calcite temporal keywords (pr468, by john-bodley). Bug Fixes * Fix occasional IndexError (pr390, by circld, issue313). * Fix incorrect splitting of strings containing new lines (pr396, by fredyw). * Fix reindent issue for parenthesis (issue427, by fredyw). * Fix from( parsing issue (issue446, by fredyw) . * Fix for get_real_name() to return correct name (issue369, by fredyw). * Wrap function params when wrap_after is set (pr398, by soloman1124). * Fix parsing of "WHEN name" clauses (pr418, by andrew deryabin). * Add missing EXPLAIN keyword (issue421). * Fix issue with strip_comments causing a syntax error (issue425, by fredyw). * Fix formatting on INSERT which caused staircase effect on values (issue329, by fredyw). * Avoid formatting of psql commands (issue469). Internal Changes * Unify handling of GROUP BY/ORDER BY (pr457, by john-bodley). * Remove unnecessary compat shim for bytes (pr453, by jdufresne). Release 0.2.4 (Sep 27, 2017) ---------------------------- Enhancements * Add more keywords for MySQL table options (pr328, pr333, by phdru). * Add more PL/pgSQL keywords (pr357, by Demetrio92). * Improve parsing of floats (pr330, by atronah). Bug Fixes * Fix parsing of MySQL table names starting with digits (issue337). * Fix detection of identifiers using comparisons (issue327). * Fix parsing of UNION ALL after WHERE (issue349). * Fix handling of semicolon in assignments (issue359, issue358). Release 0.2.3 (Mar 02, 2017) ---------------------------- Enhancements * New command line option "--encoding" (by twang2218, pr317). * Support CONCURRENTLY keyword (issue322, by rowanseymour). Bug Fixes * Fix some edge-cases when parsing invalid SQL statements. * Fix indentation of LIMIT (by romainr, issue320). * Fix parsing of INTO keyword (issue324). Internal Changes * Several improvements regarding encodings. Release 0.2.2 (Oct 22, 2016) ---------------------------- Enhancements * Add comma_first option: When splitting list "comma first" notation is used (issue141). Bug Fixes * Fix parsing of incomplete AS (issue284, by vmuriart). * Fix parsing of Oracle names containing dollars (issue291). * Fix parsing of UNION ALL (issue294). * Fix grouping of identifiers containing typecasts (issue297). * Add Changelog to sdist again (issue302). Internal Changes * `is_whitespace` and `is_group` changed into properties Release 0.2.1 (Aug 13, 2016) ---------------------------- Notable Changes * PostgreSQL: Function bodys are parsed as literal string. Previously sqlparse assumed that all function bodys are parsable psql strings (see issue277). Bug Fixes * Fix a regression to parse streams again (issue273, reported and test case by gmccreight). * Improve Python 2/3 compatibility when using parsestream (issue190, by phdru). * Improve splitting of PostgreSQL functions (issue277). Release 0.2.0 (Jul 20, 2016) ---------------------------- IMPORTANT: The supported Python versions have changed with this release. sqlparse 0.2.x supports Python 2.7 and Python >= 3.3. Thanks to the many contributors for writing bug reports and working on pull requests who made this version possible! Internal Changes * sqlparse.SQLParseError was removed from top-level module and moved to sqlparse.exceptions. * sqlparse.sql.Token.to_unicode was removed. * The signature of a filter's process method has changed from process(stack, stream) -> to process(stream). Stack was never used at all. * Lots of code cleanups and modernization (thanks esp. to vmuriart!). * Improved grouping performance. (sjoerdjob) Enhancements * Support WHILE loops (issue215, by shenlongxing). * Better support for CTEs (issue217, by Andrew Tipton). * Recognize USING as a keyword more consistently (issue236, by koljonen). * Improve alignment of columns (issue207, issue235, by vmuriat). * Add wrap_after option for better alignment when formatting lists (issue248, by Dennis Taylor). * Add reindent-aligned option for alternate formatting (Adam Greenhall) * Improved grouping of operations (issue211, by vmuriat). Bug Fixes * Leading whitespaces are now removed when format() is called with strip_whitespace=True (issue213, by shenlongxing). * Fix typo in keywords list (issue229, by cbeloni). * Fix parsing of functions in comparisons (issue230, by saaj). * Fix grouping of identifiers (issue233). * Fix parsing of CREATE TABLE statements (issue242, by Tenghuan). * Minor bug fixes (issue101). * Improve formatting of CASE WHEN constructs (issue164, by vmuriat). Release 0.1.19 (Mar 07, 2016) ----------------------------- Bug Fixes * Fix IndexError when statement contains WITH clauses (issue205). Release 0.1.18 (Oct 25, 2015) ----------------------------- Bug Fixes * Remove universal wheel support, added in 0.1.17 by mistake. Release 0.1.17 (Oct 24, 2015) ----------------------------- Enhancements * Speed up parsing of large SQL statements (pull request: issue201, fixes the following issues: issue199, issue135, issue62, issue41, by Ryan Wooden). Bug Fixes * Fix another splitter bug regarding DECLARE (issue194). Misc * Packages on PyPI are signed from now on. Release 0.1.16 (Jul 26, 2015) ----------------------------- Bug Fixes * Fix a regression in get_alias() introduced in 0.1.15 (issue185). * Fix a bug in the splitter regarding DECLARE (issue193). * sqlformat command line tool doesn't duplicate newlines anymore (issue191). * Don't mix up MySQL comments starting with hash and MSSQL temp tables (issue192). * Statement.get_type() now ignores comments at the beginning of a statement (issue186). Release 0.1.15 (Apr 15, 2015) ----------------------------- Bug Fixes * Fix a regression for identifiers with square bracktes notation (issue153, by darikg). * Add missing SQL types (issue154, issue155, issue156, by jukebox). * Fix parsing of multi-line comments (issue172, by JacekPliszka). * Fix parsing of escaped backslashes (issue174, by caseyching). * Fix parsing of identifiers starting with underscore (issue175). * Fix misinterpretation of IN keyword (issue183). Enhancements * Improve formatting of HAVING statements. * Improve parsing of inline comments (issue163). * Group comments to parent object (issue128, issue160). * Add double precision builtin (issue169, by darikg). * Add support for square bracket array indexing (issue170, issue176, issue177 by darikg). * Improve grouping of aliased elements (issue167, by darikg). * Support comments starting with '#' character (issue178). Release 0.1.14 (Nov 30, 2014) ----------------------------- Bug Fixes * Floats in UPDATE statements are now handled correctly (issue145). * Properly handle string literals in comparisons (issue148, change proposed by aadis). * Fix indentation when using tabs (issue146). Enhancements * Improved formatting in list when newlines precede commas (issue140). Release 0.1.13 (Oct 09, 2014) ----------------------------- Bug Fixes * Fix a regression in handling of NULL keywords introduced in 0.1.12. Release 0.1.12 (Sep 20, 2014) ----------------------------- Bug Fixes * Fix handling of NULL keywords in aliased identifiers. * Fix SerializerUnicode to split unquoted newlines (issue131, by Michael Schuller). * Fix handling of modulo operators without spaces (by gavinwahl). Enhancements * Improve parsing of identifier lists containing placeholders. * Speed up query parsing of unquoted lines (by Michael Schuller). Release 0.1.11 (Feb 07, 2014) ----------------------------- Bug Fixes * Fix incorrect parsing of string literals containing line breaks (issue118). * Fix typo in keywords, add MERGE, COLLECT keywords (issue122/124, by Cristian Orellana). * Improve parsing of string literals in columns. * Fix parsing and formatting of statements containing EXCEPT keyword. * Fix Function.get_parameters() (issue126/127, by spigwitmer). Enhancements * Classify DML keywords (issue116, by Victor Hahn). * Add missing FOREACH keyword. * Grouping of BEGIN/END blocks. Other * Python 2.5 isn't automatically tested anymore, neither Travis nor Tox still support it out of the box. Release 0.1.10 (Nov 02, 2013) ----------------------------- Bug Fixes * Removed buffered reading again, it obviously causes wrong parsing in some rare cases (issue114). * Fix regression in setup.py introduced 10 months ago (issue115). Enhancements * Improved support for JOINs, by Alexander Beedie. Release 0.1.9 (Sep 28, 2013) ---------------------------- Bug Fixes * Fix an regression introduced in 0.1.5 where sqlparse didn't properly distinguished between single and double quoted strings when tagging identifier (issue111). Enhancements * New option to truncate long string literals when formatting. * Scientific numbers are pares correctly (issue107). * Support for arithmetic expressions (issue109, issue106; by prudhvi). Release 0.1.8 (Jun 29, 2013) ---------------------------- Bug Fixes * Whitespaces within certain keywords are now allowed (issue97, patch proposed by xcombelle). Enhancements * Improve parsing of assignments in UPDATE statements (issue90). * Add STRAIGHT_JOIN statement (by Yago Riveiro). * Function.get_parameters() now returns the parameter if only one parameter is given (issue94, by wayne.wuw). * sqlparse.split() now removes leading and trailing whitespaces from split statements. * Add USE as keyword token (by mulos). * Improve parsing of PEP249-style placeholders (issue103). Release 0.1.7 (Apr 06, 2013) ---------------------------- Bug Fixes * Fix Python 3 compatibility of sqlformat script (by Pi Delport). * Fix parsing of SQL statements that contain binary data (by Alexey Malyshev). * Fix a bug where keywords were identified as aliased identifiers in invalid SQL statements. * Fix parsing of identifier lists where identifiers are keywords too (issue10). Enhancements * Top-level API functions now accept encoding keyword to parse statements in certain encodings more reliable (issue20). * Improve parsing speed when SQL contains CLOBs or BLOBs (issue86). * Improve formatting of ORDER BY clauses (issue89). * Formatter now tries to detect runaway indentations caused by parsing errors or invalid SQL statements. When re-indenting such statements the formatter flips back to column 0 before going crazy. Other * Documentation updates. Release 0.1.6 (Jan 01, 2013) ---------------------------- sqlparse is now compatible with Python 3 without any patches. The Python 3 version is generated during install by 2to3. You'll need distribute to install sqlparse for Python 3. Bug Fixes * Fix parsing error with dollar-quoted procedure bodies (issue83). Other * Documentation updates. * Test suite now uses tox and pytest. * py3k fixes (by vthriller). * py3k fixes in setup.py (by Florian Bauer). * setup.py now requires distribute (by Florian Bauer). Release 0.1.5 (Nov 13, 2012) ---------------------------- Bug Fixes * Improve handling of quoted identifiers (issue78). * Improve grouping and formatting of identifiers with operators (issue53). * Improve grouping and formatting of concatenated strings (issue53). * Improve handling of varchar() (by Mike Amy). * Clean up handling of various SQL elements. * Switch to pytest and clean up tests. * Several minor fixes. Other * Deprecate sqlparse.SQLParseError. Please use sqlparse.exceptions.SQLParseError instead. * Add caching to speed up processing. * Add experimental filters for token processing. * Add sqlformat.parsestream (by quest). Release 0.1.4 (Apr 20, 2012) ---------------------------- Bug Fixes * Avoid "stair case" effects when identifiers, functions, placeholders or keywords are mixed in identifier lists (issue45, issue49, issue52) and when asterisks are used as operators (issue58). * Make keyword detection more restrict (issue47). * Improve handling of CASE statements (issue46). * Fix statement splitting when parsing recursive statements (issue57, thanks to piranna). * Fix for negative numbers (issue56, thanks to kevinjqiu). * Pretty format comments in identifier lists (issue59). * Several minor bug fixes and improvements. Release 0.1.3 (Jul 29, 2011) ---------------------------- Bug Fixes * Improve parsing of floats (thanks to Kris). * When formatting a statement a space before LIMIT was removed (issue35). * Fix strip_comments flag (issue38, reported by ooberm...@gmail.com). * Avoid parsing names as keywords (issue39, reported by djo...@taket.org). * Make sure identifier lists in subselects are grouped (issue40, reported by djo...@taket.org). * Split statements with IF as functions correctly (issue33 and issue29, reported by charles....@unige.ch). * Relax detection of keywords, esp. when used as function names (issue36, nyuhu...@gmail.com). * Don't treat single characters as keywords (issue32). * Improve parsing of stand-alone comments (issue26). * Detection of placeholders in paramterized queries (issue22, reported by Glyph Lefkowitz). * Add parsing of MS Access column names with braces (issue27, reported by frankz...@gmail.com). Other * Replace Django by Flask in App Engine frontend (issue11). Release 0.1.2 (Nov 23, 2010) ---------------------------- Bug Fixes * Fixed incorrect detection of keyword fragments embed in names (issue7, reported and initial patch by andyboyko). * Stricter detection of identifier aliases (issue8, reported by estama). * WHERE grouping consumed closing parenthesis (issue9, reported by estama). * Fixed an issue with trailing whitespaces (reported by Kris). * Better detection of escaped single quotes (issue13, reported by Martin Brochhaus, patch by bluemaro with test case by Dan Carley). * Ignore identifier in double-quotes when changing cases (issue 21). * Lots of minor fixes targeting encoding, indentation, statement parsing and more (issues 12, 14, 15, 16, 18, 19). * Code cleanup with a pinch of refactoring. Release 0.1.1 (May 6, 2009) --------------------------- Bug Fixes * Lexers preserves original line breaks (issue1). * Improved identifier parsing: backtick quotes, wildcards, T-SQL variables prefixed with @. * Improved parsing of identifier lists (issue2). * Recursive recognition of AS (issue4) and CASE. * Improved support for UPDATE statements. Other * Code cleanup and better test coverage. Release 0.1.0 (Apr 8, 2009) --------------------------- Initial release. sqlparse-0.4.4/LICENSE000066400000000000000000000030011454634525500143650ustar00rootroot00000000000000Copyright (c) 2016, Andi Albrecht All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the authors nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. sqlparse-0.4.4/Makefile000066400000000000000000000007501454634525500150300ustar00rootroot00000000000000# Makefile to simplify some common development tasks. # Run 'make help' for a list of commands. PYTHON=`which python` default: help help: @echo "Available commands:" @sed -n '/^[a-zA-Z0-9_.]*:/s/:.*//p' Requires-Python: >=3.5 Description-Content-Type: text/x-rst Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: BSD License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: 3.6 Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: Topic :: Database Classifier: Topic :: Software Development Requires-Dist: flake8 ; extra == "dev" Requires-Dist: build ; extra == "dev" Requires-Dist: sphinx ; extra == "doc" Requires-Dist: pytest ; extra == "test" Requires-Dist: pytest-cov ; extra == "test" Project-URL: Documentation, https://sqlparse.readthedocs.io/ Project-URL: Home, https://github.com/andialbrecht/sqlparse Project-URL: Release Notes, https://sqlparse.readthedocs.io/en/latest/changes/ Project-URL: Source, https://github.com/andialbrecht/sqlparse Project-URL: Tracker, https://github.com/andialbrecht/sqlparse/issues Provides-Extra: dev Provides-Extra: doc Provides-Extra: test python-sqlparse - Parse SQL statements ====================================== |buildstatus|_ |coverage|_ |docs|_ |packageversion|_ .. docincludebegin sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting SQL statements. The module is compatible with Python 3.5+ and released under the terms of the `New BSD license `_. Visit the project page at https://github.com/andialbrecht/sqlparse for further information about this project. Quick Start ----------- .. code-block:: sh $ pip install sqlparse .. code-block:: python >>> import sqlparse >>> # Split a string containing two SQL statements: >>> raw = 'select * from foo; select * from bar;' >>> statements = sqlparse.split(raw) >>> statements ['select * from foo;', 'select * from bar;'] >>> # Format the first statement and print it out: >>> first = statements[0] >>> print(sqlparse.format(first, reindent=True, keyword_case='upper')) SELECT * FROM foo; >>> # Parsing a SQL statement: >>> parsed = sqlparse.parse('select * from foo')[0] >>> parsed.tokens [, , >> Links ----- Project page https://github.com/andialbrecht/sqlparse Bug tracker https://github.com/andialbrecht/sqlparse/issues Documentation https://sqlparse.readthedocs.io/ Online Demo https://sqlformat.org/ sqlparse is licensed under the BSD license. Parts of the code are based on pygments written by Georg Brandl and others. pygments-Homepage: http://pygments.org/ .. |buildstatus| image:: https://github.com/andialbrecht/sqlparse/actions/workflows/python-app.yml/badge.svg .. _buildstatus: https://github.com/andialbrecht/sqlparse/actions/workflows/python-app.yml .. |coverage| image:: https://codecov.io/gh/andialbrecht/sqlparse/branch/master/graph/badge.svg .. _coverage: https://codecov.io/gh/andialbrecht/sqlparse .. |docs| image:: https://readthedocs.org/projects/sqlparse/badge/?version=latest .. _docs: https://sqlparse.readthedocs.io/en/latest/?badge=latest .. |packageversion| image:: https://img.shields.io/pypi/v/sqlparse?color=%2334D058&label=pypi%20package .. _packageversion: https://pypi.org/project/sqlparse sqlparse-0.4.4/README.rst000066400000000000000000000044151454634525500150610ustar00rootroot00000000000000python-sqlparse - Parse SQL statements ====================================== |buildstatus|_ |coverage|_ |docs|_ |packageversion|_ .. docincludebegin sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting SQL statements. The module is compatible with Python 3.5+ and released under the terms of the `New BSD license `_. Visit the project page at https://github.com/andialbrecht/sqlparse for further information about this project. Quick Start ----------- .. code-block:: sh $ pip install sqlparse .. code-block:: python >>> import sqlparse >>> # Split a string containing two SQL statements: >>> raw = 'select * from foo; select * from bar;' >>> statements = sqlparse.split(raw) >>> statements ['select * from foo;', 'select * from bar;'] >>> # Format the first statement and print it out: >>> first = statements[0] >>> print(sqlparse.format(first, reindent=True, keyword_case='upper')) SELECT * FROM foo; >>> # Parsing a SQL statement: >>> parsed = sqlparse.parse('select * from foo')[0] >>> parsed.tokens [, , >> Links ----- Project page https://github.com/andialbrecht/sqlparse Bug tracker https://github.com/andialbrecht/sqlparse/issues Documentation https://sqlparse.readthedocs.io/ Online Demo https://sqlformat.org/ sqlparse is licensed under the BSD license. Parts of the code are based on pygments written by Georg Brandl and others. pygments-Homepage: http://pygments.org/ .. |buildstatus| image:: https://github.com/andialbrecht/sqlparse/actions/workflows/python-app.yml/badge.svg .. _buildstatus: https://github.com/andialbrecht/sqlparse/actions/workflows/python-app.yml .. |coverage| image:: https://codecov.io/gh/andialbrecht/sqlparse/branch/master/graph/badge.svg .. _coverage: https://codecov.io/gh/andialbrecht/sqlparse .. |docs| image:: https://readthedocs.org/projects/sqlparse/badge/?version=latest .. _docs: https://sqlparse.readthedocs.io/en/latest/?badge=latest .. |packageversion| image:: https://img.shields.io/pypi/v/sqlparse?color=%2334D058&label=pypi%20package .. _packageversion: https://pypi.org/project/sqlparse sqlparse-0.4.4/TODO000066400000000000000000000004421454634525500140560ustar00rootroot00000000000000* See https://groups.google.com/d/msg/sqlparse/huz9lKXt0Lc/11ybIKPJWbUJ for some interesting hints and suggestions. * Provide a function to replace tokens. See this thread: https://groups.google.com/d/msg/sqlparse/5xmBL2UKqX4/ZX9z_peve-AJ * Document filter stack and processing phases. sqlparse-0.4.4/docs/000077500000000000000000000000001454634525500143165ustar00rootroot00000000000000sqlparse-0.4.4/docs/Makefile000066400000000000000000000056621454634525500157670ustar00rootroot00000000000000# Makefile for Sphinx documentation # # You can set these variables from the command line. SPHINXOPTS = SPHINXBUILD = sphinx-build PAPER = # Internal variables. PAPEROPT_a4 = -D latex_paper_size=a4 PAPEROPT_letter = -D latex_paper_size=letter ALLSPHINXOPTS = -d build/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source .PHONY: help clean html dirhtml pickle json htmlhelp qthelp latex changes linkcheck doctest help: @echo "Please use \`make ' where is one of" @echo " html to make standalone HTML files" @echo " dirhtml to make HTML files named index.html in directories" @echo " pickle to make pickle files" @echo " json to make JSON files" @echo " htmlhelp to make HTML files and a HTML help project" @echo " qthelp to make HTML files and a qthelp project" @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" @echo " changes to make an overview of all changed/added/deprecated items" @echo " linkcheck to check all external links for integrity" @echo " doctest to run all doctests embedded in the documentation (if enabled)" clean: -rm -rf build/* html: $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) build/html @echo @echo "Build finished. The HTML pages are in build/html." dirhtml: $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) build/dirhtml @echo @echo "Build finished. The HTML pages are in build/dirhtml." pickle: $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) build/pickle @echo @echo "Build finished; now you can process the pickle files." json: $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) build/json @echo @echo "Build finished; now you can process the JSON files." htmlhelp: $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) build/htmlhelp @echo @echo "Build finished; now you can run HTML Help Workshop with the" \ ".hhp project file in build/htmlhelp." qthelp: $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) build/qthelp @echo @echo "Build finished; now you can run "qcollectiongenerator" with the" \ ".qhcp project file in build/qthelp, like this:" @echo "# qcollectiongenerator build/qthelp/python-sqlparse.qhcp" @echo "To view the help file:" @echo "# assistant -collectionFile build/qthelp/python-sqlparse.qhc" latex: $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) build/latex @echo @echo "Build finished; the LaTeX files are in build/latex." @echo "Run \`make all-pdf' or \`make all-ps' in that directory to" \ "run these through (pdf)latex." changes: $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) build/changes @echo @echo "The overview file is in build/changes." linkcheck: $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) build/linkcheck @echo @echo "Link check complete; look for any errors in the above output " \ "or in build/linkcheck/output.txt." doctest: $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) build/doctest @echo "Testing of doctests in the sources finished, look at the " \ "results in build/doctest/output.txt." sqlparse-0.4.4/docs/source/000077500000000000000000000000001454634525500156165ustar00rootroot00000000000000sqlparse-0.4.4/docs/source/analyzing.rst000066400000000000000000000025211454634525500203440ustar00rootroot00000000000000.. _analyze: Analyzing the Parsed Statement ============================== When the :meth:`~sqlparse.parse` function is called the returned value is a tree-ish representation of the analyzed statements. The returned objects can be used by applications to retrieve further information about the parsed SQL. Base Classes ------------ All returned objects inherit from these base classes. The :class:`~sqlparse.sql.Token` class represents a single token and :class:`~sqlparse.sql.TokenList` class is a group of tokens. The latter provides methods for inspecting its child tokens. .. autoclass:: sqlparse.sql.Token :members: .. autoclass:: sqlparse.sql.TokenList :members: SQL Representing Classes ------------------------ The following classes represent distinct parts of a SQL statement. .. autoclass:: sqlparse.sql.Statement :members: .. autoclass:: sqlparse.sql.Comment :members: .. autoclass:: sqlparse.sql.Identifier :members: .. autoclass:: sqlparse.sql.IdentifierList :members: .. autoclass:: sqlparse.sql.Where :members: .. autoclass:: sqlparse.sql.Case :members: .. autoclass:: sqlparse.sql.Parenthesis :members: .. autoclass:: sqlparse.sql.If :members: .. autoclass:: sqlparse.sql.For :members: .. autoclass:: sqlparse.sql.Assignment :members: .. autoclass:: sqlparse.sql.Comparison :members: sqlparse-0.4.4/docs/source/api.rst000066400000000000000000000040141454634525500171200ustar00rootroot00000000000000:mod:`sqlparse` -- Parse SQL statements ======================================= .. module:: sqlparse :synopsis: Parse SQL statements. The :mod:`sqlparse` module provides the following functions on module-level. .. autofunction:: sqlparse.split .. autofunction:: sqlparse.format .. autofunction:: sqlparse.parse In most cases there's no need to set the `encoding` parameter. If `encoding` is not set, sqlparse assumes that the given SQL statement is encoded either in utf-8 or latin-1. .. _formatting: Formatting of SQL Statements ---------------------------- The :meth:`~sqlparse.format` function accepts the following keyword arguments. ``keyword_case`` Changes how keywords are formatted. Allowed values are "upper", "lower" and "capitalize". ``identifier_case`` Changes how identifiers are formatted. Allowed values are "upper", "lower", and "capitalize". ``strip_comments`` If ``True`` comments are removed from the statements. ``truncate_strings`` If ``truncate_strings`` is a positive integer, string literals longer than the given value will be truncated. ``truncate_char`` (default: "[...]") If long string literals are truncated (see above) this value will be append to the truncated string. ``reindent`` If ``True`` the indentations of the statements are changed. ``reindent_aligned`` If ``True`` the indentations of the statements are changed, and statements are aligned by keywords. ``use_space_around_operators`` If ``True`` spaces are used around all operators. ``indent_tabs`` If ``True`` tabs instead of spaces are used for indentation. ``indent_width`` The width of the indentation, defaults to 2. ``wrap_after`` The column limit (in characters) for wrapping comma-separated lists. If unspecified, it puts every item in the list on its own line. ``output_format`` If given the output is additionally formatted to be used as a variable in a programming language. Allowed values are "python" and "php". ``comma_first`` If ``True`` comma-first notation for column names is used. sqlparse-0.4.4/docs/source/changes.rst000066400000000000000000000004671454634525500177670ustar00rootroot00000000000000.. _changes: ============================ Changes in python-sqlparse ============================ Upcoming Deprecations ===================== * ``sqlparse.SQLParseError`` is deprecated (version 0.1.5), use ``sqlparse.exceptions.SQLParseError`` instead. Changelog ========= .. include:: ../../CHANGELOG sqlparse-0.4.4/docs/source/conf.py000066400000000000000000000147201454634525500171210ustar00rootroot00000000000000# python-sqlparse documentation build configuration file, created by # sphinx-quickstart on Thu Feb 26 08:19:28 2009. # # This file is execfile()d with the current directory set to its containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. import datetime import sys, os # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. #sys.path.append(os.path.abspath('.')) sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../')) import sqlparse # -- General configuration ----------------------------------------------------- # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = ['sphinx.ext.autodoc', 'sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.autosummary'] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8' # The master toctree document. master_doc = 'index' # General information about the project. project = 'python-sqlparse' copyright = '{:%Y}, Andi Albrecht'.format(datetime.date.today()) # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. version = sqlparse.__version__ # The full version, including alpha/beta/rc tags. release = sqlparse.__version__ # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of documents that shouldn't be included in the build. #unused_docs = [] # List of directories, relative to source directory, that shouldn't be searched # for source files. exclude_trees = [] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'tango' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # The theme to use for HTML and HTML Help pages. Major themes that come with # Sphinx are currently 'default' and 'sphinxdoc'. #html_theme = 'agogo' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. #html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [os.path.abspath('../')] # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". #html_title = None # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". #html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_use_modindex = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # If nonempty, this is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = '' # Output file base name for HTML help builder. htmlhelp_basename = 'python-sqlparsedoc' # -- Options for LaTeX output -------------------------------------------------- # The paper size ('letter' or 'a4'). #latex_paper_size = 'letter' # The font size ('10pt', '11pt' or '12pt'). #latex_font_size = '10pt' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', 'python-sqlparse.tex', 'python-sqlparse Documentation', 'Andi Albrecht', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # Additional stuff for the LaTeX preamble. #latex_preamble = '' # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_use_modindex = True todo_include_todos = True sqlparse-0.4.4/docs/source/extending.rst000066400000000000000000000057211454634525500203420ustar00rootroot00000000000000Extending :mod:`sqlparse` ========================= .. module:: sqlparse :synopsis: Extending parsing capability of sqlparse. The :mod:`sqlparse` module uses a sql grammar that was tuned through usage and numerous PR to fit a broad range of SQL syntaxes, but it cannot cater to every given case since some SQL dialects have adopted conflicting meanings of certain keywords. Sqlparse therefore exposes a mechanism to configure the fundamental keywords and regular expressions that parse the language as described below. If you find an adaptation that works for your specific use-case. Please consider contributing it back to the community by opening a PR on `GitHub `_. Configuring the Lexer --------------------- The lexer is a singleton class that breaks down the stream of characters into language tokens. It does this by using a sequence of regular expressions and keywords that are listed in the file ``sqlparse.keywords``. Instead of applying these fixed grammar definitions directly, the lexer is default initialized in its method called ``default_initialization()``. As an api user, you can adapt the Lexer configuration by applying your own configuration logic. To do so, start out by clearing previous configurations with ``.clear()``, then apply the SQL list with ``.set_SQL_REGEX(SQL_REGEX)``, and apply keyword lists with ``.add_keywords(KEYWORDS)``. You can do so by re-using the expressions in ``sqlparse.keywords`` (see example below), leaving parts out, or by making up your own master list. See the expected types of the arguments by inspecting their structure in ``sqlparse.keywords``. (For compatibility with python 3.4, this library does not use type-hints.) The following example adds support for the expression ``ZORDER BY``, and adds ``BAR`` as a keyword to the lexer: .. code-block:: python import re import sqlparse from sqlparse import keywords from sqlparse.lexer import Lexer # get the lexer singleton object to configure it lex = Lexer.get_default_instance() # Clear the default configurations. # After this call, reg-exps and keyword dictionaries need to be loaded # to make the lexer functional again. lex.clear() my_regex = (r"ZORDER\s+BY\b", sqlparse.tokens.Keyword) # slice the default SQL_REGEX to inject the custom object lex.set_SQL_REGEX( keywords.SQL_REGEX[:38] + [my_regex] + keywords.SQL_REGEX[38:] ) # add the default keyword dictionaries lex.add_keywords(keywords.KEYWORDS_COMMON) lex.add_keywords(keywords.KEYWORDS_ORACLE) lex.add_keywords(keywords.KEYWORDS_PLPGSQL) lex.add_keywords(keywords.KEYWORDS_HQL) lex.add_keywords(keywords.KEYWORDS_MSACCESS) lex.add_keywords(keywords.KEYWORDS) # add a custom keyword dictionary lex.add_keywords({'BAR', sqlparse.tokens.Keyword}) # no configuration is passed here. The lexer is used as a singleton. sqlparse.parse("select * from foo zorder by bar;") sqlparse-0.4.4/docs/source/index.rst000066400000000000000000000013051454634525500174560ustar00rootroot00000000000000.. python-sqlparse documentation master file, created by sphinx-quickstart on Thu Feb 26 08:19:28 2009. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. python-sqlparse =============== .. include:: ../../README.rst :start-after: docincludebegin :end-before: Links Contents -------- .. toctree:: :maxdepth: 2 intro api analyzing ui extending changes license indices Resources --------- Project page https://github.com/andialbrecht/sqlparse Bug tracker https://github.com/andialbrecht/sqlparse/issues Documentation https://sqlparse.readthedocs.io/ Online Demo https://sqlformat.org/ sqlparse-0.4.4/docs/source/indices.rst000066400000000000000000000001341454634525500177640ustar00rootroot00000000000000Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` sqlparse-0.4.4/docs/source/intro.rst000066400000000000000000000103021454634525500174770ustar00rootroot00000000000000Introduction ============ Download & Installation ----------------------- The latest released version can be obtained from the `Python Package Index (PyPI) `_. To extract and install the module system-wide run .. code-block:: bash $ tar cvfz python-sqlparse-VERSION.tar.gz $ cd python-sqlparse/ $ sudo python setup.py install Alternatively you can install :mod:`sqlparse` using :command:`pip`: .. code-block:: bash $ pip install sqlparse Getting Started --------------- The :mod:`sqlparse` module provides three simple functions on module level to achieve some common tasks when working with SQL statements. This section shows some simple usage examples of these functions. Let's get started with splitting a string containing one or more SQL statements into a list of single statements using :meth:`~sqlparse.split`: .. code-block:: python >>> import sqlparse >>> sql = 'select * from foo; select * from bar;' >>> sqlparse.split(sql) [u'select * from foo; ', u'select * from bar;'] The end of a statement is identified by the occurrence of a semicolon. Semicolons within certain SQL constructs like ``BEGIN ... END`` blocks are handled correctly by the splitting mechanism. SQL statements can be beautified by using the :meth:`~sqlparse.format` function. .. code-block:: python >>> sql = 'select * from foo where id in (select id from bar);' >>> print(sqlparse.format(sql, reindent=True, keyword_case='upper')) SELECT * FROM foo WHERE id IN (SELECT id FROM bar); In this case all keywords in the given SQL are uppercased and the indentation is changed to make it more readable. Read :ref:`formatting` for a full reference of supported options given as keyword arguments to that function. Before proceeding with a closer look at the internal representation of SQL statements, you should be aware that this SQL parser is intentionally non-validating. It assumes that the given input is at least some kind of SQL and then it tries to analyze as much as possible without making too much assumptions about the concrete dialect or the actual statement. At least it's up to the user of this API to interpret the results right. When using the :meth:`~sqlparse.parse` function a tuple of :class:`~sqlparse.sql.Statement` instances is returned: .. code-block:: python >>> sql = 'select * from "someschema"."mytable" where id = 1' >>> parsed = sqlparse.parse(sql) >>> parsed (,) Each item of the tuple is a single statement as identified by the above mentioned :meth:`~sqlparse.split` function. So let's grab the only element from that list and have a look at the ``tokens`` attribute. Sub-tokens are stored in this attribute. .. code-block:: python >>> stmt = parsed[0] # grab the Statement object >>> stmt.tokens (, , , , , , , , ) Each object can be converted back to a string at any time: .. code-block:: python >>> str(stmt) # str(stmt) for Python 3 'select * from "someschema"."mytable" where id = 1' >>> str(stmt.tokens[-1]) # or just the WHERE part 'where id = 1' Details of the returned objects are described in :ref:`analyze`. Development & Contributing -------------------------- To check out the latest sources of this module run .. code-block:: bash $ git clone git://github.com/andialbrecht/sqlparse.git to check out the latest sources from the repository. :mod:`sqlparse` is currently tested under Python 3.5+ and PyPy. Tests are automatically run on each commit and for each pull request on Travis: https://travis-ci.org/andialbrecht/sqlparse Make sure to run the test suite before sending a pull request by running .. code-block:: bash $ tox It's ok, if :command:`tox` doesn't find all interpreters listed above. Ideally a Python 2 and a Python 3 version should be tested locally. Please file bug reports and feature requests on the project site at https://github.com/andialbrecht/sqlparse/issues/new. sqlparse-0.4.4/docs/source/license.rst000066400000000000000000000000531454634525500177700ustar00rootroot00000000000000License ======= .. include:: ../../LICENSEsqlparse-0.4.4/docs/source/ui.rst000066400000000000000000000010571454634525500167700ustar00rootroot00000000000000User Interfaces =============== ``sqlformat`` The ``sqlformat`` command line script ist distributed with the module. Run :command:`sqlformat --help` to list available options and for usage hints. ``sqlformat.appspot.com`` An example `Google App Engine `_ application that exposes the formatting features using a web front-end. See https://sqlformat.org/ for details. The source for this application is available from a source code check out of the :mod:`sqlparse` module (see :file:`extras/appengine`). sqlparse-0.4.4/docs/sqlformat.1000066400000000000000000000034001454634525500164050ustar00rootroot00000000000000.\" Based on template /usr/share/man-db/examples/manpage.example provided by .\" Tom Christiansen . .TH SQLFORMAT "1" "December 2010" "python-sqlparse version: 0.1.2" "User Commands" .SH NAME sqlformat \- reformat SQL .SH SYNOPSIS .PP .B sqlformat [ .I "OPTION" ] ... [ .I "FILE" ] ... .SH DESCRIPTION .\" Putting a newline after each sentence can generate better output. The `sqlformat' command-line tool can be used to reformat SQL file according to specified options or prepare a snippet in in some programming language (only Python and PHP currently supported). Use "-" for .I FILE to read from stdin. .SH OPTIONS .TP \fB\-i\fR \fICHOICE\fR|\fB\-\-identifiers\fR=\fIFORMAT\fR Change case of identifiers. .I FORMAT is one of "upper", "lower", "capitalize". .TP \fB\-k\fR \fICHOICE\fR|\fB\-\-keywords\fR=\fIFORMAT\fR Change case of keywords. .I FORMAT is one of "upper", "lower", "capitalize". .TP \fB\-l\fR \fICHOICE\fR|\fB\-\-language\fR=\fILANG\fR Output a snippet in programming language LANG. .I LANG can be "python", "php". .TP \fB\-o\fR \fIFILE\fR|\fB\-\-outfile\fR=\fIFILE\fR Write output to .I FILE (defaults to stdout). .TP .BR \-r | \-\-reindent Reindent statements. .TP \fB\-\-indent_width\fR=\fIINDENT_WIDTH\fR Set indent width to .IR INDENT_WIDTH . Default is 2 spaces. .TP \fB\-\-wrap_after\fR=\fIWRAP_AFTER\fR The column limit for wrapping comma-separated lists. If unspecified, it puts every item in the list on its own line. .TP \fB\-\-strip\-comments Remove comments. .TP .BR \-h | \-\-help Print a short help message and exit. All subsequent options are ignored. .TP .BR --verbose Verbose output. .TP .BR \-\-version Print program's version number and exit. .SH AUTHORS This man page was written by Andriy Senkovych sqlparse-0.4.4/pyproject.toml000066400000000000000000000035131454634525500163040ustar00rootroot00000000000000[build-system] requires = ["flit_core >=3.2,<4"] build-backend = "flit_core.buildapi" [project] name = "sqlparse" description = "A non-validating SQL parser." authors = [{name = "Andi Albrecht", email = "albrecht.andi@gmail.com"}] readme = "README.rst" dynamic = ["version"] classifiers = [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: BSD License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3 :: Only", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy", "Topic :: Database", "Topic :: Software Development", ] requires-python = ">=3.5" [project.urls] Home = "https://github.com/andialbrecht/sqlparse" Documentation = "https://sqlparse.readthedocs.io/" "Release Notes" = "https://sqlparse.readthedocs.io/en/latest/changes/" Source = "https://github.com/andialbrecht/sqlparse" Tracker = "https://github.com/andialbrecht/sqlparse/issues" [project.scripts] sqlformat = "sqlparse.__main__:main" [project.optional-dependencies] dev = [ "flake8", "build", ] test = [ "pytest", "pytest-cov", ] doc = [ "sphinx", ] [tool.flit.sdist] include = [ "docs/source/", "docs/sqlformat.1", "docs/Makefile", "tests/*.py", "tests/files/*.sql", "LICENSE", "TODO", "AUTHORS", "CHANGELOG", "Makefile", "tox.ini", ] [tool.coverage.run] omit = ["sqlparse/__main__.py"] sqlparse-0.4.4/sqlparse/000077500000000000000000000000001454634525500152205ustar00rootroot00000000000000sqlparse-0.4.4/sqlparse/__init__.py000066400000000000000000000042041454634525500173310ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """Parse SQL statements.""" # Setup namespace from sqlparse import sql from sqlparse import cli from sqlparse import engine from sqlparse import tokens from sqlparse import filters from sqlparse import formatter __version__ = '0.4.4' __all__ = ['engine', 'filters', 'formatter', 'sql', 'tokens', 'cli'] def parse(sql, encoding=None): """Parse sql and return a list of statements. :param sql: A string containing one or more SQL statements. :param encoding: The encoding of the statement (optional). :returns: A tuple of :class:`~sqlparse.sql.Statement` instances. """ return tuple(parsestream(sql, encoding)) def parsestream(stream, encoding=None): """Parses sql statements from file-like object. :param stream: A file-like object. :param encoding: The encoding of the stream contents (optional). :returns: A generator of :class:`~sqlparse.sql.Statement` instances. """ stack = engine.FilterStack() stack.enable_grouping() return stack.run(stream, encoding) def format(sql, encoding=None, **options): """Format *sql* according to *options*. Available options are documented in :ref:`formatting`. In addition to the formatting options this function accepts the keyword "encoding" which determines the encoding of the statement. :returns: The formatted SQL statement as string. """ stack = engine.FilterStack() options = formatter.validate_options(options) stack = formatter.build_filter_stack(stack, options) stack.postprocess.append(filters.SerializerUnicode()) return ''.join(stack.run(sql, encoding)) def split(sql, encoding=None): """Split *sql* into single statements. :param sql: A string containing one or more SQL statements. :param encoding: The encoding of the statement (optional). :returns: A list of strings. """ stack = engine.FilterStack() return [str(stmt).strip() for stmt in stack.run(sql, encoding)] sqlparse-0.4.4/sqlparse/__main__.py000066400000000000000000000011421454634525500173100ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """Entrypoint module for `python -m sqlparse`. Why does this file exist, and why __main__? For more info, read: - https://www.python.org/dev/peps/pep-0338/ - https://docs.python.org/2/using/cmdline.html#cmdoption-m - https://docs.python.org/3/using/cmdline.html#cmdoption-m """ import sys from sqlparse.cli import main if __name__ == '__main__': sys.exit(main()) sqlparse-0.4.4/sqlparse/cli.py000077500000000000000000000131201454634525500163410ustar00rootroot00000000000000#!/usr/bin/env python # # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """Module that contains the command line app. Why does this file exist, and why not put this in __main__? You might be tempted to import things from __main__ later, but that will cause problems: the code will get executed twice: - When you run `python -m sqlparse` python will execute ``__main__.py`` as a script. That means there won't be any ``sqlparse.__main__`` in ``sys.modules``. - When you import __main__ it will get executed again (as a module) because there's no ``sqlparse.__main__`` in ``sys.modules``. Also see (1) from http://click.pocoo.org/5/setuptools/#setuptools-integration """ import argparse import sys from io import TextIOWrapper import sqlparse from sqlparse.exceptions import SQLParseError # TODO: Add CLI Tests # TODO: Simplify formatter by using argparse `type` arguments def create_parser(): _CASE_CHOICES = ['upper', 'lower', 'capitalize'] parser = argparse.ArgumentParser( prog='sqlformat', description='Format FILE according to OPTIONS. Use "-" as FILE ' 'to read from stdin.', usage='%(prog)s [OPTIONS] FILE, ...', ) parser.add_argument('filename') parser.add_argument( '-o', '--outfile', dest='outfile', metavar='FILE', help='write output to FILE (defaults to stdout)') parser.add_argument( '--version', action='version', version=sqlparse.__version__) group = parser.add_argument_group('Formatting Options') group.add_argument( '-k', '--keywords', metavar='CHOICE', dest='keyword_case', choices=_CASE_CHOICES, help='change case of keywords, CHOICE is one of {}'.format( ', '.join('"{}"'.format(x) for x in _CASE_CHOICES))) group.add_argument( '-i', '--identifiers', metavar='CHOICE', dest='identifier_case', choices=_CASE_CHOICES, help='change case of identifiers, CHOICE is one of {}'.format( ', '.join('"{}"'.format(x) for x in _CASE_CHOICES))) group.add_argument( '-l', '--language', metavar='LANG', dest='output_format', choices=['python', 'php'], help='output a snippet in programming language LANG, ' 'choices are "python", "php"') group.add_argument( '--strip-comments', dest='strip_comments', action='store_true', default=False, help='remove comments') group.add_argument( '-r', '--reindent', dest='reindent', action='store_true', default=False, help='reindent statements') group.add_argument( '--indent_width', dest='indent_width', default=2, type=int, help='indentation width (defaults to 2 spaces)') group.add_argument( '--indent_after_first', dest='indent_after_first', action='store_true', default=False, help='indent after first line of statement (e.g. SELECT)') group.add_argument( '--indent_columns', dest='indent_columns', action='store_true', default=False, help='indent all columns by indent_width instead of keyword length') group.add_argument( '-a', '--reindent_aligned', action='store_true', default=False, help='reindent statements to aligned format') group.add_argument( '-s', '--use_space_around_operators', action='store_true', default=False, help='place spaces around mathematical operators') group.add_argument( '--wrap_after', dest='wrap_after', default=0, type=int, help='Column after which lists should be wrapped') group.add_argument( '--comma_first', dest='comma_first', default=False, type=bool, help='Insert linebreak before comma (default False)') group.add_argument( '--encoding', dest='encoding', default='utf-8', help='Specify the input encoding (default utf-8)') return parser def _error(msg): """Print msg and optionally exit with return code exit_.""" sys.stderr.write('[ERROR] {}\n'.format(msg)) return 1 def main(args=None): parser = create_parser() args = parser.parse_args(args) if args.filename == '-': # read from stdin wrapper = TextIOWrapper(sys.stdin.buffer, encoding=args.encoding) try: data = wrapper.read() finally: wrapper.detach() else: try: with open(args.filename, encoding=args.encoding) as f: data = ''.join(f.readlines()) except OSError as e: return _error( 'Failed to read {}: {}'.format(args.filename, e)) close_stream = False if args.outfile: try: stream = open(args.outfile, 'w', encoding=args.encoding) close_stream = True except OSError as e: return _error('Failed to open {}: {}'.format(args.outfile, e)) else: stream = sys.stdout formatter_opts = vars(args) try: formatter_opts = sqlparse.formatter.validate_options(formatter_opts) except SQLParseError as e: return _error('Invalid options: {}'.format(e)) s = sqlparse.format(data, **formatter_opts) stream.write(s) stream.flush() if close_stream: stream.close() return 0 sqlparse-0.4.4/sqlparse/engine/000077500000000000000000000000001454634525500164655ustar00rootroot00000000000000sqlparse-0.4.4/sqlparse/engine/__init__.py000066400000000000000000000006771454634525500206100ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse.engine import grouping from sqlparse.engine.filter_stack import FilterStack from sqlparse.engine.statement_splitter import StatementSplitter __all__ = [ 'grouping', 'FilterStack', 'StatementSplitter', ] sqlparse-0.4.4/sqlparse/engine/filter_stack.py000066400000000000000000000022511454634525500215110ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """filter""" from sqlparse import lexer from sqlparse.engine import grouping from sqlparse.engine.statement_splitter import StatementSplitter class FilterStack: def __init__(self): self.preprocess = [] self.stmtprocess = [] self.postprocess = [] self._grouping = False def enable_grouping(self): self._grouping = True def run(self, sql, encoding=None): stream = lexer.tokenize(sql, encoding) # Process token stream for filter_ in self.preprocess: stream = filter_.process(stream) stream = StatementSplitter().process(stream) # Output: Stream processed Statements for stmt in stream: if self._grouping: stmt = grouping.group(stmt) for filter_ in self.stmtprocess: filter_.process(stmt) for filter_ in self.postprocess: stmt = filter_.process(stmt) yield stmt sqlparse-0.4.4/sqlparse/engine/grouping.py000066400000000000000000000330021454634525500206670ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import sql from sqlparse import tokens as T from sqlparse.utils import recurse, imt T_NUMERICAL = (T.Number, T.Number.Integer, T.Number.Float) T_STRING = (T.String, T.String.Single, T.String.Symbol) T_NAME = (T.Name, T.Name.Placeholder) def _group_matching(tlist, cls): """Groups Tokens that have beginning and end.""" opens = [] tidx_offset = 0 for idx, token in enumerate(list(tlist)): tidx = idx - tidx_offset if token.is_whitespace: # ~50% of tokens will be whitespace. Will checking early # for them avoid 3 comparisons, but then add 1 more comparison # for the other ~50% of tokens... continue if token.is_group and not isinstance(token, cls): # Check inside previously grouped (i.e. parenthesis) if group # of different type is inside (i.e., case). though ideally should # should check for all open/close tokens at once to avoid recursion _group_matching(token, cls) continue if token.match(*cls.M_OPEN): opens.append(tidx) elif token.match(*cls.M_CLOSE): try: open_idx = opens.pop() except IndexError: # this indicates invalid sql and unbalanced tokens. # instead of break, continue in case other "valid" groups exist continue close_idx = tidx tlist.group_tokens(cls, open_idx, close_idx) tidx_offset += close_idx - open_idx def group_brackets(tlist): _group_matching(tlist, sql.SquareBrackets) def group_parenthesis(tlist): _group_matching(tlist, sql.Parenthesis) def group_case(tlist): _group_matching(tlist, sql.Case) def group_if(tlist): _group_matching(tlist, sql.If) def group_for(tlist): _group_matching(tlist, sql.For) def group_begin(tlist): _group_matching(tlist, sql.Begin) def group_typecasts(tlist): def match(token): return token.match(T.Punctuation, '::') def valid(token): return token is not None def post(tlist, pidx, tidx, nidx): return pidx, nidx valid_prev = valid_next = valid _group(tlist, sql.Identifier, match, valid_prev, valid_next, post) def group_tzcasts(tlist): def match(token): return token.ttype == T.Keyword.TZCast def valid_prev(token): return token is not None def valid_next(token): return token is not None and ( token.is_whitespace or token.match(T.Keyword, 'AS') or token.match(*sql.TypedLiteral.M_CLOSE) ) def post(tlist, pidx, tidx, nidx): return pidx, nidx _group(tlist, sql.Identifier, match, valid_prev, valid_next, post) def group_typed_literal(tlist): # definitely not complete, see e.g.: # https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/interval-literal-syntax # https://docs.microsoft.com/en-us/sql/odbc/reference/appendixes/interval-literals # https://www.postgresql.org/docs/9.1/datatype-datetime.html # https://www.postgresql.org/docs/9.1/functions-datetime.html def match(token): return imt(token, m=sql.TypedLiteral.M_OPEN) def match_to_extend(token): return isinstance(token, sql.TypedLiteral) def valid_prev(token): return token is not None def valid_next(token): return token is not None and token.match(*sql.TypedLiteral.M_CLOSE) def valid_final(token): return token is not None and token.match(*sql.TypedLiteral.M_EXTEND) def post(tlist, pidx, tidx, nidx): return tidx, nidx _group(tlist, sql.TypedLiteral, match, valid_prev, valid_next, post, extend=False) _group(tlist, sql.TypedLiteral, match_to_extend, valid_prev, valid_final, post, extend=True) def group_period(tlist): def match(token): return token.match(T.Punctuation, '.') def valid_prev(token): sqlcls = sql.SquareBrackets, sql.Identifier ttypes = T.Name, T.String.Symbol return imt(token, i=sqlcls, t=ttypes) def valid_next(token): # issue261, allow invalid next token return True def post(tlist, pidx, tidx, nidx): # next_ validation is being performed here. issue261 sqlcls = sql.SquareBrackets, sql.Function ttypes = T.Name, T.String.Symbol, T.Wildcard next_ = tlist[nidx] if nidx is not None else None valid_next = imt(next_, i=sqlcls, t=ttypes) return (pidx, nidx) if valid_next else (pidx, tidx) _group(tlist, sql.Identifier, match, valid_prev, valid_next, post) def group_as(tlist): def match(token): return token.is_keyword and token.normalized == 'AS' def valid_prev(token): return token.normalized == 'NULL' or not token.is_keyword def valid_next(token): ttypes = T.DML, T.DDL, T.CTE return not imt(token, t=ttypes) and token is not None def post(tlist, pidx, tidx, nidx): return pidx, nidx _group(tlist, sql.Identifier, match, valid_prev, valid_next, post) def group_assignment(tlist): def match(token): return token.match(T.Assignment, ':=') def valid(token): return token is not None and token.ttype not in (T.Keyword) def post(tlist, pidx, tidx, nidx): m_semicolon = T.Punctuation, ';' snidx, _ = tlist.token_next_by(m=m_semicolon, idx=nidx) nidx = snidx or nidx return pidx, nidx valid_prev = valid_next = valid _group(tlist, sql.Assignment, match, valid_prev, valid_next, post) def group_comparison(tlist): sqlcls = (sql.Parenthesis, sql.Function, sql.Identifier, sql.Operation, sql.TypedLiteral) ttypes = T_NUMERICAL + T_STRING + T_NAME def match(token): return token.ttype == T.Operator.Comparison def valid(token): if imt(token, t=ttypes, i=sqlcls): return True elif token and token.is_keyword and token.normalized == 'NULL': return True else: return False def post(tlist, pidx, tidx, nidx): return pidx, nidx valid_prev = valid_next = valid _group(tlist, sql.Comparison, match, valid_prev, valid_next, post, extend=False) @recurse(sql.Identifier) def group_identifier(tlist): ttypes = (T.String.Symbol, T.Name) tidx, token = tlist.token_next_by(t=ttypes) while token: tlist.group_tokens(sql.Identifier, tidx, tidx) tidx, token = tlist.token_next_by(t=ttypes, idx=tidx) def group_arrays(tlist): sqlcls = sql.SquareBrackets, sql.Identifier, sql.Function ttypes = T.Name, T.String.Symbol def match(token): return isinstance(token, sql.SquareBrackets) def valid_prev(token): return imt(token, i=sqlcls, t=ttypes) def valid_next(token): return True def post(tlist, pidx, tidx, nidx): return pidx, tidx _group(tlist, sql.Identifier, match, valid_prev, valid_next, post, extend=True, recurse=False) def group_operator(tlist): ttypes = T_NUMERICAL + T_STRING + T_NAME sqlcls = (sql.SquareBrackets, sql.Parenthesis, sql.Function, sql.Identifier, sql.Operation, sql.TypedLiteral) def match(token): return imt(token, t=(T.Operator, T.Wildcard)) def valid(token): return imt(token, i=sqlcls, t=ttypes) \ or (token and token.match( T.Keyword, ('CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP'))) def post(tlist, pidx, tidx, nidx): tlist[tidx].ttype = T.Operator return pidx, nidx valid_prev = valid_next = valid _group(tlist, sql.Operation, match, valid_prev, valid_next, post, extend=False) def group_identifier_list(tlist): m_role = T.Keyword, ('null', 'role') sqlcls = (sql.Function, sql.Case, sql.Identifier, sql.Comparison, sql.IdentifierList, sql.Operation) ttypes = (T_NUMERICAL + T_STRING + T_NAME + (T.Keyword, T.Comment, T.Wildcard)) def match(token): return token.match(T.Punctuation, ',') def valid(token): return imt(token, i=sqlcls, m=m_role, t=ttypes) def post(tlist, pidx, tidx, nidx): return pidx, nidx valid_prev = valid_next = valid _group(tlist, sql.IdentifierList, match, valid_prev, valid_next, post, extend=True) @recurse(sql.Comment) def group_comments(tlist): tidx, token = tlist.token_next_by(t=T.Comment) while token: eidx, end = tlist.token_not_matching( lambda tk: imt(tk, t=T.Comment) or tk.is_whitespace, idx=tidx) if end is not None: eidx, end = tlist.token_prev(eidx, skip_ws=False) tlist.group_tokens(sql.Comment, tidx, eidx) tidx, token = tlist.token_next_by(t=T.Comment, idx=tidx) @recurse(sql.Where) def group_where(tlist): tidx, token = tlist.token_next_by(m=sql.Where.M_OPEN) while token: eidx, end = tlist.token_next_by(m=sql.Where.M_CLOSE, idx=tidx) if end is None: end = tlist._groupable_tokens[-1] else: end = tlist.tokens[eidx - 1] # TODO: convert this to eidx instead of end token. # i think above values are len(tlist) and eidx-1 eidx = tlist.token_index(end) tlist.group_tokens(sql.Where, tidx, eidx) tidx, token = tlist.token_next_by(m=sql.Where.M_OPEN, idx=tidx) @recurse() def group_aliased(tlist): I_ALIAS = (sql.Parenthesis, sql.Function, sql.Case, sql.Identifier, sql.Operation, sql.Comparison) tidx, token = tlist.token_next_by(i=I_ALIAS, t=T.Number) while token: nidx, next_ = tlist.token_next(tidx) if isinstance(next_, sql.Identifier): tlist.group_tokens(sql.Identifier, tidx, nidx, extend=True) tidx, token = tlist.token_next_by(i=I_ALIAS, t=T.Number, idx=tidx) @recurse(sql.Function) def group_functions(tlist): has_create = False has_table = False has_as = False for tmp_token in tlist.tokens: if tmp_token.value.upper() == 'CREATE': has_create = True if tmp_token.value.upper() == 'TABLE': has_table = True if tmp_token.value == 'AS': has_as = True if has_create and has_table and not has_as: return tidx, token = tlist.token_next_by(t=T.Name) while token: nidx, next_ = tlist.token_next(tidx) if isinstance(next_, sql.Parenthesis): tlist.group_tokens(sql.Function, tidx, nidx) tidx, token = tlist.token_next_by(t=T.Name, idx=tidx) def group_order(tlist): """Group together Identifier and Asc/Desc token""" tidx, token = tlist.token_next_by(t=T.Keyword.Order) while token: pidx, prev_ = tlist.token_prev(tidx) if imt(prev_, i=sql.Identifier, t=T.Number): tlist.group_tokens(sql.Identifier, pidx, tidx) tidx = pidx tidx, token = tlist.token_next_by(t=T.Keyword.Order, idx=tidx) @recurse() def align_comments(tlist): tidx, token = tlist.token_next_by(i=sql.Comment) while token: pidx, prev_ = tlist.token_prev(tidx) if isinstance(prev_, sql.TokenList): tlist.group_tokens(sql.TokenList, pidx, tidx, extend=True) tidx = pidx tidx, token = tlist.token_next_by(i=sql.Comment, idx=tidx) def group_values(tlist): tidx, token = tlist.token_next_by(m=(T.Keyword, 'VALUES')) start_idx = tidx end_idx = -1 while token: if isinstance(token, sql.Parenthesis): end_idx = tidx tidx, token = tlist.token_next(tidx) if end_idx != -1: tlist.group_tokens(sql.Values, start_idx, end_idx, extend=True) def group(stmt): for func in [ group_comments, # _group_matching group_brackets, group_parenthesis, group_case, group_if, group_for, group_begin, group_functions, group_where, group_period, group_arrays, group_identifier, group_order, group_typecasts, group_tzcasts, group_typed_literal, group_operator, group_comparison, group_as, group_aliased, group_assignment, align_comments, group_identifier_list, group_values, ]: func(stmt) return stmt def _group(tlist, cls, match, valid_prev=lambda t: True, valid_next=lambda t: True, post=None, extend=True, recurse=True ): """Groups together tokens that are joined by a middle token. i.e. x < y""" tidx_offset = 0 pidx, prev_ = None, None for idx, token in enumerate(list(tlist)): tidx = idx - tidx_offset if tidx < 0: # tidx shouldn't get negative continue if token.is_whitespace: continue if recurse and token.is_group and not isinstance(token, cls): _group(token, cls, match, valid_prev, valid_next, post, extend) if match(token): nidx, next_ = tlist.token_next(tidx) if prev_ and valid_prev(prev_) and valid_next(next_): from_idx, to_idx = post(tlist, pidx, tidx, nidx) grp = tlist.group_tokens(cls, from_idx, to_idx, extend=extend) tidx_offset += to_idx - from_idx pidx, prev_ = from_idx, grp continue pidx, prev_ = tidx, token sqlparse-0.4.4/sqlparse/engine/statement_splitter.py000066400000000000000000000072561454634525500230030ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import sql, tokens as T class StatementSplitter: """Filter that split stream at individual statements""" def __init__(self): self._reset() def _reset(self): """Set the filter attributes to its default values""" self._in_declare = False self._is_create = False self._begin_depth = 0 self.consume_ws = False self.tokens = [] self.level = 0 def _change_splitlevel(self, ttype, value): """Get the new split level (increase, decrease or remain equal)""" # parenthesis increase/decrease a level if ttype is T.Punctuation and value == '(': return 1 elif ttype is T.Punctuation and value == ')': return -1 elif ttype not in T.Keyword: # if normal token return return 0 # Everything after here is ttype = T.Keyword # Also to note, once entered an If statement you are done and basically # returning unified = value.upper() # three keywords begin with CREATE, but only one of them is DDL # DDL Create though can contain more words such as "or replace" if ttype is T.Keyword.DDL and unified.startswith('CREATE'): self._is_create = True return 0 # can have nested declare inside of being... if unified == 'DECLARE' and self._is_create and self._begin_depth == 0: self._in_declare = True return 1 if unified == 'BEGIN': self._begin_depth += 1 if self._is_create: # FIXME(andi): This makes no sense. return 1 return 0 # Should this respect a preceding BEGIN? # In CASE ... WHEN ... END this results in a split level -1. # Would having multiple CASE WHEN END and a Assignment Operator # cause the statement to cut off prematurely? if unified == 'END': self._begin_depth = max(0, self._begin_depth - 1) return -1 if (unified in ('IF', 'FOR', 'WHILE', 'CASE') and self._is_create and self._begin_depth > 0): return 1 if unified in ('END IF', 'END FOR', 'END WHILE'): return -1 # Default return 0 def process(self, stream): """Process the stream""" EOS_TTYPE = T.Whitespace, T.Comment.Single # Run over all stream tokens for ttype, value in stream: # Yield token if we finished a statement and there's no whitespaces # It will count newline token as a non whitespace. In this context # whitespace ignores newlines. # why don't multi line comments also count? if self.consume_ws and ttype not in EOS_TTYPE: yield sql.Statement(self.tokens) # Reset filter and prepare to process next statement self._reset() # Change current split level (increase, decrease or remain equal) self.level += self._change_splitlevel(ttype, value) # Append the token to the current statement self.tokens.append(sql.Token(ttype, value)) # Check if we get the end of a statement if self.level <= 0 and ttype is T.Punctuation and value == ';': self.consume_ws = True # Yield pending statement (if any) if self.tokens and not all(t.is_whitespace for t in self.tokens): yield sql.Statement(self.tokens) sqlparse-0.4.4/sqlparse/exceptions.py000066400000000000000000000005261454634525500177560ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """Exceptions used in this package.""" class SQLParseError(Exception): """Base class for exceptions in this module.""" sqlparse-0.4.4/sqlparse/filters/000077500000000000000000000000001454634525500166705ustar00rootroot00000000000000sqlparse-0.4.4/sqlparse/filters/__init__.py000066400000000000000000000023321454634525500210010ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse.filters.others import SerializerUnicode from sqlparse.filters.others import StripCommentsFilter from sqlparse.filters.others import StripWhitespaceFilter from sqlparse.filters.others import SpacesAroundOperatorsFilter from sqlparse.filters.output import OutputPHPFilter from sqlparse.filters.output import OutputPythonFilter from sqlparse.filters.tokens import KeywordCaseFilter from sqlparse.filters.tokens import IdentifierCaseFilter from sqlparse.filters.tokens import TruncateStringFilter from sqlparse.filters.reindent import ReindentFilter from sqlparse.filters.right_margin import RightMarginFilter from sqlparse.filters.aligned_indent import AlignedIndentFilter __all__ = [ 'SerializerUnicode', 'StripCommentsFilter', 'StripWhitespaceFilter', 'SpacesAroundOperatorsFilter', 'OutputPHPFilter', 'OutputPythonFilter', 'KeywordCaseFilter', 'IdentifierCaseFilter', 'TruncateStringFilter', 'ReindentFilter', 'RightMarginFilter', 'AlignedIndentFilter', ] sqlparse-0.4.4/sqlparse/filters/aligned_indent.py000066400000000000000000000117661454634525500222210ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import sql, tokens as T from sqlparse.utils import offset, indent class AlignedIndentFilter: join_words = (r'((LEFT\s+|RIGHT\s+|FULL\s+)?' r'(INNER\s+|OUTER\s+|STRAIGHT\s+)?|' r'(CROSS\s+|NATURAL\s+)?)?JOIN\b') by_words = r'(GROUP|ORDER)\s+BY\b' split_words = ('FROM', join_words, 'ON', by_words, 'WHERE', 'AND', 'OR', 'HAVING', 'LIMIT', 'UNION', 'VALUES', 'SET', 'BETWEEN', 'EXCEPT') def __init__(self, char=' ', n='\n'): self.n = n self.offset = 0 self.indent = 0 self.char = char self._max_kwd_len = len('select') def nl(self, offset=1): # offset = 1 represent a single space after SELECT offset = -len(offset) if not isinstance(offset, int) else offset # add two for the space and parenthesis indent = self.indent * (2 + self._max_kwd_len) return sql.Token(T.Whitespace, self.n + self.char * ( self._max_kwd_len + offset + indent + self.offset)) def _process_statement(self, tlist): if len(tlist.tokens) > 0 and tlist.tokens[0].is_whitespace \ and self.indent == 0: tlist.tokens.pop(0) # process the main query body self._process(sql.TokenList(tlist.tokens)) def _process_parenthesis(self, tlist): # if this isn't a subquery, don't re-indent _, token = tlist.token_next_by(m=(T.DML, 'SELECT')) if token is not None: with indent(self): tlist.insert_after(tlist[0], self.nl('SELECT')) # process the inside of the parenthesis self._process_default(tlist) # de-indent last parenthesis tlist.insert_before(tlist[-1], self.nl()) def _process_identifierlist(self, tlist): # columns being selected identifiers = list(tlist.get_identifiers()) identifiers.pop(0) [tlist.insert_before(token, self.nl()) for token in identifiers] self._process_default(tlist) def _process_case(self, tlist): offset_ = len('case ') + len('when ') cases = tlist.get_cases(skip_ws=True) # align the end as well end_token = tlist.token_next_by(m=(T.Keyword, 'END'))[1] cases.append((None, [end_token])) condition_width = [len(' '.join(map(str, cond))) if cond else 0 for cond, _ in cases] max_cond_width = max(condition_width) for i, (cond, value) in enumerate(cases): # cond is None when 'else or end' stmt = cond[0] if cond else value[0] if i > 0: tlist.insert_before(stmt, self.nl(offset_ - len(str(stmt)))) if cond: ws = sql.Token(T.Whitespace, self.char * ( max_cond_width - condition_width[i])) tlist.insert_after(cond[-1], ws) def _next_token(self, tlist, idx=-1): split_words = T.Keyword, self.split_words, True tidx, token = tlist.token_next_by(m=split_words, idx=idx) # treat "BETWEEN x and y" as a single statement if token and token.normalized == 'BETWEEN': tidx, token = self._next_token(tlist, tidx) if token and token.normalized == 'AND': tidx, token = self._next_token(tlist, tidx) return tidx, token def _split_kwds(self, tlist): tidx, token = self._next_token(tlist) while token: # joins, group/order by are special case. only consider the first # word as aligner if ( token.match(T.Keyword, self.join_words, regex=True) or token.match(T.Keyword, self.by_words, regex=True) ): token_indent = token.value.split()[0] else: token_indent = str(token) tlist.insert_before(token, self.nl(token_indent)) tidx += 1 tidx, token = self._next_token(tlist, tidx) def _process_default(self, tlist): self._split_kwds(tlist) # process any sub-sub statements for sgroup in tlist.get_sublists(): idx = tlist.token_index(sgroup) pidx, prev_ = tlist.token_prev(idx) # HACK: make "group/order by" work. Longer than max_len. offset_ = 3 if ( prev_ and prev_.match(T.Keyword, self.by_words, regex=True) ) else 0 with offset(self, offset_): self._process(sgroup) def _process(self, tlist): func_name = '_process_{cls}'.format(cls=type(tlist).__name__) func = getattr(self, func_name.lower(), self._process_default) func(tlist) def process(self, stmt): self._process(stmt) return stmt sqlparse-0.4.4/sqlparse/filters/others.py000066400000000000000000000120741454634525500205520ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause import re from sqlparse import sql, tokens as T from sqlparse.utils import split_unquoted_newlines class StripCommentsFilter: @staticmethod def _process(tlist): def get_next_comment(): # TODO(andi) Comment types should be unified, see related issue38 return tlist.token_next_by(i=sql.Comment, t=T.Comment) def _get_insert_token(token): """Returns either a whitespace or the line breaks from token.""" # See issue484 why line breaks should be preserved. # Note: The actual value for a line break is replaced by \n # in SerializerUnicode which will be executed in the # postprocessing state. m = re.search(r'((\r|\n)+) *$', token.value) if m is not None: return sql.Token(T.Whitespace.Newline, m.groups()[0]) else: return sql.Token(T.Whitespace, ' ') tidx, token = get_next_comment() while token: pidx, prev_ = tlist.token_prev(tidx, skip_ws=False) nidx, next_ = tlist.token_next(tidx, skip_ws=False) # Replace by whitespace if prev and next exist and if they're not # whitespaces. This doesn't apply if prev or next is a parenthesis. if (prev_ is None or next_ is None or prev_.is_whitespace or prev_.match(T.Punctuation, '(') or next_.is_whitespace or next_.match(T.Punctuation, ')')): # Insert a whitespace to ensure the following SQL produces # a valid SQL (see #425). if prev_ is not None and not prev_.match(T.Punctuation, '('): tlist.tokens.insert(tidx, _get_insert_token(token)) tlist.tokens.remove(token) else: tlist.tokens[tidx] = _get_insert_token(token) tidx, token = get_next_comment() def process(self, stmt): [self.process(sgroup) for sgroup in stmt.get_sublists()] StripCommentsFilter._process(stmt) return stmt class StripWhitespaceFilter: def _stripws(self, tlist): func_name = '_stripws_{cls}'.format(cls=type(tlist).__name__) func = getattr(self, func_name.lower(), self._stripws_default) func(tlist) @staticmethod def _stripws_default(tlist): last_was_ws = False is_first_char = True for token in tlist.tokens: if token.is_whitespace: token.value = '' if last_was_ws or is_first_char else ' ' last_was_ws = token.is_whitespace is_first_char = False def _stripws_identifierlist(self, tlist): # Removes newlines before commas, see issue140 last_nl = None for token in list(tlist.tokens): if last_nl and token.ttype is T.Punctuation and token.value == ',': tlist.tokens.remove(last_nl) last_nl = token if token.is_whitespace else None # next_ = tlist.token_next(token, skip_ws=False) # if (next_ and not next_.is_whitespace and # token.ttype is T.Punctuation and token.value == ','): # tlist.insert_after(token, sql.Token(T.Whitespace, ' ')) return self._stripws_default(tlist) def _stripws_parenthesis(self, tlist): while tlist.tokens[1].is_whitespace: tlist.tokens.pop(1) while tlist.tokens[-2].is_whitespace: tlist.tokens.pop(-2) self._stripws_default(tlist) def process(self, stmt, depth=0): [self.process(sgroup, depth + 1) for sgroup in stmt.get_sublists()] self._stripws(stmt) if depth == 0 and stmt.tokens and stmt.tokens[-1].is_whitespace: stmt.tokens.pop(-1) return stmt class SpacesAroundOperatorsFilter: @staticmethod def _process(tlist): ttypes = (T.Operator, T.Comparison) tidx, token = tlist.token_next_by(t=ttypes) while token: nidx, next_ = tlist.token_next(tidx, skip_ws=False) if next_ and next_.ttype != T.Whitespace: tlist.insert_after(tidx, sql.Token(T.Whitespace, ' ')) pidx, prev_ = tlist.token_prev(tidx, skip_ws=False) if prev_ and prev_.ttype != T.Whitespace: tlist.insert_before(tidx, sql.Token(T.Whitespace, ' ')) tidx += 1 # has to shift since token inserted before it # assert tlist.token_index(token) == tidx tidx, token = tlist.token_next_by(t=ttypes, idx=tidx) def process(self, stmt): [self.process(sgroup) for sgroup in stmt.get_sublists()] SpacesAroundOperatorsFilter._process(stmt) return stmt # --------------------------- # postprocess class SerializerUnicode: @staticmethod def process(stmt): lines = split_unquoted_newlines(stmt) return '\n'.join(line.rstrip() for line in lines) sqlparse-0.4.4/sqlparse/filters/output.py000066400000000000000000000076411454634525500206120ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import sql, tokens as T class OutputFilter: varname_prefix = '' def __init__(self, varname='sql'): self.varname = self.varname_prefix + varname self.count = 0 def _process(self, stream, varname, has_nl): raise NotImplementedError def process(self, stmt): self.count += 1 if self.count > 1: varname = '{f.varname}{f.count}'.format(f=self) else: varname = self.varname has_nl = len(str(stmt).strip().splitlines()) > 1 stmt.tokens = self._process(stmt.tokens, varname, has_nl) return stmt class OutputPythonFilter(OutputFilter): def _process(self, stream, varname, has_nl): # SQL query assignation to varname if self.count > 1: yield sql.Token(T.Whitespace, '\n') yield sql.Token(T.Name, varname) yield sql.Token(T.Whitespace, ' ') yield sql.Token(T.Operator, '=') yield sql.Token(T.Whitespace, ' ') if has_nl: yield sql.Token(T.Operator, '(') yield sql.Token(T.Text, "'") # Print the tokens on the quote for token in stream: # Token is a new line separator if token.is_whitespace and '\n' in token.value: # Close quote and add a new line yield sql.Token(T.Text, " '") yield sql.Token(T.Whitespace, '\n') # Quote header on secondary lines yield sql.Token(T.Whitespace, ' ' * (len(varname) + 4)) yield sql.Token(T.Text, "'") # Indentation after_lb = token.value.split('\n', 1)[1] if after_lb: yield sql.Token(T.Whitespace, after_lb) continue # Token has escape chars elif "'" in token.value: token.value = token.value.replace("'", "\\'") # Put the token yield sql.Token(T.Text, token.value) # Close quote yield sql.Token(T.Text, "'") if has_nl: yield sql.Token(T.Operator, ')') class OutputPHPFilter(OutputFilter): varname_prefix = '$' def _process(self, stream, varname, has_nl): # SQL query assignation to varname (quote header) if self.count > 1: yield sql.Token(T.Whitespace, '\n') yield sql.Token(T.Name, varname) yield sql.Token(T.Whitespace, ' ') if has_nl: yield sql.Token(T.Whitespace, ' ') yield sql.Token(T.Operator, '=') yield sql.Token(T.Whitespace, ' ') yield sql.Token(T.Text, '"') # Print the tokens on the quote for token in stream: # Token is a new line separator if token.is_whitespace and '\n' in token.value: # Close quote and add a new line yield sql.Token(T.Text, ' ";') yield sql.Token(T.Whitespace, '\n') # Quote header on secondary lines yield sql.Token(T.Name, varname) yield sql.Token(T.Whitespace, ' ') yield sql.Token(T.Operator, '.=') yield sql.Token(T.Whitespace, ' ') yield sql.Token(T.Text, '"') # Indentation after_lb = token.value.split('\n', 1)[1] if after_lb: yield sql.Token(T.Whitespace, after_lb) continue # Token has escape chars elif '"' in token.value: token.value = token.value.replace('"', '\\"') # Put the token yield sql.Token(T.Text, token.value) # Close quote yield sql.Token(T.Text, '"') yield sql.Token(T.Punctuation, ';') sqlparse-0.4.4/sqlparse/filters/reindent.py000066400000000000000000000225151454634525500210570ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import sql, tokens as T from sqlparse.utils import offset, indent class ReindentFilter: def __init__(self, width=2, char=' ', wrap_after=0, n='\n', comma_first=False, indent_after_first=False, indent_columns=False): self.n = n self.width = width self.char = char self.indent = 1 if indent_after_first else 0 self.offset = 0 self.wrap_after = wrap_after self.comma_first = comma_first self.indent_columns = indent_columns self._curr_stmt = None self._last_stmt = None self._last_func = None def _flatten_up_to_token(self, token): """Yields all tokens up to token but excluding current.""" if token.is_group: token = next(token.flatten()) for t in self._curr_stmt.flatten(): if t == token: break yield t @property def leading_ws(self): return self.offset + self.indent * self.width def _get_offset(self, token): raw = ''.join(map(str, self._flatten_up_to_token(token))) line = (raw or '\n').splitlines()[-1] # Now take current offset into account and return relative offset. return len(line) - len(self.char * self.leading_ws) def nl(self, offset=0): return sql.Token( T.Whitespace, self.n + self.char * max(0, self.leading_ws + offset)) def _next_token(self, tlist, idx=-1): split_words = ('FROM', 'STRAIGHT_JOIN$', 'JOIN$', 'AND', 'OR', 'GROUP BY', 'ORDER BY', 'UNION', 'VALUES', 'SET', 'BETWEEN', 'EXCEPT', 'HAVING', 'LIMIT') m_split = T.Keyword, split_words, True tidx, token = tlist.token_next_by(m=m_split, idx=idx) if token and token.normalized == 'BETWEEN': tidx, token = self._next_token(tlist, tidx) if token and token.normalized == 'AND': tidx, token = self._next_token(tlist, tidx) return tidx, token def _split_kwds(self, tlist): tidx, token = self._next_token(tlist) while token: pidx, prev_ = tlist.token_prev(tidx, skip_ws=False) uprev = str(prev_) if prev_ and prev_.is_whitespace: del tlist.tokens[pidx] tidx -= 1 if not (uprev.endswith('\n') or uprev.endswith('\r')): tlist.insert_before(tidx, self.nl()) tidx += 1 tidx, token = self._next_token(tlist, tidx) def _split_statements(self, tlist): ttypes = T.Keyword.DML, T.Keyword.DDL tidx, token = tlist.token_next_by(t=ttypes) while token: pidx, prev_ = tlist.token_prev(tidx, skip_ws=False) if prev_ and prev_.is_whitespace: del tlist.tokens[pidx] tidx -= 1 # only break if it's not the first token if prev_: tlist.insert_before(tidx, self.nl()) tidx += 1 tidx, token = tlist.token_next_by(t=ttypes, idx=tidx) def _process(self, tlist): func_name = '_process_{cls}'.format(cls=type(tlist).__name__) func = getattr(self, func_name.lower(), self._process_default) func(tlist) def _process_where(self, tlist): tidx, token = tlist.token_next_by(m=(T.Keyword, 'WHERE')) if not token: return # issue121, errors in statement fixed?? tlist.insert_before(tidx, self.nl()) with indent(self): self._process_default(tlist) def _process_parenthesis(self, tlist): ttypes = T.Keyword.DML, T.Keyword.DDL _, is_dml_dll = tlist.token_next_by(t=ttypes) fidx, first = tlist.token_next_by(m=sql.Parenthesis.M_OPEN) if first is None: return with indent(self, 1 if is_dml_dll else 0): tlist.tokens.insert(0, self.nl()) if is_dml_dll else None with offset(self, self._get_offset(first) + 1): self._process_default(tlist, not is_dml_dll) def _process_function(self, tlist): self._last_func = tlist[0] self._process_default(tlist) def _process_identifierlist(self, tlist): identifiers = list(tlist.get_identifiers()) if self.indent_columns: first = next(identifiers[0].flatten()) num_offset = 1 if self.char == '\t' else self.width else: first = next(identifiers.pop(0).flatten()) num_offset = 1 if self.char == '\t' else self._get_offset(first) if not tlist.within(sql.Function) and not tlist.within(sql.Values): with offset(self, num_offset): position = 0 for token in identifiers: # Add 1 for the "," separator position += len(token.value) + 1 if position > (self.wrap_after - self.offset): adjust = 0 if self.comma_first: adjust = -2 _, comma = tlist.token_prev( tlist.token_index(token)) if comma is None: continue token = comma tlist.insert_before(token, self.nl(offset=adjust)) if self.comma_first: _, ws = tlist.token_next( tlist.token_index(token), skip_ws=False) if (ws is not None and ws.ttype is not T.Text.Whitespace): tlist.insert_after( token, sql.Token(T.Whitespace, ' ')) position = 0 else: # ensure whitespace for token in tlist: _, next_ws = tlist.token_next( tlist.token_index(token), skip_ws=False) if token.value == ',' and not next_ws.is_whitespace: tlist.insert_after( token, sql.Token(T.Whitespace, ' ')) end_at = self.offset + sum(len(i.value) + 1 for i in identifiers) adjusted_offset = 0 if (self.wrap_after > 0 and end_at > (self.wrap_after - self.offset) and self._last_func): adjusted_offset = -len(self._last_func.value) - 1 with offset(self, adjusted_offset), indent(self): if adjusted_offset < 0: tlist.insert_before(identifiers[0], self.nl()) position = 0 for token in identifiers: # Add 1 for the "," separator position += len(token.value) + 1 if (self.wrap_after > 0 and position > (self.wrap_after - self.offset)): adjust = 0 tlist.insert_before(token, self.nl(offset=adjust)) position = 0 self._process_default(tlist) def _process_case(self, tlist): iterable = iter(tlist.get_cases()) cond, _ = next(iterable) first = next(cond[0].flatten()) with offset(self, self._get_offset(tlist[0])): with offset(self, self._get_offset(first)): for cond, value in iterable: token = value[0] if cond is None else cond[0] tlist.insert_before(token, self.nl()) # Line breaks on group level are done. let's add an offset of # len "when ", "then ", "else " with offset(self, len("WHEN ")): self._process_default(tlist) end_idx, end = tlist.token_next_by(m=sql.Case.M_CLOSE) if end_idx is not None: tlist.insert_before(end_idx, self.nl()) def _process_values(self, tlist): tlist.insert_before(0, self.nl()) tidx, token = tlist.token_next_by(i=sql.Parenthesis) first_token = token while token: ptidx, ptoken = tlist.token_next_by(m=(T.Punctuation, ','), idx=tidx) if ptoken: if self.comma_first: adjust = -2 offset = self._get_offset(first_token) + adjust tlist.insert_before(ptoken, self.nl(offset)) else: tlist.insert_after(ptoken, self.nl(self._get_offset(token))) tidx, token = tlist.token_next_by(i=sql.Parenthesis, idx=tidx) def _process_default(self, tlist, stmts=True): self._split_statements(tlist) if stmts else None self._split_kwds(tlist) for sgroup in tlist.get_sublists(): self._process(sgroup) def process(self, stmt): self._curr_stmt = stmt self._process(stmt) if self._last_stmt is not None: nl = '\n' if str(self._last_stmt).endswith('\n') else '\n\n' stmt.tokens.insert(0, sql.Token(T.Whitespace, nl)) self._last_stmt = stmt return stmt sqlparse-0.4.4/sqlparse/filters/right_margin.py000066400000000000000000000030071454634525500217140ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause import re from sqlparse import sql, tokens as T # FIXME: Doesn't work class RightMarginFilter: keep_together = ( # sql.TypeCast, sql.Identifier, sql.Alias, ) def __init__(self, width=79): self.width = width self.line = '' def _process(self, group, stream): for token in stream: if token.is_whitespace and '\n' in token.value: if token.value.endswith('\n'): self.line = '' else: self.line = token.value.splitlines()[-1] elif token.is_group and type(token) not in self.keep_together: token.tokens = self._process(token, token.tokens) else: val = str(token) if len(self.line) + len(val) > self.width: match = re.search(r'^ +', self.line) if match is not None: indent = match.group() else: indent = '' yield sql.Token(T.Whitespace, '\n{}'.format(indent)) self.line = indent self.line += val yield token def process(self, group): # return # group.tokens = self._process(group, group.tokens) raise NotImplementedError sqlparse-0.4.4/sqlparse/filters/tokens.py000066400000000000000000000030211454634525500205410ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import tokens as T class _CaseFilter: ttype = None def __init__(self, case=None): case = case or 'upper' self.convert = getattr(str, case) def process(self, stream): for ttype, value in stream: if ttype in self.ttype: value = self.convert(value) yield ttype, value class KeywordCaseFilter(_CaseFilter): ttype = T.Keyword class IdentifierCaseFilter(_CaseFilter): ttype = T.Name, T.String.Symbol def process(self, stream): for ttype, value in stream: if ttype in self.ttype and value.strip()[0] != '"': value = self.convert(value) yield ttype, value class TruncateStringFilter: def __init__(self, width, char): self.width = width self.char = char def process(self, stream): for ttype, value in stream: if ttype != T.Literal.String.Single: yield ttype, value continue if value[:2] == "''": inner = value[2:-2] quote = "''" else: inner = value[1:-1] quote = "'" if len(inner) > self.width: value = ''.join((quote, inner[:self.width], self.char, quote)) yield ttype, value sqlparse-0.4.4/sqlparse/formatter.py000066400000000000000000000166161454634525500176070ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """SQL formatter""" from sqlparse import filters from sqlparse.exceptions import SQLParseError def validate_options(options): """Validates options.""" kwcase = options.get('keyword_case') if kwcase not in [None, 'upper', 'lower', 'capitalize']: raise SQLParseError('Invalid value for keyword_case: ' '{!r}'.format(kwcase)) idcase = options.get('identifier_case') if idcase not in [None, 'upper', 'lower', 'capitalize']: raise SQLParseError('Invalid value for identifier_case: ' '{!r}'.format(idcase)) ofrmt = options.get('output_format') if ofrmt not in [None, 'sql', 'python', 'php']: raise SQLParseError('Unknown output format: ' '{!r}'.format(ofrmt)) strip_comments = options.get('strip_comments', False) if strip_comments not in [True, False]: raise SQLParseError('Invalid value for strip_comments: ' '{!r}'.format(strip_comments)) space_around_operators = options.get('use_space_around_operators', False) if space_around_operators not in [True, False]: raise SQLParseError('Invalid value for use_space_around_operators: ' '{!r}'.format(space_around_operators)) strip_ws = options.get('strip_whitespace', False) if strip_ws not in [True, False]: raise SQLParseError('Invalid value for strip_whitespace: ' '{!r}'.format(strip_ws)) truncate_strings = options.get('truncate_strings') if truncate_strings is not None: try: truncate_strings = int(truncate_strings) except (ValueError, TypeError): raise SQLParseError('Invalid value for truncate_strings: ' '{!r}'.format(truncate_strings)) if truncate_strings <= 1: raise SQLParseError('Invalid value for truncate_strings: ' '{!r}'.format(truncate_strings)) options['truncate_strings'] = truncate_strings options['truncate_char'] = options.get('truncate_char', '[...]') indent_columns = options.get('indent_columns', False) if indent_columns not in [True, False]: raise SQLParseError('Invalid value for indent_columns: ' '{!r}'.format(indent_columns)) elif indent_columns: options['reindent'] = True # enforce reindent options['indent_columns'] = indent_columns reindent = options.get('reindent', False) if reindent not in [True, False]: raise SQLParseError('Invalid value for reindent: ' '{!r}'.format(reindent)) elif reindent: options['strip_whitespace'] = True reindent_aligned = options.get('reindent_aligned', False) if reindent_aligned not in [True, False]: raise SQLParseError('Invalid value for reindent_aligned: ' '{!r}'.format(reindent)) elif reindent_aligned: options['strip_whitespace'] = True indent_after_first = options.get('indent_after_first', False) if indent_after_first not in [True, False]: raise SQLParseError('Invalid value for indent_after_first: ' '{!r}'.format(indent_after_first)) options['indent_after_first'] = indent_after_first indent_tabs = options.get('indent_tabs', False) if indent_tabs not in [True, False]: raise SQLParseError('Invalid value for indent_tabs: ' '{!r}'.format(indent_tabs)) elif indent_tabs: options['indent_char'] = '\t' else: options['indent_char'] = ' ' indent_width = options.get('indent_width', 2) try: indent_width = int(indent_width) except (TypeError, ValueError): raise SQLParseError('indent_width requires an integer') if indent_width < 1: raise SQLParseError('indent_width requires a positive integer') options['indent_width'] = indent_width wrap_after = options.get('wrap_after', 0) try: wrap_after = int(wrap_after) except (TypeError, ValueError): raise SQLParseError('wrap_after requires an integer') if wrap_after < 0: raise SQLParseError('wrap_after requires a positive integer') options['wrap_after'] = wrap_after comma_first = options.get('comma_first', False) if comma_first not in [True, False]: raise SQLParseError('comma_first requires a boolean value') options['comma_first'] = comma_first right_margin = options.get('right_margin') if right_margin is not None: try: right_margin = int(right_margin) except (TypeError, ValueError): raise SQLParseError('right_margin requires an integer') if right_margin < 10: raise SQLParseError('right_margin requires an integer > 10') options['right_margin'] = right_margin return options def build_filter_stack(stack, options): """Setup and return a filter stack. Args: stack: :class:`~sqlparse.filters.FilterStack` instance options: Dictionary with options validated by validate_options. """ # Token filter if options.get('keyword_case'): stack.preprocess.append( filters.KeywordCaseFilter(options['keyword_case'])) if options.get('identifier_case'): stack.preprocess.append( filters.IdentifierCaseFilter(options['identifier_case'])) if options.get('truncate_strings'): stack.preprocess.append(filters.TruncateStringFilter( width=options['truncate_strings'], char=options['truncate_char'])) if options.get('use_space_around_operators', False): stack.enable_grouping() stack.stmtprocess.append(filters.SpacesAroundOperatorsFilter()) # After grouping if options.get('strip_comments'): stack.enable_grouping() stack.stmtprocess.append(filters.StripCommentsFilter()) if options.get('strip_whitespace') or options.get('reindent'): stack.enable_grouping() stack.stmtprocess.append(filters.StripWhitespaceFilter()) if options.get('reindent'): stack.enable_grouping() stack.stmtprocess.append( filters.ReindentFilter( char=options['indent_char'], width=options['indent_width'], indent_after_first=options['indent_after_first'], indent_columns=options['indent_columns'], wrap_after=options['wrap_after'], comma_first=options['comma_first'])) if options.get('reindent_aligned', False): stack.enable_grouping() stack.stmtprocess.append( filters.AlignedIndentFilter(char=options['indent_char'])) if options.get('right_margin'): stack.enable_grouping() stack.stmtprocess.append( filters.RightMarginFilter(width=options['right_margin'])) # Serializer if options.get('output_format'): frmt = options['output_format'] if frmt.lower() == 'php': fltr = filters.OutputPHPFilter() elif frmt.lower() == 'python': fltr = filters.OutputPythonFilter() else: fltr = None if fltr is not None: stack.postprocess.append(fltr) return stack sqlparse-0.4.4/sqlparse/keywords.py000066400000000000000000000714051454634525500174500ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause from sqlparse import tokens # object() only supports "is" and is useful as a marker # use this marker to specify that the given regex in SQL_REGEX # shall be processed further through a lookup in the KEYWORDS dictionaries PROCESS_AS_KEYWORD = object() SQL_REGEX = [ (r'(--|# )\+.*?(\r\n|\r|\n|$)', tokens.Comment.Single.Hint), (r'/\*\+[\s\S]*?\*/', tokens.Comment.Multiline.Hint), (r'(--|# ).*?(\r\n|\r|\n|$)', tokens.Comment.Single), (r'/\*[\s\S]*?\*/', tokens.Comment.Multiline), (r'(\r\n|\r|\n)', tokens.Newline), (r'\s+?', tokens.Whitespace), (r':=', tokens.Assignment), (r'::', tokens.Punctuation), (r'\*', tokens.Wildcard), (r"`(``|[^`])*`", tokens.Name), (r"´(´´|[^´])*´", tokens.Name), (r'((?=~!]+', tokens.Operator.Comparison), (r'[+/@#%^&|^-]+', tokens.Operator), ] KEYWORDS = { 'ABORT': tokens.Keyword, 'ABS': tokens.Keyword, 'ABSOLUTE': tokens.Keyword, 'ACCESS': tokens.Keyword, 'ADA': tokens.Keyword, 'ADD': tokens.Keyword, 'ADMIN': tokens.Keyword, 'AFTER': tokens.Keyword, 'AGGREGATE': tokens.Keyword, 'ALIAS': tokens.Keyword, 'ALL': tokens.Keyword, 'ALLOCATE': tokens.Keyword, 'ANALYSE': tokens.Keyword, 'ANALYZE': tokens.Keyword, 'ANY': tokens.Keyword, 'ARRAYLEN': tokens.Keyword, 'ARE': tokens.Keyword, 'ASC': tokens.Keyword.Order, 'ASENSITIVE': tokens.Keyword, 'ASSERTION': tokens.Keyword, 'ASSIGNMENT': tokens.Keyword, 'ASYMMETRIC': tokens.Keyword, 'AT': tokens.Keyword, 'ATOMIC': tokens.Keyword, 'AUDIT': tokens.Keyword, 'AUTHORIZATION': tokens.Keyword, 'AUTO_INCREMENT': tokens.Keyword, 'AVG': tokens.Keyword, 'BACKWARD': tokens.Keyword, 'BEFORE': tokens.Keyword, 'BEGIN': tokens.Keyword, 'BETWEEN': tokens.Keyword, 'BITVAR': tokens.Keyword, 'BIT_LENGTH': tokens.Keyword, 'BOTH': tokens.Keyword, 'BREADTH': tokens.Keyword, # 'C': tokens.Keyword, # most likely this is an alias 'CACHE': tokens.Keyword, 'CALL': tokens.Keyword, 'CALLED': tokens.Keyword, 'CARDINALITY': tokens.Keyword, 'CASCADE': tokens.Keyword, 'CASCADED': tokens.Keyword, 'CAST': tokens.Keyword, 'CATALOG': tokens.Keyword, 'CATALOG_NAME': tokens.Keyword, 'CHAIN': tokens.Keyword, 'CHARACTERISTICS': tokens.Keyword, 'CHARACTER_LENGTH': tokens.Keyword, 'CHARACTER_SET_CATALOG': tokens.Keyword, 'CHARACTER_SET_NAME': tokens.Keyword, 'CHARACTER_SET_SCHEMA': tokens.Keyword, 'CHAR_LENGTH': tokens.Keyword, 'CHARSET': tokens.Keyword, 'CHECK': tokens.Keyword, 'CHECKED': tokens.Keyword, 'CHECKPOINT': tokens.Keyword, 'CLASS': tokens.Keyword, 'CLASS_ORIGIN': tokens.Keyword, 'CLOB': tokens.Keyword, 'CLOSE': tokens.Keyword, 'CLUSTER': tokens.Keyword, 'COALESCE': tokens.Keyword, 'COBOL': tokens.Keyword, 'COLLATE': tokens.Keyword, 'COLLATION': tokens.Keyword, 'COLLATION_CATALOG': tokens.Keyword, 'COLLATION_NAME': tokens.Keyword, 'COLLATION_SCHEMA': tokens.Keyword, 'COLLECT': tokens.Keyword, 'COLUMN': tokens.Keyword, 'COLUMN_NAME': tokens.Keyword, 'COMPRESS': tokens.Keyword, 'COMMAND_FUNCTION': tokens.Keyword, 'COMMAND_FUNCTION_CODE': tokens.Keyword, 'COMMENT': tokens.Keyword, 'COMMIT': tokens.Keyword.DML, 'COMMITTED': tokens.Keyword, 'COMPLETION': tokens.Keyword, 'CONCURRENTLY': tokens.Keyword, 'CONDITION_NUMBER': tokens.Keyword, 'CONNECT': tokens.Keyword, 'CONNECTION': tokens.Keyword, 'CONNECTION_NAME': tokens.Keyword, 'CONSTRAINT': tokens.Keyword, 'CONSTRAINTS': tokens.Keyword, 'CONSTRAINT_CATALOG': tokens.Keyword, 'CONSTRAINT_NAME': tokens.Keyword, 'CONSTRAINT_SCHEMA': tokens.Keyword, 'CONSTRUCTOR': tokens.Keyword, 'CONTAINS': tokens.Keyword, 'CONTINUE': tokens.Keyword, 'CONVERSION': tokens.Keyword, 'CONVERT': tokens.Keyword, 'COPY': tokens.Keyword, 'CORRESPONDING': tokens.Keyword, 'COUNT': tokens.Keyword, 'CREATEDB': tokens.Keyword, 'CREATEUSER': tokens.Keyword, 'CROSS': tokens.Keyword, 'CUBE': tokens.Keyword, 'CURRENT': tokens.Keyword, 'CURRENT_DATE': tokens.Keyword, 'CURRENT_PATH': tokens.Keyword, 'CURRENT_ROLE': tokens.Keyword, 'CURRENT_TIME': tokens.Keyword, 'CURRENT_TIMESTAMP': tokens.Keyword, 'CURRENT_USER': tokens.Keyword, 'CURSOR': tokens.Keyword, 'CURSOR_NAME': tokens.Keyword, 'CYCLE': tokens.Keyword, 'DATA': tokens.Keyword, 'DATABASE': tokens.Keyword, 'DATETIME_INTERVAL_CODE': tokens.Keyword, 'DATETIME_INTERVAL_PRECISION': tokens.Keyword, 'DAY': tokens.Keyword, 'DEALLOCATE': tokens.Keyword, 'DECLARE': tokens.Keyword, 'DEFAULT': tokens.Keyword, 'DEFAULTS': tokens.Keyword, 'DEFERRABLE': tokens.Keyword, 'DEFERRED': tokens.Keyword, 'DEFINED': tokens.Keyword, 'DEFINER': tokens.Keyword, 'DELIMITER': tokens.Keyword, 'DELIMITERS': tokens.Keyword, 'DEREF': tokens.Keyword, 'DESC': tokens.Keyword.Order, 'DESCRIBE': tokens.Keyword, 'DESCRIPTOR': tokens.Keyword, 'DESTROY': tokens.Keyword, 'DESTRUCTOR': tokens.Keyword, 'DETERMINISTIC': tokens.Keyword, 'DIAGNOSTICS': tokens.Keyword, 'DICTIONARY': tokens.Keyword, 'DISABLE': tokens.Keyword, 'DISCONNECT': tokens.Keyword, 'DISPATCH': tokens.Keyword, 'DIV': tokens.Operator, 'DO': tokens.Keyword, 'DOMAIN': tokens.Keyword, 'DYNAMIC': tokens.Keyword, 'DYNAMIC_FUNCTION': tokens.Keyword, 'DYNAMIC_FUNCTION_CODE': tokens.Keyword, 'EACH': tokens.Keyword, 'ENABLE': tokens.Keyword, 'ENCODING': tokens.Keyword, 'ENCRYPTED': tokens.Keyword, 'END-EXEC': tokens.Keyword, 'ENGINE': tokens.Keyword, 'EQUALS': tokens.Keyword, 'ESCAPE': tokens.Keyword, 'EVERY': tokens.Keyword, 'EXCEPT': tokens.Keyword, 'EXCEPTION': tokens.Keyword, 'EXCLUDING': tokens.Keyword, 'EXCLUSIVE': tokens.Keyword, 'EXEC': tokens.Keyword, 'EXECUTE': tokens.Keyword, 'EXISTING': tokens.Keyword, 'EXISTS': tokens.Keyword, 'EXPLAIN': tokens.Keyword, 'EXTERNAL': tokens.Keyword, 'EXTRACT': tokens.Keyword, 'FALSE': tokens.Keyword, 'FETCH': tokens.Keyword, 'FILE': tokens.Keyword, 'FINAL': tokens.Keyword, 'FIRST': tokens.Keyword, 'FORCE': tokens.Keyword, 'FOREACH': tokens.Keyword, 'FOREIGN': tokens.Keyword, 'FORTRAN': tokens.Keyword, 'FORWARD': tokens.Keyword, 'FOUND': tokens.Keyword, 'FREE': tokens.Keyword, 'FREEZE': tokens.Keyword, 'FULL': tokens.Keyword, 'FUNCTION': tokens.Keyword, # 'G': tokens.Keyword, 'GENERAL': tokens.Keyword, 'GENERATED': tokens.Keyword, 'GET': tokens.Keyword, 'GLOBAL': tokens.Keyword, 'GO': tokens.Keyword, 'GOTO': tokens.Keyword, 'GRANT': tokens.Keyword, 'GRANTED': tokens.Keyword, 'GROUPING': tokens.Keyword, 'HAVING': tokens.Keyword, 'HIERARCHY': tokens.Keyword, 'HOLD': tokens.Keyword, 'HOUR': tokens.Keyword, 'HOST': tokens.Keyword, 'IDENTIFIED': tokens.Keyword, 'IDENTITY': tokens.Keyword, 'IGNORE': tokens.Keyword, 'ILIKE': tokens.Keyword, 'IMMEDIATE': tokens.Keyword, 'IMMUTABLE': tokens.Keyword, 'IMPLEMENTATION': tokens.Keyword, 'IMPLICIT': tokens.Keyword, 'INCLUDING': tokens.Keyword, 'INCREMENT': tokens.Keyword, 'INDEX': tokens.Keyword, 'INDICATOR': tokens.Keyword, 'INFIX': tokens.Keyword, 'INHERITS': tokens.Keyword, 'INITIAL': tokens.Keyword, 'INITIALIZE': tokens.Keyword, 'INITIALLY': tokens.Keyword, 'INOUT': tokens.Keyword, 'INPUT': tokens.Keyword, 'INSENSITIVE': tokens.Keyword, 'INSTANTIABLE': tokens.Keyword, 'INSTEAD': tokens.Keyword, 'INTERSECT': tokens.Keyword, 'INTO': tokens.Keyword, 'INVOKER': tokens.Keyword, 'IS': tokens.Keyword, 'ISNULL': tokens.Keyword, 'ISOLATION': tokens.Keyword, 'ITERATE': tokens.Keyword, # 'K': tokens.Keyword, 'KEY': tokens.Keyword, 'KEY_MEMBER': tokens.Keyword, 'KEY_TYPE': tokens.Keyword, 'LANCOMPILER': tokens.Keyword, 'LANGUAGE': tokens.Keyword, 'LARGE': tokens.Keyword, 'LAST': tokens.Keyword, 'LATERAL': tokens.Keyword, 'LEADING': tokens.Keyword, 'LENGTH': tokens.Keyword, 'LESS': tokens.Keyword, 'LEVEL': tokens.Keyword, 'LIMIT': tokens.Keyword, 'LISTEN': tokens.Keyword, 'LOAD': tokens.Keyword, 'LOCAL': tokens.Keyword, 'LOCALTIME': tokens.Keyword, 'LOCALTIMESTAMP': tokens.Keyword, 'LOCATION': tokens.Keyword, 'LOCATOR': tokens.Keyword, 'LOCK': tokens.Keyword, 'LOWER': tokens.Keyword, # 'M': tokens.Keyword, 'MAP': tokens.Keyword, 'MATCH': tokens.Keyword, 'MAXEXTENTS': tokens.Keyword, 'MAXVALUE': tokens.Keyword, 'MESSAGE_LENGTH': tokens.Keyword, 'MESSAGE_OCTET_LENGTH': tokens.Keyword, 'MESSAGE_TEXT': tokens.Keyword, 'METHOD': tokens.Keyword, 'MINUTE': tokens.Keyword, 'MINUS': tokens.Keyword, 'MINVALUE': tokens.Keyword, 'MOD': tokens.Keyword, 'MODE': tokens.Keyword, 'MODIFIES': tokens.Keyword, 'MODIFY': tokens.Keyword, 'MONTH': tokens.Keyword, 'MORE': tokens.Keyword, 'MOVE': tokens.Keyword, 'MUMPS': tokens.Keyword, 'NAMES': tokens.Keyword, 'NATIONAL': tokens.Keyword, 'NATURAL': tokens.Keyword, 'NCHAR': tokens.Keyword, 'NCLOB': tokens.Keyword, 'NEW': tokens.Keyword, 'NEXT': tokens.Keyword, 'NO': tokens.Keyword, 'NOAUDIT': tokens.Keyword, 'NOCOMPRESS': tokens.Keyword, 'NOCREATEDB': tokens.Keyword, 'NOCREATEUSER': tokens.Keyword, 'NONE': tokens.Keyword, 'NOT': tokens.Keyword, 'NOTFOUND': tokens.Keyword, 'NOTHING': tokens.Keyword, 'NOTIFY': tokens.Keyword, 'NOTNULL': tokens.Keyword, 'NOWAIT': tokens.Keyword, 'NULL': tokens.Keyword, 'NULLABLE': tokens.Keyword, 'NULLIF': tokens.Keyword, 'OBJECT': tokens.Keyword, 'OCTET_LENGTH': tokens.Keyword, 'OF': tokens.Keyword, 'OFF': tokens.Keyword, 'OFFLINE': tokens.Keyword, 'OFFSET': tokens.Keyword, 'OIDS': tokens.Keyword, 'OLD': tokens.Keyword, 'ONLINE': tokens.Keyword, 'ONLY': tokens.Keyword, 'OPEN': tokens.Keyword, 'OPERATION': tokens.Keyword, 'OPERATOR': tokens.Keyword, 'OPTION': tokens.Keyword, 'OPTIONS': tokens.Keyword, 'ORDINALITY': tokens.Keyword, 'OUT': tokens.Keyword, 'OUTPUT': tokens.Keyword, 'OVERLAPS': tokens.Keyword, 'OVERLAY': tokens.Keyword, 'OVERRIDING': tokens.Keyword, 'OWNER': tokens.Keyword, 'QUARTER': tokens.Keyword, 'PAD': tokens.Keyword, 'PARAMETER': tokens.Keyword, 'PARAMETERS': tokens.Keyword, 'PARAMETER_MODE': tokens.Keyword, 'PARAMETER_NAME': tokens.Keyword, 'PARAMETER_ORDINAL_POSITION': tokens.Keyword, 'PARAMETER_SPECIFIC_CATALOG': tokens.Keyword, 'PARAMETER_SPECIFIC_NAME': tokens.Keyword, 'PARAMETER_SPECIFIC_SCHEMA': tokens.Keyword, 'PARTIAL': tokens.Keyword, 'PASCAL': tokens.Keyword, 'PCTFREE': tokens.Keyword, 'PENDANT': tokens.Keyword, 'PLACING': tokens.Keyword, 'PLI': tokens.Keyword, 'POSITION': tokens.Keyword, 'POSTFIX': tokens.Keyword, 'PRECISION': tokens.Keyword, 'PREFIX': tokens.Keyword, 'PREORDER': tokens.Keyword, 'PREPARE': tokens.Keyword, 'PRESERVE': tokens.Keyword, 'PRIMARY': tokens.Keyword, 'PRIOR': tokens.Keyword, 'PRIVILEGES': tokens.Keyword, 'PROCEDURAL': tokens.Keyword, 'PROCEDURE': tokens.Keyword, 'PUBLIC': tokens.Keyword, 'RAISE': tokens.Keyword, 'RAW': tokens.Keyword, 'READ': tokens.Keyword, 'READS': tokens.Keyword, 'RECHECK': tokens.Keyword, 'RECURSIVE': tokens.Keyword, 'REF': tokens.Keyword, 'REFERENCES': tokens.Keyword, 'REFERENCING': tokens.Keyword, 'REINDEX': tokens.Keyword, 'RELATIVE': tokens.Keyword, 'RENAME': tokens.Keyword, 'REPEATABLE': tokens.Keyword, 'RESET': tokens.Keyword, 'RESOURCE': tokens.Keyword, 'RESTART': tokens.Keyword, 'RESTRICT': tokens.Keyword, 'RESULT': tokens.Keyword, 'RETURN': tokens.Keyword, 'RETURNED_LENGTH': tokens.Keyword, 'RETURNED_OCTET_LENGTH': tokens.Keyword, 'RETURNED_SQLSTATE': tokens.Keyword, 'RETURNING': tokens.Keyword, 'RETURNS': tokens.Keyword, 'REVOKE': tokens.Keyword, 'RIGHT': tokens.Keyword, 'ROLE': tokens.Keyword, 'ROLLBACK': tokens.Keyword.DML, 'ROLLUP': tokens.Keyword, 'ROUTINE': tokens.Keyword, 'ROUTINE_CATALOG': tokens.Keyword, 'ROUTINE_NAME': tokens.Keyword, 'ROUTINE_SCHEMA': tokens.Keyword, 'ROW': tokens.Keyword, 'ROWS': tokens.Keyword, 'ROW_COUNT': tokens.Keyword, 'RULE': tokens.Keyword, 'SAVE_POINT': tokens.Keyword, 'SCALE': tokens.Keyword, 'SCHEMA': tokens.Keyword, 'SCHEMA_NAME': tokens.Keyword, 'SCOPE': tokens.Keyword, 'SCROLL': tokens.Keyword, 'SEARCH': tokens.Keyword, 'SECOND': tokens.Keyword, 'SECURITY': tokens.Keyword, 'SELF': tokens.Keyword, 'SENSITIVE': tokens.Keyword, 'SEQUENCE': tokens.Keyword, 'SERIALIZABLE': tokens.Keyword, 'SERVER_NAME': tokens.Keyword, 'SESSION': tokens.Keyword, 'SESSION_USER': tokens.Keyword, 'SETOF': tokens.Keyword, 'SETS': tokens.Keyword, 'SHARE': tokens.Keyword, 'SHOW': tokens.Keyword, 'SIMILAR': tokens.Keyword, 'SIMPLE': tokens.Keyword, 'SIZE': tokens.Keyword, 'SOME': tokens.Keyword, 'SOURCE': tokens.Keyword, 'SPACE': tokens.Keyword, 'SPECIFIC': tokens.Keyword, 'SPECIFICTYPE': tokens.Keyword, 'SPECIFIC_NAME': tokens.Keyword, 'SQL': tokens.Keyword, 'SQLBUF': tokens.Keyword, 'SQLCODE': tokens.Keyword, 'SQLERROR': tokens.Keyword, 'SQLEXCEPTION': tokens.Keyword, 'SQLSTATE': tokens.Keyword, 'SQLWARNING': tokens.Keyword, 'STABLE': tokens.Keyword, 'START': tokens.Keyword.DML, # 'STATE': tokens.Keyword, 'STATEMENT': tokens.Keyword, 'STATIC': tokens.Keyword, 'STATISTICS': tokens.Keyword, 'STDIN': tokens.Keyword, 'STDOUT': tokens.Keyword, 'STORAGE': tokens.Keyword, 'STRICT': tokens.Keyword, 'STRUCTURE': tokens.Keyword, 'STYPE': tokens.Keyword, 'SUBCLASS_ORIGIN': tokens.Keyword, 'SUBLIST': tokens.Keyword, 'SUBSTRING': tokens.Keyword, 'SUCCESSFUL': tokens.Keyword, 'SUM': tokens.Keyword, 'SYMMETRIC': tokens.Keyword, 'SYNONYM': tokens.Keyword, 'SYSID': tokens.Keyword, 'SYSTEM': tokens.Keyword, 'SYSTEM_USER': tokens.Keyword, 'TABLE': tokens.Keyword, 'TABLE_NAME': tokens.Keyword, 'TEMP': tokens.Keyword, 'TEMPLATE': tokens.Keyword, 'TEMPORARY': tokens.Keyword, 'TERMINATE': tokens.Keyword, 'THAN': tokens.Keyword, 'TIMESTAMP': tokens.Keyword, 'TIMEZONE_HOUR': tokens.Keyword, 'TIMEZONE_MINUTE': tokens.Keyword, 'TO': tokens.Keyword, 'TOAST': tokens.Keyword, 'TRAILING': tokens.Keyword, 'TRANSATION': tokens.Keyword, 'TRANSACTIONS_COMMITTED': tokens.Keyword, 'TRANSACTIONS_ROLLED_BACK': tokens.Keyword, 'TRANSATION_ACTIVE': tokens.Keyword, 'TRANSFORM': tokens.Keyword, 'TRANSFORMS': tokens.Keyword, 'TRANSLATE': tokens.Keyword, 'TRANSLATION': tokens.Keyword, 'TREAT': tokens.Keyword, 'TRIGGER': tokens.Keyword, 'TRIGGER_CATALOG': tokens.Keyword, 'TRIGGER_NAME': tokens.Keyword, 'TRIGGER_SCHEMA': tokens.Keyword, 'TRIM': tokens.Keyword, 'TRUE': tokens.Keyword, 'TRUNCATE': tokens.Keyword, 'TRUSTED': tokens.Keyword, 'TYPE': tokens.Keyword, 'UID': tokens.Keyword, 'UNCOMMITTED': tokens.Keyword, 'UNDER': tokens.Keyword, 'UNENCRYPTED': tokens.Keyword, 'UNION': tokens.Keyword, 'UNIQUE': tokens.Keyword, 'UNKNOWN': tokens.Keyword, 'UNLISTEN': tokens.Keyword, 'UNNAMED': tokens.Keyword, 'UNNEST': tokens.Keyword, 'UNTIL': tokens.Keyword, 'UPPER': tokens.Keyword, 'USAGE': tokens.Keyword, 'USE': tokens.Keyword, 'USER': tokens.Keyword, 'USER_DEFINED_TYPE_CATALOG': tokens.Keyword, 'USER_DEFINED_TYPE_NAME': tokens.Keyword, 'USER_DEFINED_TYPE_SCHEMA': tokens.Keyword, 'USING': tokens.Keyword, 'VACUUM': tokens.Keyword, 'VALID': tokens.Keyword, 'VALIDATE': tokens.Keyword, 'VALIDATOR': tokens.Keyword, 'VALUES': tokens.Keyword, 'VARIABLE': tokens.Keyword, 'VERBOSE': tokens.Keyword, 'VERSION': tokens.Keyword, 'VIEW': tokens.Keyword, 'VOLATILE': tokens.Keyword, 'WEEK': tokens.Keyword, 'WHENEVER': tokens.Keyword, 'WITH': tokens.Keyword.CTE, 'WITHOUT': tokens.Keyword, 'WORK': tokens.Keyword, 'WRITE': tokens.Keyword, 'YEAR': tokens.Keyword, 'ZONE': tokens.Keyword, # Name.Builtin 'ARRAY': tokens.Name.Builtin, 'BIGINT': tokens.Name.Builtin, 'BINARY': tokens.Name.Builtin, 'BIT': tokens.Name.Builtin, 'BLOB': tokens.Name.Builtin, 'BOOLEAN': tokens.Name.Builtin, 'CHAR': tokens.Name.Builtin, 'CHARACTER': tokens.Name.Builtin, 'DATE': tokens.Name.Builtin, 'DEC': tokens.Name.Builtin, 'DECIMAL': tokens.Name.Builtin, 'FILE_TYPE': tokens.Name.Builtin, 'FLOAT': tokens.Name.Builtin, 'INT': tokens.Name.Builtin, 'INT8': tokens.Name.Builtin, 'INTEGER': tokens.Name.Builtin, 'INTERVAL': tokens.Name.Builtin, 'LONG': tokens.Name.Builtin, 'NATURALN': tokens.Name.Builtin, 'NVARCHAR': tokens.Name.Builtin, 'NUMBER': tokens.Name.Builtin, 'NUMERIC': tokens.Name.Builtin, 'PLS_INTEGER': tokens.Name.Builtin, 'POSITIVE': tokens.Name.Builtin, 'POSITIVEN': tokens.Name.Builtin, 'REAL': tokens.Name.Builtin, 'ROWID': tokens.Name.Builtin, 'ROWLABEL': tokens.Name.Builtin, 'ROWNUM': tokens.Name.Builtin, 'SERIAL': tokens.Name.Builtin, 'SERIAL8': tokens.Name.Builtin, 'SIGNED': tokens.Name.Builtin, 'SIGNTYPE': tokens.Name.Builtin, 'SIMPLE_DOUBLE': tokens.Name.Builtin, 'SIMPLE_FLOAT': tokens.Name.Builtin, 'SIMPLE_INTEGER': tokens.Name.Builtin, 'SMALLINT': tokens.Name.Builtin, 'SYS_REFCURSOR': tokens.Name.Builtin, 'SYSDATE': tokens.Name, 'TEXT': tokens.Name.Builtin, 'TINYINT': tokens.Name.Builtin, 'UNSIGNED': tokens.Name.Builtin, 'UROWID': tokens.Name.Builtin, 'UTL_FILE': tokens.Name.Builtin, 'VARCHAR': tokens.Name.Builtin, 'VARCHAR2': tokens.Name.Builtin, 'VARYING': tokens.Name.Builtin, } KEYWORDS_COMMON = { 'SELECT': tokens.Keyword.DML, 'INSERT': tokens.Keyword.DML, 'DELETE': tokens.Keyword.DML, 'UPDATE': tokens.Keyword.DML, 'UPSERT': tokens.Keyword.DML, 'REPLACE': tokens.Keyword.DML, 'MERGE': tokens.Keyword.DML, 'DROP': tokens.Keyword.DDL, 'CREATE': tokens.Keyword.DDL, 'ALTER': tokens.Keyword.DDL, 'WHERE': tokens.Keyword, 'FROM': tokens.Keyword, 'INNER': tokens.Keyword, 'JOIN': tokens.Keyword, 'STRAIGHT_JOIN': tokens.Keyword, 'AND': tokens.Keyword, 'OR': tokens.Keyword, 'LIKE': tokens.Keyword, 'ON': tokens.Keyword, 'IN': tokens.Keyword, 'SET': tokens.Keyword, 'BY': tokens.Keyword, 'GROUP': tokens.Keyword, 'ORDER': tokens.Keyword, 'LEFT': tokens.Keyword, 'OUTER': tokens.Keyword, 'FULL': tokens.Keyword, 'IF': tokens.Keyword, 'END': tokens.Keyword, 'THEN': tokens.Keyword, 'LOOP': tokens.Keyword, 'AS': tokens.Keyword, 'ELSE': tokens.Keyword, 'FOR': tokens.Keyword, 'WHILE': tokens.Keyword, 'CASE': tokens.Keyword, 'WHEN': tokens.Keyword, 'MIN': tokens.Keyword, 'MAX': tokens.Keyword, 'DISTINCT': tokens.Keyword, } KEYWORDS_ORACLE = { 'ARCHIVE': tokens.Keyword, 'ARCHIVELOG': tokens.Keyword, 'BACKUP': tokens.Keyword, 'BECOME': tokens.Keyword, 'BLOCK': tokens.Keyword, 'BODY': tokens.Keyword, 'CANCEL': tokens.Keyword, 'CHANGE': tokens.Keyword, 'COMPILE': tokens.Keyword, 'CONTENTS': tokens.Keyword, 'CONTROLFILE': tokens.Keyword, 'DATAFILE': tokens.Keyword, 'DBA': tokens.Keyword, 'DISMOUNT': tokens.Keyword, 'DOUBLE': tokens.Keyword, 'DUMP': tokens.Keyword, 'ELSIF': tokens.Keyword, 'EVENTS': tokens.Keyword, 'EXCEPTIONS': tokens.Keyword, 'EXPLAIN': tokens.Keyword, 'EXTENT': tokens.Keyword, 'EXTERNALLY': tokens.Keyword, 'FLUSH': tokens.Keyword, 'FREELIST': tokens.Keyword, 'FREELISTS': tokens.Keyword, # groups seems too common as table name # 'GROUPS': tokens.Keyword, 'INDICATOR': tokens.Keyword, 'INITRANS': tokens.Keyword, 'INSTANCE': tokens.Keyword, 'LAYER': tokens.Keyword, 'LINK': tokens.Keyword, 'LISTS': tokens.Keyword, 'LOGFILE': tokens.Keyword, 'MANAGE': tokens.Keyword, 'MANUAL': tokens.Keyword, 'MAXDATAFILES': tokens.Keyword, 'MAXINSTANCES': tokens.Keyword, 'MAXLOGFILES': tokens.Keyword, 'MAXLOGHISTORY': tokens.Keyword, 'MAXLOGMEMBERS': tokens.Keyword, 'MAXTRANS': tokens.Keyword, 'MINEXTENTS': tokens.Keyword, 'MODULE': tokens.Keyword, 'MOUNT': tokens.Keyword, 'NOARCHIVELOG': tokens.Keyword, 'NOCACHE': tokens.Keyword, 'NOCYCLE': tokens.Keyword, 'NOMAXVALUE': tokens.Keyword, 'NOMINVALUE': tokens.Keyword, 'NOORDER': tokens.Keyword, 'NORESETLOGS': tokens.Keyword, 'NORMAL': tokens.Keyword, 'NOSORT': tokens.Keyword, 'OPTIMAL': tokens.Keyword, 'OWN': tokens.Keyword, 'PACKAGE': tokens.Keyword, 'PARALLEL': tokens.Keyword, 'PCTINCREASE': tokens.Keyword, 'PCTUSED': tokens.Keyword, 'PLAN': tokens.Keyword, 'PRIVATE': tokens.Keyword, 'PROFILE': tokens.Keyword, 'QUOTA': tokens.Keyword, 'RECOVER': tokens.Keyword, 'RESETLOGS': tokens.Keyword, 'RESTRICTED': tokens.Keyword, 'REUSE': tokens.Keyword, 'ROLES': tokens.Keyword, 'SAVEPOINT': tokens.Keyword, 'SCN': tokens.Keyword, 'SECTION': tokens.Keyword, 'SEGMENT': tokens.Keyword, 'SHARED': tokens.Keyword, 'SNAPSHOT': tokens.Keyword, 'SORT': tokens.Keyword, 'STATEMENT_ID': tokens.Keyword, 'STOP': tokens.Keyword, 'SWITCH': tokens.Keyword, 'TABLES': tokens.Keyword, 'TABLESPACE': tokens.Keyword, 'THREAD': tokens.Keyword, 'TIME': tokens.Keyword, 'TRACING': tokens.Keyword, 'TRANSACTION': tokens.Keyword, 'TRIGGERS': tokens.Keyword, 'UNLIMITED': tokens.Keyword, 'UNLOCK': tokens.Keyword, } # PostgreSQL Syntax KEYWORDS_PLPGSQL = { 'CONFLICT': tokens.Keyword, 'WINDOW': tokens.Keyword, 'PARTITION': tokens.Keyword, 'OVER': tokens.Keyword, 'PERFORM': tokens.Keyword, 'NOTICE': tokens.Keyword, 'PLPGSQL': tokens.Keyword, 'INHERIT': tokens.Keyword, 'INDEXES': tokens.Keyword, 'ON_ERROR_STOP': tokens.Keyword, 'BYTEA': tokens.Keyword, 'BIGSERIAL': tokens.Keyword, 'BIT VARYING': tokens.Keyword, 'BOX': tokens.Keyword, 'CHARACTER': tokens.Keyword, 'CHARACTER VARYING': tokens.Keyword, 'CIDR': tokens.Keyword, 'CIRCLE': tokens.Keyword, 'DOUBLE PRECISION': tokens.Keyword, 'INET': tokens.Keyword, 'JSON': tokens.Keyword, 'JSONB': tokens.Keyword, 'LINE': tokens.Keyword, 'LSEG': tokens.Keyword, 'MACADDR': tokens.Keyword, 'MONEY': tokens.Keyword, 'PATH': tokens.Keyword, 'PG_LSN': tokens.Keyword, 'POINT': tokens.Keyword, 'POLYGON': tokens.Keyword, 'SMALLSERIAL': tokens.Keyword, 'TSQUERY': tokens.Keyword, 'TSVECTOR': tokens.Keyword, 'TXID_SNAPSHOT': tokens.Keyword, 'UUID': tokens.Keyword, 'XML': tokens.Keyword, 'FOR': tokens.Keyword, 'IN': tokens.Keyword, 'LOOP': tokens.Keyword, } # Hive Syntax KEYWORDS_HQL = { 'EXPLODE': tokens.Keyword, 'DIRECTORY': tokens.Keyword, 'DISTRIBUTE': tokens.Keyword, 'INCLUDE': tokens.Keyword, 'LOCATE': tokens.Keyword, 'OVERWRITE': tokens.Keyword, 'POSEXPLODE': tokens.Keyword, 'ARRAY_CONTAINS': tokens.Keyword, 'CMP': tokens.Keyword, 'COLLECT_LIST': tokens.Keyword, 'CONCAT': tokens.Keyword, 'CONDITION': tokens.Keyword, 'DATE_ADD': tokens.Keyword, 'DATE_SUB': tokens.Keyword, 'DECODE': tokens.Keyword, 'DBMS_OUTPUT': tokens.Keyword, 'ELEMENTS': tokens.Keyword, 'EXCHANGE': tokens.Keyword, 'EXTENDED': tokens.Keyword, 'FLOOR': tokens.Keyword, 'FOLLOWING': tokens.Keyword, 'FROM_UNIXTIME': tokens.Keyword, 'FTP': tokens.Keyword, 'HOUR': tokens.Keyword, 'INLINE': tokens.Keyword, 'INSTR': tokens.Keyword, 'LEN': tokens.Keyword, 'MAP': tokens.Name.Builtin, 'MAXELEMENT': tokens.Keyword, 'MAXINDEX': tokens.Keyword, 'MAX_PART_DATE': tokens.Keyword, 'MAX_PART_INT': tokens.Keyword, 'MAX_PART_STRING': tokens.Keyword, 'MINELEMENT': tokens.Keyword, 'MININDEX': tokens.Keyword, 'MIN_PART_DATE': tokens.Keyword, 'MIN_PART_INT': tokens.Keyword, 'MIN_PART_STRING': tokens.Keyword, 'NOW': tokens.Keyword, 'NVL': tokens.Keyword, 'NVL2': tokens.Keyword, 'PARSE_URL_TUPLE': tokens.Keyword, 'PART_LOC': tokens.Keyword, 'PART_COUNT': tokens.Keyword, 'PART_COUNT_BY': tokens.Keyword, 'PRINT': tokens.Keyword, 'PUT_LINE': tokens.Keyword, 'RANGE': tokens.Keyword, 'REDUCE': tokens.Keyword, 'REGEXP_REPLACE': tokens.Keyword, 'RESIGNAL': tokens.Keyword, 'RTRIM': tokens.Keyword, 'SIGN': tokens.Keyword, 'SIGNAL': tokens.Keyword, 'SIN': tokens.Keyword, 'SPLIT': tokens.Keyword, 'SQRT': tokens.Keyword, 'STACK': tokens.Keyword, 'STR': tokens.Keyword, 'STRING': tokens.Name.Builtin, 'STRUCT': tokens.Name.Builtin, 'SUBSTR': tokens.Keyword, 'SUMMARY': tokens.Keyword, 'TBLPROPERTIES': tokens.Keyword, 'TIMESTAMP': tokens.Name.Builtin, 'TIMESTAMP_ISO': tokens.Keyword, 'TO_CHAR': tokens.Keyword, 'TO_DATE': tokens.Keyword, 'TO_TIMESTAMP': tokens.Keyword, 'TRUNC': tokens.Keyword, 'UNBOUNDED': tokens.Keyword, 'UNIQUEJOIN': tokens.Keyword, 'UNIX_TIMESTAMP': tokens.Keyword, 'UTC_TIMESTAMP': tokens.Keyword, 'VIEWS': tokens.Keyword, 'EXIT': tokens.Keyword, 'BREAK': tokens.Keyword, 'LEAVE': tokens.Keyword, } KEYWORDS_MSACCESS = { 'DISTINCTROW': tokens.Keyword, } sqlparse-0.4.4/sqlparse/lexer.py000066400000000000000000000132321454634525500167120ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """SQL Lexer""" import re # This code is based on the SqlLexer in pygments. # http://pygments.org/ # It's separated from the rest of pygments to increase performance # and to allow some customizations. from io import TextIOBase from sqlparse import tokens, keywords from sqlparse.utils import consume class Lexer: """The Lexer supports configurable syntax. To add support for additional keywords, use the `add_keywords` method.""" _default_intance = None # Development notes: # - This class is prepared to be able to support additional SQL dialects # in the future by adding additional functions that take the place of # the function default_initialization() # - The lexer class uses an explicit singleton behavior with the # instance-getter method get_default_instance(). This mechanism has # the advantage that the call signature of the entry-points to the # sqlparse library are not affected. Also, usage of sqlparse in third # party code does not need to be adapted. On the other hand, singleton # behavior is not thread safe, and the current implementation does not # easily allow for multiple SQL dialects to be parsed in the same # process. Such behavior can be supported in the future by passing a # suitably initialized lexer object as an additional parameter to the # entry-point functions (such as `parse`). Code will need to be written # to pass down and utilize such an object. The current implementation # is prepared to support this thread safe approach without the # default_instance part needing to change interface. @classmethod def get_default_instance(cls): """Returns the lexer instance used internally by the sqlparse core functions.""" if cls._default_intance is None: cls._default_intance = cls() cls._default_intance.default_initialization() return cls._default_intance def default_initialization(self): """Initialize the lexer with default dictionaries. Useful if you need to revert custom syntax settings.""" self.clear() self.set_SQL_REGEX(keywords.SQL_REGEX) self.add_keywords(keywords.KEYWORDS_COMMON) self.add_keywords(keywords.KEYWORDS_ORACLE) self.add_keywords(keywords.KEYWORDS_PLPGSQL) self.add_keywords(keywords.KEYWORDS_HQL) self.add_keywords(keywords.KEYWORDS_MSACCESS) self.add_keywords(keywords.KEYWORDS) def clear(self): """Clear all syntax configurations. Useful if you want to load a reduced set of syntax configurations. After this call, regexps and keyword dictionaries need to be loaded to make the lexer functional again.""" self._SQL_REGEX = [] self._keywords = [] def set_SQL_REGEX(self, SQL_REGEX): """Set the list of regex that will parse the SQL.""" FLAGS = re.IGNORECASE | re.UNICODE self._SQL_REGEX = [ (re.compile(rx, FLAGS).match, tt) for rx, tt in SQL_REGEX ] def add_keywords(self, keywords): """Add keyword dictionaries. Keywords are looked up in the same order that dictionaries were added.""" self._keywords.append(keywords) def is_keyword(self, value): """Checks for a keyword. If the given value is in one of the KEYWORDS_* dictionary it's considered a keyword. Otherwise, tokens.Name is returned. """ val = value.upper() for kwdict in self._keywords: if val in kwdict: return kwdict[val], value else: return tokens.Name, value def get_tokens(self, text, encoding=None): """ Return an iterable of (tokentype, value) pairs generated from `text`. If `unfiltered` is set to `True`, the filtering mechanism is bypassed even if filters are defined. Also preprocess the text, i.e. expand tabs and strip it if wanted and applies registered filters. Split ``text`` into (tokentype, text) pairs. ``stack`` is the initial stack (default: ``['root']``) """ if isinstance(text, TextIOBase): text = text.read() if isinstance(text, str): pass elif isinstance(text, bytes): if encoding: text = text.decode(encoding) else: try: text = text.decode('utf-8') except UnicodeDecodeError: text = text.decode('unicode-escape') else: raise TypeError("Expected text or file-like object, got {!r}". format(type(text))) iterable = enumerate(text) for pos, char in iterable: for rexmatch, action in self._SQL_REGEX: m = rexmatch(text, pos) if not m: continue elif isinstance(action, tokens._TokenType): yield action, m.group() elif action is keywords.PROCESS_AS_KEYWORD: yield self.is_keyword(m.group()) consume(iterable, m.end() - pos - 1) break else: yield tokens.Error, char def tokenize(sql, encoding=None): """Tokenize sql. Tokenize *sql* using the :class:`Lexer` and return a 2-tuple stream of ``(token type, value)`` items. """ return Lexer.get_default_instance().get_tokens(sql, encoding) sqlparse-0.4.4/sqlparse/sql.py000066400000000000000000000476611454634525500164070ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause """This module contains classes representing syntactical elements of SQL.""" import re from sqlparse import tokens as T from sqlparse.utils import imt, remove_quotes class NameAliasMixin: """Implements get_real_name and get_alias.""" def get_real_name(self): """Returns the real name (object name) of this identifier.""" # a.b dot_idx, _ = self.token_next_by(m=(T.Punctuation, '.')) return self._get_first_name(dot_idx, real_name=True) def get_alias(self): """Returns the alias for this identifier or ``None``.""" # "name AS alias" kw_idx, kw = self.token_next_by(m=(T.Keyword, 'AS')) if kw is not None: return self._get_first_name(kw_idx + 1, keywords=True) # "name alias" or "complicated column expression alias" _, ws = self.token_next_by(t=T.Whitespace) if len(self.tokens) > 2 and ws is not None: return self._get_first_name(reverse=True) class Token: """Base class for all other classes in this module. It represents a single token and has two instance attributes: ``value`` is the unchanged value of the token and ``ttype`` is the type of the token. """ __slots__ = ('value', 'ttype', 'parent', 'normalized', 'is_keyword', 'is_group', 'is_whitespace') def __init__(self, ttype, value): value = str(value) self.value = value self.ttype = ttype self.parent = None self.is_group = False self.is_keyword = ttype in T.Keyword self.is_whitespace = self.ttype in T.Whitespace self.normalized = value.upper() if self.is_keyword else value def __str__(self): return self.value # Pending tokenlist __len__ bug fix # def __len__(self): # return len(self.value) def __repr__(self): cls = self._get_repr_name() value = self._get_repr_value() q = '"' if value.startswith("'") and value.endswith("'") else "'" return "<{cls} {q}{value}{q} at 0x{id:2X}>".format( id=id(self), **locals()) def _get_repr_name(self): return str(self.ttype).split('.')[-1] def _get_repr_value(self): raw = str(self) if len(raw) > 7: raw = raw[:6] + '...' return re.sub(r'\s+', ' ', raw) def flatten(self): """Resolve subgroups.""" yield self def match(self, ttype, values, regex=False): """Checks whether the token matches the given arguments. *ttype* is a token type. If this token doesn't match the given token type. *values* is a list of possible values for this token. The values are OR'ed together so if only one of the values matches ``True`` is returned. Except for keyword tokens the comparison is case-sensitive. For convenience it's OK to pass in a single string. If *regex* is ``True`` (default is ``False``) the given values are treated as regular expressions. """ type_matched = self.ttype is ttype if not type_matched or values is None: return type_matched if isinstance(values, str): values = (values,) if regex: # TODO: Add test for regex with is_keyboard = false flag = re.IGNORECASE if self.is_keyword else 0 values = (re.compile(v, flag) for v in values) for pattern in values: if pattern.search(self.normalized): return True return False if self.is_keyword: values = (v.upper() for v in values) return self.normalized in values def within(self, group_cls): """Returns ``True`` if this token is within *group_cls*. Use this method for example to check if an identifier is within a function: ``t.within(sql.Function)``. """ parent = self.parent while parent: if isinstance(parent, group_cls): return True parent = parent.parent return False def is_child_of(self, other): """Returns ``True`` if this token is a direct child of *other*.""" return self.parent == other def has_ancestor(self, other): """Returns ``True`` if *other* is in this tokens ancestry.""" parent = self.parent while parent: if parent == other: return True parent = parent.parent return False class TokenList(Token): """A group of tokens. It has an additional instance attribute ``tokens`` which holds a list of child-tokens. """ __slots__ = 'tokens' def __init__(self, tokens=None): self.tokens = tokens or [] [setattr(token, 'parent', self) for token in self.tokens] super().__init__(None, str(self)) self.is_group = True def __str__(self): return ''.join(token.value for token in self.flatten()) # weird bug # def __len__(self): # return len(self.tokens) def __iter__(self): return iter(self.tokens) def __getitem__(self, item): return self.tokens[item] def _get_repr_name(self): return type(self).__name__ def _pprint_tree(self, max_depth=None, depth=0, f=None, _pre=''): """Pretty-print the object tree.""" token_count = len(self.tokens) for idx, token in enumerate(self.tokens): cls = token._get_repr_name() value = token._get_repr_value() last = idx == (token_count - 1) pre = '`- ' if last else '|- ' q = '"' if value.startswith("'") and value.endswith("'") else "'" print("{_pre}{pre}{idx} {cls} {q}{value}{q}" .format(**locals()), file=f) if token.is_group and (max_depth is None or depth < max_depth): parent_pre = ' ' if last else '| ' token._pprint_tree(max_depth, depth + 1, f, _pre + parent_pre) def get_token_at_offset(self, offset): """Returns the token that is on position offset.""" idx = 0 for token in self.flatten(): end = idx + len(token.value) if idx <= offset < end: return token idx = end def flatten(self): """Generator yielding ungrouped tokens. This method is recursively called for all child tokens. """ for token in self.tokens: if token.is_group: yield from token.flatten() else: yield token def get_sublists(self): for token in self.tokens: if token.is_group: yield token @property def _groupable_tokens(self): return self.tokens def _token_matching(self, funcs, start=0, end=None, reverse=False): """next token that match functions""" if start is None: return None if not isinstance(funcs, (list, tuple)): funcs = (funcs,) if reverse: assert end is None indexes = range(start - 2, -1, -1) else: if end is None: end = len(self.tokens) indexes = range(start, end) for idx in indexes: token = self.tokens[idx] for func in funcs: if func(token): return idx, token return None, None def token_first(self, skip_ws=True, skip_cm=False): """Returns the first child token. If *skip_ws* is ``True`` (the default), whitespace tokens are ignored. if *skip_cm* is ``True`` (default: ``False``), comments are ignored too. """ # this on is inconsistent, using Comment instead of T.Comment... def matcher(tk): return not ((skip_ws and tk.is_whitespace) or (skip_cm and imt(tk, t=T.Comment, i=Comment))) return self._token_matching(matcher)[1] def token_next_by(self, i=None, m=None, t=None, idx=-1, end=None): idx += 1 return self._token_matching(lambda tk: imt(tk, i, m, t), idx, end) def token_not_matching(self, funcs, idx): funcs = (funcs,) if not isinstance(funcs, (list, tuple)) else funcs funcs = [lambda tk: not func(tk) for func in funcs] return self._token_matching(funcs, idx) def token_matching(self, funcs, idx): return self._token_matching(funcs, idx)[1] def token_prev(self, idx, skip_ws=True, skip_cm=False): """Returns the previous token relative to *idx*. If *skip_ws* is ``True`` (the default) whitespace tokens are ignored. If *skip_cm* is ``True`` comments are ignored. ``None`` is returned if there's no previous token. """ return self.token_next(idx, skip_ws, skip_cm, _reverse=True) # TODO: May need to re-add default value to idx def token_next(self, idx, skip_ws=True, skip_cm=False, _reverse=False): """Returns the next token relative to *idx*. If *skip_ws* is ``True`` (the default) whitespace tokens are ignored. If *skip_cm* is ``True`` comments are ignored. ``None`` is returned if there's no next token. """ if idx is None: return None, None idx += 1 # alot of code usage current pre-compensates for this def matcher(tk): return not ((skip_ws and tk.is_whitespace) or (skip_cm and imt(tk, t=T.Comment, i=Comment))) return self._token_matching(matcher, idx, reverse=_reverse) def token_index(self, token, start=0): """Return list index of token.""" start = start if isinstance(start, int) else self.token_index(start) return start + self.tokens[start:].index(token) def group_tokens(self, grp_cls, start, end, include_end=True, extend=False): """Replace tokens by an instance of *grp_cls*.""" start_idx = start start = self.tokens[start_idx] end_idx = end + include_end # will be needed later for new group_clauses # while skip_ws and tokens and tokens[-1].is_whitespace: # tokens = tokens[:-1] if extend and isinstance(start, grp_cls): subtokens = self.tokens[start_idx + 1:end_idx] grp = start grp.tokens.extend(subtokens) del self.tokens[start_idx + 1:end_idx] grp.value = str(start) else: subtokens = self.tokens[start_idx:end_idx] grp = grp_cls(subtokens) self.tokens[start_idx:end_idx] = [grp] grp.parent = self for token in subtokens: token.parent = grp return grp def insert_before(self, where, token): """Inserts *token* before *where*.""" if not isinstance(where, int): where = self.token_index(where) token.parent = self self.tokens.insert(where, token) def insert_after(self, where, token, skip_ws=True): """Inserts *token* after *where*.""" if not isinstance(where, int): where = self.token_index(where) nidx, next_ = self.token_next(where, skip_ws=skip_ws) token.parent = self if next_ is None: self.tokens.append(token) else: self.tokens.insert(nidx, token) def has_alias(self): """Returns ``True`` if an alias is present.""" return self.get_alias() is not None def get_alias(self): """Returns the alias for this identifier or ``None``.""" return None def get_name(self): """Returns the name of this identifier. This is either it's alias or it's real name. The returned valued can be considered as the name under which the object corresponding to this identifier is known within the current statement. """ return self.get_alias() or self.get_real_name() def get_real_name(self): """Returns the real name (object name) of this identifier.""" return None def get_parent_name(self): """Return name of the parent object if any. A parent object is identified by the first occurring dot. """ dot_idx, _ = self.token_next_by(m=(T.Punctuation, '.')) _, prev_ = self.token_prev(dot_idx) return remove_quotes(prev_.value) if prev_ is not None else None def _get_first_name(self, idx=None, reverse=False, keywords=False, real_name=False): """Returns the name of the first token with a name""" tokens = self.tokens[idx:] if idx else self.tokens tokens = reversed(tokens) if reverse else tokens types = [T.Name, T.Wildcard, T.String.Symbol] if keywords: types.append(T.Keyword) for token in tokens: if token.ttype in types: return remove_quotes(token.value) elif isinstance(token, (Identifier, Function)): return token.get_real_name() if real_name else token.get_name() class Statement(TokenList): """Represents a SQL statement.""" def get_type(self): """Returns the type of a statement. The returned value is a string holding an upper-cased reprint of the first DML or DDL keyword. If the first token in this group isn't a DML or DDL keyword "UNKNOWN" is returned. Whitespaces and comments at the beginning of the statement are ignored. """ token = self.token_first(skip_cm=True) if token is None: # An "empty" statement that either has not tokens at all # or only whitespace tokens. return 'UNKNOWN' elif token.ttype in (T.Keyword.DML, T.Keyword.DDL): return token.normalized elif token.ttype == T.Keyword.CTE: # The WITH keyword should be followed by either an Identifier or # an IdentifierList containing the CTE definitions; the actual # DML keyword (e.g. SELECT, INSERT) will follow next. tidx = self.token_index(token) while tidx is not None: tidx, token = self.token_next(tidx, skip_ws=True) if isinstance(token, (Identifier, IdentifierList)): tidx, token = self.token_next(tidx, skip_ws=True) if token is not None \ and token.ttype == T.Keyword.DML: return token.normalized # Hmm, probably invalid syntax, so return unknown. return 'UNKNOWN' class Identifier(NameAliasMixin, TokenList): """Represents an identifier. Identifiers may have aliases or typecasts. """ def is_wildcard(self): """Return ``True`` if this identifier contains a wildcard.""" _, token = self.token_next_by(t=T.Wildcard) return token is not None def get_typecast(self): """Returns the typecast or ``None`` of this object as a string.""" midx, marker = self.token_next_by(m=(T.Punctuation, '::')) nidx, next_ = self.token_next(midx, skip_ws=False) return next_.value if next_ else None def get_ordering(self): """Returns the ordering or ``None`` as uppercase string.""" _, ordering = self.token_next_by(t=T.Keyword.Order) return ordering.normalized if ordering else None def get_array_indices(self): """Returns an iterator of index token lists""" for token in self.tokens: if isinstance(token, SquareBrackets): # Use [1:-1] index to discard the square brackets yield token.tokens[1:-1] class IdentifierList(TokenList): """A list of :class:`~sqlparse.sql.Identifier`\'s.""" def get_identifiers(self): """Returns the identifiers. Whitespaces and punctuations are not included in this generator. """ for token in self.tokens: if not (token.is_whitespace or token.match(T.Punctuation, ',')): yield token class TypedLiteral(TokenList): """A typed literal, such as "date '2001-09-28'" or "interval '2 hours'".""" M_OPEN = [(T.Name.Builtin, None), (T.Keyword, "TIMESTAMP")] M_CLOSE = T.String.Single, None M_EXTEND = T.Keyword, ("DAY", "HOUR", "MINUTE", "MONTH", "SECOND", "YEAR") class Parenthesis(TokenList): """Tokens between parenthesis.""" M_OPEN = T.Punctuation, '(' M_CLOSE = T.Punctuation, ')' @property def _groupable_tokens(self): return self.tokens[1:-1] class SquareBrackets(TokenList): """Tokens between square brackets""" M_OPEN = T.Punctuation, '[' M_CLOSE = T.Punctuation, ']' @property def _groupable_tokens(self): return self.tokens[1:-1] class Assignment(TokenList): """An assignment like 'var := val;'""" class If(TokenList): """An 'if' clause with possible 'else if' or 'else' parts.""" M_OPEN = T.Keyword, 'IF' M_CLOSE = T.Keyword, 'END IF' class For(TokenList): """A 'FOR' loop.""" M_OPEN = T.Keyword, ('FOR', 'FOREACH') M_CLOSE = T.Keyword, 'END LOOP' class Comparison(TokenList): """A comparison used for example in WHERE clauses.""" @property def left(self): return self.tokens[0] @property def right(self): return self.tokens[-1] class Comment(TokenList): """A comment.""" def is_multiline(self): return self.tokens and self.tokens[0].ttype == T.Comment.Multiline class Where(TokenList): """A WHERE clause.""" M_OPEN = T.Keyword, 'WHERE' M_CLOSE = T.Keyword, ( 'ORDER BY', 'GROUP BY', 'LIMIT', 'UNION', 'UNION ALL', 'EXCEPT', 'HAVING', 'RETURNING', 'INTO') class Having(TokenList): """A HAVING clause.""" M_OPEN = T.Keyword, 'HAVING' M_CLOSE = T.Keyword, ('ORDER BY', 'LIMIT') class Case(TokenList): """A CASE statement with one or more WHEN and possibly an ELSE part.""" M_OPEN = T.Keyword, 'CASE' M_CLOSE = T.Keyword, 'END' def get_cases(self, skip_ws=False): """Returns a list of 2-tuples (condition, value). If an ELSE exists condition is None. """ CONDITION = 1 VALUE = 2 ret = [] mode = CONDITION for token in self.tokens: # Set mode from the current statement if token.match(T.Keyword, 'CASE'): continue elif skip_ws and token.ttype in T.Whitespace: continue elif token.match(T.Keyword, 'WHEN'): ret.append(([], [])) mode = CONDITION elif token.match(T.Keyword, 'THEN'): mode = VALUE elif token.match(T.Keyword, 'ELSE'): ret.append((None, [])) mode = VALUE elif token.match(T.Keyword, 'END'): mode = None # First condition without preceding WHEN if mode and not ret: ret.append(([], [])) # Append token depending of the current mode if mode == CONDITION: ret[-1][0].append(token) elif mode == VALUE: ret[-1][1].append(token) # Return cases list return ret class Function(NameAliasMixin, TokenList): """A function or procedure call.""" def get_parameters(self): """Return a list of parameters.""" parenthesis = self.tokens[-1] for token in parenthesis.tokens: if isinstance(token, IdentifierList): return token.get_identifiers() elif imt(token, i=(Function, Identifier), t=T.Literal): return [token, ] return [] class Begin(TokenList): """A BEGIN/END block.""" M_OPEN = T.Keyword, 'BEGIN' M_CLOSE = T.Keyword, 'END' class Operation(TokenList): """Grouping of operations""" class Values(TokenList): """Grouping of values""" class Command(TokenList): """Grouping of CLI commands.""" sqlparse-0.4.4/sqlparse/tokens.py000066400000000000000000000031751454634525500171030ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause # # The Token implementation is based on pygment's token system written # by Georg Brandl. # http://pygments.org/ """Tokens""" class _TokenType(tuple): parent = None def __contains__(self, item): return item is not None and (self is item or item[:len(self)] == self) def __getattr__(self, name): new = _TokenType(self + (name,)) setattr(self, name, new) new.parent = self return new def __repr__(self): # self can be False only if its the `root` i.e. Token itself return 'Token' + ('.' if self else '') + '.'.join(self) Token = _TokenType() # Special token types Text = Token.Text Whitespace = Text.Whitespace Newline = Whitespace.Newline Error = Token.Error # Text that doesn't belong to this lexer (e.g. HTML in PHP) Other = Token.Other # Common token types for source code Keyword = Token.Keyword Name = Token.Name Literal = Token.Literal String = Literal.String Number = Literal.Number Punctuation = Token.Punctuation Operator = Token.Operator Comparison = Operator.Comparison Wildcard = Token.Wildcard Comment = Token.Comment Assignment = Token.Assignment # Generic types for non-source code Generic = Token.Generic Command = Generic.Command # String and some others are not direct children of Token. # alias them: Token.Token = Token Token.String = String Token.Number = Number # SQL specific tokens DML = Keyword.DML DDL = Keyword.DDL CTE = Keyword.CTE sqlparse-0.4.4/sqlparse/utils.py000066400000000000000000000065661454634525500167470ustar00rootroot00000000000000# # Copyright (C) 2009-2020 the sqlparse authors and contributors # # # This module is part of python-sqlparse and is released under # the BSD License: https://opensource.org/licenses/BSD-3-Clause import itertools import re from collections import deque from contextlib import contextmanager # This regular expression replaces the home-cooked parser that was here before. # It is much faster, but requires an extra post-processing step to get the # desired results (that are compatible with what you would expect from the # str.splitlines() method). # # It matches groups of characters: newlines, quoted strings, or unquoted text, # and splits on that basis. The post-processing step puts those back together # into the actual lines of SQL. SPLIT_REGEX = re.compile(r""" ( (?: # Start of non-capturing group (?:\r\n|\r|\n) | # Match any single newline, or [^\r\n'"]+ | # Match any character series without quotes or # newlines, or "(?:[^"\\]|\\.)*" | # Match double-quoted strings, or '(?:[^'\\]|\\.)*' # Match single quoted strings ) ) """, re.VERBOSE) LINE_MATCH = re.compile(r'(\r\n|\r|\n)') def split_unquoted_newlines(stmt): """Split a string on all unquoted newlines. Unlike str.splitlines(), this will ignore CR/LF/CR+LF if the requisite character is inside of a string.""" text = str(stmt) lines = SPLIT_REGEX.split(text) outputlines = [''] for line in lines: if not line: continue elif LINE_MATCH.match(line): outputlines.append('') else: outputlines[-1] += line return outputlines def remove_quotes(val): """Helper that removes surrounding quotes from strings.""" if val is None: return if val[0] in ('"', "'", '`') and val[0] == val[-1]: val = val[1:-1] return val def recurse(*cls): """Function decorator to help with recursion :param cls: Classes to not recurse over :return: function """ def wrap(f): def wrapped_f(tlist): for sgroup in tlist.get_sublists(): if not isinstance(sgroup, cls): wrapped_f(sgroup) f(tlist) return wrapped_f return wrap def imt(token, i=None, m=None, t=None): """Helper function to simplify comparisons Instance, Match and TokenType :param token: :param i: Class or Tuple/List of Classes :param m: Tuple of TokenType & Value. Can be list of Tuple for multiple :param t: TokenType or Tuple/List of TokenTypes :return: bool """ clss = i types = [t, ] if t and not isinstance(t, list) else t mpatterns = [m, ] if m and not isinstance(m, list) else m if token is None: return False elif clss and isinstance(token, clss): return True elif mpatterns and any(token.match(*pattern) for pattern in mpatterns): return True elif types and any(token.ttype in ttype for ttype in types): return True else: return False def consume(iterator, n): """Advance the iterator n-steps ahead. If n is none, consume entirely.""" deque(itertools.islice(iterator, n), maxlen=0) @contextmanager def offset(filter_, n=0): filter_.offset += n yield filter_.offset -= n @contextmanager def indent(filter_, n=1): filter_.indent += n yield filter_.indent -= n sqlparse-0.4.4/tests/000077500000000000000000000000001454634525500145305ustar00rootroot00000000000000sqlparse-0.4.4/tests/__init__.py000066400000000000000000000000001454634525500166270ustar00rootroot00000000000000sqlparse-0.4.4/tests/conftest.py000066400000000000000000000030431454634525500167270ustar00rootroot00000000000000"""Helpers for testing.""" import io import os import pytest DIR_PATH = os.path.dirname(__file__) FILES_DIR = os.path.join(DIR_PATH, 'files') @pytest.fixture() def filepath(): """Returns full file path for test files.""" def make_filepath(filename): # https://stackoverflow.com/questions/18011902/py-test-pass-a-parameter-to-a-fixture-function # Alternate solution is to use parametrization `indirect=True` # https://stackoverflow.com/questions/18011902/py-test-pass-a-parameter-to-a-fixture-function/33879151#33879151 # Syntax is noisy and requires specific variable names return os.path.join(FILES_DIR, filename) return make_filepath @pytest.fixture() def load_file(filepath): """Opens filename with encoding and return its contents.""" def make_load_file(filename, encoding='utf-8'): # https://stackoverflow.com/questions/18011902/py-test-pass-a-parameter-to-a-fixture-function # Alternate solution is to use parametrization `indirect=True` # https://stackoverflow.com/questions/18011902/py-test-pass-a-parameter-to-a-fixture-function/33879151#33879151 # Syntax is noisy and requires specific variable names # And seems to be limited to only 1 argument. with open(filepath(filename), encoding=encoding) as f: return f.read().strip() return make_load_file @pytest.fixture() def get_stream(filepath): def make_stream(filename, encoding='utf-8'): return open(filepath(filename), encoding=encoding) return make_stream sqlparse-0.4.4/tests/files/000077500000000000000000000000001454634525500156325ustar00rootroot00000000000000sqlparse-0.4.4/tests/files/_Make_DirEntry.sql000066400000000000000000000001551454634525500212100ustar00rootroot00000000000000-- Make a new dir entry -- and return its inode INSERT INTO dir_entries(type) VALUES(:type)sqlparse-0.4.4/tests/files/begintag.sql000066400000000000000000000000551454634525500201330ustar00rootroot00000000000000begin; update foo set bar = 1; commit;sqlparse-0.4.4/tests/files/begintag_2.sql000066400000000000000000000005451454634525500203600ustar00rootroot00000000000000CREATE TRIGGER IF NOT EXISTS remove_if_it_was_the_last_file_link -- Delete the direntry when is removed it's last static link AFTER DELETE ON links WHEN NOT EXISTS ( SELECT * FROM links WHERE child_entry = OLD.child_entry LIMIT 1 ) BEGIN DELETE FROM dir_entries WHERE dir_entries.inode = OLD.child_entry; END;sqlparse-0.4.4/tests/files/casewhen_procedure.sql000066400000000000000000000002261454634525500222200ustar00rootroot00000000000000create procedure procName() begin select case when column = 'value' then column else 0 end; end; create procedure procName() begin select 1; end; sqlparse-0.4.4/tests/files/dashcomment.sql000066400000000000000000000001301454634525500206470ustar00rootroot00000000000000select * from user; --select * from host; select * from user; select * -- foo; from foo;sqlparse-0.4.4/tests/files/encoding_gbk.sql000066400000000000000000000000621454634525500207620ustar00rootroot00000000000000select * from foo where bar = 'ϲԼ'sqlparse-0.4.4/tests/files/encoding_utf8.sql000066400000000000000000000001141454634525500211030ustar00rootroot00000000000000select * from foo where bar = '齐天大圣.カラフルな雲.사랑해요'sqlparse-0.4.4/tests/files/function.sql000066400000000000000000000002731454634525500202020ustar00rootroot00000000000000CREATE OR REPLACE FUNCTION foo( p_in1 VARCHAR , p_in2 INTEGER ) RETURNS INTEGER AS DECLARE v_foo INTEGER; BEGIN SELECT * FROM foo INTO v_foo; RETURN v_foo.id; END;sqlparse-0.4.4/tests/files/function_psql.sql000066400000000000000000000057721454634525500212520ustar00rootroot00000000000000CREATE OR REPLACE FUNCTION public.delete_data ( p_tabelle VARCHAR , p_key VARCHAR , p_value INTEGER ) RETURNS INTEGER AS $$ DECLARE p_retval INTEGER; v_constraint RECORD; v_count INTEGER; v_data RECORD; v_fieldname VARCHAR; v_sql VARCHAR; v_key VARCHAR; v_value INTEGER; BEGIN v_sql := 'SELECT COUNT(*) FROM ' || p_tabelle || ' WHERE ' || p_key || ' = ' || p_value; --RAISE NOTICE '%', v_sql; EXECUTE v_sql INTO v_count; IF v_count::integer != 0 THEN SELECT att.attname INTO v_key FROM pg_attribute att LEFT JOIN pg_constraint con ON con.conrelid = att.attrelid AND con.conkey[1] = att.attnum AND con.contype = 'p', pg_type typ, pg_class rel, pg_namespace ns WHERE att.attrelid = rel.oid AND att.attnum > 0 AND typ.oid = att.atttypid AND att.attisdropped = false AND rel.relname = p_tabelle AND con.conkey[1] = 1 AND ns.oid = rel.relnamespace AND ns.nspname = 'public' ORDER BY att.attnum; v_sql := 'SELECT ' || v_key || ' AS id FROM ' || p_tabelle || ' WHERE ' || p_key || ' = ' || p_value; FOR v_data IN EXECUTE v_sql LOOP --RAISE NOTICE ' -> % %', p_tabelle, v_data.id; FOR v_constraint IN SELECT t.constraint_name , t.constraint_type , t.table_name , c.column_name FROM public.v_table_constraints t , public.v_constraint_columns c WHERE t.constraint_name = c.constraint_name AND t.constraint_type = 'FOREIGN KEY' AND c.table_name = p_tabelle AND t.table_schema = 'public' AND c.table_schema = 'public' LOOP v_fieldname := substring(v_constraint.constraint_name from 1 for length(v_constraint.constraint_name) - length(v_constraint.column_name) - 1); IF (v_constraint.table_name = p_tabelle) AND (p_value = v_data.id) THEN --RAISE NOTICE 'Skip (Selbstverweis)'; CONTINUE; ELSE PERFORM delete_data(v_constraint.table_name::varchar, v_fieldname::varchar, v_data.id::integer); END IF; END LOOP; END LOOP; v_sql := 'DELETE FROM ' || p_tabelle || ' WHERE ' || p_key || ' = ' || p_value; --RAISE NOTICE '%', v_sql; EXECUTE v_sql; p_retval := 1; ELSE --RAISE NOTICE ' -> Keine Sätze gefunden'; p_retval := 0; END IF; RETURN p_retval; END; $$ LANGUAGE plpgsql;sqlparse-0.4.4/tests/files/function_psql2.sql000066400000000000000000000002611454634525500213200ustar00rootroot00000000000000CREATE OR REPLACE FUNCTION update_something() RETURNS void AS $body$ BEGIN raise notice 'foo'; END; $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;sqlparse-0.4.4/tests/files/function_psql3.sql000066400000000000000000000002531454634525500213220ustar00rootroot00000000000000CREATE OR REPLACE FUNCTION foo() RETURNS integer AS $body$ DECLARE BEGIN select * from foo; END; $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY INVOKER;sqlparse-0.4.4/tests/files/function_psql4.sql000066400000000000000000000003351454634525500213240ustar00rootroot00000000000000CREATE FUNCTION doubledollarinbody(var1 text) RETURNS text /* see issue277 */ LANGUAGE plpgsql AS $_$ DECLARE str text; BEGIN str = $$'foo'$$||var1; execute 'select '||str into str; return str; END $_$; sqlparse-0.4.4/tests/files/huge_select.sql000066400000000000000000000223321454634525500206440ustar00rootroot00000000000000select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foo UNION select case when i = 0 then 1 else 0 end as col0, case when i = 1 then 1 else 1 end as col1, case when i = 2 then 1 else 2 end as col2, case when i = 3 then 1 else 3 end as col3, case when i = 4 then 1 else 4 end as col4, case when i = 5 then 1 else 5 end as col5, case when i = 6 then 1 else 6 end as col6, case when i = 7 then 1 else 7 end as col7, case when i = 8 then 1 else 8 end as col8, case when i = 9 then 1 else 9 end as col9, case when i = 10 then 1 else 10 end as col10, case when i = 11 then 1 else 11 end as col11, case when i = 12 then 1 else 12 end as col12, case when i = 13 then 1 else 13 end as col13, case when i = 14 then 1 else 14 end as col14, case when i = 15 then 1 else 15 end as col15, case when i = 16 then 1 else 16 end as col16, case when i = 17 then 1 else 17 end as col17, case when i = 18 then 1 else 18 end as col18, case when i = 19 then 1 else 19 end as col19, case when i = 20 then 1 else 20 end as col20, case when i = 21 then 1 else 21 end as col21, case when i = 22 then 1 else 22 end as col22 from foosqlparse-0.4.4/tests/files/mysql_handler.sql000066400000000000000000000002061454634525500212130ustar00rootroot00000000000000create procedure proc1() begin declare handler for foo begin end; select 1; end; create procedure proc2() begin select 1; end; sqlparse-0.4.4/tests/files/stream.sql000066400000000000000000000000541454634525500176450ustar00rootroot00000000000000-- this file is streamed in insert into foo sqlparse-0.4.4/tests/files/test_cp1251.sql000066400000000000000000000000611454634525500203220ustar00rootroot00000000000000insert into foo values (1); -- sqlparse-0.4.4/tests/test_cli.py000066400000000000000000000067151454634525500167210ustar00rootroot00000000000000import subprocess import sys import pytest import sqlparse def test_cli_main_empty(): with pytest.raises(SystemExit): sqlparse.cli.main([]) def test_parser_empty(): with pytest.raises(SystemExit): parser = sqlparse.cli.create_parser() parser.parse_args([]) def test_main_help(): # Call with the --help option as a basic sanity check. with pytest.raises(SystemExit) as exinfo: sqlparse.cli.main(["--help", ]) assert exinfo.value.code == 0 def test_valid_args(filepath): # test doesn't abort path = filepath('function.sql') assert sqlparse.cli.main([path, '-r']) is not None def test_invalid_choice(filepath): path = filepath('function.sql') with pytest.raises(SystemExit): sqlparse.cli.main([path, '-l', 'Spanish']) def test_invalid_args(filepath, capsys): path = filepath('function.sql') sqlparse.cli.main([path, '-r', '--indent_width', '0']) _, err = capsys.readouterr() assert err == ("[ERROR] Invalid options: indent_width requires " "a positive integer\n") def test_invalid_infile(filepath, capsys): path = filepath('missing.sql') sqlparse.cli.main([path, '-r']) _, err = capsys.readouterr() assert err[:22] == "[ERROR] Failed to read" def test_invalid_outfile(filepath, capsys): path = filepath('function.sql') outpath = filepath('/missing/function.sql') sqlparse.cli.main([path, '-r', '-o', outpath]) _, err = capsys.readouterr() assert err[:22] == "[ERROR] Failed to open" def test_stdout(filepath, load_file, capsys): path = filepath('begintag.sql') expected = load_file('begintag.sql') sqlparse.cli.main([path]) out, _ = capsys.readouterr() assert out == expected def test_script(): # Call with the --help option as a basic sanity check. cmd = "{:s} -m sqlparse.cli --help".format(sys.executable) assert subprocess.call(cmd.split()) == 0 @pytest.mark.parametrize('fpath, encoding', ( ('encoding_utf8.sql', 'utf-8'), ('encoding_gbk.sql', 'gbk'), )) def test_encoding_stdout(fpath, encoding, filepath, load_file, capfd): path = filepath(fpath) expected = load_file(fpath, encoding) sqlparse.cli.main([path, '--encoding', encoding]) out, _ = capfd.readouterr() assert out == expected @pytest.mark.parametrize('fpath, encoding', ( ('encoding_utf8.sql', 'utf-8'), ('encoding_gbk.sql', 'gbk'), )) def test_encoding_output_file(fpath, encoding, filepath, load_file, tmpdir): in_path = filepath(fpath) expected = load_file(fpath, encoding) out_path = tmpdir.dirname + '/encoding_out.sql' sqlparse.cli.main([in_path, '--encoding', encoding, '-o', out_path]) out = load_file(out_path, encoding) assert out == expected @pytest.mark.parametrize('fpath, encoding', ( ('encoding_utf8.sql', 'utf-8'), ('encoding_gbk.sql', 'gbk'), )) def test_encoding_stdin(fpath, encoding, filepath, load_file, capfd): path = filepath(fpath) expected = load_file(fpath, encoding) old_stdin = sys.stdin with open(path) as f: sys.stdin = f sqlparse.cli.main(['-', '--encoding', encoding]) sys.stdin = old_stdin out, _ = capfd.readouterr() assert out == expected def test_encoding(filepath, capsys): path = filepath('test_cp1251.sql') expected = 'insert into foo values (1); -- Песня про надежду\n' sqlparse.cli.main([path, '--encoding=cp1251']) out, _ = capsys.readouterr() assert out == expected sqlparse-0.4.4/tests/test_format.py000066400000000000000000000631671454634525500174460ustar00rootroot00000000000000import pytest import sqlparse from sqlparse.exceptions import SQLParseError class TestFormat: def test_keywordcase(self): sql = 'select * from bar; -- select foo\n' res = sqlparse.format(sql, keyword_case='upper') assert res == 'SELECT * FROM bar; -- select foo\n' res = sqlparse.format(sql, keyword_case='capitalize') assert res == 'Select * From bar; -- select foo\n' res = sqlparse.format(sql.upper(), keyword_case='lower') assert res == 'select * from BAR; -- SELECT FOO\n' def test_keywordcase_invalid_option(self): sql = 'select * from bar; -- select foo\n' with pytest.raises(SQLParseError): sqlparse.format(sql, keyword_case='foo') def test_identifiercase(self): sql = 'select * from bar; -- select foo\n' res = sqlparse.format(sql, identifier_case='upper') assert res == 'select * from BAR; -- select foo\n' res = sqlparse.format(sql, identifier_case='capitalize') assert res == 'select * from Bar; -- select foo\n' res = sqlparse.format(sql.upper(), identifier_case='lower') assert res == 'SELECT * FROM bar; -- SELECT FOO\n' def test_identifiercase_invalid_option(self): sql = 'select * from bar; -- select foo\n' with pytest.raises(SQLParseError): sqlparse.format(sql, identifier_case='foo') def test_identifiercase_quotes(self): sql = 'select * from "foo"."bar"' res = sqlparse.format(sql, identifier_case="upper") assert res == 'select * from "foo"."bar"' def test_strip_comments_single(self): sql = 'select *-- statement starts here\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\nfrom foo' sql = 'select * -- statement starts here\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\nfrom foo' sql = 'select-- foo\nfrom -- bar\nwhere' res = sqlparse.format(sql, strip_comments=True) assert res == 'select\nfrom\nwhere' sql = 'select *-- statement starts here\n\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\n\nfrom foo' sql = 'select * from foo-- statement starts here\nwhere' res = sqlparse.format(sql, strip_comments=True) assert res == 'select * from foo\nwhere' sql = 'select a-- statement starts here\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select a\nfrom foo' sql = '--comment\nselect a-- statement starts here\n' \ 'from foo--comment\nf' res = sqlparse.format(sql, strip_comments=True) assert res == 'select a\nfrom foo\nf' def test_strip_comments_invalid_option(self): sql = 'select-- foo\nfrom -- bar\nwhere' with pytest.raises(SQLParseError): sqlparse.format(sql, strip_comments=None) def test_strip_comments_multi(self): sql = '/* sql starts here */\nselect' res = sqlparse.format(sql, strip_comments=True) assert res == 'select' sql = '/* sql starts here */ select' res = sqlparse.format(sql, strip_comments=True) assert res == 'select' sql = '/*\n * sql starts here\n */\nselect' res = sqlparse.format(sql, strip_comments=True) assert res == 'select' sql = 'select (/* sql starts here */ select 2)' res = sqlparse.format(sql, strip_comments=True) assert res == 'select (select 2)' sql = 'select (/* sql /* starts here */ select 2)' res = sqlparse.format(sql, strip_comments=True) assert res == 'select (select 2)' def test_strip_comments_preserves_linebreak(self): sql = 'select * -- a comment\r\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\nfrom foo' sql = 'select * -- a comment\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\nfrom foo' sql = 'select * -- a comment\rfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\nfrom foo' sql = 'select * -- a comment\r\n\r\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\n\nfrom foo' sql = 'select * -- a comment\n\nfrom foo' res = sqlparse.format(sql, strip_comments=True) assert res == 'select *\n\nfrom foo' def test_strip_ws(self): f = lambda sql: sqlparse.format(sql, strip_whitespace=True) s = 'select\n* from foo\n\twhere ( 1 = 2 )\n' assert f(s) == 'select * from foo where (1 = 2)' s = 'select -- foo\nfrom bar\n' assert f(s) == 'select -- foo\nfrom bar' def test_strip_ws_invalid_option(self): s = 'select -- foo\nfrom bar\n' with pytest.raises(SQLParseError): sqlparse.format(s, strip_whitespace=None) def test_preserve_ws(self): # preserve at least one whitespace after subgroups f = lambda sql: sqlparse.format(sql, strip_whitespace=True) s = 'select\n* /* foo */ from bar ' assert f(s) == 'select * /* foo */ from bar' def test_notransform_of_quoted_crlf(self): # Make sure that CR/CR+LF characters inside string literals don't get # affected by the formatter. s1 = "SELECT some_column LIKE 'value\r'" s2 = "SELECT some_column LIKE 'value\r'\r\nWHERE id = 1\n" s3 = "SELECT some_column LIKE 'value\\'\r' WHERE id = 1\r" s4 = "SELECT some_column LIKE 'value\\\\\\'\r' WHERE id = 1\r\n" f = lambda x: sqlparse.format(x) # Because of the use of assert f(s1) == "SELECT some_column LIKE 'value\r'" assert f(s2) == "SELECT some_column LIKE 'value\r'\nWHERE id = 1\n" assert f(s3) == "SELECT some_column LIKE 'value\\'\r' WHERE id = 1\n" assert (f(s4) == "SELECT some_column LIKE 'value\\\\\\'\r' WHERE id = 1\n") class TestFormatReindentAligned: @staticmethod def formatter(sql): return sqlparse.format(sql, reindent_aligned=True) def test_basic(self): sql = """ select a, b as bb,c from table join (select a * 2 as a from new_table) other on table.a = other.a where c is true and b between 3 and 4 or d is 'blue' limit 10 """ assert self.formatter(sql) == '\n'.join([ 'select a,', ' b as bb,', ' c', ' from table', ' join (', ' select a * 2 as a', ' from new_table', ' ) other', ' on table.a = other.a', ' where c is true', ' and b between 3 and 4', " or d is 'blue'", ' limit 10']) def test_joins(self): sql = """ select * from a join b on a.one = b.one left join c on c.two = a.two and c.three = a.three full outer join d on d.three = a.three cross join e on e.four = a.four join f using (one, two, three) """ assert self.formatter(sql) == '\n'.join([ 'select *', ' from a', ' join b', ' on a.one = b.one', ' left join c', ' on c.two = a.two', ' and c.three = a.three', ' full outer join d', ' on d.three = a.three', ' cross join e', ' on e.four = a.four', ' join f using (one, two, three)']) def test_case_statement(self): sql = """ select a, case when a = 0 then 1 when bb = 1 then 1 when c = 2 then 2 else 0 end as d, extra_col from table where c is true and b between 3 and 4 """ assert self.formatter(sql) == '\n'.join([ 'select a,', ' case when a = 0 then 1', ' when bb = 1 then 1', ' when c = 2 then 2', ' else 0', ' end as d,', ' extra_col', ' from table', ' where c is true', ' and b between 3 and 4']) def test_case_statement_with_between(self): sql = """ select a, case when a = 0 then 1 when bb = 1 then 1 when c = 2 then 2 when d between 3 and 5 then 3 else 0 end as d, extra_col from table where c is true and b between 3 and 4 """ assert self.formatter(sql) == '\n'.join([ 'select a,', ' case when a = 0 then 1', ' when bb = 1 then 1', ' when c = 2 then 2', ' when d between 3 and 5 then 3', ' else 0', ' end as d,', ' extra_col', ' from table', ' where c is true', ' and b between 3 and 4']) def test_group_by(self): sql = """ select a, b, c, sum(x) as sum_x, count(y) as cnt_y from table group by a,b,c having sum(x) > 1 and count(y) > 5 order by 3,2,1 """ assert self.formatter(sql) == '\n'.join([ 'select a,', ' b,', ' c,', ' sum(x) as sum_x,', ' count(y) as cnt_y', ' from table', ' group by a,', ' b,', ' c', 'having sum(x) > 1', ' and count(y) > 5', ' order by 3,', ' 2,', ' 1']) def test_group_by_subquery(self): # TODO: add subquery alias when test_identifier_list_subquery fixed sql = """ select *, sum_b + 2 as mod_sum from ( select a, sum(b) as sum_b from table group by a,z) order by 1,2 """ assert self.formatter(sql) == '\n'.join([ 'select *,', ' sum_b + 2 as mod_sum', ' from (', ' select a,', ' sum(b) as sum_b', ' from table', ' group by a,', ' z', ' )', ' order by 1,', ' 2']) def test_window_functions(self): sql = """ select a, SUM(a) OVER (PARTITION BY b ORDER BY c ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sum_a, ROW_NUMBER() OVER (PARTITION BY b, c ORDER BY d DESC) as row_num from table""" assert self.formatter(sql) == '\n'.join([ 'select a,', ' SUM(a) OVER (PARTITION BY b ORDER BY c ROWS ' 'BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as sum_a,', ' ROW_NUMBER() OVER ' '(PARTITION BY b, c ORDER BY d DESC) as row_num', ' from table']) class TestSpacesAroundOperators: @staticmethod def formatter(sql): return sqlparse.format(sql, use_space_around_operators=True) def test_basic(self): sql = ('select a+b as d from table ' 'where (c-d)%2= 1 and e> 3.0/4 and z^2 <100') assert self.formatter(sql) == ( 'select a + b as d from table ' 'where (c - d) % 2 = 1 and e > 3.0 / 4 and z ^ 2 < 100') def test_bools(self): sql = 'select * from table where a &&b or c||d' assert self.formatter( sql) == 'select * from table where a && b or c || d' def test_nested(self): sql = 'select *, case when a-b then c end from table' assert self.formatter( sql) == 'select *, case when a - b then c end from table' def test_wildcard_vs_mult(self): sql = 'select a*b-c from table' assert self.formatter(sql) == 'select a * b - c from table' class TestFormatReindent: def test_option(self): with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=2) with pytest.raises(SQLParseError): sqlparse.format('foo', indent_tabs=2) with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=True, indent_width='foo') with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=True, indent_width=-12) with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=True, wrap_after='foo') with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=True, wrap_after=-12) with pytest.raises(SQLParseError): sqlparse.format('foo', reindent=True, comma_first='foo') def test_stmts(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select foo; select bar' assert f(s) == 'select foo;\n\nselect bar' s = 'select foo' assert f(s) == 'select foo' s = 'select foo; -- test\n select bar' assert f(s) == 'select foo; -- test\n\nselect bar' def test_keywords(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select * from foo union select * from bar;' assert f(s) == '\n'.join([ 'select *', 'from foo', 'union', 'select *', 'from bar;']) def test_keywords_between(self): # issue 14 # don't break AND after BETWEEN f = lambda sql: sqlparse.format(sql, reindent=True) s = 'and foo between 1 and 2 and bar = 3' assert f(s) == '\n'.join([ '', 'and foo between 1 and 2', 'and bar = 3']) def test_parenthesis(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select count(*) from (select * from foo);' assert f(s) == '\n'.join([ 'select count(*)', 'from', ' (select *', ' from foo);']) assert f("select f(1)") == 'select f(1)' assert f("select f( 1 )") == 'select f(1)' assert f("select f(\n\n\n1\n\n\n)") == 'select f(1)' assert f("select f(\n\n\n 1 \n\n\n)") == 'select f(1)' assert f("select f(\n\n\n 1 \n\n\n)") == 'select f(1)' def test_where(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select * from foo where bar = 1 and baz = 2 or bzz = 3;' assert f(s) == '\n'.join([ 'select *', 'from foo', 'where bar = 1', ' and baz = 2', ' or bzz = 3;']) s = 'select * from foo where bar = 1 and (baz = 2 or bzz = 3);' assert f(s) == '\n'.join([ 'select *', 'from foo', 'where bar = 1', ' and (baz = 2', ' or bzz = 3);']) def test_join(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select * from foo join bar on 1 = 2' assert f(s) == '\n'.join([ 'select *', 'from foo', 'join bar on 1 = 2']) s = 'select * from foo inner join bar on 1 = 2' assert f(s) == '\n'.join([ 'select *', 'from foo', 'inner join bar on 1 = 2']) s = 'select * from foo left outer join bar on 1 = 2' assert f(s) == '\n'.join([ 'select *', 'from foo', 'left outer join bar on 1 = 2']) s = 'select * from foo straight_join bar on 1 = 2' assert f(s) == '\n'.join([ 'select *', 'from foo', 'straight_join bar on 1 = 2']) def test_identifier_list(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select foo, bar, baz from table1, table2 where 1 = 2' assert f(s) == '\n'.join([ 'select foo,', ' bar,', ' baz', 'from table1,', ' table2', 'where 1 = 2']) s = 'select a.*, b.id from a, b' assert f(s) == '\n'.join([ 'select a.*,', ' b.id', 'from a,', ' b']) def test_identifier_list_with_wrap_after(self): f = lambda sql: sqlparse.format(sql, reindent=True, wrap_after=14) s = 'select foo, bar, baz from table1, table2 where 1 = 2' assert f(s) == '\n'.join([ 'select foo, bar,', ' baz', 'from table1, table2', 'where 1 = 2']) def test_identifier_list_comment_first(self): f = lambda sql: sqlparse.format(sql, reindent=True, comma_first=True) # not the 3: It cleans up whitespace too! s = 'select foo, bar, baz from table where foo in (1, 2,3)' assert f(s) == '\n'.join([ 'select foo', ' , bar', ' , baz', 'from table', 'where foo in (1', ' , 2', ' , 3)']) def test_identifier_list_with_functions(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = ("select 'abc' as foo, coalesce(col1, col2)||col3 as bar," "col3 from my_table") assert f(s) == '\n'.join([ "select 'abc' as foo,", " coalesce(col1, col2)||col3 as bar,", " col3", "from my_table"]) def test_long_identifier_list_with_functions(self): f = lambda sql: sqlparse.format(sql, reindent=True, wrap_after=30) s = ("select 'abc' as foo, json_build_object('a', a," "'b', b, 'c', c, 'd', d, 'e', e) as col2" "col3 from my_table") assert f(s) == '\n'.join([ "select 'abc' as foo,", " json_build_object('a',", " a, 'b', b, 'c', c, 'd', d,", " 'e', e) as col2col3", "from my_table"]) def test_case(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'case when foo = 1 then 2 when foo = 3 then 4 else 5 end' assert f(s) == '\n'.join([ 'case', ' when foo = 1 then 2', ' when foo = 3 then 4', ' else 5', 'end']) def test_case2(self): f = lambda sql: sqlparse.format(sql, reindent=True) s = 'case(foo) when bar = 1 then 2 else 3 end' assert f(s) == '\n'.join([ 'case(foo)', ' when bar = 1 then 2', ' else 3', 'end']) def test_nested_identifier_list(self): # issue4 f = lambda sql: sqlparse.format(sql, reindent=True) s = '(foo as bar, bar1, bar2 as bar3, b4 as b5)' assert f(s) == '\n'.join([ '(foo as bar,', ' bar1,', ' bar2 as bar3,', ' b4 as b5)']) def test_duplicate_linebreaks(self): # issue3 f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select c1 -- column1\nfrom foo' assert f(s) == '\n'.join([ 'select c1 -- column1', 'from foo']) s = 'select c1 -- column1\nfrom foo' r = sqlparse.format(s, reindent=True, strip_comments=True) assert r == '\n'.join([ 'select c1', 'from foo']) s = 'select c1\nfrom foo\norder by c1' assert f(s) == '\n'.join([ 'select c1', 'from foo', 'order by c1']) s = 'select c1 from t1 where (c1 = 1) order by c1' assert f(s) == '\n'.join([ 'select c1', 'from t1', 'where (c1 = 1)', 'order by c1']) def test_keywordfunctions(self): # issue36 f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select max(a) b, foo, bar' assert f(s) == '\n'.join([ 'select max(a) b,', ' foo,', ' bar']) def test_identifier_and_functions(self): # issue45 f = lambda sql: sqlparse.format(sql, reindent=True) s = 'select foo.bar, nvl(1) from dual' assert f(s) == '\n'.join([ 'select foo.bar,', ' nvl(1)', 'from dual']) def test_insert_values(self): # issue 329 f = lambda sql: sqlparse.format(sql, reindent=True) s = 'insert into foo values (1, 2)' assert f(s) == '\n'.join([ 'insert into foo', 'values (1, 2)']) s = 'insert into foo values (1, 2), (3, 4), (5, 6)' assert f(s) == '\n'.join([ 'insert into foo', 'values (1, 2),', ' (3, 4),', ' (5, 6)']) s = 'insert into foo(a, b) values (1, 2), (3, 4), (5, 6)' assert f(s) == '\n'.join([ 'insert into foo(a, b)', 'values (1, 2),', ' (3, 4),', ' (5, 6)']) f = lambda sql: sqlparse.format(sql, reindent=True, comma_first=True) s = 'insert into foo values (1, 2)' assert f(s) == '\n'.join([ 'insert into foo', 'values (1, 2)']) s = 'insert into foo values (1, 2), (3, 4), (5, 6)' assert f(s) == '\n'.join([ 'insert into foo', 'values (1, 2)', ' , (3, 4)', ' , (5, 6)']) s = 'insert into foo(a, b) values (1, 2), (3, 4), (5, 6)' assert f(s) == '\n'.join([ 'insert into foo(a, b)', 'values (1, 2)', ' , (3, 4)', ' , (5, 6)']) class TestOutputFormat: def test_python(self): sql = 'select * from foo;' f = lambda sql: sqlparse.format(sql, output_format='python') assert f(sql) == "sql = 'select * from foo;'" f = lambda sql: sqlparse.format(sql, output_format='python', reindent=True) assert f(sql) == '\n'.join([ "sql = ('select * '", " 'from foo;')"]) def test_python_multiple_statements(self): sql = 'select * from foo; select 1 from dual' f = lambda sql: sqlparse.format(sql, output_format='python') assert f(sql) == '\n'.join([ "sql = 'select * from foo; '", "sql2 = 'select 1 from dual'"]) @pytest.mark.xfail(reason="Needs fixing") def test_python_multiple_statements_with_formatting(self): sql = 'select * from foo; select 1 from dual' f = lambda sql: sqlparse.format(sql, output_format='python', reindent=True) assert f(sql) == '\n'.join([ "sql = ('select * '", " 'from foo;')", "sql2 = ('select 1 '", " 'from dual')"]) def test_php(self): sql = 'select * from foo;' f = lambda sql: sqlparse.format(sql, output_format='php') assert f(sql) == '$sql = "select * from foo;";' f = lambda sql: sqlparse.format(sql, output_format='php', reindent=True) assert f(sql) == '\n'.join([ '$sql = "select * ";', '$sql .= "from foo;";']) def test_sql(self): # "sql" is an allowed option but has no effect sql = 'select * from foo;' f = lambda sql: sqlparse.format(sql, output_format='sql') assert f(sql) == 'select * from foo;' def test_invalid_option(self): sql = 'select * from foo;' with pytest.raises(SQLParseError): sqlparse.format(sql, output_format='foo') def test_format_column_ordering(): # issue89 sql = 'select * from foo order by c1 desc, c2, c3;' formatted = sqlparse.format(sql, reindent=True) expected = '\n'.join([ 'select *', 'from foo', 'order by c1 desc,', ' c2,', ' c3;']) assert formatted == expected def test_truncate_strings(): sql = "update foo set value = '{}';".format('x' * 1000) formatted = sqlparse.format(sql, truncate_strings=10) assert formatted == "update foo set value = 'xxxxxxxxxx[...]';" formatted = sqlparse.format(sql, truncate_strings=3, truncate_char='YYY') assert formatted == "update foo set value = 'xxxYYY';" @pytest.mark.parametrize('option', ['bar', -1, 0]) def test_truncate_strings_invalid_option2(option): with pytest.raises(SQLParseError): sqlparse.format('foo', truncate_strings=option) @pytest.mark.parametrize('sql', [ 'select verrrylongcolumn from foo', 'select "verrrylongcolumn" from "foo"']) def test_truncate_strings_doesnt_truncate_identifiers(sql): formatted = sqlparse.format(sql, truncate_strings=2) assert formatted == sql def test_having_produces_newline(): sql = ('select * from foo, bar where bar.id = foo.bar_id ' 'having sum(bar.value) > 100') formatted = sqlparse.format(sql, reindent=True) expected = [ 'select *', 'from foo,', ' bar', 'where bar.id = foo.bar_id', 'having sum(bar.value) > 100'] assert formatted == '\n'.join(expected) @pytest.mark.parametrize('right_margin', ['ten', 2]) def test_format_right_margin_invalid_option(right_margin): with pytest.raises(SQLParseError): sqlparse.format('foo', right_margin=right_margin) @pytest.mark.xfail(reason="Needs fixing") def test_format_right_margin(): # TODO: Needs better test, only raises exception right now sqlparse.format('foo', right_margin="79") sqlparse-0.4.4/tests/test_grouping.py000066400000000000000000000541541454634525500200040ustar00rootroot00000000000000import pytest import sqlparse from sqlparse import sql, tokens as T def test_grouping_parenthesis(): s = 'select (select (x3) x2) and (y2) bar' parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert len(parsed.tokens) == 7 assert isinstance(parsed.tokens[2], sql.Parenthesis) assert isinstance(parsed.tokens[-1], sql.Identifier) assert len(parsed.tokens[2].tokens) == 5 assert isinstance(parsed.tokens[2].tokens[3], sql.Identifier) assert isinstance(parsed.tokens[2].tokens[3].tokens[0], sql.Parenthesis) assert len(parsed.tokens[2].tokens[3].tokens) == 3 def test_grouping_comments(): s = '/*\n * foo\n */ \n bar' parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert len(parsed.tokens) == 2 @pytest.mark.parametrize('s', ['foo := 1;', 'foo := 1']) def test_grouping_assignment(s): parsed = sqlparse.parse(s)[0] assert len(parsed.tokens) == 1 assert isinstance(parsed.tokens[0], sql.Assignment) @pytest.mark.parametrize('s', ["x > DATE '2020-01-01'", "x > TIMESTAMP '2020-01-01 00:00:00'"]) def test_grouping_typed_literal(s): parsed = sqlparse.parse(s)[0] assert isinstance(parsed[0][4], sql.TypedLiteral) @pytest.mark.parametrize('s, a, b', [ ('select a from b where c < d + e', sql.Identifier, sql.Identifier), ('select a from b where c < d + interval \'1 day\'', sql.Identifier, sql.TypedLiteral), ('select a from b where c < d + interval \'6\' month', sql.Identifier, sql.TypedLiteral), ('select a from b where c < current_timestamp - interval \'1 day\'', sql.Token, sql.TypedLiteral), ]) def test_compare_expr(s, a, b): parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert isinstance(parsed.tokens[2], sql.Identifier) assert isinstance(parsed.tokens[6], sql.Identifier) assert isinstance(parsed.tokens[8], sql.Where) assert len(parsed.tokens) == 9 where = parsed.tokens[8] assert isinstance(where.tokens[2], sql.Comparison) assert len(where.tokens) == 3 comparison = where.tokens[2] assert isinstance(comparison.tokens[0], sql.Identifier) assert comparison.tokens[2].ttype is T.Operator.Comparison assert isinstance(comparison.tokens[4], sql.Operation) assert len(comparison.tokens) == 5 operation = comparison.tokens[4] assert isinstance(operation.tokens[0], a) assert operation.tokens[2].ttype is T.Operator assert isinstance(operation.tokens[4], b) assert len(operation.tokens) == 5 def test_grouping_identifiers(): s = 'select foo.bar from "myscheme"."table" where fail. order' parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert isinstance(parsed.tokens[2], sql.Identifier) assert isinstance(parsed.tokens[6], sql.Identifier) assert isinstance(parsed.tokens[8], sql.Where) s = 'select * from foo where foo.id = 1' parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert isinstance(parsed.tokens[-1].tokens[-1].tokens[0], sql.Identifier) s = 'select * from (select "foo"."id" from foo)' parsed = sqlparse.parse(s)[0] assert str(parsed) == s assert isinstance(parsed.tokens[-1].tokens[3], sql.Identifier) for s in ["INSERT INTO `test` VALUES('foo', 'bar');", "INSERT INTO `test` VALUES(1, 2), (3, 4), (5, 6);", "INSERT INTO `test(a, b)` VALUES(1, 2), (3, 4), (5, 6);"]: parsed = sqlparse.parse(s)[0] types = [l.ttype for l in parsed.tokens if not l.is_whitespace] assert types == [T.DML, T.Keyword, None, None, T.Punctuation] assert isinstance(parsed.tokens[6], sql.Values) s = "select 1.0*(a+b) as col, sum(c)/sum(d) from myschema.mytable" parsed = sqlparse.parse(s)[0] assert len(parsed.tokens) == 7 assert isinstance(parsed.tokens[2], sql.IdentifierList) assert len(parsed.tokens[2].tokens) == 4 identifiers = list(parsed.tokens[2].get_identifiers()) assert len(identifiers) == 2 assert identifiers[0].get_alias() == "col" @pytest.mark.parametrize('s', [ '1 as f', 'foo as f', 'foo f', '1/2 as f', '1/2 f', '1<2 as f', # issue327 '1<2 f', ]) def test_simple_identifiers(s): parsed = sqlparse.parse(s)[0] assert isinstance(parsed.tokens[0], sql.Identifier) @pytest.mark.parametrize('s', [ 'foo, bar', 'sum(a), sum(b)', 'sum(a) as x, b as y', 'sum(a)::integer, b', 'sum(a)/count(b) as x, y', 'sum(a)::integer as x, y', 'sum(a)::integer/count(b) as x, y', # issue297 ]) def test_group_identifier_list(s): parsed = sqlparse.parse(s)[0] assert isinstance(parsed.tokens[0], sql.IdentifierList) def test_grouping_identifier_wildcard(): p = sqlparse.parse('a.*, b.id')[0] assert isinstance(p.tokens[0], sql.IdentifierList) assert isinstance(p.tokens[0].tokens[0], sql.Identifier) assert isinstance(p.tokens[0].tokens[-1], sql.Identifier) def test_grouping_identifier_name_wildcard(): p = sqlparse.parse('a.*')[0] t = p.tokens[0] assert t.get_name() == '*' assert t.is_wildcard() is True def test_grouping_identifier_invalid(): p = sqlparse.parse('a.')[0] assert isinstance(p.tokens[0], sql.Identifier) assert p.tokens[0].has_alias() is False assert p.tokens[0].get_name() is None assert p.tokens[0].get_real_name() is None assert p.tokens[0].get_parent_name() == 'a' def test_grouping_identifier_invalid_in_middle(): # issue261 s = 'SELECT foo. FROM foo' p = sqlparse.parse(s)[0] assert isinstance(p[2], sql.Identifier) assert p[2][1].ttype == T.Punctuation assert p[3].ttype == T.Whitespace assert str(p[2]) == 'foo.' @pytest.mark.parametrize('s', ['foo as (select *)', 'foo as(select *)']) def test_grouping_identifer_as(s): # issue507 p = sqlparse.parse(s)[0] assert isinstance(p.tokens[0], sql.Identifier) token = p.tokens[0].tokens[2] assert token.ttype == T.Keyword assert token.normalized == 'AS' def test_grouping_identifier_as_invalid(): # issue8 p = sqlparse.parse('foo as select *')[0] assert len(p.tokens), 5 assert isinstance(p.tokens[0], sql.Identifier) assert len(p.tokens[0].tokens) == 1 assert p.tokens[2].ttype == T.Keyword def test_grouping_identifier_function(): p = sqlparse.parse('foo() as bar')[0] assert isinstance(p.tokens[0], sql.Identifier) assert isinstance(p.tokens[0].tokens[0], sql.Function) p = sqlparse.parse('foo()||col2 bar')[0] assert isinstance(p.tokens[0], sql.Identifier) assert isinstance(p.tokens[0].tokens[0], sql.Operation) assert isinstance(p.tokens[0].tokens[0].tokens[0], sql.Function) @pytest.mark.parametrize('s', ['foo+100', 'foo + 100', 'foo*100']) def test_grouping_operation(s): p = sqlparse.parse(s)[0] assert isinstance(p.tokens[0], sql.Operation) def test_grouping_identifier_list(): p = sqlparse.parse('a, b, c')[0] assert isinstance(p.tokens[0], sql.IdentifierList) p = sqlparse.parse('(a, b, c)')[0] assert isinstance(p.tokens[0].tokens[1], sql.IdentifierList) def test_grouping_identifier_list_subquery(): """identifier lists should still work in subqueries with aliases""" p = sqlparse.parse("select * from (" "select a, b + c as d from table) sub")[0] subquery = p.tokens[-1].tokens[0] idx, iden_list = subquery.token_next_by(i=sql.IdentifierList) assert iden_list is not None # all the identifiers should be within the IdentifierList _, ilist = subquery.token_next_by(i=sql.Identifier, idx=idx) assert ilist is None def test_grouping_identifier_list_case(): p = sqlparse.parse('a, case when 1 then 2 else 3 end as b, c')[0] assert isinstance(p.tokens[0], sql.IdentifierList) p = sqlparse.parse('(a, case when 1 then 2 else 3 end as b, c)')[0] assert isinstance(p.tokens[0].tokens[1], sql.IdentifierList) def test_grouping_identifier_list_other(): # issue2 p = sqlparse.parse("select *, null, 1, 'foo', bar from mytable, x")[0] assert isinstance(p.tokens[2], sql.IdentifierList) assert len(p.tokens[2].tokens) == 13 def test_grouping_identifier_list_with_inline_comments(): # issue163 p = sqlparse.parse('foo /* a comment */, bar')[0] assert isinstance(p.tokens[0], sql.IdentifierList) assert isinstance(p.tokens[0].tokens[0], sql.Identifier) assert isinstance(p.tokens[0].tokens[3], sql.Identifier) def test_grouping_identifiers_with_operators(): p = sqlparse.parse('a+b as c from table where (d-e)%2= 1')[0] assert len([x for x in p.flatten() if x.ttype == T.Name]) == 5 def test_grouping_identifier_list_with_order(): # issue101 p = sqlparse.parse('1, 2 desc, 3')[0] assert isinstance(p.tokens[0], sql.IdentifierList) assert isinstance(p.tokens[0].tokens[3], sql.Identifier) assert str(p.tokens[0].tokens[3]) == '2 desc' def test_grouping_where(): s = 'select * from foo where bar = 1 order by id desc' p = sqlparse.parse(s)[0] assert str(p) == s assert len(p.tokens) == 12 s = 'select x from (select y from foo where bar = 1) z' p = sqlparse.parse(s)[0] assert str(p) == s assert isinstance(p.tokens[-1].tokens[0].tokens[-2], sql.Where) @pytest.mark.parametrize('s', ( 'select 1 where 1 = 2 union select 2', 'select 1 where 1 = 2 union all select 2', )) def test_grouping_where_union(s): p = sqlparse.parse(s)[0] assert p.tokens[5].value.startswith('union') def test_returning_kw_ends_where_clause(): s = 'delete from foo where x > y returning z' p = sqlparse.parse(s)[0] assert isinstance(p.tokens[6], sql.Where) assert p.tokens[7].ttype == T.Keyword assert p.tokens[7].value == 'returning' def test_into_kw_ends_where_clause(): # issue324 s = 'select * from foo where a = 1 into baz' p = sqlparse.parse(s)[0] assert isinstance(p.tokens[8], sql.Where) assert p.tokens[9].ttype == T.Keyword assert p.tokens[9].value == 'into' @pytest.mark.parametrize('sql, expected', [ # note: typecast needs to be 2nd token for this test ('select foo::integer from bar', 'integer'), ('select (current_database())::information_schema.sql_identifier', 'information_schema.sql_identifier'), ]) def test_grouping_typecast(sql, expected): p = sqlparse.parse(sql)[0] assert p.tokens[2].get_typecast() == expected def test_grouping_alias(): s = 'select foo as bar from mytable' p = sqlparse.parse(s)[0] assert str(p) == s assert p.tokens[2].get_real_name() == 'foo' assert p.tokens[2].get_alias() == 'bar' s = 'select foo from mytable t1' p = sqlparse.parse(s)[0] assert str(p) == s assert p.tokens[6].get_real_name() == 'mytable' assert p.tokens[6].get_alias() == 't1' s = 'select foo::integer as bar from mytable' p = sqlparse.parse(s)[0] assert str(p) == s assert p.tokens[2].get_alias() == 'bar' s = ('SELECT DISTINCT ' '(current_database())::information_schema.sql_identifier AS view') p = sqlparse.parse(s)[0] assert str(p) == s assert p.tokens[4].get_alias() == 'view' def test_grouping_alias_case(): # see issue46 p = sqlparse.parse('CASE WHEN 1 THEN 2 ELSE 3 END foo')[0] assert len(p.tokens) == 1 assert p.tokens[0].get_alias() == 'foo' def test_grouping_alias_ctas(): p = sqlparse.parse('CREATE TABLE tbl1 AS SELECT coalesce(t1.col1, 0) AS col1 FROM t1')[0] assert p.tokens[10].get_alias() == 'col1' assert isinstance(p.tokens[10].tokens[0], sql.Function) def test_grouping_subquery_no_parens(): # Not totally sure if this is the right approach... # When a THEN clause contains a subquery w/o parenthesis around it *and* # a WHERE condition, the WHERE grouper consumes END too. # This takes makes sure that it doesn't fail. p = sqlparse.parse('CASE WHEN 1 THEN select 2 where foo = 1 end')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Case) @pytest.mark.parametrize('s', ['foo.bar', 'x, y', 'x > y', 'x / y']) def test_grouping_alias_returns_none(s): # see issue185 and issue445 p = sqlparse.parse(s)[0] assert len(p.tokens) == 1 assert p.tokens[0].get_alias() is None def test_grouping_idlist_function(): # see issue10 too p = sqlparse.parse('foo(1) x, bar')[0] assert isinstance(p.tokens[0], sql.IdentifierList) def test_grouping_comparison_exclude(): # make sure operators are not handled too lazy p = sqlparse.parse('(=)')[0] assert isinstance(p.tokens[0], sql.Parenthesis) assert not isinstance(p.tokens[0].tokens[1], sql.Comparison) p = sqlparse.parse('(a=1)')[0] assert isinstance(p.tokens[0].tokens[1], sql.Comparison) p = sqlparse.parse('(a>=1)')[0] assert isinstance(p.tokens[0].tokens[1], sql.Comparison) def test_grouping_function(): p = sqlparse.parse('foo()')[0] assert isinstance(p.tokens[0], sql.Function) p = sqlparse.parse('foo(null, bar)')[0] assert isinstance(p.tokens[0], sql.Function) assert len(list(p.tokens[0].get_parameters())) == 2 def test_grouping_function_not_in(): # issue183 p = sqlparse.parse('in(1, 2)')[0] assert len(p.tokens) == 2 assert p.tokens[0].ttype == T.Keyword assert isinstance(p.tokens[1], sql.Parenthesis) def test_grouping_varchar(): p = sqlparse.parse('"text" Varchar(50) NOT NULL')[0] assert isinstance(p.tokens[2], sql.Function) def test_statement_get_type(): def f(sql): return sqlparse.parse(sql)[0] assert f('select * from foo').get_type() == 'SELECT' assert f('update foo').get_type() == 'UPDATE' assert f(' update foo').get_type() == 'UPDATE' assert f('\nupdate foo').get_type() == 'UPDATE' assert f('foo').get_type() == 'UNKNOWN' def test_identifier_with_operators(): # issue 53 p = sqlparse.parse('foo||bar')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Operation) # again with whitespaces p = sqlparse.parse('foo || bar')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Operation) def test_identifier_with_op_trailing_ws(): # make sure trailing whitespace isn't grouped with identifier p = sqlparse.parse('foo || bar ')[0] assert len(p.tokens) == 2 assert isinstance(p.tokens[0], sql.Operation) assert p.tokens[1].ttype is T.Whitespace def test_identifier_with_string_literals(): p = sqlparse.parse("foo + 'bar'")[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Operation) # This test seems to be wrong. It was introduced when fixing #53, but #111 # showed that this shouldn't be an identifier at all. I'm leaving this # commented in the source for a while. # def test_identifier_string_concat(): # p = sqlparse.parse("'foo' || bar")[0] # assert len(p.tokens) == 1 # assert isinstance(p.tokens[0], sql.Identifier) def test_identifier_consumes_ordering(): # issue89 p = sqlparse.parse('select * from foo order by c1 desc, c2, c3')[0] assert isinstance(p.tokens[-1], sql.IdentifierList) ids = list(p.tokens[-1].get_identifiers()) assert len(ids) == 3 assert ids[0].get_name() == 'c1' assert ids[0].get_ordering() == 'DESC' assert ids[1].get_name() == 'c2' assert ids[1].get_ordering() is None def test_comparison_with_keywords(): # issue90 # in fact these are assignments, but for now we don't distinguish them p = sqlparse.parse('foo = NULL')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert len(p.tokens[0].tokens) == 5 assert p.tokens[0].left.value == 'foo' assert p.tokens[0].right.value == 'NULL' # make sure it's case-insensitive p = sqlparse.parse('foo = null')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) def test_comparison_with_floats(): # issue145 p = sqlparse.parse('foo = 25.5')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert len(p.tokens[0].tokens) == 5 assert p.tokens[0].left.value == 'foo' assert p.tokens[0].right.value == '25.5' def test_comparison_with_parenthesis(): # issue23 p = sqlparse.parse('(3 + 4) = 7')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) comp = p.tokens[0] assert isinstance(comp.left, sql.Parenthesis) assert comp.right.ttype is T.Number.Integer @pytest.mark.parametrize('operator', ( '=', '!=', '>', '<', '<=', '>=', '~', '~~', '!~~', 'LIKE', 'NOT LIKE', 'ILIKE', 'NOT ILIKE', )) def test_comparison_with_strings(operator): # issue148 p = sqlparse.parse("foo {} 'bar'".format(operator))[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert p.tokens[0].right.value == "'bar'" assert p.tokens[0].right.ttype == T.String.Single def test_like_and_ilike_comparison(): def validate_where_clause(where_clause, expected_tokens): assert len(where_clause.tokens) == len(expected_tokens) for where_token, expected_token in zip(where_clause, expected_tokens): expected_ttype, expected_value = expected_token if where_token.ttype is not None: assert where_token.match(expected_ttype, expected_value, regex=True) else: # Certain tokens, such as comparison tokens, do not define a ttype that can be # matched against. For these tokens, we ensure that the token instance is of # the expected type and has a value conforming to specified regular expression import re assert (isinstance(where_token, expected_ttype) and re.match(expected_value, where_token.value)) [p1] = sqlparse.parse("select * from mytable where mytable.mycolumn LIKE 'expr%' limit 5;") [p1_where] = [token for token in p1 if isinstance(token, sql.Where)] validate_where_clause(p1_where, [ (T.Keyword, "where"), (T.Whitespace, None), (sql.Comparison, r"mytable.mycolumn LIKE.*"), (T.Whitespace, None), ]) [p2] = sqlparse.parse( "select * from mytable where mycolumn NOT ILIKE '-expr' group by othercolumn;") [p2_where] = [token for token in p2 if isinstance(token, sql.Where)] validate_where_clause(p2_where, [ (T.Keyword, "where"), (T.Whitespace, None), (sql.Comparison, r"mycolumn NOT ILIKE.*"), (T.Whitespace, None), ]) def test_comparison_with_functions(): # issue230 p = sqlparse.parse('foo = DATE(bar.baz)')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert len(p.tokens[0].tokens) == 5 assert p.tokens[0].left.value == 'foo' assert p.tokens[0].right.value == 'DATE(bar.baz)' p = sqlparse.parse('DATE(foo.bar) = DATE(bar.baz)')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert len(p.tokens[0].tokens) == 5 assert p.tokens[0].left.value == 'DATE(foo.bar)' assert p.tokens[0].right.value == 'DATE(bar.baz)' p = sqlparse.parse('DATE(foo.bar) = bar.baz')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Comparison) assert len(p.tokens[0].tokens) == 5 assert p.tokens[0].left.value == 'DATE(foo.bar)' assert p.tokens[0].right.value == 'bar.baz' def test_comparison_with_typed_literal(): p = sqlparse.parse("foo = DATE 'bar.baz'")[0] assert len(p.tokens) == 1 comp = p.tokens[0] assert isinstance(comp, sql.Comparison) assert len(comp.tokens) == 5 assert comp.left.value == 'foo' assert isinstance(comp.right, sql.TypedLiteral) assert comp.right.value == "DATE 'bar.baz'" @pytest.mark.parametrize('start', ['FOR', 'FOREACH']) def test_forloops(start): p = sqlparse.parse('{} foo in bar LOOP foobar END LOOP'.format(start))[0] assert (len(p.tokens)) == 1 assert isinstance(p.tokens[0], sql.For) def test_nested_for(): p = sqlparse.parse('FOR foo LOOP FOR bar LOOP END LOOP END LOOP')[0] assert len(p.tokens) == 1 for1 = p.tokens[0] assert for1.tokens[0].value == 'FOR' assert for1.tokens[-1].value == 'END LOOP' for2 = for1.tokens[6] assert isinstance(for2, sql.For) assert for2.tokens[0].value == 'FOR' assert for2.tokens[-1].value == 'END LOOP' def test_begin(): p = sqlparse.parse('BEGIN foo END')[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Begin) def test_keyword_followed_by_parenthesis(): p = sqlparse.parse('USING(somecol')[0] assert len(p.tokens) == 3 assert p.tokens[0].ttype == T.Keyword assert p.tokens[1].ttype == T.Punctuation def test_nested_begin(): p = sqlparse.parse('BEGIN foo BEGIN bar END END')[0] assert len(p.tokens) == 1 outer = p.tokens[0] assert outer.tokens[0].value == 'BEGIN' assert outer.tokens[-1].value == 'END' inner = outer.tokens[4] assert inner.tokens[0].value == 'BEGIN' assert inner.tokens[-1].value == 'END' assert isinstance(inner, sql.Begin) def test_aliased_column_without_as(): p = sqlparse.parse('foo bar')[0].tokens assert len(p) == 1 assert p[0].get_real_name() == 'foo' assert p[0].get_alias() == 'bar' p = sqlparse.parse('foo.bar baz')[0].tokens[0] assert p.get_parent_name() == 'foo' assert p.get_real_name() == 'bar' assert p.get_alias() == 'baz' def test_qualified_function(): p = sqlparse.parse('foo()')[0].tokens[0] assert p.get_parent_name() is None assert p.get_real_name() == 'foo' p = sqlparse.parse('foo.bar()')[0].tokens[0] assert p.get_parent_name() == 'foo' assert p.get_real_name() == 'bar' def test_aliased_function_without_as(): p = sqlparse.parse('foo() bar')[0].tokens[0] assert p.get_parent_name() is None assert p.get_real_name() == 'foo' assert p.get_alias() == 'bar' p = sqlparse.parse('foo.bar() baz')[0].tokens[0] assert p.get_parent_name() == 'foo' assert p.get_real_name() == 'bar' assert p.get_alias() == 'baz' def test_aliased_literal_without_as(): p = sqlparse.parse('1 foo')[0].tokens assert len(p) == 1 assert p[0].get_alias() == 'foo' def test_grouping_as_cte(): p = sqlparse.parse('foo AS WITH apple AS 1, banana AS 2')[0].tokens assert len(p) > 4 assert p[0].get_alias() is None assert p[2].value == 'AS' assert p[4].value == 'WITH' def test_grouping_create_table(): p = sqlparse.parse("create table db.tbl (a string)")[0].tokens assert p[4].value == "db.tbl" sqlparse-0.4.4/tests/test_keywords.py000066400000000000000000000006771454634525500200220ustar00rootroot00000000000000import pytest from sqlparse import tokens from sqlparse.lexer import Lexer class TestSQLREGEX: @pytest.mark.parametrize('number', ['1.0', '-1.0', '1.', '-1.', '.1', '-.1']) def test_float_numbers(self, number): ttype = next(tt for action, tt in Lexer.get_default_instance()._SQL_REGEX if action(number)) assert tokens.Number.Float == ttype sqlparse-0.4.4/tests/test_parse.py000066400000000000000000000421461454634525500172620ustar00rootroot00000000000000"""Tests sqlparse.parse().""" from io import StringIO import pytest import sqlparse from sqlparse import sql, tokens as T, keywords from sqlparse.lexer import Lexer def test_parse_tokenize(): s = 'select * from foo;' stmts = sqlparse.parse(s) assert len(stmts) == 1 assert str(stmts[0]) == s def test_parse_multistatement(): sql1 = 'select * from foo;' sql2 = 'select * from bar;' stmts = sqlparse.parse(sql1 + sql2) assert len(stmts) == 2 assert str(stmts[0]) == sql1 assert str(stmts[1]) == sql2 @pytest.mark.parametrize('s', ['select\n*from foo;', 'select\r\n*from foo', 'select\r*from foo', 'select\r\n*from foo\n']) def test_parse_newlines(s): p = sqlparse.parse(s)[0] assert str(p) == s def test_parse_within(): s = 'foo(col1, col2)' p = sqlparse.parse(s)[0] col1 = p.tokens[0].tokens[1].tokens[1].tokens[0] assert col1.within(sql.Function) def test_parse_child_of(): s = '(col1, col2)' p = sqlparse.parse(s)[0] assert p.tokens[0].tokens[1].is_child_of(p.tokens[0]) s = 'select foo' p = sqlparse.parse(s)[0] assert not p.tokens[2].is_child_of(p.tokens[0]) assert p.tokens[2].is_child_of(p) def test_parse_has_ancestor(): s = 'foo or (bar, baz)' p = sqlparse.parse(s)[0] baz = p.tokens[-1].tokens[1].tokens[-1] assert baz.has_ancestor(p.tokens[-1].tokens[1]) assert baz.has_ancestor(p.tokens[-1]) assert baz.has_ancestor(p) @pytest.mark.parametrize('s', ['.5', '.51', '1.5', '12.5']) def test_parse_float(s): t = sqlparse.parse(s)[0].tokens assert len(t) == 1 assert t[0].ttype is sqlparse.tokens.Number.Float @pytest.mark.parametrize('s, holder', [ ('select * from foo where user = ?', '?'), ('select * from foo where user = :1', ':1'), ('select * from foo where user = :name', ':name'), ('select * from foo where user = %s', '%s'), ('select * from foo where user = $a', '$a')]) def test_parse_placeholder(s, holder): t = sqlparse.parse(s)[0].tokens[-1].tokens assert t[-1].ttype is sqlparse.tokens.Name.Placeholder assert t[-1].value == holder def test_parse_modulo_not_placeholder(): tokens = list(sqlparse.lexer.tokenize('x %3')) assert tokens[2][0] == sqlparse.tokens.Operator def test_parse_access_symbol(): # see issue27 t = sqlparse.parse('select a.[foo bar] as foo')[0].tokens assert isinstance(t[-1], sql.Identifier) assert t[-1].get_name() == 'foo' assert t[-1].get_real_name() == '[foo bar]' assert t[-1].get_parent_name() == 'a' def test_parse_square_brackets_notation_isnt_too_greedy(): # see issue153 t = sqlparse.parse('[foo], [bar]')[0].tokens assert isinstance(t[0], sql.IdentifierList) assert len(t[0].tokens) == 4 assert t[0].tokens[0].get_real_name() == '[foo]' assert t[0].tokens[-1].get_real_name() == '[bar]' def test_parse_square_brackets_notation_isnt_too_greedy2(): # see issue583 t = sqlparse.parse('[(foo[i])]')[0].tokens assert isinstance(t[0], sql.SquareBrackets) # not Identifier! def test_parse_keyword_like_identifier(): # see issue47 t = sqlparse.parse('foo.key')[0].tokens assert len(t) == 1 assert isinstance(t[0], sql.Identifier) def test_parse_function_parameter(): # see issue94 t = sqlparse.parse('abs(some_col)')[0].tokens[0].get_parameters() assert len(t) == 1 assert isinstance(t[0], sql.Identifier) def test_parse_function_param_single_literal(): t = sqlparse.parse('foo(5)')[0].tokens[0].get_parameters() assert len(t) == 1 assert t[0].ttype is T.Number.Integer def test_parse_nested_function(): t = sqlparse.parse('foo(bar(5))')[0].tokens[0].get_parameters() assert len(t) == 1 assert type(t[0]) is sql.Function def test_parse_div_operator(): p = sqlparse.parse('col1 DIV 5 AS div_col1')[0].tokens assert p[0].tokens[0].tokens[2].ttype is T.Operator assert p[0].get_alias() == 'div_col1' def test_quoted_identifier(): t = sqlparse.parse('select x.y as "z" from foo')[0].tokens assert isinstance(t[2], sql.Identifier) assert t[2].get_name() == 'z' assert t[2].get_real_name() == 'y' @pytest.mark.parametrize('name', [ 'foo', '_foo', # issue175 '1_data', # valid MySQL table name, see issue337 '業者名稱', # valid at least for SQLite3, see issue641 ]) def test_valid_identifier_names(name): t = sqlparse.parse(name)[0].tokens assert isinstance(t[0], sql.Identifier) assert t[0].get_name() == name def test_psql_quotation_marks(): # issue83 # regression: make sure plain $$ work t = sqlparse.split(""" CREATE OR REPLACE FUNCTION testfunc1(integer) RETURNS integer AS $$ .... $$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION testfunc2(integer) RETURNS integer AS $$ .... $$ LANGUAGE plpgsql;""") assert len(t) == 2 # make sure $SOMETHING$ works too t = sqlparse.split(""" CREATE OR REPLACE FUNCTION testfunc1(integer) RETURNS integer AS $PROC_1$ .... $PROC_1$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION testfunc2(integer) RETURNS integer AS $PROC_2$ .... $PROC_2$ LANGUAGE plpgsql;""") assert len(t) == 2 def test_double_precision_is_builtin(): s = 'DOUBLE PRECISION' t = sqlparse.parse(s)[0].tokens assert len(t) == 1 assert t[0].ttype == sqlparse.tokens.Name.Builtin assert t[0].value == 'DOUBLE PRECISION' @pytest.mark.parametrize('ph', ['?', ':1', ':foo', '%s', '%(foo)s']) def test_placeholder(ph): p = sqlparse.parse(ph)[0].tokens assert len(p) == 1 assert p[0].ttype is T.Name.Placeholder @pytest.mark.parametrize('num, expected', [ ('6.67428E-8', T.Number.Float), ('1.988e33', T.Number.Float), ('1e-12', T.Number.Float), ('e1', None), ]) def test_scientific_numbers(num, expected): p = sqlparse.parse(num)[0].tokens assert len(p) == 1 assert p[0].ttype is expected def test_single_quotes_are_strings(): p = sqlparse.parse("'foo'")[0].tokens assert len(p) == 1 assert p[0].ttype is T.String.Single def test_double_quotes_are_identifiers(): p = sqlparse.parse('"foo"')[0].tokens assert len(p) == 1 assert isinstance(p[0], sql.Identifier) def test_single_quotes_with_linebreaks(): # issue118 p = sqlparse.parse("'f\nf'")[0].tokens assert len(p) == 1 assert p[0].ttype is T.String.Single def test_sqlite_identifiers(): # Make sure we still parse sqlite style escapes p = sqlparse.parse('[col1],[col2]')[0].tokens id_names = [id_.get_name() for id_ in p[0].get_identifiers()] assert len(p) == 1 assert isinstance(p[0], sql.IdentifierList) assert id_names == ['[col1]', '[col2]'] p = sqlparse.parse('[col1]+[col2]')[0] types = [tok.ttype for tok in p.flatten()] assert types == [T.Name, T.Operator, T.Name] def test_simple_1d_array_index(): p = sqlparse.parse('col[1]')[0].tokens assert len(p) == 1 assert p[0].get_name() == 'col' indices = list(p[0].get_array_indices()) assert len(indices) == 1 # 1-dimensional index assert len(indices[0]) == 1 # index is single token assert indices[0][0].value == '1' def test_2d_array_index(): p = sqlparse.parse('col[x][(y+1)*2]')[0].tokens assert len(p) == 1 assert p[0].get_name() == 'col' assert len(list(p[0].get_array_indices())) == 2 # 2-dimensional index def test_array_index_function_result(): p = sqlparse.parse('somefunc()[1]')[0].tokens assert len(p) == 1 assert len(list(p[0].get_array_indices())) == 1 def test_schema_qualified_array_index(): p = sqlparse.parse('schem.col[1]')[0].tokens assert len(p) == 1 assert p[0].get_parent_name() == 'schem' assert p[0].get_name() == 'col' assert list(p[0].get_array_indices())[0][0].value == '1' def test_aliased_array_index(): p = sqlparse.parse('col[1] x')[0].tokens assert len(p) == 1 assert p[0].get_alias() == 'x' assert p[0].get_real_name() == 'col' assert list(p[0].get_array_indices())[0][0].value == '1' def test_array_literal(): # See issue #176 p = sqlparse.parse('ARRAY[%s, %s]')[0] assert len(p.tokens) == 2 assert len(list(p.flatten())) == 7 def test_typed_array_definition(): # array indices aren't grouped with built-ins, but make sure we can extract # identifier names p = sqlparse.parse('x int, y int[], z int')[0] names = [x.get_name() for x in p.get_sublists() if isinstance(x, sql.Identifier)] assert names == ['x', 'y', 'z'] @pytest.mark.parametrize('s', ['select 1 -- foo', 'select 1 # foo']) def test_single_line_comments(s): # see issue178 p = sqlparse.parse(s)[0] assert len(p.tokens) == 5 assert p.tokens[-1].ttype == T.Comment.Single @pytest.mark.parametrize('s', ['foo', '@foo', '#foo', '##foo']) def test_names_and_special_names(s): # see issue192 p = sqlparse.parse(s)[0] assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Identifier) def test_get_token_at_offset(): p = sqlparse.parse('select * from dual')[0] # 0123456789 assert p.get_token_at_offset(0) == p.tokens[0] assert p.get_token_at_offset(1) == p.tokens[0] assert p.get_token_at_offset(6) == p.tokens[1] assert p.get_token_at_offset(7) == p.tokens[2] assert p.get_token_at_offset(8) == p.tokens[3] assert p.get_token_at_offset(9) == p.tokens[4] assert p.get_token_at_offset(10) == p.tokens[4] def test_pprint(): p = sqlparse.parse('select a0, b0, c0, d0, e0 from ' '(select * from dual) q0 where 1=1 and 2=2')[0] output = StringIO() p._pprint_tree(f=output) pprint = '\n'.join([ "|- 0 DML 'select'", "|- 1 Whitespace ' '", "|- 2 IdentifierList 'a0, b0...'", "| |- 0 Identifier 'a0'", "| | `- 0 Name 'a0'", "| |- 1 Punctuation ','", "| |- 2 Whitespace ' '", "| |- 3 Identifier 'b0'", "| | `- 0 Name 'b0'", "| |- 4 Punctuation ','", "| |- 5 Whitespace ' '", "| |- 6 Identifier 'c0'", "| | `- 0 Name 'c0'", "| |- 7 Punctuation ','", "| |- 8 Whitespace ' '", "| |- 9 Identifier 'd0'", "| | `- 0 Name 'd0'", "| |- 10 Punctuation ','", "| |- 11 Whitespace ' '", "| `- 12 Identifier 'e0'", "| `- 0 Name 'e0'", "|- 3 Whitespace ' '", "|- 4 Keyword 'from'", "|- 5 Whitespace ' '", "|- 6 Identifier '(selec...'", "| |- 0 Parenthesis '(selec...'", "| | |- 0 Punctuation '('", "| | |- 1 DML 'select'", "| | |- 2 Whitespace ' '", "| | |- 3 Wildcard '*'", "| | |- 4 Whitespace ' '", "| | |- 5 Keyword 'from'", "| | |- 6 Whitespace ' '", "| | |- 7 Identifier 'dual'", "| | | `- 0 Name 'dual'", "| | `- 8 Punctuation ')'", "| |- 1 Whitespace ' '", "| `- 2 Identifier 'q0'", "| `- 0 Name 'q0'", "|- 7 Whitespace ' '", "`- 8 Where 'where ...'", " |- 0 Keyword 'where'", " |- 1 Whitespace ' '", " |- 2 Comparison '1=1'", " | |- 0 Integer '1'", " | |- 1 Comparison '='", " | `- 2 Integer '1'", " |- 3 Whitespace ' '", " |- 4 Keyword 'and'", " |- 5 Whitespace ' '", " `- 6 Comparison '2=2'", " |- 0 Integer '2'", " |- 1 Comparison '='", " `- 2 Integer '2'", ""]) assert output.getvalue() == pprint def test_wildcard_multiplication(): p = sqlparse.parse('select * from dual')[0] assert p.tokens[2].ttype == T.Wildcard p = sqlparse.parse('select a0.* from dual a0')[0] assert p.tokens[2][2].ttype == T.Wildcard p = sqlparse.parse('select 1 * 2 from dual')[0] assert p.tokens[2][2].ttype == T.Operator def test_stmt_tokens_parents(): # see issue 226 s = "CREATE TABLE test();" stmt = sqlparse.parse(s)[0] for token in stmt.tokens: assert token.has_ancestor(stmt) @pytest.mark.parametrize('sql, is_literal', [ ('$$foo$$', True), ('$_$foo$_$', True), ('$token$ foo $token$', True), # don't parse inner tokens ('$_$ foo $token$bar$token$ baz$_$', True), ('$A$ foo $B$', False) # tokens don't match ]) def test_dbldollar_as_literal(sql, is_literal): # see issue 277 p = sqlparse.parse(sql)[0] if is_literal: assert len(p.tokens) == 1 assert p.tokens[0].ttype == T.Literal else: for token in p.tokens: assert token.ttype != T.Literal def test_non_ascii(): _test_non_ascii = "insert into test (id, name) values (1, 'тест');" s = _test_non_ascii stmts = sqlparse.parse(s) assert len(stmts) == 1 statement = stmts[0] assert str(statement) == s assert statement._pprint_tree() is None s = _test_non_ascii.encode('utf-8') stmts = sqlparse.parse(s, 'utf-8') assert len(stmts) == 1 statement = stmts[0] assert str(statement) == _test_non_ascii assert statement._pprint_tree() is None def test_get_real_name(): # issue 369 s = "update a t set t.b=1" stmts = sqlparse.parse(s) assert len(stmts) == 1 assert 'a' == stmts[0].tokens[2].get_real_name() assert 't' == stmts[0].tokens[2].get_alias() def test_from_subquery(): # issue 446 s = 'from(select 1)' stmts = sqlparse.parse(s) assert len(stmts) == 1 assert len(stmts[0].tokens) == 2 assert stmts[0].tokens[0].value == 'from' assert stmts[0].tokens[0].ttype == T.Keyword s = 'from (select 1)' stmts = sqlparse.parse(s) assert len(stmts) == 1 assert len(stmts[0].tokens) == 3 assert stmts[0].tokens[0].value == 'from' assert stmts[0].tokens[0].ttype == T.Keyword assert stmts[0].tokens[1].ttype == T.Whitespace def test_parenthesis(): tokens = sqlparse.parse("(\n\n1\n\n)")[0].tokens[0].tokens assert list(map(lambda t: t.ttype, tokens)) == [T.Punctuation, T.Newline, T.Newline, T.Number.Integer, T.Newline, T.Newline, T.Punctuation] tokens = sqlparse.parse("(\n\n 1 \n\n)")[0].tokens[0].tokens assert list(map(lambda t: t.ttype, tokens)) == [T.Punctuation, T.Newline, T.Newline, T.Whitespace, T.Number.Integer, T.Whitespace, T.Newline, T.Newline, T.Punctuation] def test_configurable_keywords(): sql = """select * from foo BACON SPAM EGGS;""" tokens = sqlparse.parse(sql)[0] assert list( (t.ttype, t.value) for t in tokens if t.ttype not in sqlparse.tokens.Whitespace ) == [ (sqlparse.tokens.Keyword.DML, "select"), (sqlparse.tokens.Wildcard, "*"), (sqlparse.tokens.Keyword, "from"), (None, "foo BACON"), (None, "SPAM EGGS"), (sqlparse.tokens.Punctuation, ";"), ] Lexer.get_default_instance().add_keywords( { "BACON": sqlparse.tokens.Name.Builtin, "SPAM": sqlparse.tokens.Keyword, "EGGS": sqlparse.tokens.Keyword, } ) tokens = sqlparse.parse(sql)[0] # reset the syntax for later tests. Lexer.get_default_instance().default_initialization() assert list( (t.ttype, t.value) for t in tokens if t.ttype not in sqlparse.tokens.Whitespace ) == [ (sqlparse.tokens.Keyword.DML, "select"), (sqlparse.tokens.Wildcard, "*"), (sqlparse.tokens.Keyword, "from"), (None, "foo"), (sqlparse.tokens.Name.Builtin, "BACON"), (sqlparse.tokens.Keyword, "SPAM"), (sqlparse.tokens.Keyword, "EGGS"), (sqlparse.tokens.Punctuation, ";"), ] def test_configurable_regex(): lex = Lexer.get_default_instance() lex.clear() my_regex = (r"ZORDER\s+BY\b", sqlparse.tokens.Keyword) lex.set_SQL_REGEX( keywords.SQL_REGEX[:38] + [my_regex] + keywords.SQL_REGEX[38:] ) lex.add_keywords(keywords.KEYWORDS_COMMON) lex.add_keywords(keywords.KEYWORDS_ORACLE) lex.add_keywords(keywords.KEYWORDS_PLPGSQL) lex.add_keywords(keywords.KEYWORDS_HQL) lex.add_keywords(keywords.KEYWORDS_MSACCESS) lex.add_keywords(keywords.KEYWORDS) tokens = sqlparse.parse("select * from foo zorder by bar;")[0] # reset the syntax for later tests. Lexer.get_default_instance().default_initialization() assert list( (t.ttype, t.value) for t in tokens if t.ttype not in sqlparse.tokens.Whitespace )[4] == (sqlparse.tokens.Keyword, "zorder by") sqlparse-0.4.4/tests/test_regressions.py000066400000000000000000000336511454634525500205140ustar00rootroot00000000000000import pytest import sqlparse from sqlparse import sql, tokens as T def test_issue9(): # make sure where doesn't consume parenthesis p = sqlparse.parse('(where 1)')[0] assert isinstance(p, sql.Statement) assert len(p.tokens) == 1 assert isinstance(p.tokens[0], sql.Parenthesis) prt = p.tokens[0] assert len(prt.tokens) == 3 assert prt.tokens[0].ttype == T.Punctuation assert prt.tokens[-1].ttype == T.Punctuation def test_issue13(): parsed = sqlparse.parse("select 'one';\n" "select 'two\\'';\n" "select 'three';") assert len(parsed) == 3 assert str(parsed[1]).strip() == "select 'two\\'';" @pytest.mark.parametrize('s', ['--hello', '-- hello', '--hello\n', '--', '--\n']) def test_issue26(s): # parse stand-alone comments p = sqlparse.parse(s)[0] assert len(p.tokens) == 1 assert p.tokens[0].ttype is T.Comment.Single @pytest.mark.parametrize('value', ['create', 'CREATE']) def test_issue34(value): t = sqlparse.parse("create")[0].token_first() assert t.match(T.Keyword.DDL, value) is True def test_issue35(): # missing space before LIMIT. Updated for #321 sql = sqlparse.format("select * from foo where bar = 1 limit 1", reindent=True) assert sql == "\n".join([ "select *", "from foo", "where bar = 1", "limit 1"]) def test_issue38(): sql = sqlparse.format("SELECT foo; -- comment", strip_comments=True) assert sql == "SELECT foo;" sql = sqlparse.format("/* foo */", strip_comments=True) assert sql == "" def test_issue39(): p = sqlparse.parse('select user.id from user')[0] assert len(p.tokens) == 7 idt = p.tokens[2] assert idt.__class__ == sql.Identifier assert len(idt.tokens) == 3 assert idt.tokens[0].match(T.Name, 'user') is True assert idt.tokens[1].match(T.Punctuation, '.') is True assert idt.tokens[2].match(T.Name, 'id') is True def test_issue40(): # make sure identifier lists in subselects are grouped p = sqlparse.parse('SELECT id, name FROM ' '(SELECT id, name FROM bar) as foo')[0] assert len(p.tokens) == 7 assert p.tokens[2].__class__ == sql.IdentifierList assert p.tokens[-1].__class__ == sql.Identifier assert p.tokens[-1].get_name() == 'foo' sp = p.tokens[-1].tokens[0] assert sp.tokens[3].__class__ == sql.IdentifierList # make sure that formatting works as expected s = sqlparse.format('SELECT id == name FROM ' '(SELECT id, name FROM bar)', reindent=True) assert s == '\n'.join([ 'SELECT id == name', 'FROM', ' (SELECT id,', ' name', ' FROM bar)']) s = sqlparse.format('SELECT id == name FROM ' '(SELECT id, name FROM bar) as foo', reindent=True) assert s == '\n'.join([ 'SELECT id == name', 'FROM', ' (SELECT id,', ' name', ' FROM bar) as foo']) @pytest.mark.parametrize('s', ['select x.y::text as z from foo', 'select x.y::text as "z" from foo', 'select x."y"::text as z from foo', 'select x."y"::text as "z" from foo', 'select "x".y::text as z from foo', 'select "x".y::text as "z" from foo', 'select "x"."y"::text as z from foo', 'select "x"."y"::text as "z" from foo']) @pytest.mark.parametrize('func_name, result', [('get_name', 'z'), ('get_real_name', 'y'), ('get_parent_name', 'x'), ('get_alias', 'z'), ('get_typecast', 'text')]) def test_issue78(s, func_name, result): # the bug author provided this nice examples, let's use them! p = sqlparse.parse(s)[0] i = p.tokens[2] assert isinstance(i, sql.Identifier) func = getattr(i, func_name) assert func() == result def test_issue83(): sql = """ CREATE OR REPLACE FUNCTION func_a(text) RETURNS boolean LANGUAGE plpgsql STRICT IMMUTABLE AS $_$ BEGIN ... END; $_$; CREATE OR REPLACE FUNCTION func_b(text) RETURNS boolean LANGUAGE plpgsql STRICT IMMUTABLE AS $_$ BEGIN ... END; $_$; ALTER TABLE..... ;""" t = sqlparse.split(sql) assert len(t) == 3 def test_comment_encoding_when_reindent(): # There was an UnicodeEncodeError in the reindent filter that # casted every comment followed by a keyword to str. sql = 'select foo -- Comment containing Ümläuts\nfrom bar' formatted = sqlparse.format(sql, reindent=True) assert formatted == sql def test_parse_sql_with_binary(): # See https://github.com/andialbrecht/sqlparse/pull/88 # digest = '‚|ËêŠplL4¡h‘øN{' digest = '\x82|\xcb\x0e\xea\x8aplL4\xa1h\x91\xf8N{' sql = "select * from foo where bar = '{}'".format(digest) formatted = sqlparse.format(sql, reindent=True) tformatted = "select *\nfrom foo\nwhere bar = '{}'".format(digest) assert formatted == tformatted def test_dont_alias_keywords(): # The _group_left_right function had a bug where the check for the # left side wasn't handled correctly. In one case this resulted in # a keyword turning into an identifier. p = sqlparse.parse('FROM AS foo')[0] assert len(p.tokens) == 5 assert p.tokens[0].ttype is T.Keyword assert p.tokens[2].ttype is T.Keyword def test_format_accepts_encoding(load_file): # issue20 sql = load_file('test_cp1251.sql', 'cp1251') formatted = sqlparse.format(sql, reindent=True, encoding='cp1251') tformatted = 'insert into foo\nvalues (1); -- Песня про надежду' assert formatted == tformatted def test_stream(get_stream): with get_stream("stream.sql") as stream: p = sqlparse.parse(stream)[0] assert p.get_type() == 'INSERT' def test_issue90(): sql = ('UPDATE "gallery_photo" SET "owner_id" = 4018, "deleted_at" = NULL,' ' "width" = NULL, "height" = NULL, "rating_votes" = 0,' ' "rating_score" = 0, "thumbnail_width" = NULL,' ' "thumbnail_height" = NULL, "price" = 1, "description" = NULL') formatted = sqlparse.format(sql, reindent=True) tformatted = '\n'.join([ 'UPDATE "gallery_photo"', 'SET "owner_id" = 4018,', ' "deleted_at" = NULL,', ' "width" = NULL,', ' "height" = NULL,', ' "rating_votes" = 0,', ' "rating_score" = 0,', ' "thumbnail_width" = NULL,', ' "thumbnail_height" = NULL,', ' "price" = 1,', ' "description" = NULL']) assert formatted == tformatted def test_except_formatting(): sql = 'SELECT 1 FROM foo WHERE 2 = 3 EXCEPT SELECT 2 FROM bar WHERE 1 = 2' formatted = sqlparse.format(sql, reindent=True) tformatted = '\n'.join([ 'SELECT 1', 'FROM foo', 'WHERE 2 = 3', 'EXCEPT', 'SELECT 2', 'FROM bar', 'WHERE 1 = 2']) assert formatted == tformatted def test_null_with_as(): sql = 'SELECT NULL AS c1, NULL AS c2 FROM t1' formatted = sqlparse.format(sql, reindent=True) tformatted = '\n'.join([ 'SELECT NULL AS c1,', ' NULL AS c2', 'FROM t1']) assert formatted == tformatted def test_issue190_open_file(filepath): path = filepath('stream.sql') with open(path) as stream: p = sqlparse.parse(stream)[0] assert p.get_type() == 'INSERT' def test_issue193_splitting_function(): sql = """ CREATE FUNCTION a(x VARCHAR(20)) RETURNS VARCHAR(20) BEGIN DECLARE y VARCHAR(20); RETURN x; END; SELECT * FROM a.b;""" statements = sqlparse.split(sql) assert len(statements) == 2 def test_issue194_splitting_function(): sql = """ CREATE FUNCTION a(x VARCHAR(20)) RETURNS VARCHAR(20) BEGIN DECLARE y VARCHAR(20); IF (1 = 1) THEN SET x = y; END IF; RETURN x; END; SELECT * FROM a.b;""" statements = sqlparse.split(sql) assert len(statements) == 2 def test_issue186_get_type(): sql = "-- comment\ninsert into foo" p = sqlparse.parse(sql)[0] assert p.get_type() == 'INSERT' def test_issue212_py2unicode(): t1 = sql.Token(T.String, 'schöner ') t2 = sql.Token(T.String, 'bug') token_list = sql.TokenList([t1, t2]) assert str(token_list) == 'schöner bug' def test_issue213_leadingws(): sql = " select * from foo" assert sqlparse.format(sql, strip_whitespace=True) == "select * from foo" def test_issue227_gettype_cte(): select_stmt = sqlparse.parse('SELECT 1, 2, 3 FROM foo;') assert select_stmt[0].get_type() == 'SELECT' with_stmt = sqlparse.parse('WITH foo AS (SELECT 1, 2, 3)' 'SELECT * FROM foo;') assert with_stmt[0].get_type() == 'SELECT' with2_stmt = sqlparse.parse(""" WITH foo AS (SELECT 1 AS abc, 2 AS def), bar AS (SELECT * FROM something WHERE x > 1) INSERT INTO elsewhere SELECT * FROM foo JOIN bar;""") assert with2_stmt[0].get_type() == 'INSERT' def test_issue207_runaway_format(): sql = 'select 1 from (select 1 as one, 2 as two, 3 from dual) t0' p = sqlparse.format(sql, reindent=True) assert p == '\n'.join([ "select 1", "from", " (select 1 as one,", " 2 as two,", " 3", " from dual) t0"]) def test_token_next_doesnt_ignore_skip_cm(): sql = '--comment\nselect 1' tok = sqlparse.parse(sql)[0].token_next(-1, skip_cm=True)[1] assert tok.value == 'select' @pytest.mark.parametrize('s', [ 'SELECT x AS', 'AS' ]) def test_issue284_as_grouping(s): p = sqlparse.parse(s)[0] assert s == str(p) def test_issue315_utf8_by_default(): # Make sure the lexer can handle utf-8 string by default correctly # digest = '齐天大圣.カラフルな雲.사랑해요' # The digest contains Chinese, Japanese and Korean characters # All in 'utf-8' encoding. digest = ( '\xe9\xbd\x90\xe5\xa4\xa9\xe5\xa4\xa7\xe5\x9c\xa3.' '\xe3\x82\xab\xe3\x83\xa9\xe3\x83\x95\xe3\x83\xab\xe3\x81\xaa\xe9' '\x9b\xb2.' '\xec\x82\xac\xeb\x9e\x91\xed\x95\xb4\xec\x9a\x94' ) sql = "select * from foo where bar = '{}'".format(digest) formatted = sqlparse.format(sql, reindent=True) tformatted = "select *\nfrom foo\nwhere bar = '{}'".format(digest) assert formatted == tformatted def test_issue322_concurrently_is_keyword(): s = 'CREATE INDEX CONCURRENTLY myindex ON mytable(col1);' p = sqlparse.parse(s)[0] assert len(p.tokens) == 12 assert p.tokens[0].ttype is T.Keyword.DDL # CREATE assert p.tokens[2].ttype is T.Keyword # INDEX assert p.tokens[4].ttype is T.Keyword # CONCURRENTLY assert p.tokens[4].value == 'CONCURRENTLY' assert isinstance(p.tokens[6], sql.Identifier) assert p.tokens[6].value == 'myindex' @pytest.mark.parametrize('s', [ 'SELECT @min_price:=MIN(price), @max_price:=MAX(price) FROM shop;', 'SELECT @min_price:=MIN(price), @max_price:=MAX(price) FROM shop', ]) def test_issue359_index_error_assignments(s): sqlparse.parse(s) sqlparse.format(s, strip_comments=True) def test_issue469_copy_as_psql_command(): formatted = sqlparse.format( '\\copy select * from foo', keyword_case='upper', identifier_case='capitalize') assert formatted == '\\copy SELECT * FROM Foo' @pytest.mark.xfail(reason='Needs to be fixed') def test_issue484_comments_and_newlines(): formatted = sqlparse.format('\n'.join([ 'Create table myTable', '(', ' myId TINYINT NOT NULL, --my special comment', ' myName VARCHAR2(100) NOT NULL', ')']), strip_comments=True) assert formatted == ('\n'.join([ 'Create table myTable', '(', ' myId TINYINT NOT NULL,', ' myName VARCHAR2(100) NOT NULL', ')'])) def test_issue485_split_multi(): p_sql = '''CREATE OR REPLACE RULE ruled_tab_2rules AS ON INSERT TO public.ruled_tab DO instead ( select 1; select 2; );''' assert len(sqlparse.split(p_sql)) == 1 def test_issue489_tzcasts(): p = sqlparse.parse('select bar at time zone \'UTC\' as foo')[0] assert p.tokens[-1].has_alias() is True assert p.tokens[-1].get_alias() == 'foo' def test_issue562_tzcasts(): # Test that whitespace between 'from' and 'bar' is retained formatted = sqlparse.format( 'SELECT f(HOUR from bar AT TIME ZONE \'UTC\') from foo', reindent=True ) assert formatted == \ 'SELECT f(HOUR\n from bar AT TIME ZONE \'UTC\')\nfrom foo' def test_as_in_parentheses_indents(): # did raise NoneType has no attribute is_group in _process_parentheses formatted = sqlparse.format('(as foo)', reindent=True) assert formatted == '(as foo)' def test_format_invalid_where_clause(): # did raise ValueError formatted = sqlparse.format('where, foo', reindent=True) assert formatted == 'where, foo' def test_splitting_at_and_backticks_issue588(): splitted = sqlparse.split( 'grant foo to user1@`myhost`; grant bar to user1@`myhost`;') assert len(splitted) == 2 assert splitted[-1] == 'grant bar to user1@`myhost`;' def test_comment_between_cte_clauses_issue632(): p, = sqlparse.parse(""" WITH foo AS (), -- A comment before baz subquery baz AS () SELECT * FROM baz;""") assert p.get_type() == "SELECT" sqlparse-0.4.4/tests/test_split.py000066400000000000000000000112551454634525500173000ustar00rootroot00000000000000# Tests splitting functions. import types from io import StringIO import pytest import sqlparse def test_split_semicolon(): sql1 = 'select * from foo;' sql2 = "select * from foo where bar = 'foo;bar';" stmts = sqlparse.parse(''.join([sql1, sql2])) assert len(stmts) == 2 assert str(stmts[0]) == sql1 assert str(stmts[1]) == sql2 def test_split_backslash(): stmts = sqlparse.parse("select '\'; select '\'';") assert len(stmts) == 2 @pytest.mark.parametrize('fn', ['function.sql', 'function_psql.sql', 'function_psql2.sql', 'function_psql3.sql', 'function_psql4.sql']) def test_split_create_function(load_file, fn): sql = load_file(fn) stmts = sqlparse.parse(sql) assert len(stmts) == 1 assert str(stmts[0]) == sql def test_split_dashcomments(load_file): sql = load_file('dashcomment.sql') stmts = sqlparse.parse(sql) assert len(stmts) == 3 assert ''.join(str(q) for q in stmts) == sql @pytest.mark.parametrize('s', ['select foo; -- comment\n', 'select foo; -- comment\r', 'select foo; -- comment\r\n', 'select foo; -- comment']) def test_split_dashcomments_eol(s): stmts = sqlparse.parse(s) assert len(stmts) == 1 def test_split_begintag(load_file): sql = load_file('begintag.sql') stmts = sqlparse.parse(sql) assert len(stmts) == 3 assert ''.join(str(q) for q in stmts) == sql def test_split_begintag_2(load_file): sql = load_file('begintag_2.sql') stmts = sqlparse.parse(sql) assert len(stmts) == 1 assert ''.join(str(q) for q in stmts) == sql def test_split_dropif(): sql = 'DROP TABLE IF EXISTS FOO;\n\nSELECT * FROM BAR;' stmts = sqlparse.parse(sql) assert len(stmts) == 2 assert ''.join(str(q) for q in stmts) == sql def test_split_comment_with_umlaut(): sql = ('select * from foo;\n' '-- Testing an umlaut: ä\n' 'select * from bar;') stmts = sqlparse.parse(sql) assert len(stmts) == 2 assert ''.join(str(q) for q in stmts) == sql def test_split_comment_end_of_line(): sql = ('select * from foo; -- foo\n' 'select * from bar;') stmts = sqlparse.parse(sql) assert len(stmts) == 2 assert ''.join(str(q) for q in stmts) == sql # make sure the comment belongs to first query assert str(stmts[0]) == 'select * from foo; -- foo\n' def test_split_casewhen(): sql = ("SELECT case when val = 1 then 2 else null end as foo;\n" "comment on table actor is 'The actor table.';") stmts = sqlparse.split(sql) assert len(stmts) == 2 def test_split_casewhen_procedure(load_file): # see issue580 stmts = sqlparse.split(load_file('casewhen_procedure.sql')) assert len(stmts) == 2 def test_split_cursor_declare(): sql = ('DECLARE CURSOR "foo" AS SELECT 1;\n' 'SELECT 2;') stmts = sqlparse.split(sql) assert len(stmts) == 2 def test_split_if_function(): # see issue 33 # don't let IF as a function confuse the splitter sql = ('CREATE TEMPORARY TABLE tmp ' 'SELECT IF(a=1, a, b) AS o FROM one; ' 'SELECT t FROM two') stmts = sqlparse.split(sql) assert len(stmts) == 2 def test_split_stream(): stream = StringIO("SELECT 1; SELECT 2;") stmts = sqlparse.parsestream(stream) assert isinstance(stmts, types.GeneratorType) assert len(list(stmts)) == 2 def test_split_encoding_parsestream(): stream = StringIO("SELECT 1; SELECT 2;") stmts = list(sqlparse.parsestream(stream)) assert isinstance(stmts[0].tokens[0].value, str) def test_split_unicode_parsestream(): stream = StringIO('SELECT ö') stmts = list(sqlparse.parsestream(stream)) assert str(stmts[0]) == 'SELECT ö' def test_split_simple(): stmts = sqlparse.split('select * from foo; select * from bar;') assert len(stmts) == 2 assert stmts[0] == 'select * from foo;' assert stmts[1] == 'select * from bar;' def test_split_ignores_empty_newlines(): stmts = sqlparse.split('select foo;\nselect bar;\n') assert len(stmts) == 2 assert stmts[0] == 'select foo;' assert stmts[1] == 'select bar;' def test_split_quotes_with_new_line(): stmts = sqlparse.split('select "foo\nbar"') assert len(stmts) == 1 assert stmts[0] == 'select "foo\nbar"' stmts = sqlparse.split("select 'foo\n\bar'") assert len(stmts) == 1 assert stmts[0] == "select 'foo\n\bar'" def test_split_mysql_handler_for(load_file): # see issue581 stmts = sqlparse.split(load_file('mysql_handler.sql')) assert len(stmts) == 2 sqlparse-0.4.4/tests/test_tokenize.py000066400000000000000000000141501454634525500177720ustar00rootroot00000000000000import types from io import StringIO import pytest import sqlparse from sqlparse import lexer from sqlparse import sql, tokens as T def test_tokenize_simple(): s = 'select * from foo;' stream = lexer.tokenize(s) assert isinstance(stream, types.GeneratorType) tokens = list(stream) assert len(tokens) == 8 assert len(tokens[0]) == 2 assert tokens[0] == (T.Keyword.DML, 'select') assert tokens[-1] == (T.Punctuation, ';') def test_tokenize_backticks(): s = '`foo`.`bar`' tokens = list(lexer.tokenize(s)) assert len(tokens) == 3 assert tokens[0] == (T.Name, '`foo`') @pytest.mark.parametrize('s', ['foo\nbar\n', 'foo\rbar\r', 'foo\r\nbar\r\n', 'foo\r\nbar\n']) def test_tokenize_linebreaks(s): # issue1 tokens = lexer.tokenize(s) assert ''.join(str(x[1]) for x in tokens) == s def test_tokenize_inline_keywords(): # issue 7 s = "create created_foo" tokens = list(lexer.tokenize(s)) assert len(tokens) == 3 assert tokens[0][0] == T.Keyword.DDL assert tokens[2][0] == T.Name assert tokens[2][1] == 'created_foo' s = "enddate" tokens = list(lexer.tokenize(s)) assert len(tokens) == 1 assert tokens[0][0] == T.Name s = "join_col" tokens = list(lexer.tokenize(s)) assert len(tokens) == 1 assert tokens[0][0] == T.Name s = "left join_col" tokens = list(lexer.tokenize(s)) assert len(tokens) == 3 assert tokens[2][0] == T.Name assert tokens[2][1] == 'join_col' def test_tokenize_negative_numbers(): s = "values(-1)" tokens = list(lexer.tokenize(s)) assert len(tokens) == 4 assert tokens[2][0] == T.Number.Integer assert tokens[2][1] == '-1' def test_token_str(): token = sql.Token(None, 'FoO') assert str(token) == 'FoO' def test_token_repr(): token = sql.Token(T.Keyword, 'foo') tst = "