pax_global_header00006660000000000000000000000064145532404710014517gustar00rootroot0000000000000052 comment=35dfaa0a95e8abb1cdceb8e449f9590905dca439 mistletoe-1.3.0/000077500000000000000000000000001455324047100135255ustar00rootroot00000000000000mistletoe-1.3.0/.github/000077500000000000000000000000001455324047100150655ustar00rootroot00000000000000mistletoe-1.3.0/.github/workflows/000077500000000000000000000000001455324047100171225ustar00rootroot00000000000000mistletoe-1.3.0/.github/workflows/python-package.yml000066400000000000000000000055461455324047100225710ustar00rootroot00000000000000# This workflow will install Python dependencies, run tests and lint with a variety of Python versions # For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python name: Python package env: package_name: mistletoe on: push: branches: [ "master" ] pull_request: branches: [ "master" ] jobs: build: # As long as we want to support Python 3.6-, we need to stick to an older version of Ubuntu, # see . runs-on: ubuntu-20.04 strategy: fail-fast: false matrix: python-version: ["3.5", "3.6", "3.7", "3.8", "3.9", "3.10", "3.11"] steps: - uses: actions/checkout@v3 - name: Set up Python ${{ matrix.python-version }} uses: actions/setup-python@v4 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | python -m pip install --upgrade pip python -m pip install flake8 pytest parameterized if [ -f requirements.txt ]; then python -m pip install -r requirements.txt; fi - name: Lint with flake8 run: | # See https://www.flake8rules.com for the list of the rules. # stop the build if there are Python syntax errors or undefined names flake8 . --count --select=E9,F63,F7,F82,W605 --show-source --statistics # exit-zero treats all errors as warnings flake8 . --count --exit-zero --statistics - name: Test with pytest id: unit_tests run: | pytest - name: Test CommonMark compliance if: ${{ success() || steps.unit_tests.conclusion == 'failure' }} run: | python -m test.specification --ignore-known coverage: needs: build runs-on: ubuntu-latest env: python_version: 3.11 steps: - uses: actions/checkout@v3 - name: Set up Python ${{ env.python_version }} uses: actions/setup-python@v4 with: python-version: ${{ env.python_version }} - name: Install dependencies run: | python -m pip install --upgrade pip python -m pip install pytest parameterized # note: the following also installs "coverage" python -m pip install coveralls if [ -f requirements.txt ]; then python -m pip install -r requirements.txt; fi - name: Get coverage report run: | coverage run --source=${package_name} --append -m pytest coverage run --source=${package_name} --append -m test.specification --ignore-known # quick local report output to console: coverage report - name: Upload report to Coveralls # documentation for GitHub setup: https://coveralls-python.readthedocs.io/en/latest/usage/configuration.html#github-actions-support env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} run: | coveralls --service=github mistletoe-1.3.0/.gitignore000066400000000000000000000001561455324047100155170ustar00rootroot00000000000000__pycache__/ *.swp out.html venv/ .coverage htmlcov .pylintrc *.egg-info dist build/ .DS_Store *.vim .vscode/ mistletoe-1.3.0/CONTRIBUTING.md000066400000000000000000000112421455324047100157560ustar00rootroot00000000000000

Contributing

You've seen mistletoe: it branches off in all directions, bringing people together. We would love to see what you can make of mistletoe, which direction you would take it to. Or maybe you can discover some [Nargles][nargles], which, by the way, totally exist. The following instructions serve as guidelines, and you should use your best judgement when employing them. ## Getting started Refer to the [README][readme] for install instructions. Since you're going to mess with the code, it's preferred that you clone the repo directly. ## Things you can do ### Introducing new features It is suggested that you **open an issue first** before working on new features. Include your reasons, use case, and maybe plans for implementation. That way, we have a better idea of what you'll be working on, and can hopefully avoid collision. Your pull request may also get merged much faster. There is a contrib directory (and Python package) for software that has been contributed to the project, but which isn't maintained by the core developers. This is a good place to put things like renderers for new formats. ### Fixing bugs Before you post an issue, try narrowing the problem down to the smallest component possible. For example, if an `InlineCode` token is not parsed correctly, include only the paragraph that introduce the error, not the entire document. You might find mistletoe's interactive mode handy when tracking down bugs. Type in your input, and you immediately see how mistletoe handles it. I created it just for this purpose. To use it, run `mistletoe` (or `python3 mistletoe`) in your shell without arguments. Markdown is a very finicky document format to parse, so if something does not work as intended, it's probably my fault and not yours. ### Writing documentation The creator might not the best person to write documentation; the users, knowing all the painpoints, have a better idea of actual use cases and possible things that can go wrong. Write docstrings or comments for functions that are missing them. mistletoe generally follows the [Google Python Style Guide][style-guide] to format comments. ## Writing code ### Atomic commits * minimal cosmetic changes are fine to mix in with your commits, but try feel guilty when you do that, and if it's not too big of a hassle, break them into two commits. * similarly, provided there occur bigger, independent areas of changes you would like to address in a pull request, it may be a good idea to split your pull request into multiple ones. ### Commit messages * give clear, instructive commit messages. [Conventional Commits](conv-commits) is the preferred way of how to structure a commit message. * here is an example commit message when fixing some stuff from a numbered issue: `fix: avoid infinite loop when parsing specific Footnotes (#124)`. * find 5 minutes of your time to add important non-obvious details to the message body, like WHY or HOW. This can tremendously reduce the time necessary to investigate future issues and to get better understanding of the project code for newbies. (Yet, this should not serve as a replacement for proper documentation or inline comments.) ### Style guide Here's the obligatory [PEP8][pep-8] link, but here's a much shorter list of things to be aware of: * mistletoe uses `CamelCase` for classnames, `snake_case` for functions and methods; * mistletoe follows the eighty-character rule: if you find your line to be too lengthy, try giving variable names to expressions, and break it up that way. That said, it's okay to go over the character limit occasionally. * mistletoe uses four spaces instead of a tab to indent. For vim users, include `set ts=4 sw=4 ai et` in your `.vimrc`. * recommended Python tooling: * [Black][black-formatter] as the code formatter * [flake8][flake8] as the linter (style checker) Apart from that, stay consistent with the coding style around you. But don't get boggled down by this: if you have a genius idea, I'd love to clean up for you; write down your genius idea first. ## Get in touch I tweet [@mi_before_yu][twitter]. Also yell at me over [email][email]. [nargles]: http://harrypotter.wikia.com/wiki/Nargle [readme]: README.md [wiki]: https://github.com/miyuchina/mistletoe/wiki [style-guide]: https://google.github.io/styleguide/pyguide.html [pep-8]: https://www.python.org/dev/peps/pep-0008/ [twitter]: https://twitter.com/mi_before_yu [email]: mailto:hello@afteryu.me [conv-commits]: https://www.conventionalcommits.org/ [black-formatter]: https://black.readthedocs.io/ [flake8]: https://flake8.pycqa.org/ mistletoe-1.3.0/LICENSE000066400000000000000000000020461455324047100145340ustar00rootroot00000000000000The MIT License Copyright 2017 Mi Yu Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. mistletoe-1.3.0/MANIFEST.in000066400000000000000000000000201455324047100152530ustar00rootroot00000000000000include LICENSE mistletoe-1.3.0/README.md000066400000000000000000000210641455324047100150070ustar00rootroot00000000000000

mistletoe

[![Build Status][build-badge]][github-actions] [![Coverage Status][cover-badge]][coveralls] [![PyPI][pypi-badge]][pypi] [![is wheel][wheel-badge]][pypi] mistletoe is a Markdown parser in pure Python, designed to be fast, spec-compliant and fully customizable. Apart from being the fastest CommonMark-compliant Markdown parser implementation in pure Python, mistletoe also supports easy definitions of custom tokens. Parsing Markdown into an abstract syntax tree also allows us to swap out renderers for different output formats, without touching any of the core components. Remember to spell mistletoe in lowercase! Features -------- * **Fast**: mistletoe is the fastest implementation of CommonMark in Python. See the [performance][performance] section for details. * **Spec-compliant**: CommonMark is [a useful, high-quality project][oilshell]. mistletoe follows the [CommonMark specification][commonmark] to resolve ambiguities during parsing. Outputs are predictable and well-defined. * **Extensible**: Strikethrough and tables are supported natively, and custom block-level and span-level tokens can easily be added. Writing a new renderer for mistletoe is a relatively trivial task. You can even write [a Lisp][scheme] in it. Output formats -------------- Renderers for the following "core" output formats exist within the mistletoe main package: * HTML * LaTeX * AST (Abstract Syntax Tree; handy for debugging the parsing process) * Markdown (Can be used to reflow the text, or make other types of automated changes to Markdown documents) Renderers for the following output formats can be found in the [contrib][contrib] package: * HTML with MathJax (_mathjax.py_) * HTML with code highlighting (using Pygments) (_pygments\_renderer.py_) * HTML with TOC (for programmatical use) (_toc\_renderer.py_) * HTML with support for GitHub wiki links (_github\_wiki.py_) * Jira Markdown (_jira\_renderer.py_) * XWiki Syntax (_xwiki20\_renderer.py_) * Scheme (_scheme.py_) Installation ------------ mistletoe is tested for Python 3.5 and above. Install mistletoe with pip: ```sh pip3 install mistletoe ``` Alternatively, clone the repo: ```sh git clone https://github.com/miyuchina/mistletoe.git cd mistletoe pip3 install -e . ``` This installs mistletoe in "editable" mode (because of the `-e` option). That means that any changes made to the source code will get visible immediately - that's because Python only makes a link to the specified directory (`.`) instead of copying the files to the standard packages folder. See the [contributing][contributing] doc for how to contribute to mistletoe. Usage ----- ### Usage from Python Here's how you can use mistletoe in a Python script: ```python import mistletoe with open('foo.md', 'r') as fin: rendered = mistletoe.markdown(fin) ``` `mistletoe.markdown()` uses mistletoe's default settings: allowing HTML mixins and rendering to HTML. The function also accepts an additional argument `renderer`. To produce LaTeX output: ```python import mistletoe from mistletoe.latex_renderer import LaTeXRenderer with open('foo.md', 'r') as fin: rendered = mistletoe.markdown(fin, LaTeXRenderer) ``` To reflow the text in a Markdown document with a max line length of 20 characters: ```python import mistletoe from mistletoe.markdown_renderer import MarkdownRenderer with open('dev-guide.md', 'r') as fin: with MarkdownRenderer(max_line_length=20) as renderer: print(renderer.render(mistletoe.Document(fin))) ``` Finally, here's how you would manually specify extra tokens via a renderer. In the following example, we use `HtmlRenderer` to render the AST. The renderer itself adds `HtmlBlock` and `HtmlSpan` tokens to the parsing process. The result should be equal to the output obtained from the first example above. ```python from mistletoe import Document, HtmlRenderer with open('foo.md', 'r') as fin: with HtmlRenderer() as renderer: # or: `with HtmlRenderer(AnotherToken1, AnotherToken2) as renderer:` doc = Document(fin) # parse the lines into AST rendered = renderer.render(doc) # render the AST # internal lists of tokens to be parsed are automatically reset when exiting this `with` block ``` **Important**: As can be seen from the example above, the parsing phase is currently tightly connected with initiation and closing of a renderer. Therefore, you should never call `Document(...)` outside of a `with ... as renderer` block, unless you know what you are doing. ### Usage from command-line pip installation enables mistletoe's command-line utility. Type the following directly into your shell: ```sh mistletoe foo.md ``` This will transpile `foo.md` into HTML, and dump the output to stdout. To save the HTML, direct the output into a file: ```sh mistletoe foo.md > out.html ``` You can use a different renderer by including the full path to the renderer class after a `-r` or `--renderer` flag. For example, to transpile into LaTeX: ```sh mistletoe foo.md --renderer mistletoe.latex_renderer.LaTeXRenderer ``` and similarly for a renderer in the contrib package: ```sh mistletoe foo.md --renderer mistletoe.contrib.jira_renderer.JiraRenderer ``` ### mistletoe interactive mode Running `mistletoe` without specifying a file will land you in interactive mode. Like Python's REPL, interactive mode allows you to test how your Markdown will be interpreted by mistletoe: ```html mistletoe [version 0.7.2] (interactive) Type Ctrl-D to complete input, or Ctrl-C to exit. >>> some **bold** text ... and some *italics* ...

some bold text and some italics

>>> ``` The interactive mode also accepts the `--renderer` flag: ```latex mistletoe [version 0.7.2] (interactive) Type Ctrl-D to complete input, or Ctrl-C to exit. Using renderer: LaTeXRenderer >>> some **bold** text ... and some *italics* ... \documentclass{article} \begin{document} some \textbf{bold} text and some \textit{italics} \end{document} >>> ``` Who uses mistletoe? ------------------- mistletoe is used by projects of various target audience. You can find some concrete projects in the "Used by" section on [Libraries.io][libraries-mistletoe], but this is definitely not a complete list. Also a list of [Dependents][github-dependents] is tracked by GitHub directly. ### Run mistletoe from CopyQ One notable example is running mistletoe as a Markdown converter from the advanced clipboard manager called [CopyQ][copyq]. One just needs to install the [Convert Markdown to ...][copyq-convert-md] custom script command and then run this command on any selected Markdown text. Why mistletoe? -------------- "For fun," says David Beazley. Further reading --------------- * [Performance][performance] * [Developer's Guide](dev-guide.md) Copyright & License ------------------- * mistletoe's logo uses artwork by [Freepik][icon], under [CC BY 3.0][cc-by]. * mistletoe is released under [MIT][license]. [build-badge]: https://img.shields.io/github/actions/workflow/status/miyuchina/mistletoe/python-package.yml?style=flat-square [cover-badge]: https://img.shields.io/coveralls/miyuchina/mistletoe.svg?style=flat-square [pypi-badge]: https://img.shields.io/pypi/v/mistletoe.svg?style=flat-square [wheel-badge]: https://img.shields.io/pypi/wheel/mistletoe.svg?style=flat-square [github-actions]: https://github.com/miyuchina/mistletoe/actions/workflows/python-package.yml [coveralls]: https://coveralls.io/github/miyuchina/mistletoe?branch=master [pypi]: https://pypi.python.org/pypi/mistletoe [mistune]: https://github.com/lepture/mistune [python-markdown]: https://github.com/waylan/Python-Markdown [python-markdown2]: https://github.com/trentm/python-markdown2 [commonmark-py]: https://github.com/rtfd/CommonMark-py [performance]: performance.md [oilshell]: https://www.oilshell.org/blog/2018/02/14.html [commonmark]: https://spec.commonmark.org/ [contrib]: https://github.com/miyuchina/mistletoe/tree/master/mistletoe/contrib [scheme]: https://github.com/miyuchina/mistletoe/blob/master/mistletoe/contrib/scheme.py [contributing]: CONTRIBUTING.md [icon]: https://www.freepik.com [cc-by]: https://creativecommons.org/licenses/by/3.0/us/ [license]: LICENSE [pythonpath]: https://stackoverflow.com/questions/16107526/how-to-flexibly-change-pythonpath [libraries-mistletoe]: https://libraries.io/pypi/mistletoe [copyq]: https://hluk.github.io/CopyQ/ [copyq-convert-md]: https://github.com/hluk/copyq-commands/tree/master/Global#convert-markdown-to- [github-dependents]: https://github.com/miyuchina/mistletoe/network/dependents mistletoe-1.3.0/cutting-a-release.md000066400000000000000000000033501455324047100173610ustar00rootroot00000000000000

Cutting a Release

For maintainers, here are the basic steps when creating a new release of mistletoe. * set a release version & commit [chore: version ](https://github.com/miyuchina/mistletoe/commit/72e35ff22e823083915ed0327c5f479afec539fa) * publish artifacts of the release * official documentation: [Packaging Python Projects](https://packaging.python.org/en/latest/tutorials/packaging-projects/) * install / upgrade the build tool: `$ python -m pip install --upgrade build` * make sure there are no old relicts in the local "dist" folder * build the Wheel artifact ("dist/*.whl"): `$ python -m build` * upload the distribution archives to PyPi: * install / upgrade Twine: `$ python -m pip install --upgrade twine` * if unsure, upload to test PyPi and/or test locally - see [docs](https://packaging.python.org/en/latest/tutorials/packaging-projects/#uploading-the-distribution-archives) * do the upload to PyPi: `$ python -m twine upload dist/*` * for the username, use `__token__`, for the password, use your token from PyPi (see docs again how to do that) * check that you can install locally what you uploaded: `$ python -m pip install mistletoe` * [create the release in GitHub](https://github.com/miyuchina/mistletoe/releases/new) * attach the "dist/*.whl" from the previous step to the release (drag & drop) (source code archives are attached automatically) * publish the release, let GitHub create a new Git tag automatically * commit [chore: next dev version](https://github.com/miyuchina/mistletoe/commit/d91f21a487b72529a584b8958bffaed864dd67d7) mistletoe-1.3.0/dev-guide.md000066400000000000000000000242341455324047100157250ustar00rootroot00000000000000

Developer's Guide

This document describes usage of mistletoe and its API from the developer's point of view. Understanding the AST and the tokens ------------------------------------ When a markdown document gets parsed by mistletoe, the result is represented as an _abstract syntax tree (AST)_, stored in an instance of `Document`. This object contains a hierarchy of all the various _tokens_ which were recognized during the parsing process, for example, `Paragraph`, `Heading`, and `RawText`. The tokens which represent a line or a block of lines in the input markdown are called _block tokens_. Examples include `List`, `Paragraph`, `ThematicBreak`, and also the `Document` itself. The tokens which represent the actual content within a block are called _span tokens_, or, with CommonMark terminology, _inline tokens_. In this category you will find tokens like `RawText`, `Link`, and `Emphasis`. Block tokens may have block tokens, span tokens, or no tokens at all as children in the AST; this depends on the type of token. Span tokens may *only* have span tokens as children. In order to see what exactly gets parsed, one can simply use the `AstRenderer` on a given markdown input, for example: ```sh mistletoe text.md --renderer mistletoe.ast_renderer.AstRenderer ``` Say that the input file contains for example: ```markdown # Heading 1 text # Heading 2 [link](https://www.example.com) ``` Then we will get this JSON output from the AST renderer: ```json { "type": "Document", "footnotes": {}, "line_number": 1, "children": [ { "type": "Heading", "line_number": 1, "level": 1, "children": [ { "type": "RawText", "content": "Heading 1" } ] }, { "type": "Paragraph", "line_number": 3, "children": [ { "type": "RawText", "content": "text" } ] }, { "type": "Heading", "line_number": 5, "level": 1, "children": [ { "type": "RawText", "content": "Heading 2" } ] }, { "type": "Paragraph", "line_number": 7, "children": [ { "type": "Link", "target": "https://www.example.com", "title": "", "children": [ { "type": "RawText", "content": "link" } ] } ] } ] } ``` ### Line numbers mistletoe records the starting line of all block tokens that it encounters during parsing and stores it as the `line_number` attribute of each token. (This feature is not available for span tokens yet.) Rendering --------- Sometimes all you need is the information from the AST. But more often, you'll want to take that information and turn it into some other format like HTML. This is called _rendering_. mistletoe provides a set of built-in renderers for different formats, and it's also possible to define your own renderer. When passing an AST to a renderer, the tree is recursively traversed and methods corresponding to individual token types get called on the renderer in order to create the output in the desired format. Creating a custom token and renderer ------------------------------------ Here's an example of how to add GitHub-style wiki links to the parsing process, and provide a renderer for this new token. ### A new token GitHub wiki links are span-level tokens, meaning that they reside inline, and don't really look like chunky paragraphs. To write a new span-level token, all we need to do is make a subclass of `SpanToken`: ```python from mistletoe.span_token import SpanToken class GithubWiki(SpanToken): pass ``` mistletoe uses regular expressions to search for span-level tokens in the parsing process. As a refresher, GitHub wiki looks something like this: `[[alternative text | target]]`. We define a class variable, `pattern`, that stores the compiled regex: ```python class GithubWiki(SpanToken): pattern = re.compile(r"\[\[ *(.+?) *\| *(.+?) *\]\]") def __init__(self, match): pass ``` The regex will be picked up by `SpanToken.find`, which is used by the tokenizer to find all tokens of its kind in the document. If regexes are too limited for your use case, consider overriding the `find` method; it should return a list of all token occurrences. Three other class variables are available for our custom token class, and their default values are shown below: ```python class SpanToken: parse_group = 1 parse_inner = True precedence = 5 ``` Note that alternative text can also contain other span-level tokens. For example, `[[*alt*|link]]` is a GitHub link with an `Emphasis` token as its child. To parse child tokens, `parse_inner` should be set to `True` (the default value in this case), and `parse_group` should correspond to the match group in which child tokens might occur (also the default value, 1, in this case). Once these two class variables are set correctly, `GithubWiki.children` attribute will automatically be set to the list of child tokens. Note that there is no need to manually set this attribute, unlike previous versions of mistletoe. Lastly, the `SpanToken` constructors take a regex match object as its argument. We can simply store off the `target` attribute from `match_obj.group(2)`. ```python from mistletoe.span_token import SpanToken class GithubWiki(SpanToken): pattern = re.compile(r"\[\[ *(.+?) *\| *(.+?) *\]\]") def __init__(self, match_obj): self.target = match_obj.group(2) ``` There you go: a new token in 5 lines of code. ### Side note about precedence Normally there is no need to override the `precedence` value of a custom token. The default value is the same as `InlineCode`, `AutoLink` and `HtmlSpan`, which means that whichever token comes first will be parsed. In our case: ```markdown `code with [[ text` | link ]] ``` ... will be parsed as: ```html code with [[ text | link ]] ``` If we set `GithubWiki.precedence = 6`, we have: ```html `code with text` ``` ### A new renderer Adding a custom token to the parsing process usually involves a lot of nasty implementation details. Fortunately, mistletoe takes care of most of them for you. Simply passing your custom token class to `super().__init__()` does the trick: ```python from mistletoe.html_renderer import HtmlRenderer class GithubWikiRenderer(HtmlRenderer): def __init__(self): super().__init__(GithubWiki) ``` We then only need to tell mistletoe how to render our new token: ```python def render_github_wiki(self, token): template = '{inner}' target = token.target inner = self.render_inner(token) return template.format(target=target, inner=inner) ``` Cleaning up, we have our new renderer class: ```python from mistletoe.html_renderer import HtmlRenderer, escape_url class GithubWikiRenderer(HtmlRenderer): def __init__(self): super().__init__(GithubWiki) def render_github_wiki(self, token): template = '{inner}' target = escape_url(token.target) inner = self.render_inner(token) return template.format(target=target, inner=inner) ``` ### Take it for a spin? It is preferred that all mistletoe's renderers be used as context managers. This is to ensure that your custom tokens are cleaned up properly, so that you can parse other Markdown documents with different token types in the same program. ```python from mistletoe import Document from contrib.github_wiki import GithubWikiRenderer with open('foo.md', 'r') as fin: with GithubWikiRenderer() as renderer: rendered = renderer.render(Document(fin)) ``` For more info, take a look at the `base_renderer` module in mistletoe. The docstrings might give you a more granular idea of customizing mistletoe to your needs. Markdown to Markdown parsing-and-rendering ------------------------------------------ Suppose you have some Markdown that you want to process and then output as Markdown again. Thanks to the text-like nature of Markdown, it is often possible to do this with text search-and-replace tools... but not always. For example, if you want to replace a text fragment in the plain text, but not in the embedded code samples, then the search-and-replace approach won't work. In this case you can use mistletoe's `MarkdownRenderer`: 1. Parse Markdown to an AST (usually held in a `Document` token). 2. Make modifications to the AST. 3. Render back to Markdown using `MarkdownRenderer.render()`. Here is an example of how you can replace text in selected parts of the AST: ```python import mistletoe from mistletoe.block_token import BlockToken, Heading, Paragraph, SetextHeading from mistletoe.markdown_renderer import MarkdownRenderer from mistletoe.span_token import InlineCode, RawText, SpanToken def update_text(token: SpanToken): """Update the text contents of a span token and its children. `InlineCode` tokens are left unchanged.""" if isinstance(token, RawText): token.content = token.content.replace("mistletoe", "The Amazing mistletoe") if not isinstance(token, InlineCode) and hasattr(token, "children"): for child in token.children: update_text(child) def update_block(token: BlockToken): """Update the text contents of paragraphs and headings within this block, and recursively within its children.""" if isinstance(token, (Paragraph, SetextHeading, Heading)): for child in token.children: update_text(child) for child in token.children: if isinstance(child, BlockToken): update_block(child) with open("README.md", "r") as fin: with MarkdownRenderer() as renderer: document = mistletoe.Document(fin) update_block(document) md = renderer.render(document) print(md) ``` The `MarkdownRenderer` can also reflow the text in the document to a given maximum line length. And it can do so while preserving the formatting of code blocks and other tokens where line breaks matter. To use this feature, specify a `max_line_length` parameter in the call to the `MarkdownRenderer` constructor. mistletoe-1.3.0/docs/000077500000000000000000000000001455324047100144555ustar00rootroot00000000000000mistletoe-1.3.0/docs/CNAME000066400000000000000000000000241455324047100152170ustar00rootroot00000000000000mistletoe.afteryu.memistletoe-1.3.0/docs/README.md000066400000000000000000000002321455324047100157310ustar00rootroot00000000000000DEPRECATED: This folder used to be the source for project pages. It is not used anymore though. This folder's content is updated by running `make docs`. mistletoe-1.3.0/docs/__init__.py000066400000000000000000000040301455324047100165630ustar00rootroot00000000000000from mistletoe import Document, HtmlRenderer, __version__ INCLUDE = {'README.md': 'index.html', 'CONTRIBUTING.md': 'contributing.html'} METADATA = """ mistletoe{} """ class DocRenderer(HtmlRenderer): def render_link(self, token): return super().render_link(self._replace_link(token)) def render_document(self, token, name="README.md"): pattern = "{}{}" self.footnotes.update(token.footnotes) for filename, new_link in getattr(self, 'files', {}).items(): for k, v in self.footnotes.items(): if v == filename: self.footnotes[k] = new_link subtitle = ' | {}'.format('version ' + __version__ if name == 'README.md' else name.split('.')[0].lower()) return pattern.format(METADATA.format(subtitle), self.render_inner(token)) def _replace_link(self, token): token.target = getattr(self, 'files', {}).get(token.target, token.target) return token def build(files=None): files = files or INCLUDE for f in files: with open(f, 'r', encoding='utf-8') as fin: rendered_file = 'docs/' + files[f] with open(rendered_file, 'w+', encoding='utf-8') as fout: with DocRenderer() as renderer: renderer.files = files print(renderer.render_document(Document(fin), f), file=fout) mistletoe-1.3.0/docs/__main__.py000066400000000000000000000001261455324047100165460ustar00rootroot00000000000000import sys from docs import build build(sys.argv[1:] if len(sys.argv) > 1 else None) mistletoe-1.3.0/docs/contributing.html000066400000000000000000000127021455324047100200540ustar00rootroot00000000000000 mistletoe | contributing

Contributing

You've seen mistletoe: it branches off in all directions, bringing people together. We would love to see what you can make of mistletoe, which direction you would take it to. Or maybe you can discover some Nargles, which, by the way, totally exists.

The following instructions serve as guidelines, and you should use your best judgements when employing them.

Getting started

Refer to the README for install instructions. Since you're going to mess with the code, it's prefered that you clone the repo directly.

Check back on the dev branch regularly to avoid redoing work that others might have done. The master branch is updated only when features on the dev branch are stabilized somewhat.

Things you can do

Introducing new features

It is suggested that you open an issue first before working on new features. Include your reasons, use case, and maybe plans for implementation. That way, we have a better idea of what you'll be working on, and can hopefully avoid collision. Your pull request may also get merged much faster.

Fixing bugs

Before you post an issue, try narrowing the problem down to the smallest component possible. For example, if an InlineCode token is not parsed correctly, include only the paragraph that introduce the error, not the entire document.

You might find mistletoe's interactive mode handy when tracking down bugs. Type in your input, and you immediately see how mistletoe handles it. I created it just for this purpose. To use it, run mistletoe (or python3 mistletoe) in your shell without arguments.

Markdown is a very finicky document format to parse, so if something does not work as intended, it's probably my fault and not yours.

Writing documentations

The creator might not the best person to write documentations; the users, knowing all the painpoints, have a better idea of actual use cases and possible things that can go wrong.

Go to the mistletoe wiki and write up your own topic. Alternatively, write docstrings or comments for functions that are missing them. mistletoe generally follows the Google Python Style Guide to format comments.

Writing code

Commit messages

| Emoji | Description | | :---: | :------------------------------ | | 📚 | Update documentation. | | 🐎 | Performance improvements. | | 💡 | New features. | | 🐛 | Bug fixes. | | 🚨 | Under construction. | | ☕️ | Refactoring / cosmetic changes. | | 🌎 | Internationalization. |

Style guide

Here's the obligatory PEP8 link, but here's a much shorter list of things to be aware of:

Apart from that, stay consistent with the coding style around you. But don't get boggled down by this: if you have a genius idea, I'd love to clean up for you; write down your genius idea first.

Get in touch

I tweet @mi_before_yu. Also yell at me over email.

mistletoe-1.3.0/docs/index.html000066400000000000000000000316651455324047100164650ustar00rootroot00000000000000 mistletoe | version 0.5.2

mistletoe

Build Status Coverage Status PyPI is wheel

mistletoe is a Markdown parser in pure Python, designed to be fast, modular and fully customizable.

mistletoe is not simply a Markdown-to-HTML transpiler. It is designed, from the start, to parse Markdown into an abstract syntax tree. You can swap out renderers for different output formats, without touching any of the core components.

Remember to spell mistletoe in lowercase!

Features

Installation

mistletoe requires Python 3.3 and above, including Python 3.7, the current development branch. It is also tested on PyPy 5.8.0. Install mistletoe with pip:

pip3 install mistletoe

Alternatively, clone the repo:

git clone https://github.com/miyuchina/mistletoe.git
cd mistletoe
pip3 install -e .

See the contributing doc for how to contribute to mistletoe.

Usage

Basic usage

Here's how you can use mistletoe in a Python script:

import mistletoe

with open('foo.md', 'r') as fin:
    rendered = mistletoe.markdown(fin)

mistletoe.markdown() uses mistletoe's default settings: allowing HTML mixins and rendering to HTML. The function also accepts an additional argument renderer. To produce LaTeX output:

import mistletoe
from mistletoe.latex_renderer import LaTeXRenderer

with open('foo.md', 'r') as fin:
    rendered = mistletoe.markdown(fin, LaTeXRenderer)

Finally, here's how you would manually specify extra tokens and a renderer for mistletoe. In the following example, we use HtmlRenderer to render the AST, which adds HtmlBlock and HtmlSpan to the normal parsing process.

from mistletoe import Document, HtmlRenderer

with open('foo.md', 'r') as fin:
    with HtmlRenderer() as renderer:
        rendered = renderer.render(Document(fin))

From the command-line

pip installation enables mistletoe's commandline utility. Type the following directly into your shell:

mistletoe foo.md

This will transpile foo.md into HTML, and dump the output to stdout. To save the HTML, direct the output into a file:

mistletoe foo.md > out.html

You can pass in custom renderers by including the full path to your renderer class after a -r or --renderer flag:

mistletoe foo.md --renderer custom_renderer.CustomRenderer

Running mistletoe without specifying a file will land you in interactive mode. Like Python's REPL, interactive mode allows you to test how your Markdown will be interpreted by mistletoe:

mistletoe [version 0.5.2] (interactive)
Type Ctrl-D to complete input, or Ctrl-C to exit.
>>> some **bold text**
... and some *italics*
... ^D
<html>
<body>
<p>some <strong>bold text</strong> and some <em>italics</em></p>
</body>
</html>
>>>

The interactive mode also accepts the --renderer flag.

Performance

mistletoe is the fastest Markdown parser implementation available in pure Python; that is, on par with mistune. Try the benchmarks yourself by running:

python3 test/benchmark.py

One of the significant bottlenecks of mistletoe compared to mistune, however, is the function overhead. Because, unlike mistune, mistletoe chooses to split functionality into modules, function lookups can take significantly longer than mistune.

To boost the performance further, it is suggested to use PyPy with mistletoe. Benchmark results show that on PyPy, mistletoe is about twice as fast as mistune:

$ pypy3 test/benchmark.py mistune mistletoe
Test document: test/samples/syntax.md
Test iterations: 1000
Running tests with mistune, mistletoe...
========================================
mistune: 13.524028996936977
mistletoe: 6.477352762129158

The above result was achieved on PyPy 5.8.0-beta0, on a 13-inch Retina MacBook Pro (Early 2015).

Developer's Guide

Here's an example to add GitHub-style wiki links to the parsing process, and provide a renderer for this new token.

A new token

GitHub wiki links are span-level tokens, meaning that they reside inline, and don't really look like chunky paragraphs. To write a new span-level token, all we need to do is make a subclass of SpanToken:

from mistletoe.span_token import SpanToken

class GithubWiki(SpanToken):
    pass

mistletoe uses regular expressions to search for span-level tokens in the parsing process. As a refresher, GitHub wiki looks something like this: [[alternative text | target]]. We define a class variable, pattern, that stores the compiled regex:

class GithubWiki(SpanToken):
    pattern = re.compile(r"\[\[ *(.+?) *\| *(.+?) *\]\]")
    def __init__(self, match_obj):
        pass

For spiritual guidance on regexes, refer to xkcd classics. For an actual representation of this author parsing Markdown with regexes, refer to this brilliant meme by Greg Hendershott.

mistletoe's span-level tokenizer will search for our pattern. When it finds a match, it will pass in the match object as argument into our constructor. We have defined our regex so that the first match group is the alternative text, and the second one is the link target.

Note that alternative text can also contain other span-level tokens. For example, [[*alt*|link]] is a GitHub link with an Emphasis token as its child. To parse child tokens, simply pass match_obj to the super constructor (which assumes children to be in match_obj.group(1)), and save off all the additional attributes we need:

from mistletoe.span_token import SpanToken

class GithubWiki(SpanToken):
    pattern = re.compile(r"\[\[ *(.+?) *\| *(.+?) *\]\]")
    def __init__(self, match_obj):
        super().__init__(match_obj)
        self.target = match_obj.group(2)

There you go: a new token in 7 lines of code.

A new renderer

Adding a custom token to the parsing process usually involves a lot of nasty implementation details. Fortunately, mistletoe takes care of most of them for you. Simply pass your custom token class to super().__init__() does the trick:

from mistletoe.html_renderer import HtmlRenderer

class GithubWikiRenderer(HtmlRenderer):
    def __init__(self):
        super().__init__(GithubWiki)

We then only need to tell mistletoe how to render our new token:

def render_github_wiki(self, token):
    template = '<a href="{target}">{inner}</a>'
    target = token.target
    inner = self.render_inner(token)
    return template.format(target=target, inner=inner)

Cleaning up, we have our new renderer class:

from mistletoe.html_renderer import HtmlRenderer, escape_url

class GithubWikiRenderer(HtmlRenderer):
    def __init__(self):
        super().__init__(GithubWiki)

    def render_github_wiki(self, token):
        template = '<a href="{target}">{inner}</a>'
        target = escape_url(token.target)
        inner = self.render_inner(token)
        return template.format(target=target, inner=inner)

Take it for a spin?

It is preferred that all mistletoe's renderers be used as context managers. This is to ensure that your custom tokens are cleaned up properly, so that you can parse other Markdown documents with different token types in the same program.

from mistletoe import Document
from contrib.github_wiki import GithubWikiRenderer

with open('foo.md', 'r') as fin:
    with GithubWikiRenderer() as renderer:
        rendered = renderer.render(Document(fin))

For more info, take a look at the base_renderer module in mistletoe. The docstrings might give you a more granular idea of customizing mistletoe to your needs.

Why mistletoe?

For me, the question becomes: why not mistune? My original motivation really has nothing to do with starting a competition. Here's a list of reasons I created mistletoe in the first place:

Here's two things mistune inspired mistletoe to do:

Here's two things mistletoe does differently from mistune:

The implications of these are quite profound, and there's no definite this-is-better-than-that answer. Mistune is near perfect if one wants what it provides: I have used mistune extensively in the past, and had a great experience. If you want more control, however, give mistletoe a try.

Copyright & License

mistletoe-1.3.0/docs/style.css000066400000000000000000000012771455324047100163360ustar00rootroot00000000000000body { width: 60%; margin: 2em auto; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; line-height: 1.5; color: #24292e; } h1, h2 { border-bottom: 1px solid #eaecef; padding-bottom: 0.3em; } a { color: #0366d6; text-decoration: none; } code { padding: 0.2em 0.4em; margin: 0; background-color: rgba(27,31,35,0.05); border-radius: 3px; font-family: "SFMono-Regular", Consolas, "Liberation Mono", Menlo, Courier, monospace; font-size: 85%; } @media screen and (max-width: 1000px) { body { width: 75%; } } @media screen and (max-width: 700px) { body { width: 90%; } } mistletoe-1.3.0/makefile000066400000000000000000000006571455324047100152350ustar00rootroot00000000000000PYTHON_EXEC=python3 .PHONY: run test coverage integration benchmark docs run: ${PYTHON_EXEC} -m mistletoe test: ${PYTHON_EXEC} -m unittest coverage: . venv/bin/activate && \ ${PYTHON_EXEC} -m coverage run -m unittest && \ coverage report && \ deactivate integration: ./test/test_ci.sh 1 benchmark: ${PYTHON_EXEC} test/benchmark.py specification: ${PYTHON_EXEC} -m test.specification docs: ${PYTHON_EXEC} -m docs mistletoe-1.3.0/mistletoe/000077500000000000000000000000001455324047100155325ustar00rootroot00000000000000mistletoe-1.3.0/mistletoe/__init__.py000066400000000000000000000014511455324047100176440ustar00rootroot00000000000000""" Make mistletoe easier to import. """ __version__ = "1.3.0" __all__ = ['html_renderer', 'ast_renderer', 'block_token', 'block_tokenizer', 'span_token', 'span_tokenizer'] from mistletoe.block_token import Document from mistletoe.html_renderer import HtmlRenderer # import the old name for backwards compatibility: from mistletoe.html_renderer import HTMLRenderer # noqa: F401 def markdown(iterable, renderer=HtmlRenderer): """ Converts markdown input to the output supported by the given renderer. If no renderer is supplied, ``HtmlRenderer`` is used. Note that extra token types supported by the given renderer are automatically (and temporarily) added to the parsing process. """ with renderer() as renderer: return renderer.render(Document(iterable)) mistletoe-1.3.0/mistletoe/__main__.py000066400000000000000000000003331455324047100176230ustar00rootroot00000000000000""" Make mistletoe runnable as a script with default settings. """ import sys from mistletoe import cli def main(): """ Entry point. """ cli.main(sys.argv[1:]) if __name__ == "__main__": main() mistletoe-1.3.0/mistletoe/ast_renderer.py000066400000000000000000000027021455324047100205620ustar00rootroot00000000000000""" Abstract syntax tree renderer for mistletoe. """ import json from mistletoe.base_renderer import BaseRenderer class AstRenderer(BaseRenderer): def render(self, token): """ Returns the string representation of the AST. Overrides super().render. Delegates the logic to get_ast. """ return json.dumps(get_ast(token), indent=2) + '\n' def __getattr__(self, name): return lambda token: '' def get_ast(token): """ Recursively unrolls token attributes into dictionaries (token.children into lists). Returns: a dictionary of token's attributes. """ node = {} # Python 3.6 uses [ordered dicts] [1]. # Put in 'type' entry first to make the final tree format somewhat # similar to [MDAST] [2]. # # [1]: https://docs.python.org/3/whatsnew/3.6.html # [2]: https://github.com/syntax-tree/mdast node['type'] = token.__class__.__name__ for attrname in ['content', 'footnotes']: if attrname in vars(token): node[attrname] = getattr(token, attrname) for attrname in token.repr_attributes: node[attrname] = getattr(token, attrname) if 'header' in vars(token): node['header'] = get_ast(getattr(token, 'header')) if 'children' in vars(token): node['children'] = [get_ast(child) for child in token.children] return node ASTRenderer = AstRenderer """ Deprecated name of the `AstRenderer` class. """ mistletoe-1.3.0/mistletoe/base_renderer.py000066400000000000000000000163121455324047100207070ustar00rootroot00000000000000""" Base class for renderers. """ import re from mistletoe import block_token, span_token class BaseRenderer(object): """ Base class for renderers. All renderers should ... * ... define all render functions specified in self.render_map; * ... be a context manager (by inheriting __enter__ and __exit__); Custom renderers could ... * ... add additional tokens into the parsing process by passing custom tokens to super().__init__(); * ... add additional render functions by appending to self.render_map; Usage: Suppose SomeRenderer inherits BaseRenderer, and fin is the input file. The syntax looks something like this: >>> from mistletoe import Document >>> from some_renderer import SomeRenderer >>> with SomeRenderer() as renderer: ... rendered = renderer.render(Document(fin)) See mistletoe.html_renderer for an implementation example. Naming conventions: * The keys of self.render_map should exactly match the class name of tokens; * Render function names should be of form: "render_" + the "snake-case" form of token's class name. Attributes: render_map (dict): maps tokens to their corresponding render functions. _extras (list): a list of custom tokens to be added to the parsing process. """ _parse_name = re.compile(r"([A-Z][a-z]+|[A-Z]+(?![a-z]))") def __init__(self, *extras, **kwargs): self.render_map = { 'Strong': self.render_strong, 'Emphasis': self.render_emphasis, 'InlineCode': self.render_inline_code, 'RawText': self.render_raw_text, 'Strikethrough': self.render_strikethrough, 'Image': self.render_image, 'Link': self.render_link, 'AutoLink': self.render_auto_link, 'EscapeSequence': self.render_escape_sequence, 'Heading': self.render_heading, 'SetextHeading': self.render_heading, 'Quote': self.render_quote, 'Paragraph': self.render_paragraph, 'CodeFence': self.render_block_code, 'BlockCode': self.render_block_code, 'List': self.render_list, 'ListItem': self.render_list_item, 'Table': self.render_table, 'TableRow': self.render_table_row, 'TableCell': self.render_table_cell, 'ThematicBreak': self.render_thematic_break, 'LineBreak': self.render_line_break, 'Document': self.render_document, } self._extras = extras for token in extras: if issubclass(token, span_token.SpanToken): token_module = span_token else: token_module = block_token token_module.add_token(token) render_func = getattr(self, self._cls_to_func(token.__name__)) self.render_map[token.__name__] = render_func self.footnotes = {} def render(self, token): """ Grabs the class name from input token and finds its corresponding render function. Basically a janky way to do polymorphism. Arguments: token: whose __class__.__name__ is in self.render_map. """ return self.render_map[token.__class__.__name__](token) def render_inner(self, token) -> str: """ Recursively renders child tokens. Joins the rendered strings with no space in between. If newlines / spaces are needed between tokens, add them in their respective templates, or override this function in the renderer subclass, so that whitespace won't seem to appear magically for anyone reading your program. Arguments: token: a branch node who has children attribute. """ return ''.join(map(self.render, token.children)) def __enter__(self): """ Make renderer classes into context managers. """ return self def __exit__(self, exception_type, exception_val, traceback): """ Make renderer classes into context managers. Reset block_token._token_types and span_token._token_types. """ block_token.reset_tokens() span_token.reset_tokens() @classmethod def _cls_to_func(cls, cls_name): snake = '_'.join(map(str.lower, cls._parse_name.findall(cls_name))) return 'render_{}'.format(snake) @staticmethod def _tokens_from_module(module): """ Helper method; takes a module and returns a list of all token classes specified in module.__all__. Useful when custom tokens are defined in a separate module. """ return [getattr(module, name) for name in module.__all__] def render_raw_text(self, token) -> str: """ Default render method for RawText. Simply return token.content. """ return token.content def render_strong(self, token: span_token.Strong) -> str: return self.render_inner(token) def render_emphasis(self, token: span_token.Emphasis) -> str: return self.render_inner(token) def render_inline_code(self, token: span_token.InlineCode) -> str: return self.render_inner(token) def render_strikethrough(self, token: span_token.Strikethrough) -> str: return self.render_inner(token) def render_image(self, token: span_token.Image) -> str: return self.render_inner(token) def render_link(self, token: span_token.Link) -> str: return self.render_inner(token) def render_auto_link(self, token: span_token.AutoLink) -> str: return self.render_inner(token) def render_escape_sequence(self, token: span_token.EscapeSequence) -> str: return self.render_inner(token) def render_line_break(self, token: span_token.LineBreak) -> str: return self.render_inner(token) def render_heading(self, token: block_token.Heading) -> str: return self.render_inner(token) def render_quote(self, token: block_token.Quote) -> str: return self.render_inner(token) def render_paragraph(self, token: block_token.Paragraph) -> str: return self.render_inner(token) def render_block_code(self, token: block_token.BlockCode) -> str: return self.render_inner(token) def render_list(self, token: block_token.List) -> str: return self.render_inner(token) def render_list_item(self, token: block_token.ListItem) -> str: return self.render_inner(token) def render_table(self, token: block_token.Table) -> str: return self.render_inner(token) def render_table_cell(self, token: block_token.TableCell) -> str: return self.render_inner(token) def render_table_row(self, token: block_token.TableRow) -> str: return self.render_inner(token) def render_thematic_break(self, token: block_token.ThematicBreak) -> str: return self.render_inner(token) def render_document(self, token: block_token.Document) -> str: return self.render_inner(token) mistletoe-1.3.0/mistletoe/block_token.py000066400000000000000000001147741455324047100204140ustar00rootroot00000000000000""" Built-in block-level token classes. """ import re from itertools import zip_longest import mistletoe.block_tokenizer as tokenizer from mistletoe import token, span_token from mistletoe.core_tokens import ( follows, shift_whitespace, whitespace, is_control_char, normalize_label, ) """ Tokens to be included in the parsing process, in the order specified. """ __all__ = ['BlockCode', 'Heading', 'Quote', 'CodeFence', 'ThematicBreak', 'List', 'Table', 'Footnote', 'Paragraph'] def tokenize(lines): """ A wrapper around block_tokenizer.tokenize. Pass in all block-level token constructors as arguments to block_tokenizer.tokenize. Doing so (instead of importing block_token module in block_tokenizer) avoids cyclic dependency issues, and allows for future injections of custom token classes. _token_types variable is at the bottom of this module. See also: block_tokenizer.tokenize, span_token.tokenize_inner. """ return tokenizer.tokenize(lines, _token_types) def add_token(token_cls, position=0): """ Allows external manipulation of the parsing process. This function is usually called in BaseRenderer.__enter__. Arguments: token_cls (SpanToken): token to be included in the parsing process. position (int): the position for the token class to be inserted into. """ _token_types.insert(position, token_cls) def remove_token(token_cls): """ Allows external manipulation of the parsing process. This function is usually called in BaseRenderer.__exit__. Arguments: token_cls (BlockToken): token to be removed from the parsing process. """ _token_types.remove(token_cls) def reset_tokens(): """ Resets global _token_types to all token classes in __all__. """ global _token_types _token_types = [globals()[cls_name] for cls_name in __all__] class BlockToken(token.Token): """ Base class for block-level tokens. Recursively parse inner tokens. Naming conventions: * lines denotes a list of (possibly unparsed) input lines, and is commonly used as the argument name for constructors. * BlockToken.children is a list with all the inner tokens (thus if a token has children attribute, it is not a leaf node; if a token calls span_token.tokenize_inner, it is the boundary between span-level tokens and block-level tokens); * BlockToken.start takes a line from the document as argument, and returns a boolean representing whether that line marks the start of the current token. Every subclass of BlockToken must define a start function (see block_tokenizer.tokenize). * BlockToken.read takes the rest of the lines in the document as an iterator (including the start line), and consumes all the lines that should be read into this token. Default to stop at an empty line. Note that BlockToken.read does not have to return a list of lines. Because the return value of this function will be directly passed into the token constructor, we can return any relevant parsing information, sometimes even ready-made tokens, into the constructor. See block_tokenizer.tokenize. If BlockToken.read returns None, the read result is ignored, but the token class is responsible for resetting the iterator to a previous state. See block_tokenizer.FileWrapper.get_pos, block_tokenizer.FileWrapper.set_pos. Attributes: children (list): inner tokens. line_number (int): starting line (1-based). """ repr_attributes = ("line_number",) def __init__(self, lines, tokenize_func): self.children = tokenize_func(lines) def __contains__(self, text): return any(text in child for child in self.children) @staticmethod def read(lines): line_buffer = [next(lines)] for line in lines: if line == '\n': break line_buffer.append(line) return line_buffer class Document(BlockToken): """ Document token. This is a container block token. Its children are block tokens - container or leaf ones. Attributes: footnotes (dictionary): link reference definitions. """ def __init__(self, lines): if isinstance(lines, str): lines = lines.splitlines(keepends=True) lines = [line if line.endswith('\n') else '{}\n'.format(line) for line in lines] self.footnotes = {} self.line_number = 1 token._root_node = self self.children = tokenize(lines) token._root_node = None class Heading(BlockToken): """ ATX heading token. (["### some heading ###\\n"]) This is a leaf block token. Its children are inline (span) tokens. Attributes: level (int): heading level. """ repr_attributes = BlockToken.repr_attributes + ("level",) pattern = re.compile(r' {0,3}(#{1,6})(?:\n|\s+?(.*?)(\n|\s+?#+\s*?$))') level = 0 content = '' def __init__(self, match): self.level, content, self.closing_sequence = match super().__init__(content, span_token.tokenize_inner) @classmethod def start(cls, line): match_obj = cls.pattern.match(line) if match_obj is None: return False cls.level = len(match_obj.group(1)) cls.content = (match_obj.group(2) or '').strip() if set(cls.content) == {'#'}: cls.content = '' cls.closing_sequence = (match_obj.group(3) or '').strip() return True @classmethod def check_interrupts_paragraph(cls, lines): return cls.start(lines.peek()) @classmethod def read(cls, lines): next(lines) return cls.level, cls.content, cls.closing_sequence class SetextHeading(BlockToken): """ Setext heading token. This is a leaf block token. Its children are inline (span) tokens. Not included in the parsing process, but called by Paragraph.__new__. Attributes: level (int): heading level. """ repr_attributes = BlockToken.repr_attributes + ("level",) def __init__(self, lines): self.underline = lines.pop().rstrip() self.level = 1 if self.underline.endswith('=') else 2 content = '\n'.join([line.strip() for line in lines]) super().__init__(content, span_token.tokenize_inner) @classmethod def start(cls, line): raise NotImplementedError() @classmethod def read(cls, lines): raise NotImplementedError() class Quote(BlockToken): """ Block quote token. (["> # heading\\n", "> paragraph\\n"]) This is a container block token. Its children are block tokens - container or leaf ones. """ def __init__(self, parse_buffer): # span-level tokenizing happens here. self.children = tokenizer.make_tokens(parse_buffer) @staticmethod def start(line): stripped = line.lstrip(' ') if len(line) - len(stripped) > 3: return False return stripped.startswith('>') @classmethod def check_interrupts_paragraph(cls, lines): return cls.start(lines.peek()) @classmethod def read(cls, lines): # first line line = cls.convert_leading_tabs(next(lines).lstrip()).split('>', 1)[1] if len(line) > 0 and line[0] == ' ': line = line[1:] line_buffer = [line] start_line = lines.line_number() # set booleans in_code_fence = CodeFence.start(line) in_block_code = BlockCode.start(line) blank_line = line.strip() == '' # following lines next_line = lines.peek() breaking_tokens = [t for t in _token_types if hasattr(t, 'check_interrupts_paragraph') and not t == Quote] while (next_line is not None and next_line.strip() != '' and not any(token_type.check_interrupts_paragraph(lines) for token_type in breaking_tokens)): stripped = cls.convert_leading_tabs(next_line.lstrip()) prepend = 0 if stripped[0] == '>': # has leader, not lazy continuation prepend += 1 if stripped[1] == ' ': prepend += 1 stripped = stripped[prepend:] in_code_fence = CodeFence.start(stripped) in_block_code = BlockCode.start(stripped) blank_line = stripped.strip() == '' line_buffer.append(stripped) elif in_code_fence or in_block_code or blank_line: # not paragraph continuation text break else: # lazy continuation, preserve whitespace line_buffer.append(next_line) next(lines) next_line = lines.peek() # parse child block tokens Paragraph.parse_setext = False parse_buffer = tokenizer.tokenize_block(line_buffer, _token_types, start_line=start_line) Paragraph.parse_setext = True return parse_buffer @staticmethod def convert_leading_tabs(string): string = string.replace('>\t', ' ', 1) count = 0 for i, c in enumerate(string): if c == '\t': count += 4 elif c == ' ': count += 1 else: break if i == 0: return string return '>' + ' ' * count + string[i:] class Paragraph(BlockToken): """ Paragraph token. (["some\\n", "continuous\\n", "lines\\n"]) This is a leaf block token. Its children are inline (span) tokens. """ setext_pattern = re.compile(r' {0,3}(=|-)+ *$') parse_setext = True # can be disabled by Quote def __new__(cls, lines): if not isinstance(lines, list): # setext heading token, return directly return lines return super().__new__(cls) def __init__(self, lines): content = ''.join([line.lstrip() for line in lines]).strip() super().__init__(content, span_token.tokenize_inner) @staticmethod def start(line): return line.strip() != '' @classmethod def read(cls, lines): line_buffer = [next(lines)] next_line = lines.peek() breaking_tokens = [t for t in _token_types if hasattr(t, 'check_interrupts_paragraph') and not t == ThematicBreak] while (next_line is not None and next_line.strip() != ''): # check if a paragraph-breaking token starts on the next line. # (except ThematicBreak, because these can be confused with Setext underlines.) if any(token_type.check_interrupts_paragraph(lines) for token_type in breaking_tokens): break # check if the paragraph being parsed is in fact a Setext heading if cls.parse_setext and cls.is_setext_heading(next_line): line_buffer.append(next(lines)) return SetextHeading(line_buffer) # finish the check for paragraph-breaking tokens with the special case: ThematicBreak if ThematicBreak.check_interrupts_paragraph(lines): break line_buffer.append(next(lines)) next_line = lines.peek() return line_buffer @classmethod def is_setext_heading(cls, line): return cls.setext_pattern.match(line) class BlockCode(BlockToken): """ Indented code block token. This is a leaf block token with a single child of type span_token.RawText. Attributes: language (str): always the empty string. """ repr_attributes = BlockToken.repr_attributes + ("language",) def __init__(self, lines): self.language = '' self.children = (span_token.RawText(''.join(lines).strip('\n') + '\n'),) @property def content(self): """Returns the code block content.""" return self.children[0].content @staticmethod def start(line): return line.replace('\t', ' ', 1).startswith(' ') @classmethod def read(cls, lines): line_buffer = [] trailing_blanks = 0 for line in lines: if line.strip() == '': line_buffer.append(line.lstrip(' ') if len(line) < 5 else line[4:]) trailing_blanks = trailing_blanks + 1 if line == '\n' else 0 continue if not line.replace('\t', ' ', 1).startswith(' '): lines.backstep() break line_buffer.append(cls.strip(line)) trailing_blanks = 0 for _ in range(trailing_blanks): line_buffer.pop() lines.backstep() return line_buffer @staticmethod def strip(string): count = 0 for i, c in enumerate(string): if c == '\t': return string[i + 1:] elif c == ' ': count += 1 else: break if count == 4: return string[i + 1:] return string class CodeFence(BlockToken): """ Fenced code block token. (["```sh\\n", "rm -rf /", ..., "```"]) This is a leaf block token with a single child of type span_token.RawText. Attributes: language (str): language of code block (default to empty). """ repr_attributes = BlockToken.repr_attributes + ("language",) pattern = re.compile(r'( {0,3})(`{3,}|~{3,})( *(\S*)[^\n]*)') _open_info = None def __init__(self, match): lines, open_info = match self.indentation = open_info[0] self.delimiter = open_info[1] self.info_string = open_info[2] self.language = span_token.EscapeSequence.strip(open_info[3]) self.children = (span_token.RawText(''.join(lines)),) @property def content(self): """Returns the code block content.""" return self.children[0].content @classmethod def start(cls, line): match_obj = cls.pattern.match(line) if not match_obj: return False prepend, leader, info_string, lang = match_obj.groups() # info strings for backtick code blocks may not contain backticks, # but info strings for tilde code blocks may contain both tildes and backticks. if leader[0] == '`' and '`' in info_string: return False cls._open_info = len(prepend), leader, info_string, lang return True @classmethod def check_interrupts_paragraph(cls, lines): return cls.start(lines.peek()) @classmethod def read(cls, lines): next(lines) line_buffer = [] for line in lines: stripped_line = line.lstrip(' ') diff = len(line) - len(stripped_line) if (stripped_line.startswith(cls._open_info[1]) and len(stripped_line.split(maxsplit=1)) == 1 and diff < 4): break if diff > cls._open_info[0]: stripped_line = ' ' * (diff - cls._open_info[0]) + stripped_line line_buffer.append(stripped_line) return line_buffer, cls._open_info class List(BlockToken): """ List token. This is a container block token. Its children are ListItem tokens. Attributes: loose (bool): whether the list is loose. start (NoneType or int): None if unordered, starting number if ordered. """ repr_attributes = BlockToken.repr_attributes + ("loose", "start") pattern = re.compile(r' {0,3}(?:\d{0,9}[.)]|[+\-*])(?:[ \t]*$|[ \t]+)') def __init__(self, matches): self.children = [ListItem(*match) for match in matches] self.loose = any(item.loose for item in self.children) leader = self.children[0].leader self.start = None if len(leader) != 1: self.start = int(leader[:-1]) @classmethod def start(cls, line): return cls.pattern.match(line) @classmethod def check_interrupts_paragraph(cls, lines): # to break a paragraph, the first line may not be empty (beyond the list marker), # and the list must either be unordered or start from 1. marker_tuple = ListItem.parse_marker(lines.peek()) if (marker_tuple is not None): _, _, leader, content = marker_tuple if not content.strip() == '': return not leader[0].isdigit() or leader in ['1.', '1)'] return False @classmethod def read(cls, lines): leader = None next_marker = None matches = [] while True: anchor = lines.get_pos() output, next_marker = ListItem.read(lines, next_marker) item_leader = output[3] if leader is None: leader = item_leader elif not cls.same_marker_type(leader, item_leader): lines.set_pos(anchor) break matches.append(output) if next_marker is None: break if matches: # Only consider the last list item loose if there's more than one element last_parse_buffer = matches[-1][0] last_parse_buffer.loose = len(last_parse_buffer) > 1 and last_parse_buffer.loose return matches @staticmethod def same_marker_type(leader, other): if len(leader) == 1: return leader == other return leader[:-1].isdigit() and other[:-1].isdigit() and leader[-1] == other[-1] class ListItem(BlockToken): """ List item token. This is a container block token. Its children are block tokens - container or leaf ones. Not included in the parsing process, but called by List. Attributes: leader (string): a bullet list marker or an ordered list marker. indentation (int): spaces before the leader. prepend (int): the start position of the content, i.e., the indentation required for continuation lines. loose (bool): whether the list is loose. """ repr_attributes = BlockToken.repr_attributes + ("leader", "indentation", "prepend", "loose") pattern = re.compile(r'( {0,3})(\d{0,9}[.)]|[+\-*])($|\s+)') continuation_pattern = re.compile(r'([ \t]*)(\S.*\n|\n)') def __init__(self, parse_buffer, indentation, prepend, leader, line_number=None): self.line_number = line_number self.leader = leader self.indentation = indentation self.prepend = prepend self.children = tokenizer.make_tokens(parse_buffer) self.loose = parse_buffer.loose @classmethod def parse_continuation(cls, line, prepend): """ Returns content (i.e. the line with the prepend stripped off) iff the line is a valid continuation line for a list item with the given prepend length, otherwise None. Note that the list item may still continue even if this test doesn't pass due to lazy continuation. """ match_obj = cls.continuation_pattern.match(line) if match_obj is None: return None if match_obj.group(2) == '\n': return '\n' expanded_spaces = match_obj.group(1).expandtabs(4) return expanded_spaces[prepend:] + match_obj.group(2) if len(expanded_spaces) >= prepend else None @classmethod def parse_marker(cls, line): """ Returns a tuple (prepend, leader, content) iff the line has a valid leader and at least one space separating leader and content, or if the content is empty, in which case there need not be any spaces. The return value is None if the line doesn't have a valid marker. The leader is a bullet list marker, or an ordered list marker. The indentation is spaces before the leader. The prepend is the start position of the content, i.e., the indentation required for continuation lines. """ match_obj = cls.pattern.match(line) if match_obj is None: return None indentation = len(match_obj.group(1)) prepend = len(match_obj.group(0).expandtabs(4)) leader = match_obj.group(2) content = line[match_obj.end(0):] n_spaces = prepend - match_obj.end(2) if n_spaces > 4: # if there are more than 4 spaces after the leader, we treat them as part of the content # with the exception of the first (marker separator) space. prepend -= n_spaces - 1 content = ' ' * (n_spaces - 1) + content return indentation, prepend, leader, content @classmethod def read(cls, lines, prev_marker=None): next_marker = None line_buffer = [] # first line line = next(lines) start_line = lines.line_number() next_line = lines.peek() indentation, prepend, leader, content = prev_marker if prev_marker else cls.parse_marker(line) if content.strip() == '': # item starting with a blank line: look for the next non-blank line prepend = indentation + len(leader) + 1 blanks = 1 while next_line is not None and next_line.strip() == '': blanks += 1 next(lines) next_line = lines.peek() # if the line following the list marker is also empty, then this is an empty # list item. if blanks > 1: parse_buffer = tokenizer.ParseBuffer() parse_buffer.loose = True next_marker = cls.parse_marker(next_line) if next_line is not None else None return (parse_buffer, indentation, prepend, leader, start_line), next_marker else: line_buffer.append(content) # loop over the following lines, looking for the end of the list item breaking_tokens = [t for t in _token_types if hasattr(t, 'check_interrupts_paragraph') and not t == List] newline_count = 0 while True: if next_line is None: # list item ends here because we have reached the end of content if newline_count: lines.backstep() del line_buffer[-newline_count:] break continuation = cls.parse_continuation(next_line, prepend) if not continuation: # the line doesn't have the indentation to show that it belongs to # the list item, but it should be included anyway by lazy continuation... # ...unless it's the start of another token if any(token_type.check_interrupts_paragraph(lines) for token_type in breaking_tokens): if newline_count: lines.backstep() del line_buffer[-newline_count:] break # ...or it's a new list item marker_info = cls.parse_marker(next_line) if marker_info is not None: next_marker = marker_info break # ...or the line above it was blank if newline_count: lines.backstep() del line_buffer[-newline_count:] break continuation = next_line line_buffer.append(continuation) newline_count = newline_count + 1 if continuation == '\n' else 0 next(lines) next_line = lines.peek() # block-level tokens are parsed here, so that footnotes can be # recognized before span-level parsing. parse_buffer = tokenizer.tokenize_block(line_buffer, _token_types, start_line=start_line) return (parse_buffer, indentation, prepend, leader, start_line), next_marker class Table(BlockToken): """ Table token. See its GFM definition at . This is a container block token. Its children are TableRow tokens. Class attributes: interrupt_paragraph: indicates whether tables should interrupt paragraphs during parsing. The default is true. Attributes: header: header row (TableRow). column_align (list): align options for each column (default to [None]). """ repr_attributes = BlockToken.repr_attributes + ("column_align",) interrupt_paragraph = True _column_align = r':?-+:?' column_align_pattern = re.compile(_column_align) delimiter_row_pattern = re.compile(r'\s*\|?\s*' + _column_align + r'\s*(\|\s*' + _column_align + r'\s*)*\|?\s*') def __init__(self, match): lines, start_line = match # note: the following condition is currently always true, because read() guarantees the presence of the delimiter row if '-' in lines[1]: self.column_align = [self.parse_align(column) for column in self.split_delimiter(lines[1])] self.header = TableRow(lines[0], self.column_align, start_line) self.children = [TableRow(line, self.column_align, start_line + offset) for offset, line in enumerate(lines[2:], start=2)] else: self.column_align = [None] self.children = [TableRow(line, line_number=start_line + offset) for offset, line in enumerate(lines)] @classmethod def split_delimiter(cls, delimiter_row): """ Helper function; returns a list of align options. Args: delimiter (str): e.g.: "| :--- | :---: | ---: |\n" Returns: a list of align options (None, 0 or 1). """ return cls.column_align_pattern.findall(delimiter_row) @staticmethod def parse_align(column): """ Helper function; returns align option from cell content. Returns: None if align = left; 0 if align = center; 1 if align = right. """ return (0 if column[0] == ':' else 1) if column[-1] == ':' else None @staticmethod def start(line): return '|' in line @classmethod def check_interrupts_paragraph(cls, lines): if not cls.interrupt_paragraph: return False anchor = lines.get_pos() result = cls.read(lines) lines.set_pos(anchor) return result @classmethod def read(cls, lines): anchor = lines.get_pos() line_buffer = [next(lines)] start_line = lines.line_number() while lines.peek() is not None and '|' in lines.peek(): line_buffer.append(next(lines)) if len(line_buffer) < 2 or not cls.delimiter_row_pattern.fullmatch(line_buffer[1]): lines.set_pos(anchor) return None return line_buffer, start_line class TableRow(BlockToken): """ Table row token. Supports escaped pipes in table cells (for primary use within code spans). This is a container block token. Its children are TableCell tokens. Should only be called by Table.__init__(). Attributes: row_align (list): align options for each column (default to [None]). """ repr_attributes = BlockToken.repr_attributes + ("row_align",) # Note: Python regex requires fixed-length look-behind, # so we cannot use a more precise alternative: r"(?= 0: return dest_end + eol_pos + 1, (label, dest, "", dest_type, None) else: return None _, title_end, title = match_info # optional spaces or tabs. final line ending. line_end = title_end while line_end < len(string): if string[line_end] == '\n': title_delimiter = string[title_start] if title_start < title_end else None return line_end + 1, (label, dest, title, dest_type, title_delimiter) elif string[line_end] in whitespace: line_end += 1 else: break # non-whitespace found on the same line as the title, making it invalid. # if there was a line break following the destination, # we still have a valid link reference definition. otherwise not. eol_pos = string[dest_end:title_start].find("\n") if eol_pos >= 0: return dest_end + eol_pos + 1, (label, dest, "", dest_type, None) else: return None @classmethod def match_link_label(cls, string, offset): """ Matches: up to three spaces, "[", label, "]". """ start = -1 escaped = False for i, c in enumerate(string[offset:], start=offset): if escaped: escaped = False elif c == '\\': escaped = True elif c == '[': if start == -1: start = i else: return None elif c == ']': label = string[start + 1:i] if label.strip() != '': return start, i + 1, label return None # only spaces allowed before the opening bracket if start == -1 and not (c == " " and i - offset < 3): return None return None @classmethod def match_link_dest(cls, string, offset): if string[offset] == '<': escaped = False for i, c in enumerate(string[offset + 1:], start=offset + 1): if c == '\\' and not escaped: escaped = True elif c == '\n' or (c == '<' and not escaped): return None elif c == '>' and not escaped: return offset, i + 1, string[offset + 1:i] elif escaped: escaped = False return None else: escaped = False count = 0 for i, c in enumerate(string[offset:], start=offset): if c == '\\' and not escaped: escaped = True elif c in whitespace: break elif not escaped: if c == '(': count += 1 elif c == ')': count -= 1 elif is_control_char(c): return None elif escaped: escaped = False if count != 0: return None return offset, i, string[offset:i] @classmethod def match_link_title(cls, string, offset): if offset == len(string): return None if string[offset] == '"': closing = '"' elif string[offset] == "'": closing = "'" elif string[offset] == '(': closing = ')' else: return None escaped = False for i, c in enumerate(string[offset + 1:], start=offset + 1): if c == '\\' and not escaped: escaped = True elif c == closing and not escaped: return offset, i + 1, string[offset + 1:i] elif escaped: escaped = False return None @staticmethod def append_footnotes(matches, root): for key, dest, title, *_ in matches: key = normalize_label(key) dest = span_token.EscapeSequence.strip(dest.strip()) title = span_token.EscapeSequence.strip(title) if key not in root.footnotes: root.footnotes[key] = dest, title class ThematicBreak(BlockToken): """ Thematic break token (a.k.a. horizontal rule.) This is a leaf block token without children. """ pattern = re.compile(r' {0,3}(?:([-_*])\s*?)(?:\1\s*?){2,}$') def __init__(self, lines): self.line = lines[0].strip('\n') @classmethod def start(cls, line): return cls.pattern.match(line) @classmethod def check_interrupts_paragraph(cls, lines): return cls.start(lines.peek()) @staticmethod def read(lines): return [next(lines)] class HtmlBlock(BlockToken): """ Block-level HTML token. This is a leaf block token with a single child of type span_token.RawText, which holds the raw HTML content. """ _end_cond = None multiblock = re.compile(r'<(pre|script|style|textarea)[ >\n]') predefined = re.compile(r'<\/?(.+?)(?:\/?>|[ \n])') custom_tag = re.compile(r'(?:' + '|'.join((span_token._open_tag, span_token._closing_tag)) + r')\s*$') def __init__(self, lines): self.children = (span_token.RawText(''.join(lines).rstrip('\n')),) @property def content(self): """Returns the raw HTML content.""" return self.children[0].content @classmethod def start(cls, line): stripped = line.lstrip() if len(line) - len(stripped) >= 4: return False # rule 1: HTML tags designed to contain literal content, allow newlines in block match_obj = cls.multiblock.match(stripped) if match_obj is not None: cls._end_cond = ''.format(match_obj.group(1).casefold()) return 1 # rule 2: html comment tags, allow newlines in block if stripped.startswith('' return 2 # rule 3: tags that starts with ' return 3 # rule 4: tags that starts with ' return 4 # rule 5: CDATA declaration, allow newlines in block if stripped.startswith('' return 5 # rule 6: predefined tags (see html_token._tags), read until newline match_obj = cls.predefined.match(stripped) if match_obj is not None and match_obj.group(1).casefold() in span_token._tags: cls._end_cond = None return 6 # rule 7: custom tags, read until newline match_obj = cls.custom_tag.match(stripped) if match_obj is not None: cls._end_cond = None return 7 return False @classmethod def check_interrupts_paragraph(cls, lines): html_block = cls.start(lines.peek()) return html_block and html_block != 7 @classmethod def read(cls, lines): # note: stop condition can trigger on the starting line line_buffer = [] for line in lines: line_buffer.append(line) if cls._end_cond is not None: if cls._end_cond in line.casefold(): break elif line.strip() == '': line_buffer.pop() lines.backstep() break return line_buffer HTMLBlock = HtmlBlock """ Deprecated name of the `HtmlBlock` class. """ _token_types = [] reset_tokens() mistletoe-1.3.0/mistletoe/block_tokenizer.py000066400000000000000000000062731455324047100213000ustar00rootroot00000000000000""" Block-level tokenizer for mistletoe. """ class FileWrapper: def __init__(self, lines, start_line=1): self.lines = lines if isinstance(lines, list) else list(lines) self.start_line = start_line self._index = -1 self._anchor = 0 def __next__(self): if self._index + 1 < len(self.lines): self._index += 1 return self.lines[self._index] raise StopIteration def __iter__(self): return self def __repr__(self): return repr(self.lines[self._index + 1:]) def get_pos(self): """Returns the current reading position. The result is an opaque value which can be passed to `set_pos`.""" return self._index def set_pos(self, pos): """Sets the current reading position.""" self._index = pos def anchor(self): """@deprecated use `get_pos` instead""" self._anchor = self.get_pos() def reset(self): """@deprecated use `set_pos` instead""" self.set_pos(self._anchor) def peek(self): if self._index + 1 < len(self.lines): return self.lines[self._index + 1] return None def backstep(self): if self._index != -1: self._index -= 1 def line_number(self): return self.start_line + self._index def tokenize(iterable, token_types): """ Searches for token_types in iterable. Args: iterable (list): user input lines to be parsed. token_types (list): a list of block-level token constructors. Returns: block-level token instances. """ return make_tokens(tokenize_block(iterable, token_types)) def tokenize_block(iterable, token_types, start_line=1): """ Returns a list of tuples (token_type, read_result, line_number). Footnotes are parsed here, but span-level parsing has not started yet. """ lines = FileWrapper(iterable, start_line=start_line) parse_buffer = ParseBuffer() line = lines.peek() while line is not None: for token_type in token_types: if token_type.start(line): line_number = lines.line_number() + 1 result = token_type.read(lines) if result is not None: parse_buffer.append((token_type, result, line_number)) break else: # unmatched newlines next(lines) parse_buffer.loose = True line = lines.peek() return parse_buffer def make_tokens(parse_buffer): """ Takes a list of tuples (token_type, read_result, line_number), applies token_type(read_result), and sets the line_number attribute. Footnotes are already parsed before this point, and span-level parsing is started here. """ tokens = [] for token_type, result, line_number in parse_buffer: token = token_type(result) if token is not None: token.line_number = line_number tokens.append(token) return tokens class ParseBuffer(list): """ A wrapper around builtin list, so that setattr(list, 'loose') is legal. """ def __init__(self, *args): super().__init__(*args) self.loose = False mistletoe-1.3.0/mistletoe/cli.py000066400000000000000000000053611455324047100166600ustar00rootroot00000000000000import sys import mistletoe from argparse import ArgumentParser version_str = 'mistletoe [version {}]'.format(mistletoe.__version__) def main(args): namespace = parse(args) if namespace.filenames: convert(namespace.filenames, namespace.renderer) else: interactive(namespace.renderer) def convert(filenames, renderer): for filename in filenames: convert_file(filename, renderer) def convert_file(filename, renderer): """ Parse a Markdown file and dump the output to stdout. """ try: with open(filename, 'r', encoding='utf-8') as fin: rendered = mistletoe.markdown(fin, renderer) sys.stdout.buffer.write(rendered.encode()) except OSError: sys.exit('Cannot open file "{}".'.format(filename)) def interactive(renderer): """ Parse user input, dump to stdout, rinse and repeat. Python REPL style. """ _import_readline() _print_heading(renderer) contents = [] more = False while True: try: prompt, more = ('... ', True) if more else ('>>> ', True) contents.append(input(prompt) + '\n') except EOFError: print('\n' + mistletoe.markdown(contents, renderer), end='') more = False contents = [] except KeyboardInterrupt: print('\nExiting.') break def parse(args): parser = ArgumentParser() parser.add_argument('-r', '--renderer', type=_import, default='mistletoe.HtmlRenderer', help='specify an importable renderer class') parser.add_argument('-v', '--version', action='version', version=version_str) parser.add_argument('filenames', nargs='*', help='specify an optional list of files to convert') return parser.parse_args(args) def _import(arg): import importlib try: cls_name, path = map(lambda s: s[::-1], arg[::-1].split('.', 1)) module = importlib.import_module(path) return getattr(module, cls_name) except ValueError: sys.exit('[error] please supply full path to your custom renderer.') except ImportError: sys.exit('[error] cannot import module "{}".'.format(path)) except AttributeError: sys.exit('[error] cannot find renderer "{}" from module "{}".'.format(cls_name, path)) def _import_readline(): try: import readline # noqa: F401 except ImportError: print('[warning] readline library not available.') def _print_heading(renderer): print('{} (interactive)'.format(version_str)) print('Type Ctrl-D to complete input, or Ctrl-C to exit.') if renderer is not mistletoe.HtmlRenderer: print('Using renderer: {}'.format(renderer.__name__)) mistletoe-1.3.0/mistletoe/contrib/000077500000000000000000000000001455324047100171725ustar00rootroot00000000000000mistletoe-1.3.0/mistletoe/contrib/Makefile000066400000000000000000000003141455324047100206300ustar00rootroot00000000000000 INSTALL_DIR=$(HOME)/bin install: install-md2jira install-md2jira: echo "[contrib] installing: md2jira" cp md2jira.py $(INSTALL_DIR)/md2jira chmod uog+x $(INSTALL_DIR)/md2jira echo "[contrib] done" mistletoe-1.3.0/mistletoe/contrib/__init__.py000066400000000000000000000000001455324047100212710ustar00rootroot00000000000000mistletoe-1.3.0/mistletoe/contrib/github_wiki.py000066400000000000000000000015411455324047100220520ustar00rootroot00000000000000""" GitHub Wiki support for mistletoe. """ import re from mistletoe.span_token import SpanToken from mistletoe.html_renderer import HtmlRenderer __all__ = ['GithubWiki', 'GithubWikiRenderer'] class GithubWiki(SpanToken): pattern = re.compile(r"\[\[ *(.+?) *\| *(.+?) *\]\]") def __init__(self, match): self.target = match.group(2) class GithubWikiRenderer(HtmlRenderer): def __init__(self, **kwargs): """ Args: **kwargs: additional parameters to be passed to the ancestor's constructor. """ super().__init__(GithubWiki, **kwargs) def render_github_wiki(self, token): template = '{inner}' target = self.escape_url(token.target) inner = self.render_inner(token) return template.format(target=target, inner=inner) mistletoe-1.3.0/mistletoe/contrib/jira_renderer.py000066400000000000000000000203631455324047100223630ustar00rootroot00000000000000# Copyright 2018 Tile, Inc. All Rights Reserved. # # The MIT License # # Permission is hereby granted, free of charge, to any person obtaining a copy of # this software and associated documentation files (the "Software"), to deal in # the Software without restriction, including without limitation the rights to # use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do # so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # from itertools import chain from mistletoe import block_token, span_token from mistletoe.base_renderer import BaseRenderer import re class JiraRenderer(BaseRenderer): """ JIRA renderer class. See mistletoe.base_renderer module for more info. """ def __init__(self, *extras): """ Args: extras (list): allows subclasses to add even more custom tokens. """ self.listTokens = [] self.lastChildOfQuotes = [] super().__init__(*chain([block_token.HtmlBlock, span_token.HtmlSpan], extras)) def render_strong(self, token): template = '*{}*' return template.format(self.render_inner(token)) def render_emphasis(self, token): template = '_{}_' return template.format(self.render_inner(token)) def render_inline_code(self, token): template = '{{{{{}}}}}' return template.format(self.render_inner(token)) def render_strikethrough(self, token): template = '-{}-' return template.format(self.render_inner(token)) def render_image(self, token): template = '!{src}!' self.render_inner(token) return template.format(src=token.src) def render_link(self, token): template = '[{inner}|{target}{title}]' inner = self.render_inner(token) target = escape_url(token.target) if token.title: title = '|{}'.format(token.title) else: title = '' return template.format(inner=inner, target=target, title=title) def render_auto_link(self, token): template = '[{target}]' target = escape_url(token.target) return template.format(target=target) def render_escape_sequence(self, token): return self.render_inner(token) def render_raw_text(self, token, escape=True): if escape: def repl(match): return '\\' + match.group(0) # The following regex tries to find special chars that are one of the following: # 1. the whole string (typically in an EscapeSequence) # 2. just after a non-whitespace # 3. just before a non-whitespace re_esc_chars = r'[{}\[\]\-*_+^~]' re_find = r'(^{esc_chars}$)|((?<=\S)({esc_chars}))|(({esc_chars})(?=\S))'.format(esc_chars=re_esc_chars) return re.sub(re_find, repl, token.content) else: return token.content @staticmethod def render_html_span(token): return token.content def render_heading(self, token): template = 'h{level}. {inner}' inner = self.render_inner(token) return template.format(level=token.level, inner=inner) + self._block_eol(token) def render_quote(self, token): self.lastChildOfQuotes.append(token.children[-1]) inner = self.render_inner(token) del (self.lastChildOfQuotes[-1]) if len(token.children) == 1 and isinstance(token.children[0], block_token.Paragraph): template = 'bq. {inner}' + self._block_eol(token)[0:-1] else: template = '{{quote}}\n{inner}{{quote}}' + self._block_eol(token) return template.format(inner=inner) def render_paragraph(self, token): return '{}'.format(self.render_inner(token)) + self._block_eol(token) def render_block_code(self, token): template = '{{code{attr}}}\n{inner}{{code}}' + self._block_eol(token) if token.language: attr = ':{}'.format(token.language) else: attr = '' inner = self.render_raw_text(token.children[0], False) return template.format(attr=attr, inner=inner) def render_list(self, token): inner = self.render_inner(token) return inner + self._block_eol(token)[0:-1] def render_list_item(self, token): template = '{prefix} {inner}' prefix = ''.join(self.listTokens) result = template.format(prefix=prefix, inner=self.render_inner(token)) return result def render_inner(self, token): if isinstance(token, block_token.List): if token.start: self.listTokens.append('#') else: self.listTokens.append('*') rendered = [self.render(child) for child in token.children] if isinstance(token, block_token.List): del (self.listTokens[-1]) return ''.join(rendered) def render_table(self, token): # This is actually gross and I wonder if there's a better way to do it. # # The primary difficulty seems to be passing down alignment options to # reach individual cells. template = '{inner}\n' if hasattr(token, 'header'): head_template = '{inner}' header = token.header head_inner = self.render_table_row(header, True) head_rendered = head_template.format(inner=head_inner) else: head_rendered = '' body_template = '{inner}' body_inner = self.render_inner(token) body_rendered = body_template.format(inner=body_inner) return template.format(inner=head_rendered + body_rendered) def render_table_row(self, token, is_header=False): if is_header: template = '{inner}||\n' else: template = '{inner}|\n' inner = ''.join([self.render_table_cell(child, is_header) for child in token.children]) return template.format(inner=inner) def render_table_cell(self, token, in_header=False): if in_header: template = '||{inner}' else: template = '|{inner}' inner = self.render_inner(token) if inner == '': inner = ' ' return template.format(inner=inner) @staticmethod def render_thematic_break(token): return '----\n' @staticmethod def render_line_break(token): # Note: In Jira, outputting just '\n' instead of '\\\n' should be usually sufficient as well. # It is not clear when it wouldn't be sufficient though, so we use the longer variant for sure. return ' ' if token.soft else '\\\\\n' @staticmethod def render_html_block(token): return token.content def render_document(self, token): self.footnotes.update(token.footnotes) return self.render_inner(token) def _block_eol(self, token): """ Jira syntax is very limited when it comes to lists: whenever we put an empty line anywhere in a list, it gets terminated and there seems to be no workaround for this. Also to have blocks like paragraphs really vertically separated, we need to put an empty line between them. This function handles these two cases. """ return ( "\n" if len(self.listTokens) > 0 or (len(self.lastChildOfQuotes) > 0 and token is self.lastChildOfQuotes[-1]) else "\n\n" ) def escape_url(raw): """ Escape urls to prevent code injection craziness. (Hopefully.) """ from urllib.parse import quote return quote(raw, safe='/#:()*?=%@+,&;') JIRARenderer = JiraRenderer """ Deprecated name of the `JiraRenderer` class. """ mistletoe-1.3.0/mistletoe/contrib/mathjax.py000066400000000000000000000023621455324047100212030ustar00rootroot00000000000000""" Provides MathJax support for rendering Markdown with LaTeX to html. """ from mistletoe.html_renderer import HtmlRenderer from mistletoe.latex_renderer import LaTeXRenderer class MathJaxRenderer(HtmlRenderer, LaTeXRenderer): def __init__(self, **kwargs): """ Args: **kwargs: additional parameters to be passed to the ancestors' constructors. """ super().__init__(**kwargs) mathjax_src = '\n' def render_math(self, token): """ Convert single dollar sign enclosed math expressions to the ``\\(...\\)`` syntax, to support the default MathJax settings which ignore single dollar signs as described at https://docs.mathjax.org/en/latest/basic/mathematics.html#tex-and-latex-input. """ if token.content.startswith('$$'): return self.render_raw_text(token) return '\\({}\\)'.format(self.render_raw_text(token).strip('$')) def render_document(self, token): """ Append CDN link for MathJax to the end of . """ return super().render_document(token) + self.mathjax_src mistletoe-1.3.0/mistletoe/contrib/md2jira.py000077500000000000000000000067331455324047100211100ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2018 Tile, Inc. # # The MIT License # # Permission is hereby granted, free of charge, to any person obtaining a copy of # this software and associated documentation files (the "Software"), to deal in # the Software without restriction, including without limitation the rights to # use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do # so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # import os import sys import getopt import mistletoe from mistletoe.contrib.jira_renderer import JiraRenderer usageString = '%s ' % os.path.basename(sys.argv[0]) helpString = """ Convert Markdown (CommonMark) to JIRA wiki markup -h, --help help -v, --version version -o , --output= output file, use '-' for stdout (default: stdout) If no input file is specified, stdin is used. """ """ Command-line utility to convert Markdown (CommonMark) to JIRA markup. JIRA markup spec: https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=all CommonMark spec: http://spec.commonmark.org/0.30/#introduction """ class CommandLineParser: def __init__(self): try: optlist, args = getopt.getopt(sys.argv[1:], 'hvo:', ['help', 'version', 'output=']) except getopt.GetoptError as err: sys.stderr.write(err.msg + '\n') sys.stderr.write(usageString + '\n') sys.exit(1) app = MarkdownToJira() app.run(optlist, args) class MarkdownToJira: def __init__(self): self.version = "1.0.2" self.options = {} self.options['output'] = '-' def run(self, optlist, args): for o, i in optlist: if o in ('-h', '--help'): sys.stderr.write(usageString + '\n') sys.stderr.write(helpString + '\n') sys.exit(1) elif o in ('-v', '--version'): sys.stdout.write('%s\n' % self.version) sys.exit(0) elif o in ('-o', '--output'): self.options['output'] = i if len(args) < 1: sys.stderr.write(usageString + '\n') sys.exit(1) with open(args[0], 'r', encoding='utf-8') if len(args) == 1 else sys.stdin as infile: rendered = mistletoe.markdown(infile, JiraRenderer) if self.options['output'] == '-': sys.stdout.write(rendered) else: with open(self.options['output'], 'w', encoding='utf-8') as outfile: outfile.write(rendered) MarkdownToJIRA = MarkdownToJira """ Deprecated name of the `MarkdownToJira` class. """ if __name__ == '__main__': CommandLineParser() mistletoe-1.3.0/mistletoe/contrib/pygments_renderer.py000066400000000000000000000033371455324047100233060ustar00rootroot00000000000000from mistletoe import HtmlRenderer from pygments import highlight from pygments.formatters.html import HtmlFormatter from pygments.lexers import get_lexer_by_name as get_lexer, guess_lexer from pygments.styles import get_style_by_name as get_style from pygments.util import ClassNotFound class PygmentsRenderer(HtmlRenderer): formatter = HtmlFormatter() formatter.noclasses = True def __init__(self, *extras, style='default', fail_on_unsupported_language=False, **kwargs): """ Args: extras (list): allows subclasses to add even more custom tokens. style (str): short name of the style to be used by Pygments' `HtmlFormatter`, see `pygments.styles.get_style_by_name()`. fail_on_unsupported_language (bool): whether to let Pygments' `ClassNotFound` be thrown when there is an unsupported language found on a code block. If `False`, then language is guessed instead of throwing the error. **kwargs: additional parameters to be passed to the ancestor's constructor. """ super().__init__(*extras, **kwargs) self.formatter.style = get_style(style) self.fail_on_unsupported_language = fail_on_unsupported_language def render_block_code(self, token): code = token.content lexer = None if token.language: try: lexer = get_lexer(token.language) except ClassNotFound as err: if self.fail_on_unsupported_language: raise err if lexer is None: lexer = guess_lexer(code) return highlight(code, lexer, self.formatter) mistletoe-1.3.0/mistletoe/contrib/scheme.py000066400000000000000000000120041455324047100210050ustar00rootroot00000000000000import re from collections import ChainMap from functools import reduce from mistletoe.base_renderer import BaseRenderer from mistletoe import span_token, block_token from mistletoe.core_tokens import MatchObj class Program(block_token.BlockToken): def __init__(self, lines): self.children = span_token.tokenize_inner(''.join([line.strip() for line in lines])) class Expr(span_token.SpanToken): @classmethod def find(cls, string): matches = [] start = [] for i, c in enumerate(string): if c == '(': start.append(i) elif c == ')': pos = start.pop() end_pos = i + 1 content = string[pos + 1:i] matches.append(MatchObj(pos, end_pos, (pos + 1, i, content))) return matches def __repr__(self): return ''.format(self.children) class Number(span_token.SpanToken): pattern = re.compile(r"(\d+)") parse_inner = False def __init__(self, match): self.number = eval(match.group(0)) def __repr__(self): return ''.format(self.number) class Variable(span_token.SpanToken): pattern = re.compile(r"([^\s()]+)") parse_inner = False def __init__(self, match): self.name = match.group(0) def __repr__(self): return ''.format(self.name) class Whitespace(span_token.SpanToken): parse_inner = False def __new__(self, _): return None class Procedure: def __init__(self, expr_token, body, env): self.params = [child.name for child in expr_token.children] self.body = body self.env = env class Scheme(BaseRenderer): def __init__(self): self.render_map = { "Program": self.render_program, "Expr": self.render_expr, "Number": self.render_number, "Variable": self.render_variable, } block_token._token_types = [] span_token._token_types = [Expr, Number, Variable, Whitespace] self.env = ChainMap({ "define": self.define, "lambda": lambda expr_token, *body: Procedure(expr_token, body, self.env), "+": lambda x, y: self.render(x) + self.render(y), "-": lambda x, y: self.render(x) - self.render(y), "*": lambda x, y: self.render(x) * self.render(y), "/": lambda x, y: self.render(x) / self.render(y), "<": lambda x, y: self.render(x) < self.render(y), ">": lambda x, y: self.render(x) > self.render(y), "<=": lambda x, y: self.render(x) <= self.render(y), ">=": lambda x, y: self.render(x) >= self.render(y), "=": lambda x, y: self.render(x) == self.render(y), "true": True, "false": False, "cons": lambda x, y: (self.render(x), self.render(y)), "car": lambda pair: self.render(pair)[0], "cdr": lambda pair: self.render(pair)[1], "and": lambda *args: all(map(self.render, args)), "or": lambda *args: any(map(self.render, args)), "not": lambda x: not self.render(x), "if": lambda cond, true, false: self.render(true) if self.render(cond) else self.render(false), "cond": self.cond, "null": None, "null?": lambda x: self.render(x) is None, "list": lambda *args: reduce(lambda x, y: (y, x), map(self.render, reversed(args)), None), "display": lambda *args: print(*map(self.render, args)), }) def render_program(self, token): return self.render_inner(token) def render_inner(self, token): result = None for child in token.children: result = self.render(child) return result def render_expr(self, token): proc, *args = token.children proc = self.render(proc) return self.apply(proc, args) if isinstance(proc, Procedure) else proc(*args) def render_number(self, token): return token.number def render_variable(self, token): return self.env[token.name] def define(self, *args): if len(args) == 2: name_token, val_token = args self.env[name_token.name] = self.render(val_token) else: name_token, expr_token, *body = args self.env[name_token.name] = Procedure(expr_token, body, self.env) def cond(self, *exprs): for expr in exprs: test, value = expr.children if test == 'else' and 'else' not in self.env: return self.render(value) if self.render(test): return self.render(value) def apply(self, proc, args): old_env = self.env self.env = proc.env.new_child() try: for param, arg in zip(proc.params, args): self.env[param] = self.render(arg) result = None for expr in proc.body: result = self.render(expr) finally: self.env = old_env return result mistletoe-1.3.0/mistletoe/contrib/toc_renderer.py000066400000000000000000000047001455324047100222200ustar00rootroot00000000000000""" Table of contents support for mistletoe. See `if __name__ == '__main__'` section for sample usage. """ import re from mistletoe.html_renderer import HtmlRenderer from mistletoe import block_token class TocRenderer(HtmlRenderer): """ Extends HtmlRenderer class for table of contents support. """ def __init__(self, *extras, depth=5, omit_title=True, filter_conds=[], **kwargs): """ Args: extras (list): allows subclasses to add even more custom tokens. depth (int): the maximum level of heading to be included in TOC. omit_title (bool): whether to ignore tokens where token.level == 1. filter_conds (list): when any of these functions evaluate to true, current heading will not be included. **kwargs: additional parameters to be passed to the ancestor's constructor. """ super().__init__(*extras, **kwargs) self._headings = [] self.depth = depth self.omit_title = omit_title self.filter_conds = filter_conds @property def toc(self): """ Returns table of contents as a block_token.List instance. """ def get_indent(level): if self.omit_title: level -= 1 return ' ' * 4 * (level - 1) def build_list_item(heading): level, content = heading template = '{indent}- {content}\n' return template.format(indent=get_indent(level), content=content) lines = [build_list_item(heading) for heading in self._headings] items = block_token.tokenize(lines) return items[0] def render_heading(self, token): """ Overrides super().render_heading; stores rendered heading first, then returns it. """ rendered = super().render_heading(token) content = self.parse_rendered_heading(rendered) if not (self.omit_title and token.level == 1 or token.level > self.depth or any(cond(content) for cond in self.filter_conds)): self._headings.append((token.level, content)) return rendered @staticmethod def parse_rendered_heading(rendered): """ Helper method; converts rendered heading to plain text. """ return re.sub(r'<.+?>', '', rendered) TOCRenderer = TocRenderer """ Deprecated name of the `TocRenderer` class. """ mistletoe-1.3.0/mistletoe/contrib/xwiki20_renderer.py000066400000000000000000000210611455324047100227270ustar00rootroot00000000000000from itertools import chain from mistletoe import block_token, span_token from mistletoe.base_renderer import BaseRenderer class XWiki20Renderer(BaseRenderer): """ XWiki syntax 2.0 renderer class. See mistletoe.base_renderer module for more info. """ def __init__(self, *extras): """ Args: extras (list): allows subclasses to add even more custom tokens. """ self.listTokens = [] self.lastChildOfQuotes = [] self.firstChildOfListItems = [] localExtras = [block_token.HtmlBlock, span_token.HtmlSpan, span_token.XWikiBlockMacroStart, span_token.XWikiBlockMacroEnd] super().__init__(*chain(localExtras, extras)) def render_strong(self, token): template = '**{}**' return template.format(self.render_inner(token)) def render_emphasis(self, token): template = '//{}//' return template.format(self.render_inner(token)) def render_inline_code(self, token): # Note: XWiki also offers preformatted text syntax ('##{}##') as a shorter alternative. # We would have to escape the raw text when using it. template = '{{{{code}}}}{}{{{{/code}}}}' return template.format(self.render_raw_text(token.children[0], False)) def render_strikethrough(self, token): template = '--{}--' return template.format(self.render_inner(token)) def render_image(self, token): template = '[[image:{src}]]' self.render_inner(token) return template.format(src=token.src) def render_link(self, token): template = '[[{inner}>>{target}]]' target = escape_url(token.target) inner = self.render_inner(token) return template.format(target=target, inner=inner) def render_auto_link(self, token): template = '[[{target}]]' target = escape_url(token.target) return template.format(target=target) def render_escape_sequence(self, token): return '~' + self.render_inner(token) def render_raw_text(self, token, escape=True): return (token.content.replace('~', '~~') # Note: It's probably better to leave potential XWiki macros as-is, i. e. don't escape their markers: # .replace('{{', '~{{').replace('}}', '~}}') .replace('[[', '~[[').replace(']]', '~]]') .replace('**', '~**').replace('//', '~//') .replace('##', '~##').replace('--', '~--') ) if escape else token.content def render_x_wiki_block_macro_start(self, token): return token.content + '\n' def render_x_wiki_block_macro_end(self, token): return '\n' + token.content def render_html_span(self, token): # XXX: HtmlSpan parses (contains) only individual opening and closing tags # => no easy way to wrap the whole HTML code into {{html}} like this: # # template = '{{{{html wiki="true"}}}}{}{{{{/html}}}}' # return template.format(token.content) # # => Users must do this themselves after the conversion. return token.content def render_html_block(self, token): template = '{{{{html wiki="true"}}}}\n{}\n{{{{/html}}}}' + self._block_eol(token) return template.format(token.content) def render_heading(self, token): template = '{level} {inner} {level}' inner = self.render_inner(token) return template.format(level='=' * token.level, inner=inner) + self._block_eol(token) def render_quote(self, token): self.lastChildOfQuotes.append(token.children[-1]) inner = self.render_inner(token) del (self.lastChildOfQuotes[-1]) return ( "".join( map( lambda line: ">{}{}".format( "" if line.startswith(">") else " ", line ), inner.splitlines(keepends=True), ) ) + self._block_eol(token)[0:-1] ) def render_paragraph(self, token): return '{}'.format(self.render_inner(token)) + self._block_eol(token) def render_block_code(self, token): template = '{{{{code{attr}}}}}\n{inner}{{{{/code}}}}' + self._block_eol(token) if token.language: attr = ' language="{}"'.format(token.language) else: attr = '' inner = self.render_raw_text(token.children[0], False) return template.format(attr=attr, inner=inner) def render_list(self, token): inner = self.render_inner(token) return inner + self._block_eol(token)[0:-1] def render_list_item(self, token): template = '{prefix} {inner}\n' prefix = ''.join(self.listTokens) if '1' in self.listTokens: prefix += '.' self.firstChildOfListItems.append(token.children[0]) inner = self.render_inner(token) del (self.firstChildOfListItems[-1]) result = template.format(prefix=prefix, inner=inner.rstrip()) return result def render_inner(self, token): if isinstance(token, block_token.List): if token.start: self.listTokens.append('1') else: self.listTokens.append('*') rendered = [self.render(child) for child in token.children] wrap = False if isinstance(token, block_token.BlockToken) and len(token.children) > 1: # test what follows after the 1st child of this block token - wrap it to a XWiki block right after the 1st child if necessary for child in token.children[1:]: if isinstance(token, block_token.ListItem) and not isinstance(child, block_token.List): # Note: Nested list within a list item is OK, because it does its own wrapping if necessary. wrap = True break if isinstance(token, (block_token.TableCell)) and isinstance(child, block_token.BlockToken): # Note: By-design, Markdown doesn't support multiple lines in one cell, but they can be enforced by using HTML. # See e. g. https://stackoverflow.com/questions/19950648/how-to-write-lists-inside-a-markdown-table. wrap = True break if isinstance(token, block_token.List): del (self.listTokens[-1]) return (''.join(rendered) if not wrap else '{head}(((\n{tail}\n)))\n'.format(head=rendered[0].rstrip(), tail=''.join(rendered[1:]).rstrip())) def render_table(self, token): # Copied from JiraRenderer... # # This is actually gross and I wonder if there's a better way to do it. # # The primary difficulty seems to be passing down alignment options to # reach individual cells. template = '{inner}\n' if hasattr(token, 'header'): head_template = '{inner}' header = token.header head_inner = self.render_table_row(header, True) head_rendered = head_template.format(inner=head_inner) else: head_rendered = '' body_template = '{inner}' body_inner = self.render_inner(token) body_rendered = body_template.format(inner=body_inner) return template.format(inner=head_rendered + body_rendered) def render_table_row(self, token, is_header=False): if is_header: template = '{inner}\n' else: template = '{inner}\n' inner = ''.join([self.render_table_cell(child, is_header) for child in token.children]) return template.format(inner=inner) def render_table_cell(self, token, in_header=False): if in_header: template = '|={inner}' else: template = '|{inner}' inner = self.render_inner(token) return template.format(inner=inner) @staticmethod def render_thematic_break(token): return '----\n' @staticmethod def render_line_break(token): return ' ' if token.soft else '\n' def render_document(self, token): self.footnotes.update(token.footnotes) return self.render_inner(token) def _block_eol(self, token): return ('\n' if ((len(self.firstChildOfListItems) > 0 and token is self.firstChildOfListItems[-1]) or (len(self.lastChildOfQuotes) > 0 and token is self.lastChildOfQuotes[-1])) else '\n\n') def escape_url(raw): """ Escape urls to prevent code injection craziness. (Hopefully.) """ from urllib.parse import quote return quote(raw, safe='/#:()*?=%@+,&;') mistletoe-1.3.0/mistletoe/core_tokens.py000066400000000000000000000416731455324047100204320ustar00rootroot00000000000000import re import sys from unicodedata import category whitespace = {' ', '\t', '\n', '\x0b', '\x0c', '\r'} unicode_whitespace = {'\t', '\n', '\x0b', '\x0c', '\r', '\x1c', '\x1d', '\x1e', '\x1f', ' ', '\x85', '\xa0', '\u1680', '\u2000', '\u2001', '\u2002', '\u2003', '\u2004', '\u2005', '\u2006', '\u2007', '\u2008', '\u2009', '\u200a', '\u2028', '\u2029', '\u202f', '\u205f', '\u3000'} # punctuation: _ASCII and Unicode punctuation characters_ as defined at # and # unicode_chrs = (chr(i) for i in range(sys.maxunicode + 1)) punctuation = set.union( {'!', '"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`', '{', '|', '}', '~'}, {c for c in unicode_chrs if category(c).startswith("P")}, ) code_pattern = re.compile(r"(?= 2 and opener.number >= 2 else 1 start = opener.end - n end = closer.start + n match = MatchObj(start, end, (start + n, end - n, string[start + n:end - n])) match.type = 'Strong' if n == 2 else 'Emphasis' match.delimiter = string[start] matches.append(match) # remove all delimiters in between del delimiters[open_pos + 1:curr_pos] curr_pos -= curr_pos - open_pos - 1 # remove appropriate number of chars from delimiters if not opener.remove(n, left=False): delimiters.remove(opener) curr_pos -= 1 if not closer.remove(n, left=True): delimiters.remove(closer) curr_pos -= 1 if curr_pos < 0: curr_pos = 0 else: bottom = curr_pos - 1 if curr_pos > 1 else None if closer.type[0] == '*': star_bottom = bottom else: underscore_bottom = bottom if not closer.open: delimiters.remove(closer) else: curr_pos += 1 curr_pos = next_closer(curr_pos, delimiters) del delimiters[stack_bottom:] def match_link_image(string, offset, delimiter, root=None): image = delimiter.type == '![' start = delimiter.start text_start = start + delimiter.number text_end = offset text = string[text_start:text_end] # inline link if follows(string, offset, '('): # link destination match_info = match_link_dest(string, offset + 1) if match_info is not None: dest_start, dest_end, dest = match_info # link title match_info = match_link_title(string, dest_end) if match_info is not None: title_start, title_end, title = match_info # assert closing paren paren_index = shift_whitespace(string, title_end) if paren_index < len(string) and string[paren_index] == ')': end = paren_index + 1 match = MatchObj(start, end, (text_start, text_end, text), (dest_start, dest_end, dest), (title_start, title_end, title)) match.type = 'Link' if not image else 'Image' match.dest_type = "angle_uri" if dest_start < dest_end and string[dest_start] == "<" else "uri" match.title_delimiter = string[title_start] if title_start < title_end else None return match # footnote link if follows(string, offset, '['): # full footnote link: [label][dest] result = match_link_label(string, offset + 1, root) if result: match_info, (dest, title) = result end = match_info[1] match = MatchObj(start, end, (text_start, text_end, text), (-1, -1, dest), (-1, -1, title)) match.type = 'Link' if not image else 'Image' match.label = match_info[2] match.dest_type = "full" return match ref = get_link_label(text, root) if ref: # compact (collapsed) footnote link: [dest][] if follows(string, offset + 1, ']'): dest, title = ref end = offset + 3 match = MatchObj(start, end, (text_start, text_end, text), (-1, -1, dest), (-1, -1, title)) match.type = 'Link' if not image else 'Image' match.dest_type = "collapsed" return match return None # shortcut footnote link: [dest] ref = get_link_label(text, root) if ref: dest, title = ref end = offset + 1 match = MatchObj(start, end, (text_start, text_end, text), (-1, -1, dest), (-1, -1, title)) match.type = 'Link' if not image else 'Image' match.dest_type = "shortcut" return match return None def match_link_dest(string, offset): offset = shift_whitespace(string, offset + 1) if offset == len(string): return None if string[offset] == '<': escaped = False for i, c in enumerate(string[offset + 1:], start=offset + 1): if c == '\\' and not escaped: escaped = True elif c == '\n' or (c == '<' and not escaped): return None elif c == '>' and not escaped: return offset, i + 1, string[offset + 1:i] elif escaped: escaped = False return None else: escaped = False count = 1 for i, c in enumerate(string[offset:], start=offset): if c == '\\' and not escaped: escaped = True elif c in whitespace: return offset, i, string[offset:i] elif not escaped: if c == '(': count += 1 elif c == ')': count -= 1 elif is_control_char(c): return None elif escaped: escaped = False if count == 0: return offset, i, string[offset:i] return None def match_link_title(string, offset): offset = shift_whitespace(string, offset) if offset == len(string): return None if string[offset] == ')': return offset, offset, '' if string[offset] == '"': closing = '"' elif string[offset] == "'": closing = "'" elif string[offset] == '(': closing = ')' else: return None escaped = False for i, c in enumerate(string[offset + 1:], start=offset + 1): if c == '\\' and not escaped: escaped = True elif c == closing and not escaped: return offset, i + 1, string[offset + 1:i] elif escaped: escaped = False return None def match_link_label(string, offset, root=None): start = -1 end = -1 escaped = False for i, c in enumerate(string[offset:], start=offset): if c == '\\' and not escaped: escaped = True elif c == '[' and not escaped: if start == -1: start = i else: return None elif c == ']' and not escaped: end = i label = string[start + 1:end] match_info = start, end + 1, label if label.strip() != '': ref = root.footnotes.get(normalize_label(label), None) if ref is not None: return match_info, ref return None return None elif escaped: escaped = False return None def get_link_label(text, root): """ Normalize and look up `text` among the footnotes. Returns (destination, title) if successful, otherwise None. """ if not root: return None escaped = False for c in text: if c == '\\' and not escaped: escaped = True elif (c == '[' or c == ']') and not escaped: return None elif escaped: escaped = False if text.strip() != '': return root.footnotes.get(normalize_label(text), None) return None def normalize_label(text): return ' '.join(text.split()).casefold() def next_closer(curr_pos, delimiters): for i, delimiter in enumerate(delimiters[curr_pos:], start=curr_pos or 0): if hasattr(delimiter, 'close') and delimiter.close: return i return None def matching_opener(curr_pos, delimiters, bottom): if curr_pos > 0: curr_delimiter = delimiters[curr_pos] index = curr_pos - 1 for delimiter in delimiters[curr_pos - 1:bottom:-1]: if (hasattr(delimiter, 'open') and delimiter.open and delimiter.closed_by(curr_delimiter)): return index index -= 1 return None def is_opener(start, end, string): if string[start] == '*': return is_left_delimiter(start, end, string) is_right = is_right_delimiter(start, end, string) return (is_left_delimiter(start, end, string) and (not is_right or (is_right and preceded_by(start, string, punctuation)))) def is_closer(start, end, string): if string[start] == '*': return is_right_delimiter(start, end, string) is_left = is_left_delimiter(start, end, string) return (is_right_delimiter(start, end, string) and (not is_left or (is_left and succeeded_by(end, string, punctuation)))) def is_left_delimiter(start, end, string): return (not succeeded_by(end, string, unicode_whitespace) and (not succeeded_by(end, string, punctuation) or preceded_by(start, string, punctuation) or preceded_by(start, string, unicode_whitespace))) def is_right_delimiter(start, end, string): return (not preceded_by(start, string, unicode_whitespace) and (not preceded_by(start, string, punctuation) or succeeded_by(end, string, unicode_whitespace) or succeeded_by(end, string, punctuation))) def preceded_by(start, string, charset): preceding_char = string[start - 1] if start > 0 else ' ' return preceding_char in charset def succeeded_by(end, string, charset): succeeding_char = string[end] if end < len(string) else ' ' return succeeding_char in charset def is_control_char(char): return ord(char) < 32 or ord(char) == 127 def follows(string, index, char): return index + 1 < len(string) and string[index + 1] == char def shift_whitespace(string, index): for i, c in enumerate(string[index:], start=index): if c not in whitespace: return i return len(string) def deactivate_delimiters(delimiters, index, delimiter_type): for delimiter in delimiters[:index]: if delimiter.type == delimiter_type: delimiter.active = False class Delimiter: def __init__(self, start, end, string): self.type = string[start:end] self.number = end - start self.active = True self.start = start self.end = end if self.type.startswith(('*', '_')): self.open = is_opener(start, end, string) self.close = is_closer(start, end, string) def remove(self, n, left=True): if self.number - n == 0: return False if left: self.start = self.start + n self.number = self.end - self.start self.type = self.type[n:] return True self.end = self.end - n self.number = self.end - self.start self.type = self.type[:n] return True def closed_by(self, other): if self.type[0] != other.type[0]: return False if self.open and self.close or other.open and other.close: # if either of the delimiters can both open and close emphasis, then additional # restrictions apply: the sum of the lengths of the delimiter runs # containing the opening and closing delimiters must not be a multiple of 3 # unless both lengths are multiples of 3. return ((self.number + other.number) % 3 != 0 or (self.number % 3 == 0 and other.number % 3 == 0)) return True def __repr__(self): if not self.type.startswith(('*', '_')): return ''.format(repr(self.type), self.active) return ''.format(repr(self.type), self.active, self.open, self.close) class MatchObj: def __init__(self, start, end, *fields): self._start = start self._end = end self.fields = fields def start(self, n=0): if n == 0: return self._start return self.fields[n - 1][0] def end(self, n=0): if n == 0: return self._end return self.fields[n - 1][1] def group(self, n=0): if n == 0: return ''.join([field[2] for field in self.fields]) return self.fields[n - 1][2] def __repr__(self): return ''.format(self.fields, self._start, self._end) mistletoe-1.3.0/mistletoe/html_renderer.py000066400000000000000000000226141455324047100207430ustar00rootroot00000000000000""" HTML renderer for mistletoe. """ import html from itertools import chain from urllib.parse import quote from mistletoe import block_token from mistletoe import span_token from mistletoe.block_token import HtmlBlock from mistletoe.span_token import HtmlSpan from mistletoe.base_renderer import BaseRenderer class HtmlRenderer(BaseRenderer): """ HTML renderer class. See mistletoe.base_renderer module for more info. """ def __init__( self, *extras, html_escape_double_quotes=False, html_escape_single_quotes=False, process_html_tokens=True, **kwargs ): """ Args: extras (list): allows subclasses to add even more custom tokens. html_escape_double_quotes (bool): whether to also escape double quotes when HTML-escaping rendered text. html_escape_single_quotes (bool): whether to also escape single quotes when HTML-escaping rendered text. process_html_tokens (bool): whether to include HTML tokens in the processing. If `False`, HTML markup will be treated as plain text: e.g. input ``
`` will be rendered as ``<br>``. **kwargs: additional parameters to be passed to the ancestor's constructor. """ self._suppress_ptag_stack = [False] final_extras = chain((HtmlBlock, HtmlSpan) if process_html_tokens else (), extras) super().__init__(*final_extras, **kwargs) self.html_escape_double_quotes = html_escape_double_quotes self.html_escape_single_quotes = html_escape_single_quotes def __exit__(self, *args): super().__exit__(*args) def render_to_plain(self, token) -> str: if hasattr(token, 'children'): inner = [self.render_to_plain(child) for child in token.children] return ''.join(inner) return html.escape(token.content) def render_strong(self, token: span_token.Strong) -> str: template = '{}' return template.format(self.render_inner(token)) def render_emphasis(self, token: span_token.Emphasis) -> str: template = '{}' return template.format(self.render_inner(token)) def render_inline_code(self, token: span_token.InlineCode) -> str: template = '{}' inner = self.escape_html_text(token.children[0].content) return template.format(inner) def render_strikethrough(self, token: span_token.Strikethrough) -> str: template = '{}' return template.format(self.render_inner(token)) def render_image(self, token: span_token.Image) -> str: template = '{}' if token.title: title = ' title="{}"'.format(html.escape(token.title)) else: title = '' return template.format(token.src, self.render_to_plain(token), title) def render_link(self, token: span_token.Link) -> str: template = '{inner}' target = self.escape_url(token.target) if token.title: title = ' title="{}"'.format(html.escape(token.title)) else: title = '' inner = self.render_inner(token) return template.format(target=target, title=title, inner=inner) def render_auto_link(self, token: span_token.AutoLink) -> str: template = '{inner}' if token.mailto: target = 'mailto:{}'.format(token.target) else: target = self.escape_url(token.target) inner = self.render_inner(token) return template.format(target=target, inner=inner) def render_escape_sequence(self, token: span_token.EscapeSequence) -> str: return self.render_inner(token) def render_raw_text(self, token: span_token.RawText) -> str: return self.escape_html_text(token.content) @staticmethod def render_html_span(token: span_token.HtmlSpan) -> str: return token.content def render_heading(self, token: block_token.Heading) -> str: template = '{inner}' inner = self.render_inner(token) return template.format(level=token.level, inner=inner) def render_quote(self, token: block_token.Quote) -> str: elements = ['
'] self._suppress_ptag_stack.append(False) elements.extend([self.render(child) for child in token.children]) self._suppress_ptag_stack.pop() elements.append('
') return '\n'.join(elements) def render_paragraph(self, token: block_token.Paragraph) -> str: if self._suppress_ptag_stack[-1]: return '{}'.format(self.render_inner(token)) return '

{}

'.format(self.render_inner(token)) def render_block_code(self, token: block_token.BlockCode) -> str: template = '
{inner}
' if token.language: attr = ' class="{}"'.format('language-{}'.format(html.escape(token.language))) else: attr = '' inner = self.escape_html_text(token.content) return template.format(attr=attr, inner=inner) def render_list(self, token: block_token.List) -> str: template = '<{tag}{attr}>\n{inner}\n' if token.start is not None: tag = 'ol' attr = ' start="{}"'.format(token.start) if token.start != 1 else '' else: tag = 'ul' attr = '' self._suppress_ptag_stack.append(not token.loose) inner = '\n'.join([self.render(child) for child in token.children]) self._suppress_ptag_stack.pop() return template.format(tag=tag, attr=attr, inner=inner) def render_list_item(self, token: block_token.ListItem) -> str: if len(token.children) == 0: return '
  • ' inner = '\n'.join([self.render(child) for child in token.children]) inner_template = '\n{}\n' if self._suppress_ptag_stack[-1]: if token.children[0].__class__.__name__ == 'Paragraph': inner_template = inner_template[1:] if token.children[-1].__class__.__name__ == 'Paragraph': inner_template = inner_template[:-1] return '
  • {}
  • '.format(inner_template.format(inner)) def render_table(self, token: block_token.Table) -> str: # This is actually gross and I wonder if there's a better way to do it. # # The primary difficulty seems to be passing down alignment options to # reach individual cells. template = '\n{inner}
    ' if hasattr(token, 'header'): head_template = '\n{inner}\n' head_inner = self.render_table_row(token.header, is_header=True) head_rendered = head_template.format(inner=head_inner) else: head_rendered = '' body_template = '\n{inner}\n' body_inner = self.render_inner(token) body_rendered = body_template.format(inner=body_inner) return template.format(inner=head_rendered + body_rendered) def render_table_row(self, token: block_token.TableRow, is_header=False) -> str: template = '\n{inner}\n' inner = ''.join([self.render_table_cell(child, is_header) for child in token.children]) return template.format(inner=inner) def render_table_cell(self, token: block_token.TableCell, in_header=False) -> str: template = '<{tag}{attr}>{inner}\n' tag = 'th' if in_header else 'td' if token.align is None: align = 'left' elif token.align == 0: align = 'center' elif token.align == 1: align = 'right' attr = ' align="{}"'.format(align) inner = self.render_inner(token) return template.format(tag=tag, attr=attr, inner=inner) @staticmethod def render_thematic_break(token: block_token.ThematicBreak) -> str: return '
    ' @staticmethod def render_line_break(token: span_token.LineBreak) -> str: return '\n' if token.soft else '
    \n' @staticmethod def render_html_block(token: block_token.HtmlBlock) -> str: return token.content def render_document(self, token: block_token.Document) -> str: self.footnotes.update(token.footnotes) inner = '\n'.join([self.render(child) for child in token.children]) return '{}\n'.format(inner) if inner else '' def escape_html_text(self, s: str) -> str: """ Like `html.escape()`, but this looks into the current rendering options to decide which of the quotes (double, single, or both) to escape. Intended for escaping text content. To escape content of an attribute, simply call `html.escape()`. """ s = s.replace("&", "&") # Must be done first! s = s.replace("<", "<") s = s.replace(">", ">") if self.html_escape_double_quotes: s = s.replace('"', """) if self.html_escape_single_quotes: s = s.replace('\'', "'") return s @staticmethod def escape_url(raw: str) -> str: """ Escape urls to prevent code injection craziness. (Hopefully.) """ return html.escape(quote(raw, safe='/#:()*?=%@+,&;')) HTMLRenderer = HtmlRenderer """ Deprecated name of the `HtmlRenderer` class. """ mistletoe-1.3.0/mistletoe/latex_renderer.py000066400000000000000000000165321455324047100211160ustar00rootroot00000000000000""" LaTeX renderer for mistletoe. """ import string from itertools import chain from urllib.parse import quote import mistletoe.latex_token as latex_token from mistletoe.base_renderer import BaseRenderer # (customizable) delimiters for inline code verb_delimiters = string.punctuation + string.digits for delimiter in '*': # remove invalid delimiters verb_delimiters.replace(delimiter, '') for delimiter in reversed('|!"\'=+'): # start with most common delimiters verb_delimiters = delimiter + verb_delimiters.replace(delimiter, '') class LaTeXRenderer(BaseRenderer): def __init__(self, *extras, **kwargs): """ Args: extras (list): allows subclasses to add even more custom tokens. **kwargs: additional parameters to be passed to the ancestor's constructor. """ tokens = self._tokens_from_module(latex_token) self.packages = {} self.verb_delimiters = verb_delimiters super().__init__(*chain(tokens, extras), **kwargs) def render_strong(self, token): return '\\textbf{{{}}}'.format(self.render_inner(token)) def render_emphasis(self, token): return '\\textit{{{}}}'.format(self.render_inner(token)) def render_inline_code(self, token): content = self.render_raw_text(token.children[0], escape=False) # search for delimiter not present in content for delimiter in self.verb_delimiters: if delimiter not in content: break if delimiter in content: # no delimiter found raise RuntimeError('Unable to find delimiter for verb macro') template = '\\verb{delimiter}{content}{delimiter}' return template.format(delimiter=delimiter, content=content) def render_strikethrough(self, token): self.packages['ulem'] = ['normalem'] return '\\sout{{{}}}'.format(self.render_inner(token)) def render_image(self, token): self.packages['graphicx'] = [] return '\n\\includegraphics{{{}}}\n'.format(token.src) def render_link(self, token): self.packages['hyperref'] = [] template = '\\href{{{target}}}{{{inner}}}' inner = self.render_inner(token) return template.format(target=self.escape_url(token.target), inner=inner) def render_auto_link(self, token): self.packages['hyperref'] = [] return '\\url{{{}}}'.format(self.escape_url(token.target)) def render_math(self, token): self.packages['amsmath'] = [] self.packages['amsfonts'] = [] self.packages['amssymb'] = [] return token.content def render_escape_sequence(self, token): return self.render_inner(token) def render_raw_text(self, token, escape=True): return (token.content.replace('$', '\\$').replace('#', '\\#') .replace('{', '\\{').replace('}', '\\}') .replace('&', '\\&').replace('_', '\\_') .replace('%', '\\%') ) if escape else token.content def render_heading(self, token): inner = self.render_inner(token) if token.level == 1: return '\n\\section{{{}}}\n'.format(inner) elif token.level == 2: return '\n\\subsection{{{}}}\n'.format(inner) return '\n\\subsubsection{{{}}}\n'.format(inner) def render_quote(self, token): self.packages['csquotes'] = [] template = '\\begin{{displayquote}}\n{inner}\\end{{displayquote}}\n' return template.format(inner=self.render_inner(token)) def render_paragraph(self, token): return '\n{}\n'.format(self.render_inner(token)) def render_block_code(self, token): self.packages['listings'] = [] template = ('\n\\begin{{lstlisting}}[language={}]\n' '{}' '\\end{{lstlisting}}\n') inner = self.render_raw_text(token.children[0], False) return template.format(token.language, inner) def render_list(self, token): self.packages['listings'] = [] template = '\\begin{{{tag}}}\n{inner}\\end{{{tag}}}\n' tag = 'enumerate' if token.start is not None else 'itemize' inner = self.render_inner(token) return template.format(tag=tag, inner=inner) def render_list_item(self, token): inner = self.render_inner(token) return '\\item {}\n'.format(inner) def render_table(self, token): def render_align(column_align): if column_align != [None]: cols = [get_align(col) for col in token.column_align] return '{{{}}}'.format(' '.join(cols)) else: return '' def get_align(col): if col is None: return 'l' elif col == 0: return 'c' elif col == 1: return 'r' raise RuntimeError('Unrecognized align option: ' + col) template = ('\\begin{{tabular}}{align}\n' '{inner}' '\\end{{tabular}}\n') if hasattr(token, 'header'): head_template = '{inner}\\hline\n' head_inner = self.render_table_row(token.header) head_rendered = head_template.format(inner=head_inner) else: head_rendered = '' inner = self.render_inner(token) align = render_align(token.column_align) return template.format(inner=head_rendered + inner, align=align) def render_table_row(self, token): cells = [self.render(child) for child in token.children] return ' & '.join(cells) + ' \\\\\n' def render_table_cell(self, token): return self.render_inner(token) @staticmethod def render_thematic_break(token): return '\\hrulefill\n' @staticmethod def render_line_break(token): return '\n' if token.soft else '\\newline\n' def render_packages(self): pattern = '\\usepackage{options}{{{package}}}\n' return ''.join(pattern.format(options=options or '', package=package) for package, options in self.packages.items()) def render_document(self, token): template = ('\\documentclass{{article}}\n' '{packages}' '\\begin{{document}}\n' '{inner}' '\\end{{document}}\n') self.footnotes.update(token.footnotes) return template.format(inner=self.render_inner(token), packages=self.render_packages()) @staticmethod def escape_url(raw: str) -> str: """ Quote unsafe chars in urls & escape as needed for LaTeX's hyperref. %-escapes all characters that are neither in the unreserved chars ("always safe" as per RFC 2396 or RFC 3986) nor in the chars set '/#:()*?=%@+,&;' Subsequently, LaTeX-escapes '%' and '#' for hyperref's \\url{} to also work if used within macros like \\multicolumn. if \\url{} with urls containing '%' or '#' is used outside of multicolumn-macros, they work regardless of whether these characters are escaped, and the result remains the same (at least for pdflatex from TeX Live 2019). """ quoted_url = quote(raw, safe='/#:()*?=%@+,&;') return quoted_url.replace('%', '\\%') \ .replace('#', '\\#') mistletoe-1.3.0/mistletoe/latex_token.py000066400000000000000000000003121455324047100204150ustar00rootroot00000000000000import re import mistletoe.span_token as span_token __all__ = ['Math'] class Math(span_token.SpanToken): pattern = re.compile(r'(\${1,2})([^$]+?)\1') parse_inner = False parse_group = 0 mistletoe-1.3.0/mistletoe/markdown_renderer.py000066400000000000000000000472711455324047100216270ustar00rootroot00000000000000""" Markdown renderer for mistletoe. """ import re from itertools import chain from typing import Iterable, Sequence from mistletoe import block_token, span_token, token from mistletoe.base_renderer import BaseRenderer class BlankLine(block_token.BlockToken): """ Blank line token. Represents a single blank line. This is a leaf block token without children. """ pattern = re.compile(r"\s*\n$") def __init__(self, _): self.children = [] @classmethod def start(cls, line): return cls.pattern.match(line) @classmethod def read(cls, lines): return [next(lines)] class LinkReferenceDefinition(span_token.SpanToken): """ Link reference definition. ([label]: dest "title") Not included in the parsing process, but called by `LinkReferenceDefinitionBlock`. Attributes: label (str): link label, used in link references. dest (str): link target. title (str): link title (default to empty). """ repr_attributes = ("label", "dest", "title") def __init__(self, match): self.label, self.dest, self.title, self.dest_type, self.title_delimiter = match class LinkReferenceDefinitionBlock(block_token.Footnote): """ A sequence of link reference definitions. This is a leaf block token. Its children are link reference definition tokens. This class inherits from `Footnote` and modifies the behavior of the constructor, to keep the tokens in the AST. """ def __new__(cls, *args, **kwargs): obj = object.__new__(cls) obj.__init__(*args, **kwargs) return obj def __init__(self, matches): self.children = [LinkReferenceDefinition(match) for match in matches] class Fragment: """ Markdown fragment. Used when rendering trees of span tokens into flat sequences. May carry additional data in addition to the text. Attributes: text (str): markdown fragment. """ def __init__(self, text: str, **extras): self.text = text self.__dict__.update(extras) class MarkdownRenderer(BaseRenderer): """ Markdown renderer. Designed to make as "clean" a roundtrip as possible, markdown -> parsing -> rendering -> markdown, except for nonessential whitespace. Except when rendering with word wrapping enabled. Includes `HtmlBlock` and `HtmlSpan` tokens in the parsing. """ _whitespace = re.compile(r"\s+") def __init__( self, *extras, max_line_length: int = None, normalize_whitespace=False ): """ Args: extras (list): allows subclasses to add even more custom tokens. max_line_length (int): if specified, the document is word wrapped to the specified line length when rendered. Otherwise the formatting from the original (parsed) document is retained as much as possible. normalize_whitespace (bool): if `False`, the renderer will try to preserve as much whitespace as it currently can. For example, you can use this flag to control whether to replace the original spacing after every list item leader with just 1 space. """ block_token.remove_token(block_token.Footnote) super().__init__( *chain( ( block_token.HtmlBlock, span_token.HtmlSpan, BlankLine, LinkReferenceDefinitionBlock, ), extras, ) ) self.render_map["SetextHeading"] = self.render_setext_heading self.render_map["CodeFence"] = self.render_fenced_code_block self.render_map[ "LinkReferenceDefinition" ] = self.render_link_reference_definition self.max_line_length = max_line_length self.normalize_whitespace = normalize_whitespace def render(self, token: token.Token) -> str: """ Renders the tree of tokens rooted at the given token into markdown. """ if isinstance(token, block_token.BlockToken): lines = self.render_map[token.__class__.__name__]( token, max_line_length=self.max_line_length ) else: lines = self.span_to_lines([token], max_line_length=self.max_line_length) return "".join(map(lambda line: line + "\n", lines)) # rendering of span/inline tokens. # rendered into sequences of Fragments. def render_raw_text(self, token) -> Iterable[Fragment]: yield Fragment(token.content, wordwrap=True) def render_strong(self, token: span_token.Strong) -> Iterable[Fragment]: return self.embed_span(Fragment(token.delimiter * 2), token.children) def render_emphasis(self, token: span_token.Emphasis) -> Iterable[Fragment]: return self.embed_span(Fragment(token.delimiter), token.children) def render_inline_code(self, token: span_token.InlineCode) -> Iterable[Fragment]: return self.embed_span( Fragment(token.delimiter + token.padding), token.children, Fragment(token.padding + token.delimiter) ) def render_strikethrough( self, token: span_token.Strikethrough ) -> Iterable[Fragment]: return self.embed_span(Fragment("~~"), token.children) def render_image(self, token: span_token.Image) -> Iterable[Fragment]: yield Fragment("!") yield from self.render_link_or_image(token, token.src) def render_link(self, token: span_token.Link) -> Iterable[Fragment]: return self.render_link_or_image(token, token.target) def render_link_or_image( self, token: span_token.SpanToken, target: str ) -> Iterable[Fragment]: yield from self.embed_span( Fragment("["), token.children, Fragment("]"), ) if token.dest_type == "uri" or token.dest_type == "angle_uri": # "[" description "](" dest_part [" " title] ")" yield Fragment("(") dest_part = "<" + target + ">" if token.dest_type == "angle_uri" else target yield Fragment(dest_part) if token.title: yield from ( Fragment(" ", wordwrap=True), Fragment(token.title_delimiter), Fragment(token.title, wordwrap=True), Fragment( ")" if token.title_delimiter == "(" else token.title_delimiter, ), ) yield Fragment(")") elif token.dest_type == "full": # "[" description "][" label "]" yield from ( Fragment("["), Fragment(token.label, wordwrap=True), Fragment("]"), ) elif token.dest_type == "collapsed": # "[" description "][]" yield Fragment("[]") else: # "[" description "]" pass def render_auto_link(self, token: span_token.AutoLink) -> Iterable[Fragment]: yield Fragment("<" + token.children[0].content + ">") def render_escape_sequence( self, token: span_token.EscapeSequence ) -> Iterable[Fragment]: yield Fragment("\\" + token.children[0].content) def render_line_break(self, token: span_token.LineBreak) -> Iterable[Fragment]: yield Fragment( token.content + "\n", wordwrap=token.soft, hard_line_break=not token.soft ) def render_html_span(self, token: span_token.HtmlSpan) -> Iterable[Fragment]: yield Fragment(token.content) def render_link_reference_definition( self, token: LinkReferenceDefinition ) -> Iterable[Fragment]: yield from ( Fragment("["), Fragment(token.label, wordwrap=True), Fragment("]: ", wordwrap=True), Fragment( "<" + token.dest + ">" if token.dest_type == "angle_uri" else token.dest, ), ) if token.title: yield from ( Fragment(" ", wordwrap=True), Fragment(token.title_delimiter), Fragment(token.title, wordwrap=True), Fragment( ")" if token.title_delimiter == "(" else token.title_delimiter, ), ) # rendering of block tokens. # rendered into sequences of lines (strings), to be joined by newlines. def render_document( self, token: block_token.Document, max_line_length: int ) -> Iterable[str]: return self.blocks_to_lines(token.children, max_line_length=max_line_length) def render_heading( self, token: block_token.Heading, max_line_length: int ) -> Iterable[str]: # note: no word wrapping, because atx headings always fit on a single line. line = "#" * token.level text = next(self.span_to_lines(token.children, max_line_length=None), "") if text: line += " " + text if token.closing_sequence: line += " " + token.closing_sequence return [line] def render_setext_heading( self, token: block_token.SetextHeading, max_line_length: int ) -> Iterable[str]: yield from self.span_to_lines(token.children, max_line_length=max_line_length) yield token.underline def render_quote( self, token: block_token.Quote, max_line_length: int ) -> Iterable[str]: max_child_line_length = max_line_length - 2 if max_line_length else None lines = self.blocks_to_lines( token.children, max_line_length=max_child_line_length ) return self.prefix_lines(lines or [""], "> ") def render_paragraph( self, token: block_token.Paragraph, max_line_length: int ) -> Iterable[str]: return self.span_to_lines(token.children, max_line_length=max_line_length) def render_block_code( self, token: block_token.BlockCode, max_line_length: int ) -> Iterable[str]: lines = token.content[:-1].split("\n") return self.prefix_lines(lines, " ") def render_fenced_code_block( self, token: block_token.BlockCode, max_line_length: int ) -> Iterable[str]: indentation = " " * token.indentation yield indentation + token.delimiter + token.info_string yield from self.prefix_lines( token.content[:-1].split("\n"), indentation ) yield indentation + token.delimiter def render_list( self, token: block_token.List, max_line_length: int ) -> Iterable[str]: return self.blocks_to_lines(token.children, max_line_length=max_line_length) def render_list_item( self, token: block_token.ListItem, max_line_length: int ) -> Iterable[str]: indentation = len(token.leader) + 1 if self.normalize_whitespace else token.prepend - token.indentation max_child_line_length = ( max_line_length - indentation if max_line_length else None ) lines = self.blocks_to_lines( token.children, max_line_length=max_child_line_length ) return self.prefix_lines( list(lines) or [""], token.leader + " " * (indentation - len(token.leader)), " " * indentation ) def render_table( self, token: block_token.Table, max_line_length: int ) -> Iterable[str]: # note: column widths are not preserved; they are automatically adjusted to fit the contents. content = [self.table_row_to_text(token.header), []] content.extend(self.table_row_to_text(row) for row in token.children) col_widths = self.calculate_table_column_widths(content) content[1] = self.table_separator_line_to_text(col_widths, token.column_align) return [ self.table_row_to_line(col_text, col_widths, token.column_align) for col_text in content ] def render_thematic_break( self, token: block_token.ThematicBreak, max_line_length: int ) -> Iterable[str]: return [token.line] def render_html_block( self, token: block_token.HtmlBlock, max_line_length: int ) -> Iterable[str]: return token.content.split("\n") def render_link_reference_definition_block( self, token: LinkReferenceDefinitionBlock, max_line_length: int ) -> Iterable[str]: # each link reference definition starts on a new line for child in token.children: yield from self.span_to_lines([child], max_line_length=max_line_length) def render_blank_line( self, token: BlankLine, max_line_length: int ) -> Iterable[str]: return [""] # helper methods def embed_span( self, leader: Fragment, tokens: Iterable[span_token.SpanToken], trailer: Fragment = None, ) -> Iterable[Fragment]: """ Makes fragments from `tokens` and embeds within a leader and a trailer. The trailer defaults to the same as the leader. """ yield leader yield from self.make_fragments(tokens) yield trailer or leader def blocks_to_lines( self, tokens: Iterable[block_token.BlockToken], max_line_length: int ) -> Iterable[str]: """ Renders a sequence of block tokens into a sequence of lines. """ for token in tokens: # noqa: F402 yield from self.render_map[token.__class__.__name__]( token, max_line_length=max_line_length ) def span_to_lines( self, tokens: Iterable[span_token.SpanToken], max_line_length: int ) -> Iterable[str]: """ Renders a sequence of span (inline) tokens into a sequence of lines. """ fragments = self.make_fragments(tokens) return self.fragments_to_lines(fragments, max_line_length=max_line_length) def make_fragments(self, tokens: Iterable[span_token.SpanToken] ) -> Iterable[Fragment]: """ Renders a sequence of span (inline) tokens into a sequence of Fragments. """ return chain.from_iterable( [self.render_map[token.__class__.__name__](token) for token in tokens] ) @classmethod def fragments_to_lines( cls, fragments: Iterable[Fragment], max_line_length: int = None ) -> Iterable[str]: """ Renders a sequence of Fragments into lines. With word wrapping, if a `max_line_length` is given, or else following the original text flow as closely as possible. """ current_line = "" if not max_line_length: # plain rendering: merge all fragments and split on newlines for fragment in fragments: if "\n" in fragment.text: lines = fragment.text.split("\n") yield current_line + lines[0] for inner_line in lines[1:-1]: yield inner_line current_line = lines[-1] else: current_line += fragment.text else: # render with word wrapping for word in cls.make_words(fragments): if word == "\n": # hard line break yield current_line current_line = "" continue if not current_line: # first word on an empty line: accept and continue current_line = word continue # try to fit the word on the current line. # if it doesn't fit, flush the line and start on the next test = current_line + " " + word if len(test) <= max_line_length: current_line = test else: yield current_line current_line = word if current_line: yield current_line @classmethod def make_words(cls, fragments: Iterable[Fragment]) -> Iterable[str]: """ Aggregates and splits a sequence of Fragments into words, i.e., strings which do not contain breakable spaces or line breaks. The exception is hard line breaks, which are represented by the string `\n`. """ word = "" for fragment in fragments: if getattr(fragment, "wordwrap", False): first = True for item in cls._whitespace.split(fragment.text): if first: word += item first = False else: if word: yield word word = item elif getattr(fragment, "hard_line_break", False): yield from (word + fragment.text[:-1], "\n") word = "" else: word += fragment.text if word: yield word @classmethod def prefix_lines( cls, lines: Iterable[str], first_line_prefix: str, following_line_prefix: str = None, ) -> Iterable[str]: """ Prepends a prefix string to a sequence of lines. The first line may have a different prefix from the following lines. """ following_line_prefix = following_line_prefix or first_line_prefix is_first_line = True for line in lines: if is_first_line: prefixed = first_line_prefix + line is_first_line = False else: prefixed = following_line_prefix + line yield prefixed if not prefixed.isspace() else "" def table_row_to_text(self, row) -> Sequence[str]: """ Renders each table cell on a table row to text. No word wrapping. """ return [next(self.span_to_lines(col.children, max_line_length=None), "") for col in row.children] @classmethod def calculate_table_column_widths(cls, col_text) -> Sequence[int]: """ Calculates column widths for a table. """ MINIMUM_COLUMN_WIDTH = 3 col_widths = [] for row in col_text: while len(col_widths) < len(row): col_widths.append(MINIMUM_COLUMN_WIDTH) for index, text in enumerate(row): col_widths[index] = max(col_widths[index], len(text)) return col_widths @classmethod def table_separator_line_to_text(cls, col_widths, col_align) -> Sequence[str]: """ Creates the text for the line separating header from contents in a table given column widths and alignments. Note: uses dashes for left justified columns, not a colon followed by dashes. """ separator_text = [] for index, width in enumerate(col_widths): align = col_align[index] if index < len(col_align) else None sep = ":" if align == 0 else "-" sep += "-" * (width - 2) sep += ":" if align == 0 or align == 1 else "-" separator_text.append(sep) return separator_text @classmethod def table_row_to_line(cls, col_text, col_widths, col_align) -> str: """ Pads/aligns the text for a table row and add the borders (pipe characters). """ padded_text = [] for index, width in enumerate(col_widths): text = col_text[index] if index < len(col_text) else "" align = col_align[index] if index < len(col_align) else None if align is None: padded_text.append("{0: <{w}}".format(text, w=width)) elif align == 0: padded_text.append("{0: ^{w}}".format(text, w=width)) else: padded_text.append("{0: >{w}}".format(text, w=width)) return "".join(("| ", " | ".join(padded_text), " |")) mistletoe-1.3.0/mistletoe/span_token.py000066400000000000000000000240211455324047100202440ustar00rootroot00000000000000""" Built-in span-level token classes. """ import html import re import mistletoe.span_tokenizer as tokenizer from mistletoe import core_tokens, token """ Tokens to be included in the parsing process, in the order specified. """ __all__ = ['EscapeSequence', 'Strikethrough', 'AutoLink', 'CoreTokens', 'InlineCode', 'LineBreak', 'RawText'] def tokenize_inner(content): """ A wrapper around span_tokenizer.tokenize. Pass in all span-level token constructors as arguments to span_tokenizer.tokenize. Doing so (instead of importing span_token module in span_tokenizer) avoids cyclic dependency issues, and allows for future injections of custom token classes. _token_types variable is at the bottom of this module. See also: span_tokenizer.tokenize, block_token.tokenize. """ return tokenizer.tokenize(content, _token_types) def add_token(token_cls, position=1): """ Allows external manipulation of the parsing process. This function is called in BaseRenderer.__enter__. Arguments: token_cls (SpanToken): token to be included in the parsing process. """ _token_types.insert(position, token_cls) def remove_token(token_cls): """ Allows external manipulation of the parsing process. This function is called in BaseRenderer.__exit__. Arguments: token_cls (SpanToken): token to be removed from the parsing process. """ _token_types.remove(token_cls) def reset_tokens(): """ Resets global _token_types to all token classes in __all__. """ global _token_types _token_types = [globals()[cls_name] for cls_name in __all__] class SpanToken(token.Token): parse_inner = True parse_group = 1 precedence = 5 def __init__(self, match): if not self.parse_inner: self.content = match.group(self.parse_group) def __contains__(self, text): if hasattr(self, 'children'): return any(text in child for child in self.children) return text in self.content @classmethod def find(cls, string): return cls.pattern.finditer(string) class CoreTokens(SpanToken): """ Represents core tokens (Strong, Emphasis, Image, Link) during the early stage of parsing. Replaced with objects of the proper classes in the final stage of parsing. """ precedence = 3 def __new__(self, match): return globals()[match.type](match) @classmethod def find(cls, string): return core_tokens.find_core_tokens(string, token._root_node) class Strong(SpanToken): """ Strong token. ("**some text**") This is an inline token. Its children are inline (span) tokens. One of the core tokens. """ def __init__(self, match): self.delimiter = match.delimiter class Emphasis(SpanToken): """ Emphasis token. ("*some text*") This is an inline token. Its children are inline (span) tokens. One of the core tokens. """ def __init__(self, match): self.delimiter = match.delimiter class InlineCode(SpanToken): """ Inline code token. ("`some code`") This is an inline token with a single child of type RawText. """ pattern = re.compile(r"(?") This is an inline token with a single child of type RawText. Attributes: children (list): a single RawText node for the link target. target (str): link target. mailto (bool): true iff the target looks like an email address, but does not have the "mailto:" prefix. """ repr_attributes = ("target", "mailto") pattern = re.compile(r"(?]*?|[A-Za-z0-9.!#$%&'*+/=?^_`{|}~-]+@[A-Za-z0-9](?:[A-Za-z0-9-]{0,61}[A-Za-z0-9])?(?:\.[A-Za-z0-9](?:[A-Za-z0-9-]{0,61}[A-Za-z0-9])?)*)>") parse_inner = False def __init__(self, match): content = match.group(self.parse_group) self.children = (RawText(content),) self.target = content self.mailto = '@' in self.target and 'mailto' not in self.target.casefold() class EscapeSequence(SpanToken): """ Escape sequence token. ("\\\\*") This is an inline token with a single child of type RawText. Attributes: children (iterator): a single RawText node containing the escaped character. """ pattern = re.compile(r"\\([!\"#$%&'()*+,-./:;<=>?@\[\\\]^_`{|}~])") parse_inner = False precedence = 2 def __init__(self, match): self.children = (RawText(match.group(self.parse_group)),) @classmethod def strip(cls, string): return html.unescape(cls.pattern.sub(r'\1', string)) class LineBreak(SpanToken): """ Line break token: hard or soft. This is an inline token without children. Attributes: soft (bool): true if this is a soft line break. """ repr_attributes = ("soft",) pattern = re.compile(r'( *|\\)\n') parse_inner = False parse_group = 0 def __init__(self, match): self.content = match.group(1) self.soft = not self.content.startswith((' ', '\\')) class RawText(SpanToken): """ Raw text token. This is an inline token without children. RawText is the only token that accepts a string for its constructor, instead of a match object. Also, all recursions should bottom out here. """ def __init__(self, content): self.content = content _tags = {'address', 'article', 'aside', 'base', 'basefont', 'blockquote', 'body', 'caption', 'center', 'col', 'colgroup', 'dd', 'details', 'dialog', 'dir', 'div', 'dl', 'dt', 'fieldset', 'figcaption', 'figure', 'footer', 'form', 'frame', 'frameset', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'head', 'header', 'hr', 'html', 'iframe', 'legend', 'li', 'link', 'main', 'menu', 'menuitem', 'meta', 'nav', 'noframes', 'ol', 'optgroup', 'option', 'p', 'param', 'section', 'source', 'summary', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'title', 'tr', 'track', 'ul'} _tag = r'[A-Za-z][A-Za-z0-9-]*' # noqa: E221 _attrs = r'(?:\s+[A-Za-z_:][A-Za-z0-9_.:-]*(?:\s*=\s*(?:[^\s"\'=<>`]+|\'[^\']*?\'|"[^\"]*?"))?)*' _open_tag = r'(?' # noqa: E221 _closing_tag = r'(?' _comment = r'(?|->)(?:(?!--).)+?(?' # noqa: E221 _instruction = r'(?' _declaration = r'(?' _cdata = r'(?' # noqa: E221 class HtmlSpan(SpanToken): """ Span-level HTML token. This is an inline token without children. Attributes: content (str): the raw HTML content. """ pattern = re.compile('|'.join([_open_tag, _closing_tag, _comment, _instruction, _declaration, _cdata]), re.DOTALL) parse_inner = False parse_group = 0 HTMLSpan = HtmlSpan """ Deprecated name of the `HtmlSpan` class. """ # Note: The following XWiki tokens are based on the XWiki Syntax 2.0 (or above; 1.0 was deprecated years ago already). class XWikiBlockMacroStart(SpanToken): """ A "block" macro opening tag. ("{{macroName}}") We want to keep it on a separate line instead of "soft" merging it with the *following* line. """ pattern = re.compile(r'(?{{/macroName}}") We want to keep it on a separate line instead of "soft" merging it with the *preceding* line. """ pattern = re.compile(r'^(?:\s*)(\{\{/\w+\}\})', re.MULTILINE) parse_inner = False parse_group = 1 _token_types = [] reset_tokens() mistletoe-1.3.0/mistletoe/span_tokenizer.py000066400000000000000000000075131455324047100211450ustar00rootroot00000000000000""" Inline tokenizer for mistletoe. """ import html import re # replacement for html._charref which matches only entitydefs ending with ';', # according to the CommonMark spec. _markdown_charref = re.compile(r'&(#[0-9]{1,7};' r'|#[xX][0-9a-fA-F]{1,6};' r'|[^\t\n\f <&#;]{1,32};)') _stdlib_charref = html._charref def tokenize(string, token_types): try: html._charref = _markdown_charref *token_types, fallback_token = token_types tokens = find_tokens(string, token_types, fallback_token) token_buffer = [] if tokens: prev = tokens[0] for curr in tokens[1:]: prev = eval_tokens(prev, curr, token_buffer) token_buffer.append(prev) return make_tokens(token_buffer, 0, len(string), string, fallback_token) finally: html._charref = _stdlib_charref def find_tokens(string, token_types, fallback_token): tokens = [] for token_type in token_types: for m in token_type.find(string): tokens.append(ParseToken(m.start(), m.end(), m, string, token_type, fallback_token)) return sorted(tokens) def eval_tokens(x, y, token_buffer): r = relation(x, y) if r == 0: token_buffer.append(x) return y if r == 1: return x if x.cls.precedence >= y.cls.precedence else y if r == 2: x.append_child(y) return x return x def eval_new_child(parent, child): last_child = parent.children[-1] r = relation(last_child, child) if r == 0: parent.children.append(child) elif r == 1 and last_child.cls.precedence < child.cls.precedence: parent.children[-1] = child elif r == 2: last_child.append_child(child) def relation(x, y): if x.end <= y.start: return 0 # x preceeds y if x.end >= y.end: if x.parse_start <= y.start and x.parse_end >= y.end: return 2 # x contains y if x.parse_end <= y.start: return 3 # ignore y return 1 # x intersects y def make_tokens(tokens, start, end, string, fallback_token): result = [] prev_end = start for token in tokens: if token.start > prev_end: t = fallback_token(html.unescape(string[prev_end:token.start])) if t is not None: result.append(t) t = token.make() if t is not None: result.append(t) prev_end = token.end if prev_end != end: result.append(fallback_token(html.unescape(string[prev_end:end]))) return result class ParseToken: def __init__(self, start, end, match, string, cls, fallback_token): self.start = start self.end = end self.parse_start = match.start(cls.parse_group) self.parse_end = match.end(cls.parse_group) self.match = match self.string = string self.cls = cls self.fallback_token = fallback_token self.children = [] def append_child(self, child): if self.cls.parse_inner: if not self.children: self.children.append(child) else: eval_new_child(self, child) def make(self): if not self.cls.parse_inner: return self.cls(self.match) children = make_tokens(self.children, self.parse_start, self.parse_end, self.string, self.fallback_token) token = self.cls(self.match) token.children = children return token def __lt__(self, other): return self.start < other.start def __repr__(self): pattern = '' return pattern.format(self.start, self.end, self.parse_start, self.parse_end, repr(self.cls.__name__), self.children) mistletoe-1.3.0/mistletoe/token.py000066400000000000000000000042331455324047100172260ustar00rootroot00000000000000""" Base token class. """ """ Stores a reference to the current document (root) token during parsing. Footnotes are stored in the document token by accessing this reference. """ _root_node = None def _short_repr(value): """ Return a shortened ``repr`` output of value for use in ``__repr__`` methods. """ if isinstance(value, str): chars = len(value) threshold = 30 if chars > threshold: return "{0!r}...+{1}".format(value[:threshold], chars - threshold) return repr(value) class Token: """ Base token class. `Token` has two subclasses: * `block_token.BlockToken`, for all block level tokens. A block level token is text which occupies the entire horizontal width of the "page" and is offset for the surrounding sibling block with line breaks. * `span_token.SpanToken`, for all span-level (or inline-level) tokens. A span-level token appears inside the flow of the text lines without any surrounding line break. Custom ``__repr__`` methods in subclasses: The default ``__repr__`` implementation outputs the number of child tokens (from the attribute ``children``) if applicable, and the ``content`` attribute if applicable. If any additional attributes should be included in the ``__repr__`` output, this can be specified by setting the class attribute ``repr_attributes`` to a tuple containing the attribute names to be output. """ repr_attributes = () def __repr__(self): output = "<{}.{}".format( self.__class__.__module__, self.__class__.__name__ ) if "children" in vars(self): count = len(self.children) if count == 1: output += " with 1 child" else: output += " with {} children".format(count) if "content" in vars(self): output += " content=" + _short_repr(self.content) for attrname in self.repr_attributes: attrvalue = getattr(self, attrname) output += " {}={}".format(attrname, _short_repr(attrvalue)) output += " at {:#x}>".format(id(self)) return output mistletoe-1.3.0/mistletoe/utils.py000066400000000000000000000024371455324047100172520ustar00rootroot00000000000000from collections import namedtuple TraverseResult = namedtuple('TraverseResult', ['node', 'parent', 'depth']) def traverse(source, klass=None, depth=None, include_source=False): """Traverse the syntax tree, recursively yielding children. Args: source: The source syntax token klass: filter children by a certain token class depth (int): The depth to recurse into the tree include_source (bool): whether to first yield the source element (provided it passes any given ``klass`` filter) Yields: A container for an element, its parent and depth """ current_depth = 0 if include_source and (klass is None or isinstance(source, klass)): yield TraverseResult(source, None, current_depth) next_children = [(source, c) for c in getattr(source, 'children', [])] while next_children and (depth is None or current_depth < depth): current_depth += 1 new_children = [] for parent, child in next_children: if klass is None or isinstance(child, klass): yield TraverseResult(child, parent, current_depth) new_children.extend( [(child, c) for c in getattr(child, 'children', [])] ) next_children = new_children mistletoe-1.3.0/performance.md000066400000000000000000000101111455324047100163420ustar00rootroot00000000000000

    Performance

    mistletoe is the fastest CommonMark compliant implementation in Python, even though since 2017 when the first benchmarks were run, the competing implementations improved their performance notably. The benchmark ------------- Try the benchmarks yourself by running: ```sh $ python3 test/benchmark.py # all results in seconds Test document: test/samples/syntax.md Test iterations: 1000 Running tests with markdown, mistune, commonmark, mistletoe... ============================================================== markdown: 36,1091867333333 # v3.3.4 from Feb 24, 2021 mistune: 10,7586129666667 # v0.8.4 from Oct 30, 2019 commonmark: 43,4187382666667 # v0.9.1 from Oct 04, 2019 mistletoe: 33,2067990666667 # v0.8.0 from Oct 09, 2021 # run with Python 3.7.5 on MS Windows 10 ``` We notice that Mistune is the fastest Markdown parser, and by a good margin, which demands some explanation. mistletoe's biggest performance penalty comes from stringently following the CommonMark spec, which outlines a highly context-sensitive grammar for Markdown. Mistune takes a simpler approach to the lexing and parsing process, but this means that it cannot handle more complex cases, e.g., precedence of different types of tokens, escaping rules, etc. To see why this might be important to you, consider the following Markdown input ([example 392][example-392] from the CommonMark spec): ```markdown ***foo** bar* ``` The natural interpretation is: ```html

    foo bar

    ``` ... and it is indeed the output of Python-Markdown, Commonmark-py and mistletoe. Mistune (version 0.8.3) greedily parses the first two asterisks in the first delimiter run as a strong-emphasis opener, the second delimiter run as its closer, but does not know what to do with the remaining asterisk in between: ```html

    *foo bar*

    ``` The implication of this runs deeper, and it is not simply a matter of dogmatically following an external spec. By adopting a more flexible parsing algorithm, mistletoe allows us to specify a precedence level to each token class, including custom ones that you might write in the future. Code spans, for example, has a higher precedence level than emphasis, so ```markdown *foo `bar* baz` ``` ... is parsed as: ```html

    *foo bar* baz

    ``` ... whereas Mistune parses this as: ```html

    foo `bar baz`

    ``` Of course, it is not *impossible* for Mistune to modify its behavior, and parse these two examples correctly, through more sophisticated regexes or some other means. It is nevertheless *highly likely* that, when Mistune implements all the necessary context checks, it will suffer from the same performance penalties. Contextual analysis is why the other implementations are slower. The lack thereof is the reason mistune enjoys stellar performance among similar parser implementations, as well as the limitations that come with these performance benefits. If you want an implementation that focuses on raw speed, mistune remains a solid choice. If you need a spec-compliant and readily extensible implementation, however, mistletoe is still marginally faster than Python-Markdown, and significantly faster than CommonMark-py. Use PyPy for better performance ------------------------------- Another bottleneck of mistletoe compared to mistune is the function overhead. Because, unlike mistune, mistletoe chooses to split functionality into modules, function lookups can take significantly longer than mistune. To boost the performance further, it is suggested to use PyPy with mistletoe. Benchmark results show that on PyPy, mistletoe's performance improves significantly: ```sh $ pypy3 test/benchmark.py mistune mistletoe Test document: test/samples/syntax.md Test iterations: 1000 Running tests with mistune, mistletoe... ======================================== mistune: 9.720779 mistletoe: 21.984181399999997 # run with PyPy 3.7-v7.3.7 on MS Windows 10 ``` [example-392]: https://spec.commonmark.org/0.28/#example-392 mistletoe-1.3.0/requirements.txt000066400000000000000000000001441455324047100170100ustar00rootroot00000000000000# optional Pygments>=2.11.2,<=2.16.* # 2.11.2 is the last version which supports the old Python 3.5 mistletoe-1.3.0/resources/000077500000000000000000000000001455324047100155375ustar00rootroot00000000000000mistletoe-1.3.0/resources/logo.svg000066400000000000000000000203501455324047100172200ustar00rootroot00000000000000 mistletoe-1.3.0/setup.py000066400000000000000000000025321455324047100152410ustar00rootroot00000000000000from setuptools import setup import mistletoe setup( name='mistletoe', version=mistletoe.__version__, description='A fast, extensible Markdown parser in pure Python.', url='https://github.com/miyuchina/mistletoe', author='Mi Yu', author_email='hello@afteryu.me', license='MIT', packages=['mistletoe', 'mistletoe.contrib'], entry_points={'console_scripts': ['mistletoe = mistletoe.__main__:main']}, classifiers=[ 'Development Status :: 5 - Production/Stable', 'Intended Audience :: Developers', 'License :: OSI Approved :: MIT License', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.5', 'Programming Language :: Python :: 3.6', 'Programming Language :: Python :: 3.7', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', 'Programming Language :: Python :: 3.11', 'Programming Language :: Python :: Implementation :: CPython', 'Programming Language :: Python :: Implementation :: PyPy', 'Topic :: Software Development :: Libraries :: Python Modules', 'Topic :: Text Processing :: Markup :: Markdown', ], keywords='markdown lexer parser development', python_requires='~=3.5', zip_safe=False, ) mistletoe-1.3.0/test/000077500000000000000000000000001455324047100145045ustar00rootroot00000000000000mistletoe-1.3.0/test/__init__.py000066400000000000000000000000001455324047100166030ustar00rootroot00000000000000mistletoe-1.3.0/test/base_test.py000066400000000000000000000032671455324047100170370ustar00rootroot00000000000000""" Base classes for tests. """ from unittest import TestCase from mistletoe.block_token import Document class BaseRendererTest(TestCase): """ Base class for tests of renderers. """ def setUp(self): self.maxDiff = None def markdownResultTest(self, markdown, expected): output = self.renderer.render(Document(markdown)) self.assertEqual(output, expected) def filesBasedTest(func): """ Note: Use this as a decorator on a test function with an empty body. This is a realization of the "convention over configuration" practice. You only need to define a unique ``sampleOutputExtension`` within your test case setup, in addition to the ``renderer`` under the test of course. Runs the current renderer against input parsed from a file and asserts that the renderer output is equal to content stored in another file. Both the "input" and "expected output" files need to have the same ``filename`` that is extracted from the decorated test function name. """ def wrapper(self): # take the string after the last '__' in function name filename = func.__name__ filename = filename.split('__', 1)[1] # parse input markdown, call render on it and check the output with open('test/samples/{}.md'.format(filename), 'r') as fin: output = self.renderer.render(Document(fin)) with open('test/samples/{}.{}'.format(filename, self.sampleOutputExtension), 'r') as expectedFin: expected = ''.join(expectedFin) self.assertEqual(output, expected) return wrapper mistletoe-1.3.0/test/benchmark.py000066400000000000000000000035171455324047100170160ustar00rootroot00000000000000#!/usr/bin/env python3 # -*- coding: utf-8 -*- import sys from importlib import import_module from time import perf_counter TEST_FILE = 'test/samples/syntax.md' TIMES = 1000 def benchmark(package_name): def decorator(func): def inner(): try: package = import_module(package_name) except ImportError: return 'not available.' start = perf_counter() for i in range(TIMES): func(package) end = perf_counter() return end - start return inner return decorator @benchmark('markdown') def run_markdown(package): with open(TEST_FILE, 'r', encoding='utf-8') as fin: return package.markdown(fin.read(), extensions=['fenced_code', 'tables']) @benchmark('mistune') def run_mistune(package): with open(TEST_FILE, 'r', encoding='utf-8') as fin: return package.markdown(fin.read()) @benchmark('commonmark') def run_commonmark(package): with open(TEST_FILE, 'r', encoding='utf-8') as fin: return package.commonmark(fin.read()) @benchmark('mistletoe') def run_mistletoe(package): with open(TEST_FILE, 'r', encoding='utf-8') as fin: return package.markdown(fin) def run(package_name): print(package_name, end=': ') print(globals()['run_{}'.format(package_name.lower())]()) def run_all(package_names): prompt = 'Running tests with {}...'.format(', '.join(package_names)) print(prompt) print('=' * len(prompt)) for package_name in package_names: run(package_name) def main(*args): print('Test document: {}'.format(TEST_FILE)) print('Test iterations: {}'.format(TIMES)) if args[1:]: run_all(args[1:]) else: run_all(['markdown', 'mistune', 'commonmark', 'mistletoe']) if __name__ == '__main__': main(*sys.argv) mistletoe-1.3.0/test/bump_version.sh000066400000000000000000000005241455324047100175510ustar00rootroot00000000000000#!/usr/bin/env bash set -e OLD_VERSION=$1 NEW_VERSION=$2 if [[ $OLD_VERSION == "" || $NEW_VERSION == "" ]]; then echo "[Error] Missing version numbers." exit 1 fi FILES="README.md mistletoe/__init__.py" for FILE in $FILES; do sed -e "s/$OLD_VERSION/$NEW_VERSION/g" "$FILE" > "$FILE.tmp" mv $FILE{.tmp,} done git diff mistletoe-1.3.0/test/samples/000077500000000000000000000000001455324047100161505ustar00rootroot00000000000000mistletoe-1.3.0/test/samples/__init__.py000066400000000000000000000000001455324047100202470ustar00rootroot00000000000000mistletoe-1.3.0/test/samples/basic_blocks.jira000066400000000000000000000012721455324047100214370ustar00rootroot00000000000000h1. Line endings are important not only in Markdown... h2. Soft break Line continues - this is "soft break". h2. Hard break Line.\\ Another Line because the previous ends with at least two spaces. Line.\\ Another line because the previous ends with a "\". h2. Other block elements ||Column A||Column B||Column C|| |A1|B1|C1| |A2|B2|C2| |A3|B3|C3| A paragraph under a table. ---- A paragraph under a horizontal line. h2. Code blocks Java code: {code:java} public class Main { public static void main(String[] args) { System.out.println("Hello World!"); } } {code} Log file: {code} 2020-07-05 10:20:55 ... 2020-07-05 10:20:56 ... ... 2020-07-05 10:21:03 ... {code} mistletoe-1.3.0/test/samples/basic_blocks.md000066400000000000000000000013241455324047100211100ustar00rootroot00000000000000# Line endings are important not only in Markdown... ## Soft break Line continues - this is "soft break". ## Hard break Line. Another Line because the previous ends with at least two spaces. Line.\ Another line because the previous ends with a "\\". ## Other block elements Column A | Column B | Column C ---------|----------|--------- A1 | B1 | C1 A2 | B2 | C2 A3 | B3 | C3 A paragraph under a table. ---- A paragraph under a horizontal line. ## Code blocks Java code: ```java public class Main { public static void main(String[] args) { System.out.println("Hello World!"); } } ``` Log file: 2020-07-05 10:20:55 ... 2020-07-05 10:20:56 ... ... 2020-07-05 10:21:03 ... mistletoe-1.3.0/test/samples/basic_blocks.xwiki20000066400000000000000000000013171455324047100220070ustar00rootroot00000000000000= Line endings are important not only in Markdown... = == Soft break == Line continues - this is "soft break". == Hard break == Line. Another Line because the previous ends with at least two spaces. Line. Another line because the previous ends with a "~\". == Other block elements == |=Column A|=Column B|=Column C |A1|B1|C1 |A2|B2|C2 |A3|B3|C3 A paragraph under a table. ---- A paragraph under a horizontal line. == Code blocks == Java code: {{code language="java"}} public class Main { public static void main(String[] args) { System.out.println("Hello World!"); } } {{/code}} Log file: {{code}} 2020-07-05 10:20:55 ... 2020-07-05 10:20:56 ... ... 2020-07-05 10:21:03 ... {{/code}} mistletoe-1.3.0/test/samples/jquery.md000066400000000000000000000314751455324047100200230ustar00rootroot00000000000000[jQuery](https://jquery.com/) — New Wave JavaScript ================================================== Contribution Guides -------------------------------------- In the spirit of open source software development, jQuery always encourages community code contribution. To help you get started and before you jump into writing code, be sure to read these important contribution guidelines thoroughly: 1. [Getting Involved](https://contribute.jquery.org/) 2. [Core Style Guide](https://contribute.jquery.org/style-guide/js/) 3. [Writing Code for jQuery Foundation Projects](https://contribute.jquery.org/code/) Environments in which to use jQuery -------------------------------------- - [Browser support](https://jquery.com/browser-support/) - jQuery also supports Node, browser extensions, and other non-browser environments. What you need to build your own jQuery -------------------------------------- In order to build jQuery, you need to have the latest Node.js/npm and git 1.7 or later. Earlier versions might work, but are not supported. For Windows, you have to download and install [git](https://git-scm.com/downloads) and [Node.js](https://nodejs.org/en/download/). OS X users should install [Homebrew](http://brew.sh/). Once Homebrew is installed, run `brew install git` to install git, and `brew install node` to install Node.js. Linux/BSD users should use their appropriate package managers to install git and Node.js, or build from source if you swing that way. Easy-peasy. How to build your own jQuery ---------------------------- Clone a copy of the main jQuery git repo by running: ```bash git clone git://github.com/jquery/jquery.git ``` Enter the jquery directory and run the build script: ```bash cd jquery && npm run build ``` The built version of jQuery will be put in the `dist/` subdirectory, along with the minified copy and associated map file. If you want to create custom build or help with jQuery development, it would be better to install [grunt command line interface](https://github.com/gruntjs/grunt-cli) as a global package: ``` npm install -g grunt-cli ``` Make sure you have `grunt` installed by testing: ``` grunt -V ``` Now by running the `grunt` command, in the jquery directory, you can build a full version of jQuery, just like with an `npm run build` command: ``` grunt ``` There are many other tasks available for jQuery Core: ``` grunt -help ``` ### Modules Special builds can be created that exclude subsets of jQuery functionality. This allows for smaller custom builds when the builder is certain that those parts of jQuery are not being used. For example, an app that only used JSONP for `$.ajax()` and did not need to calculate offsets or positions of elements could exclude the offset and ajax/xhr modules. Any module may be excluded except for `core`, and `selector`. To exclude a module, pass its path relative to the `src` folder (without the `.js` extension). Some example modules that can be excluded are: - **ajax**: All AJAX functionality: `$.ajax()`, `$.get()`, `$.post()`, `$.ajaxSetup()`, `.load()`, transports, and ajax event shorthands such as `.ajaxStart()`. - **ajax/xhr**: The XMLHTTPRequest AJAX transport only. - **ajax/script**: The `\nokay\n", "html": "\n

    okay

    \n", "example": 170, "start_line": 2756, "end_line": 2770, "section": "HTML blocks" }, { "markdown": "\n", "html": "\n", "example": 171, "start_line": 2775, "end_line": 2791, "section": "HTML blocks" }, { "markdown": "\nh1 {color:red;}\n\np {color:blue;}\n\nokay\n", "html": "\nh1 {color:red;}\n\np {color:blue;}\n\n

    okay

    \n", "example": 172, "start_line": 2795, "end_line": 2811, "section": "HTML blocks" }, { "markdown": "\n\nfoo\n", "html": "\n\nfoo\n", "example": 173, "start_line": 2818, "end_line": 2828, "section": "HTML blocks" }, { "markdown": ">
    \n> foo\n\nbar\n", "html": "
    \n
    \nfoo\n
    \n

    bar

    \n", "example": 174, "start_line": 2831, "end_line": 2842, "section": "HTML blocks" }, { "markdown": "-
    \n- foo\n", "html": "
      \n
    • \n
      \n
    • \n
    • foo
    • \n
    \n", "example": 175, "start_line": 2845, "end_line": 2855, "section": "HTML blocks" }, { "markdown": "\n*foo*\n", "html": "\n

    foo

    \n", "example": 176, "start_line": 2860, "end_line": 2866, "section": "HTML blocks" }, { "markdown": "*bar*\n*baz*\n", "html": "*bar*\n

    baz

    \n", "example": 177, "start_line": 2869, "end_line": 2875, "section": "HTML blocks" }, { "markdown": "1. *bar*\n", "html": "1. *bar*\n", "example": 178, "start_line": 2881, "end_line": 2889, "section": "HTML blocks" }, { "markdown": "\nokay\n", "html": "\n

    okay

    \n", "example": 179, "start_line": 2894, "end_line": 2906, "section": "HTML blocks" }, { "markdown": "';\n\n?>\nokay\n", "html": "';\n\n?>\n

    okay

    \n", "example": 180, "start_line": 2912, "end_line": 2926, "section": "HTML blocks" }, { "markdown": "\n", "html": "\n", "example": 181, "start_line": 2931, "end_line": 2935, "section": "HTML blocks" }, { "markdown": "\nokay\n", "html": "\n

    okay

    \n", "example": 182, "start_line": 2940, "end_line": 2968, "section": "HTML blocks" }, { "markdown": " \n\n \n", "html": " \n
    <!-- foo -->\n
    \n", "example": 183, "start_line": 2974, "end_line": 2982, "section": "HTML blocks" }, { "markdown": "
    \n\n
    \n", "html": "
    \n
    <div>\n
    \n", "example": 184, "start_line": 2985, "end_line": 2993, "section": "HTML blocks" }, { "markdown": "Foo\n
    \nbar\n
    \n", "html": "

    Foo

    \n
    \nbar\n
    \n", "example": 185, "start_line": 2999, "end_line": 3009, "section": "HTML blocks" }, { "markdown": "
    \nbar\n
    \n*foo*\n", "html": "
    \nbar\n
    \n*foo*\n", "example": 186, "start_line": 3016, "end_line": 3026, "section": "HTML blocks" }, { "markdown": "Foo\n\nbaz\n", "html": "

    Foo\n\nbaz

    \n", "example": 187, "start_line": 3031, "end_line": 3039, "section": "HTML blocks" }, { "markdown": "
    \n\n*Emphasized* text.\n\n
    \n", "html": "
    \n

    Emphasized text.

    \n
    \n", "example": 188, "start_line": 3072, "end_line": 3082, "section": "HTML blocks" }, { "markdown": "
    \n*Emphasized* text.\n
    \n", "html": "
    \n*Emphasized* text.\n
    \n", "example": 189, "start_line": 3085, "end_line": 3093, "section": "HTML blocks" }, { "markdown": "\n\n\n\n\n\n\n\n
    \nHi\n
    \n", "html": "\n\n\n\n
    \nHi\n
    \n", "example": 190, "start_line": 3107, "end_line": 3127, "section": "HTML blocks" }, { "markdown": "\n\n \n\n \n\n \n\n
    \n Hi\n
    \n", "html": "\n \n
    <td>\n  Hi\n</td>\n
    \n \n
    \n", "example": 191, "start_line": 3134, "end_line": 3155, "section": "HTML blocks" }, { "markdown": "[foo]: /url \"title\"\n\n[foo]\n", "html": "

    foo

    \n", "example": 192, "start_line": 3183, "end_line": 3189, "section": "Link reference definitions" }, { "markdown": " [foo]: \n /url \n 'the title' \n\n[foo]\n", "html": "

    foo

    \n", "example": 193, "start_line": 3192, "end_line": 3200, "section": "Link reference definitions" }, { "markdown": "[Foo*bar\\]]:my_(url) 'title (with parens)'\n\n[Foo*bar\\]]\n", "html": "

    Foo*bar]

    \n", "example": 194, "start_line": 3203, "end_line": 3209, "section": "Link reference definitions" }, { "markdown": "[Foo bar]:\n\n'title'\n\n[Foo bar]\n", "html": "

    Foo bar

    \n", "example": 195, "start_line": 3212, "end_line": 3220, "section": "Link reference definitions" }, { "markdown": "[foo]: /url '\ntitle\nline1\nline2\n'\n\n[foo]\n", "html": "

    foo

    \n", "example": 196, "start_line": 3225, "end_line": 3239, "section": "Link reference definitions" }, { "markdown": "[foo]: /url 'title\n\nwith blank line'\n\n[foo]\n", "html": "

    [foo]: /url 'title

    \n

    with blank line'

    \n

    [foo]

    \n", "example": 197, "start_line": 3244, "end_line": 3254, "section": "Link reference definitions" }, { "markdown": "[foo]:\n/url\n\n[foo]\n", "html": "

    foo

    \n", "example": 198, "start_line": 3259, "end_line": 3266, "section": "Link reference definitions" }, { "markdown": "[foo]:\n\n[foo]\n", "html": "

    [foo]:

    \n

    [foo]

    \n", "example": 199, "start_line": 3271, "end_line": 3278, "section": "Link reference definitions" }, { "markdown": "[foo]: <>\n\n[foo]\n", "html": "

    foo

    \n", "example": 200, "start_line": 3283, "end_line": 3289, "section": "Link reference definitions" }, { "markdown": "[foo]: (baz)\n\n[foo]\n", "html": "

    [foo]: (baz)

    \n

    [foo]

    \n", "example": 201, "start_line": 3294, "end_line": 3301, "section": "Link reference definitions" }, { "markdown": "[foo]: /url\\bar\\*baz \"foo\\\"bar\\baz\"\n\n[foo]\n", "html": "

    foo

    \n", "example": 202, "start_line": 3307, "end_line": 3313, "section": "Link reference definitions" }, { "markdown": "[foo]\n\n[foo]: url\n", "html": "

    foo

    \n", "example": 203, "start_line": 3318, "end_line": 3324, "section": "Link reference definitions" }, { "markdown": "[foo]\n\n[foo]: first\n[foo]: second\n", "html": "

    foo

    \n", "example": 204, "start_line": 3330, "end_line": 3337, "section": "Link reference definitions" }, { "markdown": "[FOO]: /url\n\n[Foo]\n", "html": "

    Foo

    \n", "example": 205, "start_line": 3343, "end_line": 3349, "section": "Link reference definitions" }, { "markdown": "[ΑΓΩ]: /φου\n\n[αγω]\n", "html": "

    αγω

    \n", "example": 206, "start_line": 3352, "end_line": 3358, "section": "Link reference definitions" }, { "markdown": "[foo]: /url\n", "html": "", "example": 207, "start_line": 3367, "end_line": 3370, "section": "Link reference definitions" }, { "markdown": "[\nfoo\n]: /url\nbar\n", "html": "

    bar

    \n", "example": 208, "start_line": 3375, "end_line": 3382, "section": "Link reference definitions" }, { "markdown": "[foo]: /url \"title\" ok\n", "html": "

    [foo]: /url "title" ok

    \n", "example": 209, "start_line": 3388, "end_line": 3392, "section": "Link reference definitions" }, { "markdown": "[foo]: /url\n\"title\" ok\n", "html": "

    "title" ok

    \n", "example": 210, "start_line": 3397, "end_line": 3402, "section": "Link reference definitions" }, { "markdown": " [foo]: /url \"title\"\n\n[foo]\n", "html": "
    [foo]: /url "title"\n
    \n

    [foo]

    \n", "example": 211, "start_line": 3408, "end_line": 3416, "section": "Link reference definitions" }, { "markdown": "```\n[foo]: /url\n```\n\n[foo]\n", "html": "
    [foo]: /url\n
    \n

    [foo]

    \n", "example": 212, "start_line": 3422, "end_line": 3432, "section": "Link reference definitions" }, { "markdown": "Foo\n[bar]: /baz\n\n[bar]\n", "html": "

    Foo\n[bar]: /baz

    \n

    [bar]

    \n", "example": 213, "start_line": 3437, "end_line": 3446, "section": "Link reference definitions" }, { "markdown": "# [Foo]\n[foo]: /url\n> bar\n", "html": "

    Foo

    \n
    \n

    bar

    \n
    \n", "example": 214, "start_line": 3452, "end_line": 3461, "section": "Link reference definitions" }, { "markdown": "[foo]: /url\nbar\n===\n[foo]\n", "html": "

    bar

    \n

    foo

    \n", "example": 215, "start_line": 3463, "end_line": 3471, "section": "Link reference definitions" }, { "markdown": "[foo]: /url\n===\n[foo]\n", "html": "

    ===\nfoo

    \n", "example": 216, "start_line": 3473, "end_line": 3480, "section": "Link reference definitions" }, { "markdown": "[foo]: /foo-url \"foo\"\n[bar]: /bar-url\n \"bar\"\n[baz]: /baz-url\n\n[foo],\n[bar],\n[baz]\n", "html": "

    foo,\nbar,\nbaz

    \n", "example": 217, "start_line": 3486, "end_line": 3499, "section": "Link reference definitions" }, { "markdown": "[foo]\n\n> [foo]: /url\n", "html": "

    foo

    \n
    \n
    \n", "example": 218, "start_line": 3507, "end_line": 3515, "section": "Link reference definitions" }, { "markdown": "aaa\n\nbbb\n", "html": "

    aaa

    \n

    bbb

    \n", "example": 219, "start_line": 3529, "end_line": 3536, "section": "Paragraphs" }, { "markdown": "aaa\nbbb\n\nccc\nddd\n", "html": "

    aaa\nbbb

    \n

    ccc\nddd

    \n", "example": 220, "start_line": 3541, "end_line": 3552, "section": "Paragraphs" }, { "markdown": "aaa\n\n\nbbb\n", "html": "

    aaa

    \n

    bbb

    \n", "example": 221, "start_line": 3557, "end_line": 3565, "section": "Paragraphs" }, { "markdown": " aaa\n bbb\n", "html": "

    aaa\nbbb

    \n", "example": 222, "start_line": 3570, "end_line": 3576, "section": "Paragraphs" }, { "markdown": "aaa\n bbb\n ccc\n", "html": "

    aaa\nbbb\nccc

    \n", "example": 223, "start_line": 3582, "end_line": 3590, "section": "Paragraphs" }, { "markdown": " aaa\nbbb\n", "html": "

    aaa\nbbb

    \n", "example": 224, "start_line": 3596, "end_line": 3602, "section": "Paragraphs" }, { "markdown": " aaa\nbbb\n", "html": "
    aaa\n
    \n

    bbb

    \n", "example": 225, "start_line": 3605, "end_line": 3612, "section": "Paragraphs" }, { "markdown": "aaa \nbbb \n", "html": "

    aaa
    \nbbb

    \n", "example": 226, "start_line": 3619, "end_line": 3625, "section": "Paragraphs" }, { "markdown": " \n\naaa\n \n\n# aaa\n\n \n", "html": "

    aaa

    \n

    aaa

    \n", "example": 227, "start_line": 3636, "end_line": 3648, "section": "Blank lines" }, { "markdown": "> # Foo\n> bar\n> baz\n", "html": "
    \n

    Foo

    \n

    bar\nbaz

    \n
    \n", "example": 228, "start_line": 3704, "end_line": 3714, "section": "Block quotes" }, { "markdown": "># Foo\n>bar\n> baz\n", "html": "
    \n

    Foo

    \n

    bar\nbaz

    \n
    \n", "example": 229, "start_line": 3719, "end_line": 3729, "section": "Block quotes" }, { "markdown": " > # Foo\n > bar\n > baz\n", "html": "
    \n

    Foo

    \n

    bar\nbaz

    \n
    \n", "example": 230, "start_line": 3734, "end_line": 3744, "section": "Block quotes" }, { "markdown": " > # Foo\n > bar\n > baz\n", "html": "
    > # Foo\n> bar\n> baz\n
    \n", "example": 231, "start_line": 3749, "end_line": 3758, "section": "Block quotes" }, { "markdown": "> # Foo\n> bar\nbaz\n", "html": "
    \n

    Foo

    \n

    bar\nbaz

    \n
    \n", "example": 232, "start_line": 3764, "end_line": 3774, "section": "Block quotes" }, { "markdown": "> bar\nbaz\n> foo\n", "html": "
    \n

    bar\nbaz\nfoo

    \n
    \n", "example": 233, "start_line": 3780, "end_line": 3790, "section": "Block quotes" }, { "markdown": "> foo\n---\n", "html": "
    \n

    foo

    \n
    \n
    \n", "example": 234, "start_line": 3804, "end_line": 3812, "section": "Block quotes" }, { "markdown": "> - foo\n- bar\n", "html": "
    \n
      \n
    • foo
    • \n
    \n
    \n
      \n
    • bar
    • \n
    \n", "example": 235, "start_line": 3824, "end_line": 3836, "section": "Block quotes" }, { "markdown": "> foo\n bar\n", "html": "
    \n
    foo\n
    \n
    \n
    bar\n
    \n", "example": 236, "start_line": 3842, "end_line": 3852, "section": "Block quotes" }, { "markdown": "> ```\nfoo\n```\n", "html": "
    \n
    \n
    \n

    foo

    \n
    \n", "example": 237, "start_line": 3855, "end_line": 3865, "section": "Block quotes" }, { "markdown": "> foo\n - bar\n", "html": "
    \n

    foo\n- bar

    \n
    \n", "example": 238, "start_line": 3871, "end_line": 3879, "section": "Block quotes" }, { "markdown": ">\n", "html": "
    \n
    \n", "example": 239, "start_line": 3895, "end_line": 3900, "section": "Block quotes" }, { "markdown": ">\n> \n> \n", "html": "
    \n
    \n", "example": 240, "start_line": 3903, "end_line": 3910, "section": "Block quotes" }, { "markdown": ">\n> foo\n> \n", "html": "
    \n

    foo

    \n
    \n", "example": 241, "start_line": 3915, "end_line": 3923, "section": "Block quotes" }, { "markdown": "> foo\n\n> bar\n", "html": "
    \n

    foo

    \n
    \n
    \n

    bar

    \n
    \n", "example": 242, "start_line": 3928, "end_line": 3939, "section": "Block quotes" }, { "markdown": "> foo\n> bar\n", "html": "
    \n

    foo\nbar

    \n
    \n", "example": 243, "start_line": 3950, "end_line": 3958, "section": "Block quotes" }, { "markdown": "> foo\n>\n> bar\n", "html": "
    \n

    foo

    \n

    bar

    \n
    \n", "example": 244, "start_line": 3963, "end_line": 3972, "section": "Block quotes" }, { "markdown": "foo\n> bar\n", "html": "

    foo

    \n
    \n

    bar

    \n
    \n", "example": 245, "start_line": 3977, "end_line": 3985, "section": "Block quotes" }, { "markdown": "> aaa\n***\n> bbb\n", "html": "
    \n

    aaa

    \n
    \n
    \n
    \n

    bbb

    \n
    \n", "example": 246, "start_line": 3991, "end_line": 4003, "section": "Block quotes" }, { "markdown": "> bar\nbaz\n", "html": "
    \n

    bar\nbaz

    \n
    \n", "example": 247, "start_line": 4009, "end_line": 4017, "section": "Block quotes" }, { "markdown": "> bar\n\nbaz\n", "html": "
    \n

    bar

    \n
    \n

    baz

    \n", "example": 248, "start_line": 4020, "end_line": 4029, "section": "Block quotes" }, { "markdown": "> bar\n>\nbaz\n", "html": "
    \n

    bar

    \n
    \n

    baz

    \n", "example": 249, "start_line": 4032, "end_line": 4041, "section": "Block quotes" }, { "markdown": "> > > foo\nbar\n", "html": "
    \n
    \n
    \n

    foo\nbar

    \n
    \n
    \n
    \n", "example": 250, "start_line": 4048, "end_line": 4060, "section": "Block quotes" }, { "markdown": ">>> foo\n> bar\n>>baz\n", "html": "
    \n
    \n
    \n

    foo\nbar\nbaz

    \n
    \n
    \n
    \n", "example": 251, "start_line": 4063, "end_line": 4077, "section": "Block quotes" }, { "markdown": "> code\n\n> not code\n", "html": "
    \n
    code\n
    \n
    \n
    \n

    not code

    \n
    \n", "example": 252, "start_line": 4085, "end_line": 4097, "section": "Block quotes" }, { "markdown": "A paragraph\nwith two lines.\n\n indented code\n\n> A block quote.\n", "html": "

    A paragraph\nwith two lines.

    \n
    indented code\n
    \n
    \n

    A block quote.

    \n
    \n", "example": 253, "start_line": 4139, "end_line": 4154, "section": "List items" }, { "markdown": "1. A paragraph\n with two lines.\n\n indented code\n\n > A block quote.\n", "html": "
      \n
    1. \n

      A paragraph\nwith two lines.

      \n
      indented code\n
      \n
      \n

      A block quote.

      \n
      \n
    2. \n
    \n", "example": 254, "start_line": 4161, "end_line": 4180, "section": "List items" }, { "markdown": "- one\n\n two\n", "html": "
      \n
    • one
    • \n
    \n

    two

    \n", "example": 255, "start_line": 4194, "end_line": 4203, "section": "List items" }, { "markdown": "- one\n\n two\n", "html": "
      \n
    • \n

      one

      \n

      two

      \n
    • \n
    \n", "example": 256, "start_line": 4206, "end_line": 4217, "section": "List items" }, { "markdown": " - one\n\n two\n", "html": "
      \n
    • one
    • \n
    \n
     two\n
    \n", "example": 257, "start_line": 4220, "end_line": 4230, "section": "List items" }, { "markdown": " - one\n\n two\n", "html": "
      \n
    • \n

      one

      \n

      two

      \n
    • \n
    \n", "example": 258, "start_line": 4233, "end_line": 4244, "section": "List items" }, { "markdown": " > > 1. one\n>>\n>> two\n", "html": "
    \n
    \n
      \n
    1. \n

      one

      \n

      two

      \n
    2. \n
    \n
    \n
    \n", "example": 259, "start_line": 4255, "end_line": 4270, "section": "List items" }, { "markdown": ">>- one\n>>\n > > two\n", "html": "
    \n
    \n
      \n
    • one
    • \n
    \n

    two

    \n
    \n
    \n", "example": 260, "start_line": 4282, "end_line": 4295, "section": "List items" }, { "markdown": "-one\n\n2.two\n", "html": "

    -one

    \n

    2.two

    \n", "example": 261, "start_line": 4301, "end_line": 4308, "section": "List items" }, { "markdown": "- foo\n\n\n bar\n", "html": "
      \n
    • \n

      foo

      \n

      bar

      \n
    • \n
    \n", "example": 262, "start_line": 4314, "end_line": 4326, "section": "List items" }, { "markdown": "1. foo\n\n ```\n bar\n ```\n\n baz\n\n > bam\n", "html": "
      \n
    1. \n

      foo

      \n
      bar\n
      \n

      baz

      \n
      \n

      bam

      \n
      \n
    2. \n
    \n", "example": 263, "start_line": 4331, "end_line": 4353, "section": "List items" }, { "markdown": "- Foo\n\n bar\n\n\n baz\n", "html": "
      \n
    • \n

      Foo

      \n
      bar\n\n\nbaz\n
      \n
    • \n
    \n", "example": 264, "start_line": 4359, "end_line": 4377, "section": "List items" }, { "markdown": "123456789. ok\n", "html": "
      \n
    1. ok
    2. \n
    \n", "example": 265, "start_line": 4381, "end_line": 4387, "section": "List items" }, { "markdown": "1234567890. not ok\n", "html": "

    1234567890. not ok

    \n", "example": 266, "start_line": 4390, "end_line": 4394, "section": "List items" }, { "markdown": "0. ok\n", "html": "
      \n
    1. ok
    2. \n
    \n", "example": 267, "start_line": 4399, "end_line": 4405, "section": "List items" }, { "markdown": "003. ok\n", "html": "
      \n
    1. ok
    2. \n
    \n", "example": 268, "start_line": 4408, "end_line": 4414, "section": "List items" }, { "markdown": "-1. not ok\n", "html": "

    -1. not ok

    \n", "example": 269, "start_line": 4419, "end_line": 4423, "section": "List items" }, { "markdown": "- foo\n\n bar\n", "html": "
      \n
    • \n

      foo

      \n
      bar\n
      \n
    • \n
    \n", "example": 270, "start_line": 4442, "end_line": 4454, "section": "List items" }, { "markdown": " 10. foo\n\n bar\n", "html": "
      \n
    1. \n

      foo

      \n
      bar\n
      \n
    2. \n
    \n", "example": 271, "start_line": 4459, "end_line": 4471, "section": "List items" }, { "markdown": " indented code\n\nparagraph\n\n more code\n", "html": "
    indented code\n
    \n

    paragraph

    \n
    more code\n
    \n", "example": 272, "start_line": 4478, "end_line": 4490, "section": "List items" }, { "markdown": "1. indented code\n\n paragraph\n\n more code\n", "html": "
      \n
    1. \n
      indented code\n
      \n

      paragraph

      \n
      more code\n
      \n
    2. \n
    \n", "example": 273, "start_line": 4493, "end_line": 4509, "section": "List items" }, { "markdown": "1. indented code\n\n paragraph\n\n more code\n", "html": "
      \n
    1. \n
       indented code\n
      \n

      paragraph

      \n
      more code\n
      \n
    2. \n
    \n", "example": 274, "start_line": 4515, "end_line": 4531, "section": "List items" }, { "markdown": " foo\n\nbar\n", "html": "

    foo

    \n

    bar

    \n", "example": 275, "start_line": 4542, "end_line": 4549, "section": "List items" }, { "markdown": "- foo\n\n bar\n", "html": "
      \n
    • foo
    • \n
    \n

    bar

    \n", "example": 276, "start_line": 4552, "end_line": 4561, "section": "List items" }, { "markdown": "- foo\n\n bar\n", "html": "
      \n
    • \n

      foo

      \n

      bar

      \n
    • \n
    \n", "example": 277, "start_line": 4569, "end_line": 4580, "section": "List items" }, { "markdown": "-\n foo\n-\n ```\n bar\n ```\n-\n baz\n", "html": "
      \n
    • foo
    • \n
    • \n
      bar\n
      \n
    • \n
    • \n
      baz\n
      \n
    • \n
    \n", "example": 278, "start_line": 4596, "end_line": 4617, "section": "List items" }, { "markdown": "- \n foo\n", "html": "
      \n
    • foo
    • \n
    \n", "example": 279, "start_line": 4622, "end_line": 4629, "section": "List items" }, { "markdown": "-\n\n foo\n", "html": "
      \n
    • \n
    \n

    foo

    \n", "example": 280, "start_line": 4636, "end_line": 4645, "section": "List items" }, { "markdown": "- foo\n-\n- bar\n", "html": "
      \n
    • foo
    • \n
    • \n
    • bar
    • \n
    \n", "example": 281, "start_line": 4650, "end_line": 4660, "section": "List items" }, { "markdown": "- foo\n- \n- bar\n", "html": "
      \n
    • foo
    • \n
    • \n
    • bar
    • \n
    \n", "example": 282, "start_line": 4665, "end_line": 4675, "section": "List items" }, { "markdown": "1. foo\n2.\n3. bar\n", "html": "
      \n
    1. foo
    2. \n
    3. \n
    4. bar
    5. \n
    \n", "example": 283, "start_line": 4680, "end_line": 4690, "section": "List items" }, { "markdown": "*\n", "html": "
      \n
    • \n
    \n", "example": 284, "start_line": 4695, "end_line": 4701, "section": "List items" }, { "markdown": "foo\n*\n\nfoo\n1.\n", "html": "

    foo\n*

    \n

    foo\n1.

    \n", "example": 285, "start_line": 4705, "end_line": 4716, "section": "List items" }, { "markdown": " 1. A paragraph\n with two lines.\n\n indented code\n\n > A block quote.\n", "html": "
      \n
    1. \n

      A paragraph\nwith two lines.

      \n
      indented code\n
      \n
      \n

      A block quote.

      \n
      \n
    2. \n
    \n", "example": 286, "start_line": 4727, "end_line": 4746, "section": "List items" }, { "markdown": " 1. A paragraph\n with two lines.\n\n indented code\n\n > A block quote.\n", "html": "
      \n
    1. \n

      A paragraph\nwith two lines.

      \n
      indented code\n
      \n
      \n

      A block quote.

      \n
      \n
    2. \n
    \n", "example": 287, "start_line": 4751, "end_line": 4770, "section": "List items" }, { "markdown": " 1. A paragraph\n with two lines.\n\n indented code\n\n > A block quote.\n", "html": "
      \n
    1. \n

      A paragraph\nwith two lines.

      \n
      indented code\n
      \n
      \n

      A block quote.

      \n
      \n
    2. \n
    \n", "example": 288, "start_line": 4775, "end_line": 4794, "section": "List items" }, { "markdown": " 1. A paragraph\n with two lines.\n\n indented code\n\n > A block quote.\n", "html": "
    1.  A paragraph\n    with two lines.\n\n        indented code\n\n    > A block quote.\n
    \n", "example": 289, "start_line": 4799, "end_line": 4814, "section": "List items" }, { "markdown": " 1. A paragraph\nwith two lines.\n\n indented code\n\n > A block quote.\n", "html": "
      \n
    1. \n

      A paragraph\nwith two lines.

      \n
      indented code\n
      \n
      \n

      A block quote.

      \n
      \n
    2. \n
    \n", "example": 290, "start_line": 4829, "end_line": 4848, "section": "List items" }, { "markdown": " 1. A paragraph\n with two lines.\n", "html": "
      \n
    1. A paragraph\nwith two lines.
    2. \n
    \n", "example": 291, "start_line": 4853, "end_line": 4861, "section": "List items" }, { "markdown": "> 1. > Blockquote\ncontinued here.\n", "html": "
    \n
      \n
    1. \n
      \n

      Blockquote\ncontinued here.

      \n
      \n
    2. \n
    \n
    \n", "example": 292, "start_line": 4866, "end_line": 4880, "section": "List items" }, { "markdown": "> 1. > Blockquote\n> continued here.\n", "html": "
    \n
      \n
    1. \n
      \n

      Blockquote\ncontinued here.

      \n
      \n
    2. \n
    \n
    \n", "example": 293, "start_line": 4883, "end_line": 4897, "section": "List items" }, { "markdown": "- foo\n - bar\n - baz\n - boo\n", "html": "
      \n
    • foo\n
        \n
      • bar\n
          \n
        • baz\n
            \n
          • boo
          • \n
          \n
        • \n
        \n
      • \n
      \n
    • \n
    \n", "example": 294, "start_line": 4911, "end_line": 4932, "section": "List items" }, { "markdown": "- foo\n - bar\n - baz\n - boo\n", "html": "
      \n
    • foo
    • \n
    • bar
    • \n
    • baz
    • \n
    • boo
    • \n
    \n", "example": 295, "start_line": 4937, "end_line": 4949, "section": "List items" }, { "markdown": "10) foo\n - bar\n", "html": "
      \n
    1. foo\n
        \n
      • bar
      • \n
      \n
    2. \n
    \n", "example": 296, "start_line": 4954, "end_line": 4965, "section": "List items" }, { "markdown": "10) foo\n - bar\n", "html": "
      \n
    1. foo
    2. \n
    \n
      \n
    • bar
    • \n
    \n", "example": 297, "start_line": 4970, "end_line": 4980, "section": "List items" }, { "markdown": "- - foo\n", "html": "
      \n
    • \n
        \n
      • foo
      • \n
      \n
    • \n
    \n", "example": 298, "start_line": 4985, "end_line": 4995, "section": "List items" }, { "markdown": "1. - 2. foo\n", "html": "
      \n
    1. \n
        \n
      • \n
          \n
        1. foo
        2. \n
        \n
      • \n
      \n
    2. \n
    \n", "example": 299, "start_line": 4998, "end_line": 5012, "section": "List items" }, { "markdown": "- # Foo\n- Bar\n ---\n baz\n", "html": "
      \n
    • \n

      Foo

      \n
    • \n
    • \n

      Bar

      \nbaz
    • \n
    \n", "example": 300, "start_line": 5017, "end_line": 5031, "section": "List items" }, { "markdown": "- foo\n- bar\n+ baz\n", "html": "
      \n
    • foo
    • \n
    • bar
    • \n
    \n
      \n
    • baz
    • \n
    \n", "example": 301, "start_line": 5253, "end_line": 5265, "section": "Lists" }, { "markdown": "1. foo\n2. bar\n3) baz\n", "html": "
      \n
    1. foo
    2. \n
    3. bar
    4. \n
    \n
      \n
    1. baz
    2. \n
    \n", "example": 302, "start_line": 5268, "end_line": 5280, "section": "Lists" }, { "markdown": "Foo\n- bar\n- baz\n", "html": "

    Foo

    \n
      \n
    • bar
    • \n
    • baz
    • \n
    \n", "example": 303, "start_line": 5287, "end_line": 5297, "section": "Lists" }, { "markdown": "The number of windows in my house is\n14. The number of doors is 6.\n", "html": "

    The number of windows in my house is\n14. The number of doors is 6.

    \n", "example": 304, "start_line": 5364, "end_line": 5370, "section": "Lists" }, { "markdown": "The number of windows in my house is\n1. The number of doors is 6.\n", "html": "

    The number of windows in my house is

    \n
      \n
    1. The number of doors is 6.
    2. \n
    \n", "example": 305, "start_line": 5374, "end_line": 5382, "section": "Lists" }, { "markdown": "- foo\n\n- bar\n\n\n- baz\n", "html": "
      \n
    • \n

      foo

      \n
    • \n
    • \n

      bar

      \n
    • \n
    • \n

      baz

      \n
    • \n
    \n", "example": 306, "start_line": 5388, "end_line": 5407, "section": "Lists" }, { "markdown": "- foo\n - bar\n - baz\n\n\n bim\n", "html": "
      \n
    • foo\n
        \n
      • bar\n
          \n
        • \n

          baz

          \n

          bim

          \n
        • \n
        \n
      • \n
      \n
    • \n
    \n", "example": 307, "start_line": 5409, "end_line": 5431, "section": "Lists" }, { "markdown": "- foo\n- bar\n\n\n\n- baz\n- bim\n", "html": "
      \n
    • foo
    • \n
    • bar
    • \n
    \n\n
      \n
    • baz
    • \n
    • bim
    • \n
    \n", "example": 308, "start_line": 5439, "end_line": 5457, "section": "Lists" }, { "markdown": "- foo\n\n notcode\n\n- foo\n\n\n\n code\n", "html": "
      \n
    • \n

      foo

      \n

      notcode

      \n
    • \n
    • \n

      foo

      \n
    • \n
    \n\n
    code\n
    \n", "example": 309, "start_line": 5460, "end_line": 5483, "section": "Lists" }, { "markdown": "- a\n - b\n - c\n - d\n - e\n - f\n- g\n", "html": "
      \n
    • a
    • \n
    • b
    • \n
    • c
    • \n
    • d
    • \n
    • e
    • \n
    • f
    • \n
    • g
    • \n
    \n", "example": 310, "start_line": 5491, "end_line": 5509, "section": "Lists" }, { "markdown": "1. a\n\n 2. b\n\n 3. c\n", "html": "
      \n
    1. \n

      a

      \n
    2. \n
    3. \n

      b

      \n
    4. \n
    5. \n

      c

      \n
    6. \n
    \n", "example": 311, "start_line": 5512, "end_line": 5530, "section": "Lists" }, { "markdown": "- a\n - b\n - c\n - d\n - e\n", "html": "
      \n
    • a
    • \n
    • b
    • \n
    • c
    • \n
    • d\n- e
    • \n
    \n", "example": 312, "start_line": 5536, "end_line": 5550, "section": "Lists" }, { "markdown": "1. a\n\n 2. b\n\n 3. c\n", "html": "
      \n
    1. \n

      a

      \n
    2. \n
    3. \n

      b

      \n
    4. \n
    \n
    3. c\n
    \n", "example": 313, "start_line": 5556, "end_line": 5573, "section": "Lists" }, { "markdown": "- a\n- b\n\n- c\n", "html": "
      \n
    • \n

      a

      \n
    • \n
    • \n

      b

      \n
    • \n
    • \n

      c

      \n
    • \n
    \n", "example": 314, "start_line": 5579, "end_line": 5596, "section": "Lists" }, { "markdown": "* a\n*\n\n* c\n", "html": "
      \n
    • \n

      a

      \n
    • \n
    • \n
    • \n

      c

      \n
    • \n
    \n", "example": 315, "start_line": 5601, "end_line": 5616, "section": "Lists" }, { "markdown": "- a\n- b\n\n c\n- d\n", "html": "
      \n
    • \n

      a

      \n
    • \n
    • \n

      b

      \n

      c

      \n
    • \n
    • \n

      d

      \n
    • \n
    \n", "example": 316, "start_line": 5623, "end_line": 5642, "section": "Lists" }, { "markdown": "- a\n- b\n\n [ref]: /url\n- d\n", "html": "
      \n
    • \n

      a

      \n
    • \n
    • \n

      b

      \n
    • \n
    • \n

      d

      \n
    • \n
    \n", "example": 317, "start_line": 5645, "end_line": 5663, "section": "Lists" }, { "markdown": "- a\n- ```\n b\n\n\n ```\n- c\n", "html": "
      \n
    • a
    • \n
    • \n
      b\n\n\n
      \n
    • \n
    • c
    • \n
    \n", "example": 318, "start_line": 5668, "end_line": 5687, "section": "Lists" }, { "markdown": "- a\n - b\n\n c\n- d\n", "html": "
      \n
    • a\n
        \n
      • \n

        b

        \n

        c

        \n
      • \n
      \n
    • \n
    • d
    • \n
    \n", "example": 319, "start_line": 5694, "end_line": 5712, "section": "Lists" }, { "markdown": "* a\n > b\n >\n* c\n", "html": "
      \n
    • a\n
      \n

      b

      \n
      \n
    • \n
    • c
    • \n
    \n", "example": 320, "start_line": 5718, "end_line": 5732, "section": "Lists" }, { "markdown": "- a\n > b\n ```\n c\n ```\n- d\n", "html": "
      \n
    • a\n
      \n

      b

      \n
      \n
      c\n
      \n
    • \n
    • d
    • \n
    \n", "example": 321, "start_line": 5738, "end_line": 5756, "section": "Lists" }, { "markdown": "- a\n", "html": "
      \n
    • a
    • \n
    \n", "example": 322, "start_line": 5761, "end_line": 5767, "section": "Lists" }, { "markdown": "- a\n - b\n", "html": "
      \n
    • a\n
        \n
      • b
      • \n
      \n
    • \n
    \n", "example": 323, "start_line": 5770, "end_line": 5781, "section": "Lists" }, { "markdown": "1. ```\n foo\n ```\n\n bar\n", "html": "
      \n
    1. \n
      foo\n
      \n

      bar

      \n
    2. \n
    \n", "example": 324, "start_line": 5787, "end_line": 5801, "section": "Lists" }, { "markdown": "* foo\n * bar\n\n baz\n", "html": "
      \n
    • \n

      foo

      \n
        \n
      • bar
      • \n
      \n

      baz

      \n
    • \n
    \n", "example": 325, "start_line": 5806, "end_line": 5821, "section": "Lists" }, { "markdown": "- a\n - b\n - c\n\n- d\n - e\n - f\n", "html": "
      \n
    • \n

      a

      \n
        \n
      • b
      • \n
      • c
      • \n
      \n
    • \n
    • \n

      d

      \n
        \n
      • e
      • \n
      • f
      • \n
      \n
    • \n
    \n", "example": 326, "start_line": 5824, "end_line": 5849, "section": "Lists" }, { "markdown": "`hi`lo`\n", "html": "

    hilo`

    \n", "example": 327, "start_line": 5858, "end_line": 5862, "section": "Inlines" }, { "markdown": "`foo`\n", "html": "

    foo

    \n", "example": 328, "start_line": 5890, "end_line": 5894, "section": "Code spans" }, { "markdown": "`` foo ` bar ``\n", "html": "

    foo ` bar

    \n", "example": 329, "start_line": 5901, "end_line": 5905, "section": "Code spans" }, { "markdown": "` `` `\n", "html": "

    ``

    \n", "example": 330, "start_line": 5911, "end_line": 5915, "section": "Code spans" }, { "markdown": "` `` `\n", "html": "

    ``

    \n", "example": 331, "start_line": 5919, "end_line": 5923, "section": "Code spans" }, { "markdown": "` a`\n", "html": "

    a

    \n", "example": 332, "start_line": 5928, "end_line": 5932, "section": "Code spans" }, { "markdown": "` b `\n", "html": "

     b 

    \n", "example": 333, "start_line": 5937, "end_line": 5941, "section": "Code spans" }, { "markdown": "` `\n` `\n", "html": "

     \n

    \n", "example": 334, "start_line": 5945, "end_line": 5951, "section": "Code spans" }, { "markdown": "``\nfoo\nbar \nbaz\n``\n", "html": "

    foo bar baz

    \n", "example": 335, "start_line": 5956, "end_line": 5964, "section": "Code spans" }, { "markdown": "``\nfoo \n``\n", "html": "

    foo

    \n", "example": 336, "start_line": 5966, "end_line": 5972, "section": "Code spans" }, { "markdown": "`foo bar \nbaz`\n", "html": "

    foo bar baz

    \n", "example": 337, "start_line": 5977, "end_line": 5982, "section": "Code spans" }, { "markdown": "`foo\\`bar`\n", "html": "

    foo\\bar`

    \n", "example": 338, "start_line": 5994, "end_line": 5998, "section": "Code spans" }, { "markdown": "``foo`bar``\n", "html": "

    foo`bar

    \n", "example": 339, "start_line": 6005, "end_line": 6009, "section": "Code spans" }, { "markdown": "` foo `` bar `\n", "html": "

    foo `` bar

    \n", "example": 340, "start_line": 6011, "end_line": 6015, "section": "Code spans" }, { "markdown": "*foo`*`\n", "html": "

    *foo*

    \n", "example": 341, "start_line": 6023, "end_line": 6027, "section": "Code spans" }, { "markdown": "[not a `link](/foo`)\n", "html": "

    [not a link](/foo)

    \n", "example": 342, "start_line": 6032, "end_line": 6036, "section": "Code spans" }, { "markdown": "``\n", "html": "

    <a href="">`

    \n", "example": 343, "start_line": 6042, "end_line": 6046, "section": "Code spans" }, { "markdown": "
    `\n", "html": "

    `

    \n", "example": 344, "start_line": 6051, "end_line": 6055, "section": "Code spans" }, { "markdown": "``\n", "html": "

    <http://foo.bar.baz>`

    \n", "example": 345, "start_line": 6060, "end_line": 6064, "section": "Code spans" }, { "markdown": "`\n", "html": "

    http://foo.bar.`baz`

    \n", "example": 346, "start_line": 6069, "end_line": 6073, "section": "Code spans" }, { "markdown": "```foo``\n", "html": "

    ```foo``

    \n", "example": 347, "start_line": 6079, "end_line": 6083, "section": "Code spans" }, { "markdown": "`foo\n", "html": "

    `foo

    \n", "example": 348, "start_line": 6086, "end_line": 6090, "section": "Code spans" }, { "markdown": "`foo``bar``\n", "html": "

    `foobar

    \n", "example": 349, "start_line": 6095, "end_line": 6099, "section": "Code spans" }, { "markdown": "*foo bar*\n", "html": "

    foo bar

    \n", "example": 350, "start_line": 6312, "end_line": 6316, "section": "Emphasis and strong emphasis" }, { "markdown": "a * foo bar*\n", "html": "

    a * foo bar*

    \n", "example": 351, "start_line": 6322, "end_line": 6326, "section": "Emphasis and strong emphasis" }, { "markdown": "a*\"foo\"*\n", "html": "

    a*"foo"*

    \n", "example": 352, "start_line": 6333, "end_line": 6337, "section": "Emphasis and strong emphasis" }, { "markdown": "* a *\n", "html": "

    * a *

    \n", "example": 353, "start_line": 6342, "end_line": 6346, "section": "Emphasis and strong emphasis" }, { "markdown": "foo*bar*\n", "html": "

    foobar

    \n", "example": 354, "start_line": 6351, "end_line": 6355, "section": "Emphasis and strong emphasis" }, { "markdown": "5*6*78\n", "html": "

    5678

    \n", "example": 355, "start_line": 6358, "end_line": 6362, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo bar_\n", "html": "

    foo bar

    \n", "example": 356, "start_line": 6367, "end_line": 6371, "section": "Emphasis and strong emphasis" }, { "markdown": "_ foo bar_\n", "html": "

    _ foo bar_

    \n", "example": 357, "start_line": 6377, "end_line": 6381, "section": "Emphasis and strong emphasis" }, { "markdown": "a_\"foo\"_\n", "html": "

    a_"foo"_

    \n", "example": 358, "start_line": 6387, "end_line": 6391, "section": "Emphasis and strong emphasis" }, { "markdown": "foo_bar_\n", "html": "

    foo_bar_

    \n", "example": 359, "start_line": 6396, "end_line": 6400, "section": "Emphasis and strong emphasis" }, { "markdown": "5_6_78\n", "html": "

    5_6_78

    \n", "example": 360, "start_line": 6403, "end_line": 6407, "section": "Emphasis and strong emphasis" }, { "markdown": "пристаням_стремятся_\n", "html": "

    пристаням_стремятся_

    \n", "example": 361, "start_line": 6410, "end_line": 6414, "section": "Emphasis and strong emphasis" }, { "markdown": "aa_\"bb\"_cc\n", "html": "

    aa_"bb"_cc

    \n", "example": 362, "start_line": 6420, "end_line": 6424, "section": "Emphasis and strong emphasis" }, { "markdown": "foo-_(bar)_\n", "html": "

    foo-(bar)

    \n", "example": 363, "start_line": 6431, "end_line": 6435, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo*\n", "html": "

    _foo*

    \n", "example": 364, "start_line": 6443, "end_line": 6447, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo bar *\n", "html": "

    *foo bar *

    \n", "example": 365, "start_line": 6453, "end_line": 6457, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo bar\n*\n", "html": "

    *foo bar\n*

    \n", "example": 366, "start_line": 6462, "end_line": 6468, "section": "Emphasis and strong emphasis" }, { "markdown": "*(*foo)\n", "html": "

    *(*foo)

    \n", "example": 367, "start_line": 6475, "end_line": 6479, "section": "Emphasis and strong emphasis" }, { "markdown": "*(*foo*)*\n", "html": "

    (foo)

    \n", "example": 368, "start_line": 6485, "end_line": 6489, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo*bar\n", "html": "

    foobar

    \n", "example": 369, "start_line": 6494, "end_line": 6498, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo bar _\n", "html": "

    _foo bar _

    \n", "example": 370, "start_line": 6507, "end_line": 6511, "section": "Emphasis and strong emphasis" }, { "markdown": "_(_foo)\n", "html": "

    _(_foo)

    \n", "example": 371, "start_line": 6517, "end_line": 6521, "section": "Emphasis and strong emphasis" }, { "markdown": "_(_foo_)_\n", "html": "

    (foo)

    \n", "example": 372, "start_line": 6526, "end_line": 6530, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo_bar\n", "html": "

    _foo_bar

    \n", "example": 373, "start_line": 6535, "end_line": 6539, "section": "Emphasis and strong emphasis" }, { "markdown": "_пристаням_стремятся\n", "html": "

    _пристаням_стремятся

    \n", "example": 374, "start_line": 6542, "end_line": 6546, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo_bar_baz_\n", "html": "

    foo_bar_baz

    \n", "example": 375, "start_line": 6549, "end_line": 6553, "section": "Emphasis and strong emphasis" }, { "markdown": "_(bar)_.\n", "html": "

    (bar).

    \n", "example": 376, "start_line": 6560, "end_line": 6564, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo bar**\n", "html": "

    foo bar

    \n", "example": 377, "start_line": 6569, "end_line": 6573, "section": "Emphasis and strong emphasis" }, { "markdown": "** foo bar**\n", "html": "

    ** foo bar**

    \n", "example": 378, "start_line": 6579, "end_line": 6583, "section": "Emphasis and strong emphasis" }, { "markdown": "a**\"foo\"**\n", "html": "

    a**"foo"**

    \n", "example": 379, "start_line": 6590, "end_line": 6594, "section": "Emphasis and strong emphasis" }, { "markdown": "foo**bar**\n", "html": "

    foobar

    \n", "example": 380, "start_line": 6599, "end_line": 6603, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo bar__\n", "html": "

    foo bar

    \n", "example": 381, "start_line": 6608, "end_line": 6612, "section": "Emphasis and strong emphasis" }, { "markdown": "__ foo bar__\n", "html": "

    __ foo bar__

    \n", "example": 382, "start_line": 6618, "end_line": 6622, "section": "Emphasis and strong emphasis" }, { "markdown": "__\nfoo bar__\n", "html": "

    __\nfoo bar__

    \n", "example": 383, "start_line": 6626, "end_line": 6632, "section": "Emphasis and strong emphasis" }, { "markdown": "a__\"foo\"__\n", "html": "

    a__"foo"__

    \n", "example": 384, "start_line": 6638, "end_line": 6642, "section": "Emphasis and strong emphasis" }, { "markdown": "foo__bar__\n", "html": "

    foo__bar__

    \n", "example": 385, "start_line": 6647, "end_line": 6651, "section": "Emphasis and strong emphasis" }, { "markdown": "5__6__78\n", "html": "

    5__6__78

    \n", "example": 386, "start_line": 6654, "end_line": 6658, "section": "Emphasis and strong emphasis" }, { "markdown": "пристаням__стремятся__\n", "html": "

    пристаням__стремятся__

    \n", "example": 387, "start_line": 6661, "end_line": 6665, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo, __bar__, baz__\n", "html": "

    foo, bar, baz

    \n", "example": 388, "start_line": 6668, "end_line": 6672, "section": "Emphasis and strong emphasis" }, { "markdown": "foo-__(bar)__\n", "html": "

    foo-(bar)

    \n", "example": 389, "start_line": 6679, "end_line": 6683, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo bar **\n", "html": "

    **foo bar **

    \n", "example": 390, "start_line": 6692, "end_line": 6696, "section": "Emphasis and strong emphasis" }, { "markdown": "**(**foo)\n", "html": "

    **(**foo)

    \n", "example": 391, "start_line": 6705, "end_line": 6709, "section": "Emphasis and strong emphasis" }, { "markdown": "*(**foo**)*\n", "html": "

    (foo)

    \n", "example": 392, "start_line": 6715, "end_line": 6719, "section": "Emphasis and strong emphasis" }, { "markdown": "**Gomphocarpus (*Gomphocarpus physocarpus*, syn.\n*Asclepias physocarpa*)**\n", "html": "

    Gomphocarpus (Gomphocarpus physocarpus, syn.\nAsclepias physocarpa)

    \n", "example": 393, "start_line": 6722, "end_line": 6728, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo \"*bar*\" foo**\n", "html": "

    foo "bar" foo

    \n", "example": 394, "start_line": 6731, "end_line": 6735, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo**bar\n", "html": "

    foobar

    \n", "example": 395, "start_line": 6740, "end_line": 6744, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo bar __\n", "html": "

    __foo bar __

    \n", "example": 396, "start_line": 6752, "end_line": 6756, "section": "Emphasis and strong emphasis" }, { "markdown": "__(__foo)\n", "html": "

    __(__foo)

    \n", "example": 397, "start_line": 6762, "end_line": 6766, "section": "Emphasis and strong emphasis" }, { "markdown": "_(__foo__)_\n", "html": "

    (foo)

    \n", "example": 398, "start_line": 6772, "end_line": 6776, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo__bar\n", "html": "

    __foo__bar

    \n", "example": 399, "start_line": 6781, "end_line": 6785, "section": "Emphasis and strong emphasis" }, { "markdown": "__пристаням__стремятся\n", "html": "

    __пристаням__стремятся

    \n", "example": 400, "start_line": 6788, "end_line": 6792, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo__bar__baz__\n", "html": "

    foo__bar__baz

    \n", "example": 401, "start_line": 6795, "end_line": 6799, "section": "Emphasis and strong emphasis" }, { "markdown": "__(bar)__.\n", "html": "

    (bar).

    \n", "example": 402, "start_line": 6806, "end_line": 6810, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo [bar](/url)*\n", "html": "

    foo bar

    \n", "example": 403, "start_line": 6818, "end_line": 6822, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo\nbar*\n", "html": "

    foo\nbar

    \n", "example": 404, "start_line": 6825, "end_line": 6831, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo __bar__ baz_\n", "html": "

    foo bar baz

    \n", "example": 405, "start_line": 6837, "end_line": 6841, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo _bar_ baz_\n", "html": "

    foo bar baz

    \n", "example": 406, "start_line": 6844, "end_line": 6848, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo_ bar_\n", "html": "

    foo bar

    \n", "example": 407, "start_line": 6851, "end_line": 6855, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo *bar**\n", "html": "

    foo bar

    \n", "example": 408, "start_line": 6858, "end_line": 6862, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo **bar** baz*\n", "html": "

    foo bar baz

    \n", "example": 409, "start_line": 6865, "end_line": 6869, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo**bar**baz*\n", "html": "

    foobarbaz

    \n", "example": 410, "start_line": 6871, "end_line": 6875, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo**bar*\n", "html": "

    foo**bar

    \n", "example": 411, "start_line": 6895, "end_line": 6899, "section": "Emphasis and strong emphasis" }, { "markdown": "***foo** bar*\n", "html": "

    foo bar

    \n", "example": 412, "start_line": 6908, "end_line": 6912, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo **bar***\n", "html": "

    foo bar

    \n", "example": 413, "start_line": 6915, "end_line": 6919, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo**bar***\n", "html": "

    foobar

    \n", "example": 414, "start_line": 6922, "end_line": 6926, "section": "Emphasis and strong emphasis" }, { "markdown": "foo***bar***baz\n", "html": "

    foobarbaz

    \n", "example": 415, "start_line": 6933, "end_line": 6937, "section": "Emphasis and strong emphasis" }, { "markdown": "foo******bar*********baz\n", "html": "

    foobar***baz

    \n", "example": 416, "start_line": 6939, "end_line": 6943, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo **bar *baz* bim** bop*\n", "html": "

    foo bar baz bim bop

    \n", "example": 417, "start_line": 6948, "end_line": 6952, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo [*bar*](/url)*\n", "html": "

    foo bar

    \n", "example": 418, "start_line": 6955, "end_line": 6959, "section": "Emphasis and strong emphasis" }, { "markdown": "** is not an empty emphasis\n", "html": "

    ** is not an empty emphasis

    \n", "example": 419, "start_line": 6964, "end_line": 6968, "section": "Emphasis and strong emphasis" }, { "markdown": "**** is not an empty strong emphasis\n", "html": "

    **** is not an empty strong emphasis

    \n", "example": 420, "start_line": 6971, "end_line": 6975, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo [bar](/url)**\n", "html": "

    foo bar

    \n", "example": 421, "start_line": 6984, "end_line": 6988, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo\nbar**\n", "html": "

    foo\nbar

    \n", "example": 422, "start_line": 6991, "end_line": 6997, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo _bar_ baz__\n", "html": "

    foo bar baz

    \n", "example": 423, "start_line": 7003, "end_line": 7007, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo __bar__ baz__\n", "html": "

    foo bar baz

    \n", "example": 424, "start_line": 7010, "end_line": 7014, "section": "Emphasis and strong emphasis" }, { "markdown": "____foo__ bar__\n", "html": "

    foo bar

    \n", "example": 425, "start_line": 7017, "end_line": 7021, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo **bar****\n", "html": "

    foo bar

    \n", "example": 426, "start_line": 7024, "end_line": 7028, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo *bar* baz**\n", "html": "

    foo bar baz

    \n", "example": 427, "start_line": 7031, "end_line": 7035, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo*bar*baz**\n", "html": "

    foobarbaz

    \n", "example": 428, "start_line": 7038, "end_line": 7042, "section": "Emphasis and strong emphasis" }, { "markdown": "***foo* bar**\n", "html": "

    foo bar

    \n", "example": 429, "start_line": 7045, "end_line": 7049, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo *bar***\n", "html": "

    foo bar

    \n", "example": 430, "start_line": 7052, "end_line": 7056, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo *bar **baz**\nbim* bop**\n", "html": "

    foo bar baz\nbim bop

    \n", "example": 431, "start_line": 7061, "end_line": 7067, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo [*bar*](/url)**\n", "html": "

    foo bar

    \n", "example": 432, "start_line": 7070, "end_line": 7074, "section": "Emphasis and strong emphasis" }, { "markdown": "__ is not an empty emphasis\n", "html": "

    __ is not an empty emphasis

    \n", "example": 433, "start_line": 7079, "end_line": 7083, "section": "Emphasis and strong emphasis" }, { "markdown": "____ is not an empty strong emphasis\n", "html": "

    ____ is not an empty strong emphasis

    \n", "example": 434, "start_line": 7086, "end_line": 7090, "section": "Emphasis and strong emphasis" }, { "markdown": "foo ***\n", "html": "

    foo ***

    \n", "example": 435, "start_line": 7096, "end_line": 7100, "section": "Emphasis and strong emphasis" }, { "markdown": "foo *\\**\n", "html": "

    foo *

    \n", "example": 436, "start_line": 7103, "end_line": 7107, "section": "Emphasis and strong emphasis" }, { "markdown": "foo *_*\n", "html": "

    foo _

    \n", "example": 437, "start_line": 7110, "end_line": 7114, "section": "Emphasis and strong emphasis" }, { "markdown": "foo *****\n", "html": "

    foo *****

    \n", "example": 438, "start_line": 7117, "end_line": 7121, "section": "Emphasis and strong emphasis" }, { "markdown": "foo **\\***\n", "html": "

    foo *

    \n", "example": 439, "start_line": 7124, "end_line": 7128, "section": "Emphasis and strong emphasis" }, { "markdown": "foo **_**\n", "html": "

    foo _

    \n", "example": 440, "start_line": 7131, "end_line": 7135, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo*\n", "html": "

    *foo

    \n", "example": 441, "start_line": 7142, "end_line": 7146, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo**\n", "html": "

    foo*

    \n", "example": 442, "start_line": 7149, "end_line": 7153, "section": "Emphasis and strong emphasis" }, { "markdown": "***foo**\n", "html": "

    *foo

    \n", "example": 443, "start_line": 7156, "end_line": 7160, "section": "Emphasis and strong emphasis" }, { "markdown": "****foo*\n", "html": "

    ***foo

    \n", "example": 444, "start_line": 7163, "end_line": 7167, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo***\n", "html": "

    foo*

    \n", "example": 445, "start_line": 7170, "end_line": 7174, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo****\n", "html": "

    foo***

    \n", "example": 446, "start_line": 7177, "end_line": 7181, "section": "Emphasis and strong emphasis" }, { "markdown": "foo ___\n", "html": "

    foo ___

    \n", "example": 447, "start_line": 7187, "end_line": 7191, "section": "Emphasis and strong emphasis" }, { "markdown": "foo _\\__\n", "html": "

    foo _

    \n", "example": 448, "start_line": 7194, "end_line": 7198, "section": "Emphasis and strong emphasis" }, { "markdown": "foo _*_\n", "html": "

    foo *

    \n", "example": 449, "start_line": 7201, "end_line": 7205, "section": "Emphasis and strong emphasis" }, { "markdown": "foo _____\n", "html": "

    foo _____

    \n", "example": 450, "start_line": 7208, "end_line": 7212, "section": "Emphasis and strong emphasis" }, { "markdown": "foo __\\___\n", "html": "

    foo _

    \n", "example": 451, "start_line": 7215, "end_line": 7219, "section": "Emphasis and strong emphasis" }, { "markdown": "foo __*__\n", "html": "

    foo *

    \n", "example": 452, "start_line": 7222, "end_line": 7226, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo_\n", "html": "

    _foo

    \n", "example": 453, "start_line": 7229, "end_line": 7233, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo__\n", "html": "

    foo_

    \n", "example": 454, "start_line": 7240, "end_line": 7244, "section": "Emphasis and strong emphasis" }, { "markdown": "___foo__\n", "html": "

    _foo

    \n", "example": 455, "start_line": 7247, "end_line": 7251, "section": "Emphasis and strong emphasis" }, { "markdown": "____foo_\n", "html": "

    ___foo

    \n", "example": 456, "start_line": 7254, "end_line": 7258, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo___\n", "html": "

    foo_

    \n", "example": 457, "start_line": 7261, "end_line": 7265, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo____\n", "html": "

    foo___

    \n", "example": 458, "start_line": 7268, "end_line": 7272, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo**\n", "html": "

    foo

    \n", "example": 459, "start_line": 7278, "end_line": 7282, "section": "Emphasis and strong emphasis" }, { "markdown": "*_foo_*\n", "html": "

    foo

    \n", "example": 460, "start_line": 7285, "end_line": 7289, "section": "Emphasis and strong emphasis" }, { "markdown": "__foo__\n", "html": "

    foo

    \n", "example": 461, "start_line": 7292, "end_line": 7296, "section": "Emphasis and strong emphasis" }, { "markdown": "_*foo*_\n", "html": "

    foo

    \n", "example": 462, "start_line": 7299, "end_line": 7303, "section": "Emphasis and strong emphasis" }, { "markdown": "****foo****\n", "html": "

    foo

    \n", "example": 463, "start_line": 7309, "end_line": 7313, "section": "Emphasis and strong emphasis" }, { "markdown": "____foo____\n", "html": "

    foo

    \n", "example": 464, "start_line": 7316, "end_line": 7320, "section": "Emphasis and strong emphasis" }, { "markdown": "******foo******\n", "html": "

    foo

    \n", "example": 465, "start_line": 7327, "end_line": 7331, "section": "Emphasis and strong emphasis" }, { "markdown": "***foo***\n", "html": "

    foo

    \n", "example": 466, "start_line": 7336, "end_line": 7340, "section": "Emphasis and strong emphasis" }, { "markdown": "_____foo_____\n", "html": "

    foo

    \n", "example": 467, "start_line": 7343, "end_line": 7347, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo _bar* baz_\n", "html": "

    foo _bar baz_

    \n", "example": 468, "start_line": 7352, "end_line": 7356, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo __bar *baz bim__ bam*\n", "html": "

    foo bar *baz bim bam

    \n", "example": 469, "start_line": 7359, "end_line": 7363, "section": "Emphasis and strong emphasis" }, { "markdown": "**foo **bar baz**\n", "html": "

    **foo bar baz

    \n", "example": 470, "start_line": 7368, "end_line": 7372, "section": "Emphasis and strong emphasis" }, { "markdown": "*foo *bar baz*\n", "html": "

    *foo bar baz

    \n", "example": 471, "start_line": 7375, "end_line": 7379, "section": "Emphasis and strong emphasis" }, { "markdown": "*[bar*](/url)\n", "html": "

    *bar*

    \n", "example": 472, "start_line": 7384, "end_line": 7388, "section": "Emphasis and strong emphasis" }, { "markdown": "_foo [bar_](/url)\n", "html": "

    _foo bar_

    \n", "example": 473, "start_line": 7391, "end_line": 7395, "section": "Emphasis and strong emphasis" }, { "markdown": "*\n", "html": "

    *

    \n", "example": 474, "start_line": 7398, "end_line": 7402, "section": "Emphasis and strong emphasis" }, { "markdown": "**\n", "html": "

    **

    \n", "example": 475, "start_line": 7405, "end_line": 7409, "section": "Emphasis and strong emphasis" }, { "markdown": "__\n", "html": "

    __

    \n", "example": 476, "start_line": 7412, "end_line": 7416, "section": "Emphasis and strong emphasis" }, { "markdown": "*a `*`*\n", "html": "

    a *

    \n", "example": 477, "start_line": 7419, "end_line": 7423, "section": "Emphasis and strong emphasis" }, { "markdown": "_a `_`_\n", "html": "

    a _

    \n", "example": 478, "start_line": 7426, "end_line": 7430, "section": "Emphasis and strong emphasis" }, { "markdown": "**a\n", "html": "

    **ahttp://foo.bar/?q=**

    \n", "example": 479, "start_line": 7433, "end_line": 7437, "section": "Emphasis and strong emphasis" }, { "markdown": "__a\n", "html": "

    __ahttp://foo.bar/?q=__

    \n", "example": 480, "start_line": 7440, "end_line": 7444, "section": "Emphasis and strong emphasis" }, { "markdown": "[link](/uri \"title\")\n", "html": "

    link

    \n", "example": 481, "start_line": 7528, "end_line": 7532, "section": "Links" }, { "markdown": "[link](/uri)\n", "html": "

    link

    \n", "example": 482, "start_line": 7538, "end_line": 7542, "section": "Links" }, { "markdown": "[](./target.md)\n", "html": "

    \n", "example": 483, "start_line": 7544, "end_line": 7548, "section": "Links" }, { "markdown": "[link]()\n", "html": "

    link

    \n", "example": 484, "start_line": 7551, "end_line": 7555, "section": "Links" }, { "markdown": "[link](<>)\n", "html": "

    link

    \n", "example": 485, "start_line": 7558, "end_line": 7562, "section": "Links" }, { "markdown": "[]()\n", "html": "

    \n", "example": 486, "start_line": 7565, "end_line": 7569, "section": "Links" }, { "markdown": "[link](/my uri)\n", "html": "

    [link](/my uri)

    \n", "example": 487, "start_line": 7574, "end_line": 7578, "section": "Links" }, { "markdown": "[link](
    )\n", "html": "

    link

    \n", "example": 488, "start_line": 7580, "end_line": 7584, "section": "Links" }, { "markdown": "[link](foo\nbar)\n", "html": "

    [link](foo\nbar)

    \n", "example": 489, "start_line": 7589, "end_line": 7595, "section": "Links" }, { "markdown": "[link]()\n", "html": "

    [link]()

    \n", "example": 490, "start_line": 7597, "end_line": 7603, "section": "Links" }, { "markdown": "[a]()\n", "html": "

    a

    \n", "example": 491, "start_line": 7608, "end_line": 7612, "section": "Links" }, { "markdown": "[link]()\n", "html": "

    [link](<foo>)

    \n", "example": 492, "start_line": 7616, "end_line": 7620, "section": "Links" }, { "markdown": "[a](\n[a](c)\n", "html": "

    [a](<b)c\n[a](<b)c>\n[a](c)

    \n", "example": 493, "start_line": 7625, "end_line": 7633, "section": "Links" }, { "markdown": "[link](\\(foo\\))\n", "html": "

    link

    \n", "example": 494, "start_line": 7637, "end_line": 7641, "section": "Links" }, { "markdown": "[link](foo(and(bar)))\n", "html": "

    link

    \n", "example": 495, "start_line": 7646, "end_line": 7650, "section": "Links" }, { "markdown": "[link](foo(and(bar))\n", "html": "

    [link](foo(and(bar))

    \n", "example": 496, "start_line": 7655, "end_line": 7659, "section": "Links" }, { "markdown": "[link](foo\\(and\\(bar\\))\n", "html": "

    link

    \n", "example": 497, "start_line": 7662, "end_line": 7666, "section": "Links" }, { "markdown": "[link]()\n", "html": "

    link

    \n", "example": 498, "start_line": 7669, "end_line": 7673, "section": "Links" }, { "markdown": "[link](foo\\)\\:)\n", "html": "

    link

    \n", "example": 499, "start_line": 7679, "end_line": 7683, "section": "Links" }, { "markdown": "[link](#fragment)\n\n[link](http://example.com#fragment)\n\n[link](http://example.com?foo=3#frag)\n", "html": "

    link

    \n

    link

    \n

    link

    \n", "example": 500, "start_line": 7688, "end_line": 7698, "section": "Links" }, { "markdown": "[link](foo\\bar)\n", "html": "

    link

    \n", "example": 501, "start_line": 7704, "end_line": 7708, "section": "Links" }, { "markdown": "[link](foo%20bä)\n", "html": "

    link

    \n", "example": 502, "start_line": 7720, "end_line": 7724, "section": "Links" }, { "markdown": "[link](\"title\")\n", "html": "

    link

    \n", "example": 503, "start_line": 7731, "end_line": 7735, "section": "Links" }, { "markdown": "[link](/url \"title\")\n[link](/url 'title')\n[link](/url (title))\n", "html": "

    link\nlink\nlink

    \n", "example": 504, "start_line": 7740, "end_line": 7748, "section": "Links" }, { "markdown": "[link](/url \"title \\\""\")\n", "html": "

    link

    \n", "example": 505, "start_line": 7754, "end_line": 7758, "section": "Links" }, { "markdown": "[link](/url \"title\")\n", "html": "

    link

    \n", "example": 506, "start_line": 7765, "end_line": 7769, "section": "Links" }, { "markdown": "[link](/url \"title \"and\" title\")\n", "html": "

    [link](/url "title "and" title")

    \n", "example": 507, "start_line": 7774, "end_line": 7778, "section": "Links" }, { "markdown": "[link](/url 'title \"and\" title')\n", "html": "

    link

    \n", "example": 508, "start_line": 7783, "end_line": 7787, "section": "Links" }, { "markdown": "[link]( /uri\n \"title\" )\n", "html": "

    link

    \n", "example": 509, "start_line": 7808, "end_line": 7813, "section": "Links" }, { "markdown": "[link] (/uri)\n", "html": "

    [link] (/uri)

    \n", "example": 510, "start_line": 7819, "end_line": 7823, "section": "Links" }, { "markdown": "[link [foo [bar]]](/uri)\n", "html": "

    link [foo [bar]]

    \n", "example": 511, "start_line": 7829, "end_line": 7833, "section": "Links" }, { "markdown": "[link] bar](/uri)\n", "html": "

    [link] bar](/uri)

    \n", "example": 512, "start_line": 7836, "end_line": 7840, "section": "Links" }, { "markdown": "[link [bar](/uri)\n", "html": "

    [link bar

    \n", "example": 513, "start_line": 7843, "end_line": 7847, "section": "Links" }, { "markdown": "[link \\[bar](/uri)\n", "html": "

    link [bar

    \n", "example": 514, "start_line": 7850, "end_line": 7854, "section": "Links" }, { "markdown": "[link *foo **bar** `#`*](/uri)\n", "html": "

    link foo bar #

    \n", "example": 515, "start_line": 7859, "end_line": 7863, "section": "Links" }, { "markdown": "[![moon](moon.jpg)](/uri)\n", "html": "

    \"moon\"

    \n", "example": 516, "start_line": 7866, "end_line": 7870, "section": "Links" }, { "markdown": "[foo [bar](/uri)](/uri)\n", "html": "

    [foo bar](/uri)

    \n", "example": 517, "start_line": 7875, "end_line": 7879, "section": "Links" }, { "markdown": "[foo *[bar [baz](/uri)](/uri)*](/uri)\n", "html": "

    [foo [bar baz](/uri)](/uri)

    \n", "example": 518, "start_line": 7882, "end_line": 7886, "section": "Links" }, { "markdown": "![[[foo](uri1)](uri2)](uri3)\n", "html": "

    \"[foo](uri2)\"

    \n", "example": 519, "start_line": 7889, "end_line": 7893, "section": "Links" }, { "markdown": "*[foo*](/uri)\n", "html": "

    *foo*

    \n", "example": 520, "start_line": 7899, "end_line": 7903, "section": "Links" }, { "markdown": "[foo *bar](baz*)\n", "html": "

    foo *bar

    \n", "example": 521, "start_line": 7906, "end_line": 7910, "section": "Links" }, { "markdown": "*foo [bar* baz]\n", "html": "

    foo [bar baz]

    \n", "example": 522, "start_line": 7916, "end_line": 7920, "section": "Links" }, { "markdown": "[foo \n", "html": "

    [foo

    \n", "example": 523, "start_line": 7926, "end_line": 7930, "section": "Links" }, { "markdown": "[foo`](/uri)`\n", "html": "

    [foo](/uri)

    \n", "example": 524, "start_line": 7933, "end_line": 7937, "section": "Links" }, { "markdown": "[foo\n", "html": "

    [foohttp://example.com/?search=](uri)

    \n", "example": 525, "start_line": 7940, "end_line": 7944, "section": "Links" }, { "markdown": "[foo][bar]\n\n[bar]: /url \"title\"\n", "html": "

    foo

    \n", "example": 526, "start_line": 7978, "end_line": 7984, "section": "Links" }, { "markdown": "[link [foo [bar]]][ref]\n\n[ref]: /uri\n", "html": "

    link [foo [bar]]

    \n", "example": 527, "start_line": 7993, "end_line": 7999, "section": "Links" }, { "markdown": "[link \\[bar][ref]\n\n[ref]: /uri\n", "html": "

    link [bar

    \n", "example": 528, "start_line": 8002, "end_line": 8008, "section": "Links" }, { "markdown": "[link *foo **bar** `#`*][ref]\n\n[ref]: /uri\n", "html": "

    link foo bar #

    \n", "example": 529, "start_line": 8013, "end_line": 8019, "section": "Links" }, { "markdown": "[![moon](moon.jpg)][ref]\n\n[ref]: /uri\n", "html": "

    \"moon\"

    \n", "example": 530, "start_line": 8022, "end_line": 8028, "section": "Links" }, { "markdown": "[foo [bar](/uri)][ref]\n\n[ref]: /uri\n", "html": "

    [foo bar]ref

    \n", "example": 531, "start_line": 8033, "end_line": 8039, "section": "Links" }, { "markdown": "[foo *bar [baz][ref]*][ref]\n\n[ref]: /uri\n", "html": "

    [foo bar baz]ref

    \n", "example": 532, "start_line": 8042, "end_line": 8048, "section": "Links" }, { "markdown": "*[foo*][ref]\n\n[ref]: /uri\n", "html": "

    *foo*

    \n", "example": 533, "start_line": 8057, "end_line": 8063, "section": "Links" }, { "markdown": "[foo *bar][ref]*\n\n[ref]: /uri\n", "html": "

    foo *bar*

    \n", "example": 534, "start_line": 8066, "end_line": 8072, "section": "Links" }, { "markdown": "[foo \n\n[ref]: /uri\n", "html": "

    [foo

    \n", "example": 535, "start_line": 8078, "end_line": 8084, "section": "Links" }, { "markdown": "[foo`][ref]`\n\n[ref]: /uri\n", "html": "

    [foo][ref]

    \n", "example": 536, "start_line": 8087, "end_line": 8093, "section": "Links" }, { "markdown": "[foo\n\n[ref]: /uri\n", "html": "

    [foohttp://example.com/?search=][ref]

    \n", "example": 537, "start_line": 8096, "end_line": 8102, "section": "Links" }, { "markdown": "[foo][BaR]\n\n[bar]: /url \"title\"\n", "html": "

    foo

    \n", "example": 538, "start_line": 8107, "end_line": 8113, "section": "Links" }, { "markdown": "[ẞ]\n\n[SS]: /url\n", "html": "

    \n", "example": 539, "start_line": 8118, "end_line": 8124, "section": "Links" }, { "markdown": "[Foo\n bar]: /url\n\n[Baz][Foo bar]\n", "html": "

    Baz

    \n", "example": 540, "start_line": 8130, "end_line": 8137, "section": "Links" }, { "markdown": "[foo] [bar]\n\n[bar]: /url \"title\"\n", "html": "

    [foo] bar

    \n", "example": 541, "start_line": 8143, "end_line": 8149, "section": "Links" }, { "markdown": "[foo]\n[bar]\n\n[bar]: /url \"title\"\n", "html": "

    [foo]\nbar

    \n", "example": 542, "start_line": 8152, "end_line": 8160, "section": "Links" }, { "markdown": "[foo]: /url1\n\n[foo]: /url2\n\n[bar][foo]\n", "html": "

    bar

    \n", "example": 543, "start_line": 8193, "end_line": 8201, "section": "Links" }, { "markdown": "[bar][foo\\!]\n\n[foo!]: /url\n", "html": "

    [bar][foo!]

    \n", "example": 544, "start_line": 8208, "end_line": 8214, "section": "Links" }, { "markdown": "[foo][ref[]\n\n[ref[]: /uri\n", "html": "

    [foo][ref[]

    \n

    [ref[]: /uri

    \n", "example": 545, "start_line": 8220, "end_line": 8227, "section": "Links" }, { "markdown": "[foo][ref[bar]]\n\n[ref[bar]]: /uri\n", "html": "

    [foo][ref[bar]]

    \n

    [ref[bar]]: /uri

    \n", "example": 546, "start_line": 8230, "end_line": 8237, "section": "Links" }, { "markdown": "[[[foo]]]\n\n[[[foo]]]: /url\n", "html": "

    [[[foo]]]

    \n

    [[[foo]]]: /url

    \n", "example": 547, "start_line": 8240, "end_line": 8247, "section": "Links" }, { "markdown": "[foo][ref\\[]\n\n[ref\\[]: /uri\n", "html": "

    foo

    \n", "example": 548, "start_line": 8250, "end_line": 8256, "section": "Links" }, { "markdown": "[bar\\\\]: /uri\n\n[bar\\\\]\n", "html": "

    bar\\

    \n", "example": 549, "start_line": 8261, "end_line": 8267, "section": "Links" }, { "markdown": "[]\n\n[]: /uri\n", "html": "

    []

    \n

    []: /uri

    \n", "example": 550, "start_line": 8273, "end_line": 8280, "section": "Links" }, { "markdown": "[\n ]\n\n[\n ]: /uri\n", "html": "

    [\n]

    \n

    [\n]: /uri

    \n", "example": 551, "start_line": 8283, "end_line": 8294, "section": "Links" }, { "markdown": "[foo][]\n\n[foo]: /url \"title\"\n", "html": "

    foo

    \n", "example": 552, "start_line": 8306, "end_line": 8312, "section": "Links" }, { "markdown": "[*foo* bar][]\n\n[*foo* bar]: /url \"title\"\n", "html": "

    foo bar

    \n", "example": 553, "start_line": 8315, "end_line": 8321, "section": "Links" }, { "markdown": "[Foo][]\n\n[foo]: /url \"title\"\n", "html": "

    Foo

    \n", "example": 554, "start_line": 8326, "end_line": 8332, "section": "Links" }, { "markdown": "[foo] \n[]\n\n[foo]: /url \"title\"\n", "html": "

    foo\n[]

    \n", "example": 555, "start_line": 8339, "end_line": 8347, "section": "Links" }, { "markdown": "[foo]\n\n[foo]: /url \"title\"\n", "html": "

    foo

    \n", "example": 556, "start_line": 8359, "end_line": 8365, "section": "Links" }, { "markdown": "[*foo* bar]\n\n[*foo* bar]: /url \"title\"\n", "html": "

    foo bar

    \n", "example": 557, "start_line": 8368, "end_line": 8374, "section": "Links" }, { "markdown": "[[*foo* bar]]\n\n[*foo* bar]: /url \"title\"\n", "html": "

    [foo bar]

    \n", "example": 558, "start_line": 8377, "end_line": 8383, "section": "Links" }, { "markdown": "[[bar [foo]\n\n[foo]: /url\n", "html": "

    [[bar foo

    \n", "example": 559, "start_line": 8386, "end_line": 8392, "section": "Links" }, { "markdown": "[Foo]\n\n[foo]: /url \"title\"\n", "html": "

    Foo

    \n", "example": 560, "start_line": 8397, "end_line": 8403, "section": "Links" }, { "markdown": "[foo] bar\n\n[foo]: /url\n", "html": "

    foo bar

    \n", "example": 561, "start_line": 8408, "end_line": 8414, "section": "Links" }, { "markdown": "\\[foo]\n\n[foo]: /url \"title\"\n", "html": "

    [foo]

    \n", "example": 562, "start_line": 8420, "end_line": 8426, "section": "Links" }, { "markdown": "[foo*]: /url\n\n*[foo*]\n", "html": "

    *foo*

    \n", "example": 563, "start_line": 8432, "end_line": 8438, "section": "Links" }, { "markdown": "[foo][bar]\n\n[foo]: /url1\n[bar]: /url2\n", "html": "

    foo

    \n", "example": 564, "start_line": 8444, "end_line": 8451, "section": "Links" }, { "markdown": "[foo][]\n\n[foo]: /url1\n", "html": "

    foo

    \n", "example": 565, "start_line": 8453, "end_line": 8459, "section": "Links" }, { "markdown": "[foo]()\n\n[foo]: /url1\n", "html": "

    foo

    \n", "example": 566, "start_line": 8463, "end_line": 8469, "section": "Links" }, { "markdown": "[foo](not a link)\n\n[foo]: /url1\n", "html": "

    foo(not a link)

    \n", "example": 567, "start_line": 8471, "end_line": 8477, "section": "Links" }, { "markdown": "[foo][bar][baz]\n\n[baz]: /url\n", "html": "

    [foo]bar

    \n", "example": 568, "start_line": 8482, "end_line": 8488, "section": "Links" }, { "markdown": "[foo][bar][baz]\n\n[baz]: /url1\n[bar]: /url2\n", "html": "

    foobaz

    \n", "example": 569, "start_line": 8494, "end_line": 8501, "section": "Links" }, { "markdown": "[foo][bar][baz]\n\n[baz]: /url1\n[foo]: /url2\n", "html": "

    [foo]bar

    \n", "example": 570, "start_line": 8507, "end_line": 8514, "section": "Links" }, { "markdown": "![foo](/url \"title\")\n", "html": "

    \"foo\"

    \n", "example": 571, "start_line": 8530, "end_line": 8534, "section": "Images" }, { "markdown": "![foo *bar*]\n\n[foo *bar*]: train.jpg \"train & tracks\"\n", "html": "

    \"foo

    \n", "example": 572, "start_line": 8537, "end_line": 8543, "section": "Images" }, { "markdown": "![foo ![bar](/url)](/url2)\n", "html": "

    \"foo

    \n", "example": 573, "start_line": 8546, "end_line": 8550, "section": "Images" }, { "markdown": "![foo [bar](/url)](/url2)\n", "html": "

    \"foo

    \n", "example": 574, "start_line": 8553, "end_line": 8557, "section": "Images" }, { "markdown": "![foo *bar*][]\n\n[foo *bar*]: train.jpg \"train & tracks\"\n", "html": "

    \"foo

    \n", "example": 575, "start_line": 8567, "end_line": 8573, "section": "Images" }, { "markdown": "![foo *bar*][foobar]\n\n[FOOBAR]: train.jpg \"train & tracks\"\n", "html": "

    \"foo

    \n", "example": 576, "start_line": 8576, "end_line": 8582, "section": "Images" }, { "markdown": "![foo](train.jpg)\n", "html": "

    \"foo\"

    \n", "example": 577, "start_line": 8585, "end_line": 8589, "section": "Images" }, { "markdown": "My ![foo bar](/path/to/train.jpg \"title\" )\n", "html": "

    My \"foo

    \n", "example": 578, "start_line": 8592, "end_line": 8596, "section": "Images" }, { "markdown": "![foo]()\n", "html": "

    \"foo\"

    \n", "example": 579, "start_line": 8599, "end_line": 8603, "section": "Images" }, { "markdown": "![](/url)\n", "html": "

    \"\"

    \n", "example": 580, "start_line": 8606, "end_line": 8610, "section": "Images" }, { "markdown": "![foo][bar]\n\n[bar]: /url\n", "html": "

    \"foo\"

    \n", "example": 581, "start_line": 8615, "end_line": 8621, "section": "Images" }, { "markdown": "![foo][bar]\n\n[BAR]: /url\n", "html": "

    \"foo\"

    \n", "example": 582, "start_line": 8624, "end_line": 8630, "section": "Images" }, { "markdown": "![foo][]\n\n[foo]: /url \"title\"\n", "html": "

    \"foo\"

    \n", "example": 583, "start_line": 8635, "end_line": 8641, "section": "Images" }, { "markdown": "![*foo* bar][]\n\n[*foo* bar]: /url \"title\"\n", "html": "

    \"foo

    \n", "example": 584, "start_line": 8644, "end_line": 8650, "section": "Images" }, { "markdown": "![Foo][]\n\n[foo]: /url \"title\"\n", "html": "

    \"Foo\"

    \n", "example": 585, "start_line": 8655, "end_line": 8661, "section": "Images" }, { "markdown": "![foo] \n[]\n\n[foo]: /url \"title\"\n", "html": "

    \"foo\"\n[]

    \n", "example": 586, "start_line": 8667, "end_line": 8675, "section": "Images" }, { "markdown": "![foo]\n\n[foo]: /url \"title\"\n", "html": "

    \"foo\"

    \n", "example": 587, "start_line": 8680, "end_line": 8686, "section": "Images" }, { "markdown": "![*foo* bar]\n\n[*foo* bar]: /url \"title\"\n", "html": "

    \"foo

    \n", "example": 588, "start_line": 8689, "end_line": 8695, "section": "Images" }, { "markdown": "![[foo]]\n\n[[foo]]: /url \"title\"\n", "html": "

    ![[foo]]

    \n

    [[foo]]: /url "title"

    \n", "example": 589, "start_line": 8700, "end_line": 8707, "section": "Images" }, { "markdown": "![Foo]\n\n[foo]: /url \"title\"\n", "html": "

    \"Foo\"

    \n", "example": 590, "start_line": 8712, "end_line": 8718, "section": "Images" }, { "markdown": "!\\[foo]\n\n[foo]: /url \"title\"\n", "html": "

    ![foo]

    \n", "example": 591, "start_line": 8724, "end_line": 8730, "section": "Images" }, { "markdown": "\\![foo]\n\n[foo]: /url \"title\"\n", "html": "

    !foo

    \n", "example": 592, "start_line": 8736, "end_line": 8742, "section": "Images" }, { "markdown": "\n", "html": "

    http://foo.bar.baz

    \n", "example": 593, "start_line": 8769, "end_line": 8773, "section": "Autolinks" }, { "markdown": "\n", "html": "

    http://foo.bar.baz/test?q=hello&id=22&boolean

    \n", "example": 594, "start_line": 8776, "end_line": 8780, "section": "Autolinks" }, { "markdown": "\n", "html": "

    irc://foo.bar:2233/baz

    \n", "example": 595, "start_line": 8783, "end_line": 8787, "section": "Autolinks" }, { "markdown": "\n", "html": "

    MAILTO:FOO@BAR.BAZ

    \n", "example": 596, "start_line": 8792, "end_line": 8796, "section": "Autolinks" }, { "markdown": "\n", "html": "

    a+b+c:d

    \n", "example": 597, "start_line": 8804, "end_line": 8808, "section": "Autolinks" }, { "markdown": "\n", "html": "

    made-up-scheme://foo,bar

    \n", "example": 598, "start_line": 8811, "end_line": 8815, "section": "Autolinks" }, { "markdown": "\n", "html": "

    http://../

    \n", "example": 599, "start_line": 8818, "end_line": 8822, "section": "Autolinks" }, { "markdown": "\n", "html": "

    localhost:5001/foo

    \n", "example": 600, "start_line": 8825, "end_line": 8829, "section": "Autolinks" }, { "markdown": "\n", "html": "

    <http://foo.bar/baz bim>

    \n", "example": 601, "start_line": 8834, "end_line": 8838, "section": "Autolinks" }, { "markdown": "\n", "html": "

    http://example.com/\\[\\

    \n", "example": 602, "start_line": 8843, "end_line": 8847, "section": "Autolinks" }, { "markdown": "\n", "html": "

    foo@bar.example.com

    \n", "example": 603, "start_line": 8865, "end_line": 8869, "section": "Autolinks" }, { "markdown": "\n", "html": "

    foo+special@Bar.baz-bar0.com

    \n", "example": 604, "start_line": 8872, "end_line": 8876, "section": "Autolinks" }, { "markdown": "\n", "html": "

    <foo+@bar.example.com>

    \n", "example": 605, "start_line": 8881, "end_line": 8885, "section": "Autolinks" }, { "markdown": "<>\n", "html": "

    <>

    \n", "example": 606, "start_line": 8890, "end_line": 8894, "section": "Autolinks" }, { "markdown": "< http://foo.bar >\n", "html": "

    < http://foo.bar >

    \n", "example": 607, "start_line": 8897, "end_line": 8901, "section": "Autolinks" }, { "markdown": "\n", "html": "

    <m:abc>

    \n", "example": 608, "start_line": 8904, "end_line": 8908, "section": "Autolinks" }, { "markdown": "\n", "html": "

    <foo.bar.baz>

    \n", "example": 609, "start_line": 8911, "end_line": 8915, "section": "Autolinks" }, { "markdown": "http://example.com\n", "html": "

    http://example.com

    \n", "example": 610, "start_line": 8918, "end_line": 8922, "section": "Autolinks" }, { "markdown": "foo@bar.example.com\n", "html": "

    foo@bar.example.com

    \n", "example": 611, "start_line": 8925, "end_line": 8929, "section": "Autolinks" }, { "markdown": "\n", "html": "

    \n", "example": 612, "start_line": 9006, "end_line": 9010, "section": "Raw HTML" }, { "markdown": "\n", "html": "

    \n", "example": 613, "start_line": 9015, "end_line": 9019, "section": "Raw HTML" }, { "markdown": "\n", "html": "

    \n", "example": 614, "start_line": 9024, "end_line": 9030, "section": "Raw HTML" }, { "markdown": "\n", "html": "

    \n", "example": 615, "start_line": 9035, "end_line": 9041, "section": "Raw HTML" }, { "markdown": "Foo \n", "html": "

    Foo

    \n", "example": 616, "start_line": 9046, "end_line": 9050, "section": "Raw HTML" }, { "markdown": "<33> <__>\n", "html": "

    <33> <__>

    \n", "example": 617, "start_line": 9055, "end_line": 9059, "section": "Raw HTML" }, { "markdown": "
    \n", "html": "

    <a h*#ref="hi">

    \n", "example": 618, "start_line": 9064, "end_line": 9068, "section": "Raw HTML" }, { "markdown": "
    \n", "html": "

    <a href="hi'> <a href=hi'>

    \n", "example": 619, "start_line": 9073, "end_line": 9077, "section": "Raw HTML" }, { "markdown": "< a><\nfoo>\n\n", "html": "

    < a><\nfoo><bar/ >\n<foo bar=baz\nbim!bop />

    \n", "example": 620, "start_line": 9082, "end_line": 9092, "section": "Raw HTML" }, { "markdown": "
    \n", "html": "

    <a href='bar'title=title>

    \n", "example": 621, "start_line": 9097, "end_line": 9101, "section": "Raw HTML" }, { "markdown": "
    \n", "html": "

    \n", "example": 622, "start_line": 9106, "end_line": 9110, "section": "Raw HTML" }, { "markdown": "\n", "html": "

    </a href="foo">

    \n", "example": 623, "start_line": 9115, "end_line": 9119, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 624, "start_line": 9124, "end_line": 9130, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo <!-- not a comment -- two hyphens -->

    \n", "example": 625, "start_line": 9133, "end_line": 9137, "section": "Raw HTML" }, { "markdown": "foo foo -->\n\nfoo \n", "html": "

    foo <!--> foo -->

    \n

    foo <!-- foo--->

    \n", "example": 626, "start_line": 9142, "end_line": 9149, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 627, "start_line": 9154, "end_line": 9158, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 628, "start_line": 9163, "end_line": 9167, "section": "Raw HTML" }, { "markdown": "foo &<]]>\n", "html": "

    foo &<]]>

    \n", "example": 629, "start_line": 9172, "end_line": 9176, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 630, "start_line": 9182, "end_line": 9186, "section": "Raw HTML" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 631, "start_line": 9191, "end_line": 9195, "section": "Raw HTML" }, { "markdown": "\n", "html": "

    <a href=""">

    \n", "example": 632, "start_line": 9198, "end_line": 9202, "section": "Raw HTML" }, { "markdown": "foo \nbaz\n", "html": "

    foo
    \nbaz

    \n", "example": 633, "start_line": 9212, "end_line": 9218, "section": "Hard line breaks" }, { "markdown": "foo\\\nbaz\n", "html": "

    foo
    \nbaz

    \n", "example": 634, "start_line": 9224, "end_line": 9230, "section": "Hard line breaks" }, { "markdown": "foo \nbaz\n", "html": "

    foo
    \nbaz

    \n", "example": 635, "start_line": 9235, "end_line": 9241, "section": "Hard line breaks" }, { "markdown": "foo \n bar\n", "html": "

    foo
    \nbar

    \n", "example": 636, "start_line": 9246, "end_line": 9252, "section": "Hard line breaks" }, { "markdown": "foo\\\n bar\n", "html": "

    foo
    \nbar

    \n", "example": 637, "start_line": 9255, "end_line": 9261, "section": "Hard line breaks" }, { "markdown": "*foo \nbar*\n", "html": "

    foo
    \nbar

    \n", "example": 638, "start_line": 9267, "end_line": 9273, "section": "Hard line breaks" }, { "markdown": "*foo\\\nbar*\n", "html": "

    foo
    \nbar

    \n", "example": 639, "start_line": 9276, "end_line": 9282, "section": "Hard line breaks" }, { "markdown": "`code \nspan`\n", "html": "

    code span

    \n", "example": 640, "start_line": 9287, "end_line": 9292, "section": "Hard line breaks" }, { "markdown": "`code\\\nspan`\n", "html": "

    code\\ span

    \n", "example": 641, "start_line": 9295, "end_line": 9300, "section": "Hard line breaks" }, { "markdown": "
    \n", "html": "

    \n", "example": 642, "start_line": 9305, "end_line": 9311, "section": "Hard line breaks" }, { "markdown": "\n", "html": "

    \n", "example": 643, "start_line": 9314, "end_line": 9320, "section": "Hard line breaks" }, { "markdown": "foo\\\n", "html": "

    foo\\

    \n", "example": 644, "start_line": 9327, "end_line": 9331, "section": "Hard line breaks" }, { "markdown": "foo \n", "html": "

    foo

    \n", "example": 645, "start_line": 9334, "end_line": 9338, "section": "Hard line breaks" }, { "markdown": "### foo\\\n", "html": "

    foo\\

    \n", "example": 646, "start_line": 9341, "end_line": 9345, "section": "Hard line breaks" }, { "markdown": "### foo \n", "html": "

    foo

    \n", "example": 647, "start_line": 9348, "end_line": 9352, "section": "Hard line breaks" }, { "markdown": "foo\nbaz\n", "html": "

    foo\nbaz

    \n", "example": 648, "start_line": 9363, "end_line": 9369, "section": "Soft line breaks" }, { "markdown": "foo \n baz\n", "html": "

    foo\nbaz

    \n", "example": 649, "start_line": 9375, "end_line": 9381, "section": "Soft line breaks" }, { "markdown": "hello $.;'there\n", "html": "

    hello $.;'there

    \n", "example": 650, "start_line": 9395, "end_line": 9399, "section": "Textual content" }, { "markdown": "Foo χρῆν\n", "html": "

    Foo χρῆν

    \n", "example": 651, "start_line": 9402, "end_line": 9406, "section": "Textual content" }, { "markdown": "Multiple spaces\n", "html": "

    Multiple spaces

    \n", "example": 652, "start_line": 9411, "end_line": 9415, "section": "Textual content" } ]mistletoe-1.3.0/test/specification/commonmark.py000066400000000000000000000116571455324047100220530ustar00rootroot00000000000000import re import sys import json from mistletoe import Document, HtmlRenderer from traceback import print_tb from argparse import ArgumentParser KNOWN = [] """ Examples (their numbers) from the specification which are known to fail in mistletoe. """ def run_tests(test_entries, start=None, end=None, quiet=False, verbose=False, known=False): if known: print('ignoring tests:', ', '.join(map(str, KNOWN)) + '\n') start = start or 0 end = end or sys.maxsize results = [run_test(test_entry, quiet) for test_entry in test_entries if test_entry['example'] >= start and test_entry['example'] <= end and (not known or test_entry['example'] not in KNOWN)] if verbose: print_failure_in_sections(results) fails = len(list(filter(lambda x: not x[0], results))) if fails: print('failed:', fails) print(' total:', len(results)) else: print('All tests passing.') return not fails def run_test(test_entry, quiet=False): test_case = test_entry['markdown'].splitlines(keepends=True) try: with HtmlRenderer(html_escape_double_quotes=True) as renderer: output = renderer.render(Document(test_case)) success = test_entry['html'] == output if not success and not quiet: print_test_entry(test_entry, output) return success, test_entry['section'] except Exception as exception: if not quiet: print_exception(exception, test_entry) return False, test_entry['section'] def load_tests(specfile): with open(specfile, 'r', encoding='utf-8') as fin: return json.load(fin) def locate_section(section, tests): start = None end = None for test in tests: if re.search(section, test['section'], re.IGNORECASE): if start is None: start = test['example'] elif start is not None and end is None: end = test['example'] - 1 return start, end if start: return start, tests[-1]['example'] - 1 raise RuntimeError("Section '{}' not found, aborting.".format(section)) def print_exception(exception, test_entry): print_test_entry(test_entry, '-- exception --', fout=sys.stderr) print(exception.__class__.__name__ + ':', exception, file=sys.stderr) print('Traceback: ', file=sys.stderr) print_tb(exception.__traceback__) def print_test_entry(test_entry, output, fout=sys.stdout): print('example: ', repr(test_entry['example']), file=fout) print('markdown:', repr(test_entry['markdown']), file=fout) print('html: ', repr(test_entry['html']), file=fout) print('output: ', repr(output), file=fout) print(file=fout) def print_failure_in_sections(results): section = results[0][1] failed = 0 total = 0 for result in results: if section != result[1]: if failed: section_str = "Failed in section '{}':".format(section) result_str = "{:>3} / {:>3}".format(failed, total) print('{:70} {}'.format(section_str, result_str)) section = result[1] failed = 0 total = 0 if not result[0]: failed += 1 total += 1 if failed: section_str = "Failed in section '{}':".format(section) result_str = "{:>3} / {:>3}".format(failed, total) print('{:70} {}'.format(section_str, result_str)) print() def main(): parser = ArgumentParser(description="Custom script for running Commonmark tests.") parser.add_argument('start', type=int, nargs='?', default=None, help="Run tests starting from this position.") parser.add_argument('end', type=int, nargs='?', default=None, help="Run tests until this position.") parser.add_argument('-v', '--verbose', dest='verbose', action='store_true', help="Output failure count in every section.") parser.add_argument('-q', '--quiet', dest='quiet', action='store_true', help="Suppress failed test entry output.") parser.add_argument('-s', '--section', dest='section', default=None, help="Only run tests in specified section.") parser.add_argument('-f', '--file', dest='tests', type=load_tests, default='test/specification/commonmark.json', help="Specify alternative specfile to run.") parser.add_argument('-n', '--ignore-known', dest='known', action='store_true', help="Ignore tests entries that are known to fail.") args = parser.parse_args() start = args.start end = args.end verbose = args.verbose quiet = args.quiet tests = args.tests known = args.known if args.section is not None: start, end = locate_section(args.section, tests) if not run_tests(tests, start, end, quiet, verbose, known): sys.exit(1) if __name__ == '__main__': main() mistletoe-1.3.0/test/specification/spec.sh000077500000000000000000000003271455324047100206170ustar00rootroot00000000000000#!/usr/bin/env bash set -e VERSION="0.30" URL="https://spec.commonmark.org/$VERSION/spec.json" function main { echo "Using version $VERSION..." curl -k -o commonmark.json $URL echo "Done." } main mistletoe-1.3.0/test/test_ast_renderer.py000066400000000000000000000104011455324047100205660ustar00rootroot00000000000000import unittest from mistletoe import Document, ast_renderer class TestAstRenderer(unittest.TestCase): def test(self): self.maxDiff = None d = Document([ '# heading 1\n', '\n', 'hello\n', 'world\n', ]) output = ast_renderer.get_ast(d) expected = {'type': 'Document', 'footnotes': {}, 'line_number': 1, 'children': [{ 'type': 'Heading', 'level': 1, 'line_number': 1, 'children': [{ 'type': 'RawText', 'content': 'heading 1' }] }, { 'type': 'Paragraph', 'line_number': 3, 'children': [{ 'type': 'RawText', 'content': 'hello' }, { 'type': 'LineBreak', 'soft': True, 'content': '' }, { 'type': 'RawText', 'content': 'world' }] }]} self.assertEqual(output, expected) def test_footnotes(self): self.maxDiff = None d = Document([ '[bar][baz]\n', '\n', '[baz]: spam\n', ]) expected = {'type': 'Document', 'footnotes': {'baz': ('spam', '')}, 'line_number': 1, 'children': [{ 'type': 'Paragraph', 'line_number': 1, 'children': [{ 'type': 'Link', 'target': 'spam', 'title': '', 'children': [{ 'type': 'RawText', 'content': 'bar' }] }] }]} output = ast_renderer.get_ast(d) self.assertEqual(output, expected) def test_table(self): self.maxDiff = None d = Document([ "| A | B |\n", "| --- | --- |\n", "| 1 | 2 |\n", ]) expected = { "type": "Document", "footnotes": {}, 'line_number': 1, "children": [{ "type": "Table", "column_align": [None, None], 'line_number': 1, "header": { "type": "TableRow", "row_align": [None, None], 'line_number': 1, "children": [{ "type": "TableCell", "align": None, 'line_number': 1, "children": [{ "type": "RawText", "content": "A", }]}, { "type": "TableCell", "align": None, 'line_number': 1, "children": [{ "type": "RawText", "content": "B", }] }], }, "children": [{ "type": "TableRow", "row_align": [None, None], 'line_number': 3, "children": [{ "type": "TableCell", "align": None, 'line_number': 3, "children": [{ "type": "RawText", "content": "1", }]}, { "type": "TableCell", "align": None, 'line_number': 3, "children": [{ "type": "RawText", "content": "2", }] }], }], }], } output = ast_renderer.get_ast(d) self.assertEqual(output, expected) mistletoe-1.3.0/test/test_block_token.py000066400000000000000000000705131455324047100204150ustar00rootroot00000000000000import unittest from unittest.mock import call, patch from parameterized import parameterized from mistletoe import block_token, block_tokenizer, span_token class TestToken(unittest.TestCase): def setUp(self): self.addCleanup(lambda: span_token._token_types.__setitem__(-1, span_token.RawText)) patcher = patch('mistletoe.span_token.RawText') self.mock = patcher.start() span_token._token_types[-1] = self.mock self.addCleanup(patcher.stop) def _test_match(self, token_cls, lines, arg, **kwargs): token = next(iter(block_token.tokenize(lines))) self.assertIsInstance(token, token_cls) self._test_token(token, arg, **kwargs) def _test_token(self, token, arg, **kwargs): for attr, value in kwargs.items(): self.assertEqual(getattr(token, attr), value) self.mock.assert_any_call(arg) class TestAtxHeading(TestToken): def test_match(self): lines = ['### heading 3\n'] arg = 'heading 3' self._test_match(block_token.Heading, lines, arg, level=3) def test_children_with_enclosing_hashes(self): lines = ['# heading 3 ##### \n'] arg = 'heading 3' self._test_match(block_token.Heading, lines, arg, level=1) def test_not_heading(self): lines = ['####### paragraph\n'] arg = '####### paragraph' self._test_match(block_token.Paragraph, lines, arg) def test_heading_in_paragraph(self): lines = ['foo\n', '# heading\n', 'bar\n'] token1, token2, token3 = block_token.tokenize(lines) self.assertIsInstance(token1, block_token.Paragraph) self.assertIsInstance(token2, block_token.Heading) self.assertIsInstance(token3, block_token.Paragraph) class TestSetextHeading(TestToken): def test_match(self): lines = ['some heading\n', '---\n'] arg = 'some heading' self._test_match(block_token.SetextHeading, lines, arg, level=2) def test_next(self): lines = ['some\n', 'heading\n', '---\n', '\n', 'foobar\n'] tokens = iter(block_token.tokenize(lines)) self.assertIsInstance(next(tokens), block_token.SetextHeading) self.assertIsInstance(next(tokens), block_token.Paragraph) self.mock.assert_has_calls([call('some'), call('heading'), call('foobar')]) with self.assertRaises(StopIteration): next(tokens) class TestQuote(unittest.TestCase): def test_match(self): with patch('mistletoe.block_token.Paragraph'): token = next(iter(block_token.tokenize(['> line 1\n', '> line 2\n']))) self.assertIsInstance(token, block_token.Quote) def test_lazy_continuation(self): with patch('mistletoe.block_token.Paragraph'): token = next(iter(block_token.tokenize(['> line 1\n', 'line 2\n']))) self.assertIsInstance(token, block_token.Quote) class TestCodeFence(TestToken): def test_match_fenced_code(self): lines = ['```sh\n', 'rm dir\n', 'mkdir test\n', '```\n'] arg = 'rm dir\nmkdir test\n' self._test_match(block_token.CodeFence, lines, arg, language='sh') def test_match_fenced_code_with_tilde(self): lines = ['~~~sh\n', 'rm dir\n', 'mkdir test\n', '~~~\n'] arg = 'rm dir\nmkdir test\n' self._test_match(block_token.CodeFence, lines, arg, language='sh') def test_not_match_fenced_code_when_only_inline_code(self): lines = ['`~` is called tilde'] token = next(iter(block_token.tokenize(lines))) self.assertIsInstance(token, block_token.Paragraph) token1 = token.children[0] self.assertIsInstance(token1, span_token.InlineCode) self.mock.assert_has_calls([call('~'), call(' is called tilde')]) def test_mixed_code_fence(self): lines = ['~~~markdown\n', '```sh\n', 'some code\n', '```\n', '~~~\n'] arg = '```sh\nsome code\n```\n' self._test_match(block_token.CodeFence, lines, arg, language='markdown') def test_fence_code_lazy_continuation(self): lines = ['```sh\n', 'rm dir\n', '\n', 'mkdir test\n', '```\n'] arg = 'rm dir\n\nmkdir test\n' self._test_match(block_token.CodeFence, lines, arg, language='sh') def test_no_wrapping_newlines_code_fence(self): lines = ['```\n', 'hey', '```\n', 'paragraph\n'] arg = 'hey' self._test_match(block_token.CodeFence, lines, arg, language='') def test_unclosed_code_fence(self): lines = ['```\n', 'hey'] arg = 'hey' self._test_match(block_token.CodeFence, lines, arg, language='') def test_code_fence_with_backticks_and_tildes_in_the_info_string(self): lines = ['~~~ aa ``` ~~~\n', 'foo\n', '~~~\n'] arg = 'foo\n' self._test_match(block_token.CodeFence, lines, arg, language='aa') class TestBlockCode(TestToken): def test_parse_indented_code(self): lines = [' rm dir\n', ' mkdir test\n'] arg = 'rm dir\nmkdir test\n' self._test_match(block_token.BlockCode, lines, arg, language='') def test_parse_indented_code_with_blank_lines(self): lines = [' chunk1\n', '\n', ' chunk2\n', ' \n', ' \n', ' \n', ' chunk3\n'] arg = 'chunk1\n\nchunk2\n\n\n\nchunk3\n' self._test_match(block_token.BlockCode, lines, arg, language='') class TestParagraph(TestToken): def setUp(self): super().setUp() block_token.add_token(block_token.HtmlBlock) self.addCleanup(block_token.reset_tokens) def test_parse(self): lines = ['some\n', 'continuous\n', 'lines\n'] arg = 'some' self._test_match(block_token.Paragraph, lines, arg) def test_read(self): lines = ['this\n', '```\n', 'is some\n', '```\n', 'code\n'] try: token1, token2, token3 = block_token.tokenize(lines) except ValueError as e: raise AssertionError("Token number mismatch.") from e self.assertIsInstance(token1, block_token.Paragraph) self.assertIsInstance(token2, block_token.CodeFence) self.assertIsInstance(token3, block_token.Paragraph) def test_parse_interrupting_block_tokens(self): interrupting_blocks = [ '***\n', # thematic break '## atx\n', # ATX heading '
    \n', # HTML block type 6 '> block quote\n', '1. list\n', '``` fenced code block\n', ('| table |\n', '| ----- |\n', '| row |\n'), ] for block in interrupting_blocks: lines = ['Paragraph 1\n', *block] try: token1, token2 = block_token.tokenize(lines) except ValueError as e: raise AssertionError("Token number mismatch. Lines: '{}'".format(lines)) from e self.assertIsInstance(token1, block_token.Paragraph) self.assertNotIsInstance(token2, block_token.Paragraph) def test_parse_non_interrupting_block_tokens(self): lines = [ 'Paragraph 1\n', '2. list\n', # list doesn't start from 1 ' indented text\n', # code block '\n', # HTML block type 7 '\n', 'Paragraph 2\n' ] try: token1, token2 = block_token.tokenize(lines) except ValueError as e: raise AssertionError("Token number mismatch.") from e self.assertIsInstance(token1, block_token.Paragraph) self.assertIsInstance(token2, block_token.Paragraph) def test_parse_setext_heading(self): lines = [ 'Two line\n', 'heading\n', '---\n' ] try: token1, = block_token.tokenize(lines) except ValueError as e: raise AssertionError("Token number mismatch.") from e self.assertIsInstance(token1, block_token.SetextHeading) class TestListItem(unittest.TestCase): def test_parse_marker(self): lines = ['- foo\n', ' * bar\n', ' + baz\n', '1. item 1\n', '2) item 2\n', '123456789. item x\n', '*\n'] for line in lines: self.assertTrue(block_token.ListItem.parse_marker(line)) bad_lines = ['> foo\n', '1item 1\n', '2| item 2\n', '1234567890. item x\n', ' * too many spaces\n'] for line in bad_lines: self.assertFalse(block_token.ListItem.parse_marker(line)) def test_tokenize(self): lines = [' - foo\n', ' bar\n', '\n', ' baz\n'] token1, token2 = next(iter(block_token.tokenize(lines))).children[0].children self.assertIsInstance(token1, block_token.Paragraph) self.assertTrue('foo' in token1) self.assertIsInstance(token2, block_token.BlockCode) def test_sublist(self): lines = ['- foo\n', ' - bar\n'] token1, token2 = block_token.tokenize(lines)[0].children[0].children self.assertIsInstance(token1, block_token.Paragraph) self.assertIsInstance(token2, block_token.List) def test_deep_list(self): lines = ['- foo\n', ' - bar\n', ' - baz\n'] ptoken, ltoken = block_token.tokenize(lines)[0].children[0].children self.assertIsInstance(ptoken, block_token.Paragraph) self.assertIsInstance(ltoken, block_token.List) self.assertTrue('foo' in ptoken) ptoken, ltoken = ltoken.children[0].children self.assertIsInstance(ptoken, block_token.Paragraph) self.assertTrue('bar' in ptoken) self.assertIsInstance(ltoken, block_token.List) self.assertTrue('baz' in ltoken) def test_loose_list(self): lines = ['- foo\n', ' ~~~\n', ' bar\n', ' \n', ' baz\n' ' ~~~\n'] list_item = block_token.tokenize(lines)[0].children[0] self.assertEqual(list_item.loose, False) def test_tight_list(self): lines = ['- foo\n', '\n', '# bar\n'] list_item = block_token.tokenize(lines)[0].children[0] self.assertEqual(list_item.loose, False) def test_tabbed_list_items(self): # according to the CommonMark spec: # in contexts where spaces help to define block structure, tabs behave as if they # were replaced by spaces with a tab stop of 4 characters. lines = ['title\n', '*\ttabbed item long line\n', '\n', # break lazy continuation ' continuation 1\n', '* second list item\n', '\n', # break lazy continuation '\tcontinuation 2\n'] tokens = block_token.tokenize(lines) self.assertEqual(len(tokens), 2) self.assertIsInstance(tokens[0], block_token.Paragraph) self.assertIsInstance(tokens[1], block_token.List) self.assertTrue('tabbed item long line' in tokens[1].children[0]) self.assertTrue('continuation 1' in tokens[1].children[0]) self.assertTrue('second list item' in tokens[1].children[1]) self.assertTrue('continuation 2' in tokens[1].children[1]) def test_list_items_starting_with_blank_line(self): lines = ['-\n', ' foo\n', '-\n', ' ```\n', ' bar\n', ' ```\n', '-\n', ' baz\n'] tokens = block_token.tokenize(lines) self.assertEqual(len(tokens), 1) self.assertIsInstance(tokens[0], block_token.List) self.assertIsInstance(tokens[0].children[0].children[0], block_token.Paragraph) self.assertIsInstance(tokens[0].children[1].children[0], block_token.CodeFence) self.assertIsInstance(tokens[0].children[2].children[0], block_token.BlockCode) self.assertTrue('foo' in tokens[0].children[0].children[0]) self.assertEqual('bar\n', tokens[0].children[1].children[0].children[0].content) self.assertEqual('baz\n', tokens[0].children[2].children[0].children[0].content) def test_a_list_item_may_begin_with_at_most_one_blank_line(self): lines = ['-\n', '\n', ' foo\n'] tokens = block_token.tokenize(lines) self.assertEqual(len(tokens), 2) self.assertIsInstance(tokens[0], block_token.List) self.assertIsInstance(tokens[1], block_token.Paragraph) self.assertTrue('foo' in tokens[1].children[0]) def test_empty_list_item_in_the_middle(self): lines = ['* a\n', '*\n', '\n', '* c\n'] tokens = block_token.tokenize(lines) self.assertEqual(len(tokens), 1) self.assertIsInstance(tokens[0], block_token.List) self.assertEqual(len(tokens[0].children), 3) self.assertTrue(tokens[0].loose) def test_list_with_code_block(self): lines = ['1. indented code\n', '\n', ' paragraph\n', '\n', ' more code\n'] tokens = block_token.tokenize(lines) self.assertEqual(len(tokens), 1) self.assertIsInstance(tokens[0], block_token.List) self.assertEqual(len(tokens[0].children), 1) self.assertIsInstance(tokens[0].children[0].children[0], block_token.BlockCode) self.assertEqual(' indented code\n', tokens[0].children[0].children[0].children[0].content) self.assertIsInstance(tokens[0].children[0].children[1], block_token.Paragraph) self.assertIsInstance(tokens[0].children[0].children[2], block_token.BlockCode) class TestList(unittest.TestCase): def test_different_markers(self): lines = ['- foo\n', '* bar\n', '1. baz\n', '2) spam\n'] l1, l2, l3, l4 = block_token.tokenize(lines) self.assertIsInstance(l1, block_token.List) self.assertTrue('foo' in l1) self.assertIsInstance(l2, block_token.List) self.assertTrue('bar' in l2) self.assertIsInstance(l3, block_token.List) self.assertTrue('baz' in l3) self.assertIsInstance(l4, block_token.List) self.assertTrue('spam' in l4) def test_sublist(self): lines = ['- foo\n', ' + bar\n'] token, = block_token.tokenize(lines) self.assertIsInstance(token, block_token.List) class TestTable(unittest.TestCase): def test_parse_align(self): test_func = block_token.Table.parse_align self.assertEqual(test_func(':------'), None) self.assertEqual(test_func(':-----:'), 0) self.assertEqual(test_func('------:'), 1) def test_parse_delimiter(self): def test_func(s): return block_token.Table.split_delimiter(s) self.assertEqual(list(test_func('|-| :--- | :---: | ---:|\n')), ['-', ':---', ':---:', '---:']) @parameterized.expand([ ('| --- | --- | --- |\n'), ('| - | - | - |\n'), ('|-|-|-- \n'), ]) def test_match(self, delimiter_line): lines = ['| header 1 | header 2 | header 3 |\n', delimiter_line, '| cell 1 | cell 2 | cell 3 |\n', '| more 1 | more 2 | more 3 |\n'] with patch('mistletoe.block_token.TableRow') as mock: token, = block_token.tokenize(lines) self.assertIsInstance(token, block_token.Table) self.assertTrue(hasattr(token, 'header')) self.assertEqual(token.column_align, [None, None, None]) token.children calls = [call(line, [None, None, None], line_number) for line_number, line in enumerate(lines, start=1) if line_number != 2] mock.assert_has_calls(calls) def test_easy_table(self): lines = ['header 1 | header 2\n', ' ---: | :---\n', ' cell 1 | cell 2\n'] with patch('mistletoe.block_token.TableRow') as mock: token, = block_token.tokenize(lines) self.assertIsInstance(token, block_token.Table) self.assertTrue(hasattr(token, 'header')) self.assertEqual(token.column_align, [1, None]) token.children calls = [call(line, [1, None], line_number) for line_number, line in enumerate(lines, start=1) if line_number != 2] mock.assert_has_calls(calls) def test_not_easy_table(self): lines = ['not header 1 | not header 2\n', 'foo | bar\n'] token, = block_token.tokenize(lines) self.assertIsInstance(token, block_token.Paragraph) def test_interrupt_paragraph_option(self): lines = [ 'Paragraph 1\n', '| table |\n', '| ----- |\n', '| row |\n', ] try: block_token.Table.interrupt_paragraph = False token, = block_token.tokenize(lines) except ValueError as e: raise AssertionError("Token number mismatch.") from e finally: block_token.Table.interrupt_paragraph = True self.assertIsInstance(token, block_token.Paragraph) class TestTableRow(unittest.TestCase): def test_match(self): with patch('mistletoe.block_token.TableCell') as mock: line = '| cell 1 | cell 2 |\n' token = block_token.TableRow(line, line_number=10) self.assertEqual(token.row_align, [None]) mock.assert_has_calls([call('cell 1', None, 10), call('cell 2', None, 10)]) def test_easy_table_row(self): with patch('mistletoe.block_token.TableCell') as mock: line = 'cell 1 | cell 2\n' token = block_token.TableRow(line, line_number=10) self.assertEqual(token.row_align, [None]) mock.assert_has_calls([call('cell 1', None, 10), call('cell 2', None, 10)]) def test_short_row(self): with patch('mistletoe.block_token.TableCell') as mock: line = '| cell 1 |\n' token = block_token.TableRow(line, [None, None], 10) self.assertEqual(token.row_align, [None, None]) mock.assert_has_calls([call('cell 1', None, 10), call('', None, 10)]) def test_escaped_pipe_in_cell(self): with patch('mistletoe.block_token.TableCell') as mock: line = '| pipe: `\\|` | cell 2\n' token = block_token.TableRow(line, line_number=10, row_align=[None, None]) self.assertEqual(token.row_align, [None, None]) mock.assert_has_calls([call('pipe: `|`', None, 10), call('cell 2', None, 10)]) @unittest.skip('Even GitHub fails in here, workaround: always put a space before `|`') def test_not_really_escaped_pipe_in_cell(self): with patch('mistletoe.block_token.TableCell') as mock: line = '|ending with a \\\\|cell 2\n' token = block_token.TableRow(line, [None, None], 10) self.assertEqual(token.row_align, [None, None]) mock.assert_has_calls([call('ending with a \\\\', None, 10), call('cell 2', None, 10)]) class TestTableCell(TestToken): def test_match(self): token = block_token.TableCell('cell 2', line_number=13) self._test_token(token, 'cell 2', line_number=13, align=None) class TestFootnote(unittest.TestCase): def test_parse_simple(self): lines = ['[key 1]: value1\n', '[key 2]: value2\n'] token = block_token.Document(lines) self.assertEqual(token.footnotes, {"key 1": ("value1", ""), "key 2": ("value2", "")}) def test_parse_with_title(self): lines = ['[key 1]: value1 "title1"\n', '[key 2]: value2\n', '"title2"\n'] token = block_token.Document(lines) self.assertEqual(token.footnotes, {"key 1": ("value1", "title1"), "key 2": ("value2", "title2")}) def test_parse_with_space_in_every_part(self): lines = ['[Foo bar]:\n', '\n', '\'my title\'\n'] token = block_token.Document(lines) self.assertEqual(set(token.footnotes.values()), set({("my url", "my title")})) def test_parse_title_must_be_separated_from_link_destination(self): lines = ['[foo]: (baz)\n'] token = block_token.Document(lines) self.assertEqual(set(token.footnotes.values()), set({("bar", "baz")})) lines = ['[foo]: (baz)\n'] token = block_token.Document(lines) self.assertEqual(len(token.footnotes), 0) # this tests an edge case, it shouldn't occur in normal documents: # "[key 2]" is part of the paragraph above it, because a link reference definitions cannot interrupt a paragraph. def test_footnote_followed_by_paragraph(self): lines = ['[key 1]: value1\n', 'something1\n', '[key 2]: value2\n', 'something2\n', '\n', '[key 3]: value3\r\n', # '\r', or any other whitespace may follow on the same line 'something3\n'] token = block_token.Document(lines) self.assertEqual(token.footnotes, {"key 1": ("value1", ""), "key 3": ("value3", "")}) self.assertEqual(len(token.children), 2) self.assertIsInstance(token.children[0], block_token.Paragraph) # children: something1, , [key 2]: value2, , something2 self.assertEqual(len(token.children[0].children), 5) self.assertEqual(token.children[0].children[2].content, "[key 2]: value2") self.assertEqual(token.children[1].children[0].content, "something3") def test_content_after_title_not_allowed(self): lines = ['[foo]: /url\n', '"title" ok\n'] token = block_token.Document(lines) self.assertEqual(token.footnotes, {"foo": ("/url", "")}) self.assertEqual(len(token.children), 1) self.assertIsInstance(token.children[0], block_token.Paragraph) self.assertEqual(token.children[0].children[0].content, "\"title\" ok") def test_footnotes_may_not_have_too_much_leading_space(self): lines = [' [link]: /bla\n', ' [i-am-block-actually]: /foo\n', 'paragraph\n', '\n', '\t[i-am-block-too]: /foo\n'] token = block_token.Document(lines) self.assertEqual(token.footnotes, {"link": ("/bla", "")}) self.assertEqual(len(token.children), 3) self.assertIsInstance(token.children[0], block_token.BlockCode) self.assertEqual(token.children[0].children[0].content, "[i-am-block-actually]: /foo\n") self.assertIsInstance(token.children[1], block_token.Paragraph) self.assertEqual(token.children[1].children[0].content, "paragraph") self.assertIsInstance(token.children[2], block_token.BlockCode) self.assertEqual(token.children[2].children[0].content, "[i-am-block-too]: /foo\n") def test_parse_opening_bracket_as_paragraph(self): # ... and no error is raised lines = ['[\n'] token = block_token.Document(lines) self.assertEqual(len(token.footnotes), 0) self.assertEqual(len(token.children), 1) self.assertIsInstance(token.children[0], block_token.Paragraph) self.assertEqual(token.children[0].children[0].content, '[') def test_parse_opening_brackets_as_paragraph(self): # ... and no lines are skipped lines = ['[\n', '[ \n', ']\n'] token = block_token.Document(lines) self.assertEqual(len(token.footnotes), 0) self.assertEqual(len(token.children), 1) para = token.children[0] self.assertIsInstance(para, block_token.Paragraph) self.assertEqual(len(para.children), 5, 'expected: RawText, LineBreak, RawText, LineBreak, RawText') self.assertEqual(para.children[0].content, '[') class TestDocument(unittest.TestCase): def test_store_footnote(self): lines = ['[key 1]: value1\n', '[key 2]: value2\n'] document = block_token.Document(lines) self.assertEqual(document.footnotes['key 1'], ('value1', '')) self.assertEqual(document.footnotes['key 2'], ('value2', '')) def test_auto_splitlines(self): lines = "some\ncontinual\nlines\n" document = block_token.Document(lines) self.assertIsInstance(document.children[0], block_token.Paragraph) self.assertEqual(len(document.children), 1) class TestThematicBreak(unittest.TestCase): def test_match(self): def test_case(line): token = next(iter(block_token.tokenize([line]))) self.assertIsInstance(token, block_token.ThematicBreak) cases = ['---\n', '* * *\n', '_ _ _\n'] for case in cases: test_case(case) class TestContains(unittest.TestCase): def test_contains(self): lines = ['# heading\n', '\n', 'paragraph\n', 'with\n', '`code`\n'] token = block_token.Document(lines) self.assertTrue('heading' in token) self.assertTrue('code' in token) self.assertFalse('foo' in token) class TestHtmlBlock(unittest.TestCase): def setUp(self): block_token.add_token(block_token.HtmlBlock) self.addCleanup(block_token.reset_tokens) def test_textarea_block_may_contain_blank_lines(self): lines = ['\n'] document = block_token.Document(lines) tokens = document.children self.assertEqual(1, len(tokens)) self.assertIsInstance(tokens[0], block_token.HtmlBlock) class TestLeafBlockTokenContentProperty(unittest.TestCase): def setUp(self): block_token.add_token(block_token.HtmlBlock) self.addCleanup(block_token.reset_tokens) def test_code_fence(self): lines = ['```\n', 'line 1\n', 'line 2\n', '```\n'] document = block_token.Document(lines) tokens = document.children self.assertEqual(1, len(tokens)) self.assertIsInstance(tokens[0], block_token.CodeFence) # option 1: direct access to the content self.assertEqual('line 1\nline 2\n', tokens[0].children[0].content) # option 2: using property getter to access the content self.assertEqual('line 1\nline 2\n', tokens[0].content) def test_block_code(self): lines = [' line 1\n', ' line 2\n'] document = block_token.Document(lines) tokens = document.children self.assertEqual(1, len(tokens)) self.assertIsInstance(tokens[0], block_token.BlockCode) # option 1: direct access to the content self.assertEqual('line 1\nline 2\n', tokens[0].children[0].content) # option 2: using property getter to access the content self.assertEqual('line 1\nline 2\n', tokens[0].content) def test_html_block(self): lines = ['
    \n', 'text\n' '
    \n'] document = block_token.Document(lines) tokens = document.children self.assertEqual(1, len(tokens)) self.assertIsInstance(tokens[0], block_token.HtmlBlock) # option 1: direct access to the content self.assertEqual(''.join(lines).strip(), tokens[0].children[0].content) # option 2: using property getter to access the content self.assertEqual(''.join(lines).strip(), tokens[0].content) class TestFileWrapper(unittest.TestCase): def test_get_set_pos(self): lines = [ "# heading\n", "somewhat interesting\n", "content\n", ] wrapper = block_tokenizer.FileWrapper(lines) assert next(wrapper) == "# heading\n" anchor = wrapper.get_pos() assert next(wrapper) == "somewhat interesting\n" wrapper.set_pos(anchor) assert next(wrapper) == "somewhat interesting\n" def test_anchor_reset(self): lines = [ "# heading\n", "somewhat interesting\n", "content\n", ] wrapper = block_tokenizer.FileWrapper(lines) assert next(wrapper) == "# heading\n" wrapper.anchor() assert next(wrapper) == "somewhat interesting\n" wrapper.reset() assert next(wrapper) == "somewhat interesting\n" mistletoe-1.3.0/test/test_ci.sh000077500000000000000000000015451455324047100165020ustar00rootroot00000000000000#!/usr/bin/env bash set -e function main { if [[ "$1" == "" ]]; then echo "[Error] Specify how far you want to go back." exit 1 fi CURR_BRANCH="$(get_current_branch)" git checkout --quiet HEAD~$1 render_to_file "out2.html" OLD_SHA=$(get_sha "out2.html") git checkout --quiet "$CURR_BRANCH" render_to_file "out.html" NEW_SHA=$(get_sha "out2.html") if [[ "$OLD_SHA" == "$NEW_SHA" ]]; then cleanup else get_diff fi } function get_current_branch { git rev-parse --abbrev-ref HEAD } function render_to_file { python3 -m mistletoe "test/samples/syntax.md" > "$1" } function get_sha { md5 -q "$1" } function cleanup { echo "All good." rm out2.html } function get_diff { echo "Diff exits; prompting for review..." diff out.html out2.html | view - } main $1 mistletoe-1.3.0/test/test_cli.py000066400000000000000000000127131455324047100166700ustar00rootroot00000000000000from unittest import TestCase from unittest.mock import call, patch, sentinel, mock_open, Mock from mistletoe import cli class TestCli(TestCase): @patch('mistletoe.cli.parse', return_value=Mock(filenames=[], renderer=sentinel.Renderer)) @patch('mistletoe.cli.interactive') def test_main_to_interactive(self, mock_interactive, mock_parse): cli.main(None) mock_interactive.assert_called_with(sentinel.Renderer) @patch('mistletoe.cli.parse', return_value=Mock(filenames=['foo.md'], renderer=sentinel.Renderer)) @patch('mistletoe.cli.convert') def test_main_to_convert(self, mock_convert, mock_parse): cli.main(None) mock_convert.assert_called_with(['foo.md'], sentinel.Renderer) @patch('importlib.import_module', return_value=Mock(Renderer=sentinel.RendererCls)) def test_parse_renderer(self, mock_import_module): namespace = cli.parse(['-r', 'foo.Renderer']) mock_import_module.assert_called_with('foo') self.assertEqual(namespace.renderer, sentinel.RendererCls) def test_parse_filenames(self): filenames = ['foo.md', 'bar.md'] namespace = cli.parse(filenames) self.assertEqual(namespace.filenames, filenames) @patch('mistletoe.cli.convert_file') def test_convert(self, mock_convert_file): filenames = ['foo', 'bar'] cli.convert(filenames, sentinel.RendererCls) calls = [call(filename, sentinel.RendererCls) for filename in filenames] mock_convert_file.assert_has_calls(calls) @patch('mistletoe.markdown', return_value='rendered text') @patch('sys.stdout.buffer.write') @patch('builtins.open', new_callable=mock_open) def test_convert_file_success(self, mock_open_, mock_write, mock_markdown): filename = 'foo' cli.convert_file(filename, sentinel.RendererCls) mock_open_.assert_called_with(filename, 'r', encoding='utf-8') mock_write.assert_called_with('rendered text'.encode()) @patch('builtins.open', side_effect=OSError) @patch('sys.exit') def test_convert_file_fail(self, mock_exit, mock_open_): filename = 'foo' cli.convert_file(filename, sentinel.RendererCls) mock_open_.assert_called_with(filename, 'r', encoding='utf-8') mock_exit.assert_called_with('Cannot open file "foo".') @patch('mistletoe.cli._import_readline') @patch('mistletoe.cli._print_heading') @patch('mistletoe.markdown', return_value='rendered text') @patch('builtins.print') def test_interactive(self, mock_print, mock_markdown, mock_print_heading, mock_import_readline): def MockInputFactory(return_values): _counter = -1 def mock_input(prompt=''): nonlocal _counter _counter += 1 if _counter < len(return_values): return return_values[_counter] elif _counter == len(return_values): raise EOFError else: raise KeyboardInterrupt return mock_input return_values = ['foo', 'bar', 'baz'] with patch('builtins.input', MockInputFactory(return_values)): cli.interactive(sentinel.RendererCls) mock_import_readline.assert_called_with() mock_print_heading.assert_called_with(sentinel.RendererCls) mock_markdown.assert_called_with(['foo\n', 'bar\n', 'baz\n'], sentinel.RendererCls) calls = [call('\nrendered text', end=''), call('\nExiting.')] mock_print.assert_has_calls(calls) @patch('importlib.import_module', return_value=Mock(Renderer=sentinel.RendererCls)) def test_import_success(self, mock_import_module): self.assertEqual(sentinel.RendererCls, cli._import('foo.Renderer')) @patch('sys.exit') def test_import_incomplete_path(self, mock_exit): cli._import('foo') error_msg = '[error] please supply full path to your custom renderer.' mock_exit.assert_called_with(error_msg) @patch('importlib.import_module', side_effect=ImportError) @patch('sys.exit') def test_import_module_error(self, mock_exit, mock_import_module): cli._import('foo.Renderer') mock_exit.assert_called_with('[error] cannot import module "foo".') @patch('importlib.import_module', return_value=Mock(spec=[])) @patch('sys.exit') def test_import_class_error(self, mock_exit, mock_import_module): cli._import('foo.Renderer') error_msg = '[error] cannot find renderer "Renderer" from module "foo".' mock_exit.assert_called_with(error_msg) @patch('builtins.__import__') @patch('builtins.print') def test_import_readline_success(self, mock_print, mock_import): cli._import_readline() mock_print.assert_not_called() @patch('builtins.__import__', side_effect=ImportError) @patch('builtins.print') def test_import_readline_fail(self, mock_print, mock_import): cli._import_readline() mock_print.assert_called_with('[warning] readline library not available.') @patch('builtins.print') def test_print_heading(self, mock_print): cli._print_heading(Mock(__name__='Renderer')) version = cli.mistletoe.__version__ msgs = ['mistletoe [version {}] (interactive)'.format(version), 'Type Ctrl-D to complete input, or Ctrl-C to exit.', 'Using renderer: Renderer'] calls = [call(msg) for msg in msgs] mock_print.assert_has_calls(calls) mistletoe-1.3.0/test/test_contrib/000077500000000000000000000000001455324047100172035ustar00rootroot00000000000000mistletoe-1.3.0/test/test_contrib/__init__.py000066400000000000000000000000001455324047100213020ustar00rootroot00000000000000mistletoe-1.3.0/test/test_contrib/test_github_wiki.py000066400000000000000000000022271455324047100231240ustar00rootroot00000000000000from unittest import TestCase, mock from mistletoe import span_token, Document, token from mistletoe.span_token import tokenize_inner from mistletoe.contrib.github_wiki import GithubWiki, GithubWikiRenderer class TestGithubWiki(TestCase): def setUp(self): token._root_node = Document([]) self.renderer = GithubWikiRenderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def test_parse(self): MockRawText = mock.Mock() RawText = span_token._token_types.pop() span_token._token_types.append(MockRawText) try: tokens = tokenize_inner('text with [[wiki | target]]') token = tokens[1] self.assertIsInstance(token, GithubWiki) self.assertEqual(token.target, 'target') MockRawText.assert_has_calls([mock.call('text with '), mock.call('wiki')]) finally: span_token._token_types[-1] = RawText def test_render(self): token = next(iter(tokenize_inner('[[wiki|target]]'))) output = '
    wiki' self.assertEqual(self.renderer.render(token), output) mistletoe-1.3.0/test/test_contrib/test_jira_renderer.py000066400000000000000000000155411455324047100234350ustar00rootroot00000000000000# Copyright 2018 Tile, Inc. All Rights Reserved. # # The MIT License # # Permission is hereby granted, free of charge, to any person obtaining a copy of # this software and associated documentation files (the "Software"), to deal in # the Software without restriction, including without limitation the rights to # use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies # of the Software, and to permit persons to whom the Software is furnished to do # so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. from test.base_test import BaseRendererTest from mistletoe.span_token import tokenize_inner from mistletoe.contrib.jira_renderer import JiraRenderer import random import string filesBasedTest = BaseRendererTest.filesBasedTest class TestJiraRenderer(BaseRendererTest): def setUp(self): super().setUp() self.renderer = JiraRenderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) self.sampleOutputExtension = 'jira' def genRandomString(self, n, hasWhitespace=False): source = string.ascii_letters + string.digits if hasWhitespace: source = source + ' \t' result = ''.join(random.SystemRandom().choice(source) for _ in range(n)) return result def textFormatTest(self, inputTemplate, outputTemplate): input = self.genRandomString(80, False) token = next(iter(tokenize_inner(inputTemplate.format(input)))) output = self.renderer.render(token) expected = outputTemplate.format(input) self.assertEqual(output, expected) def test_escape_simple(self): self.textFormatTest('---fancy text---', '\\-\\-\\-fancy text\\-\\-\\-') def test_escape_single_chars(self): self.textFormatTest('**fancy \\*@\\* text**', '*fancy \\*@\\* text*') def test_escape_none_when_whitespaces(self): self.textFormatTest('obj = {{ a: (b * c) + d }}', 'obj = {{ a: (b * c) + d }}') def test_escape_in_inline_code(self): # Note: Jira puts inline code into "{{...}}" as seen in this test. self.textFormatTest('**code: `a = b + c;// [1]`**', '*code: {{{{a = b + c;// \\[1\\]}}}}*') def test_escape_link(self): # Note: There seems to be no way of how to escape plain text URL in Jira. self.textFormatTest('http://www.example.com', 'http://www.example.com') def test_render_strong(self): self.textFormatTest('**a{}**', '*a{}*') def test_render_emphasis(self): self.textFormatTest('*a{}*', '_a{}_') def test_render_inline_code(self): self.textFormatTest('`a{}b`', '{{{{a{}b}}}}') def test_render_strikethrough(self): self.textFormatTest('~~{}~~', '-{}-') def test_render_image(self): token = next(iter(tokenize_inner('![image](foo.jpg)'))) output = self.renderer.render(token) expected = '!foo.jpg!' self.assertEqual(output, expected) def test_render_footnote_image(self): # token = next(tokenize_inner('![image]\n\n[image]: foo.jpg')) # output = self.renderer.render(token) # expected = '!foo.jpg!' # self.assertEqual(output, expected) pass def test_render_link(self): url = 'http://{0}.{1}.{2}'.format(self.genRandomString(5), self.genRandomString(5), self.genRandomString(3)) body = self.genRandomString(80, True) token = next(iter(tokenize_inner('[{body}]({url})'.format(url=url, body=body)))) output = self.renderer.render(token) expected = '[{body}|{url}]'.format(url=url, body=body) self.assertEqual(output, expected) def test_render_link_with_title(self): url = 'http://{0}.{1}.{2}'.format(self.genRandomString(5), self.genRandomString(5), self.genRandomString(3)) body = self.genRandomString(80, True) title = self.genRandomString(20, True) token = next(iter(tokenize_inner('[{body}]({url} "{title}")'.format(url=url, body=body, title=title)))) output = self.renderer.render(token) expected = '[{body}|{url}|{title}]'.format(url=url, body=body, title=title) self.assertEqual(output, expected) def test_render_footnote_link(self): pass def test_render_auto_link(self): url = 'http://{0}.{1}.{2}'.format(self.genRandomString(5), self.genRandomString(5), self.genRandomString(3)) token = next(iter(tokenize_inner('<{url}>'.format(url=url)))) output = self.renderer.render(token) expected = '[{url}]'.format(url=url) self.assertEqual(output, expected) def test_render_escape_sequence(self): pass def test_render_html_span(self): pass def test_render_heading(self): pass def test_render_quote(self): pass def test_render_paragraph(self): pass def test_render_block_code(self): markdown = """\ ```java public static void main(String[] args) { // a = 1 * 2; } ``` """ expected = """\ {code:java} public static void main(String[] args) { // a = 1 * 2; } {code} """ self.markdownResultTest(markdown, expected) def test_render_list(self): pass def test_render_list_item(self): pass def test_render_inner(self): pass def test_render_table(self): pass def test_render_table_row(self): pass def test_render_table_cell(self): pass def test_render_thematic_break(self): pass def test_render_html_block(self): pass def test_render_document(self): pass def test_table_header(self): markdown = """\ | header row | |--------------| | first cell | """ expected = """\ ||header row|| |first cell| """ self.markdownResultTest(markdown, expected) def test_table_empty_cell(self): """ Empty cells need to have a space in them, see . """ markdown = """\ | A | B | C | |-----------| | 1 | | 3 | """ expected = """\ ||A||B||C|| |1| |3| """ self.markdownResultTest(markdown, expected) @filesBasedTest def test_render__basic_blocks(self): pass @filesBasedTest def test_render__lists(self): pass @filesBasedTest def test_render__quotes(self): pass mistletoe-1.3.0/test/test_contrib/test_mathjax.py000066400000000000000000000021501455324047100222460ustar00rootroot00000000000000import unittest from mistletoe import Document from mistletoe.contrib.mathjax import MathJaxRenderer class TestMathJaxRenderer(unittest.TestCase): mathjax_src = '\n' def test_render_html(self): with MathJaxRenderer() as renderer: token = Document(['# heading 1\n', 'paragraph\n']) output = renderer.render(token) target = '

    heading 1

    \n

    paragraph

    \n' target += self.mathjax_src self.assertEqual(output, target) def test_render_math(self): with MathJaxRenderer() as renderer: raw = ['math displayed as a block:\n', '$$ \\sum_{i=1}^{\\infty} \\frac{1}{i^p} $$\n', 'math displayed in-line: $ 2^x $\n'] token = Document(raw) output = renderer.render(token) target = '

    math displayed as a block:\n$$ \\sum_{i=1}^{\\infty} \\frac{1}{i^p} $$\nmath displayed in-line: \\( 2^x \\)

    \n' target += self.mathjax_src self.assertEqual(output, target) mistletoe-1.3.0/test/test_contrib/test_pygments_renderer.py000066400000000000000000000037241455324047100243560ustar00rootroot00000000000000import unittest from mistletoe import Document from mistletoe.contrib.pygments_renderer import PygmentsRenderer from parameterized import parameterized from pygments.util import ClassNotFound class TestPygmentsRenderer(unittest.TestCase): @parameterized.expand([(True,), (False,)]) def test_render_no_language(self, fail_on_unsupported_language: bool): renderer = PygmentsRenderer(fail_on_unsupported_language=fail_on_unsupported_language) token = Document(['```\n', 'no language\n', '```\n']) output = renderer.render(token) expected = ( '
    '
                'no language\n
    \n\n' ) self.assertEqual(output, expected) def test_render_known_language(self): renderer = PygmentsRenderer() token = Document(['```python\n', '# python language\n', '```\n']) output = renderer.render(token) expected = ( '
    '
                '# python language\n'
                '
    \n\n' ) self.assertEqual(output, expected) def test_render_unknown_language(self): renderer = PygmentsRenderer() token = Document(['```foobar\n', 'unknown language\n', '```\n']) output = renderer.render(token) expected = ( '
    '
                'unknown language\n
    \n\n' ) self.assertEqual(output, expected) def test_render_fail_on_unsupported_language(self): renderer = PygmentsRenderer(fail_on_unsupported_language=True) token = Document(['```foobar\n', 'unknown language\n', '```\n']) with self.assertRaises(ClassNotFound): renderer.render(token) mistletoe-1.3.0/test/test_contrib/test_scheme.py000066400000000000000000000005541455324047100220640ustar00rootroot00000000000000from unittest import TestCase from mistletoe.contrib.scheme import Program, Scheme class TestScheme(TestCase): def test_render(self): with Scheme() as renderer: prog = [ "(define x (* 2 21))", "x", ] result = renderer.render(Program(prog)) self.assertEqual(result, 42) mistletoe-1.3.0/test/test_contrib/test_toc_renderer.py000066400000000000000000000047301455324047100232730ustar00rootroot00000000000000from unittest import TestCase from mistletoe import block_token from mistletoe.block_token import Document, Heading from mistletoe.contrib.toc_renderer import TocRenderer class TestTocRenderer(TestCase): def test_parse_rendered_heading(self): rendered_heading = '

    some text

    ' content = TocRenderer.parse_rendered_heading(rendered_heading) self.assertEqual(content, 'some text') def test_render_heading(self): renderer = TocRenderer() Heading.start('### some *text*\n') token = Heading(Heading.read(iter(['foo']))) renderer.render_heading(token) self.assertEqual(renderer._headings[0], (3, 'some text')) def test_depth(self): renderer = TocRenderer(depth=3) token = Document(['# title\n', '## heading\n', '#### heading\n']) renderer.render(token) self.assertEqual(renderer._headings, [(2, 'heading')]) def test_omit_title(self): renderer = TocRenderer(omit_title=True) token = Document(['# title\n', '\n', '## heading\n']) renderer.render(token) self.assertEqual(renderer._headings, [(2, 'heading')]) def test_filter_conditions(self): import re filter_conds = [lambda x: re.match(r'heading', x), lambda x: re.match(r'foo', x)] renderer = TocRenderer(filter_conds=filter_conds) token = Document(['# title\n', '\n', '## heading\n', '\n', '#### not heading\n']) renderer.render(token) self.assertEqual(renderer._headings, [(4, 'not heading')]) def test_get_toc(self): headings = [(1, 'heading 1'), (2, 'subheading 1'), (2, 'subheading 2'), (3, 'subsubheading 1'), (2, 'subheading 3'), (1, 'heading 2')] renderer = TocRenderer(omit_title=False) renderer._headings = headings toc = renderer.toc self.assertIsInstance(toc, block_token.List) # for now, we check at least the most nested heading # (hierarchy: `List -> ListItem -> {Paragraph -> RawText.content | List -> ...}`): heading_item = toc.children[0].children[1].children[1].children[1].children[0] self.assertIsInstance(heading_item, block_token.ListItem) self.assertEqual(heading_item.children[0].children[0].content, 'subsubheading 1') mistletoe-1.3.0/test/test_contrib/test_xwiki20_renderer.py000066400000000000000000000112361455324047100240020ustar00rootroot00000000000000from test.base_test import BaseRendererTest from mistletoe.span_token import tokenize_inner from mistletoe.contrib.xwiki20_renderer import XWiki20Renderer import random import string filesBasedTest = BaseRendererTest.filesBasedTest class TestXWiki20Renderer(BaseRendererTest): def setUp(self): super().setUp() self.renderer = XWiki20Renderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) self.sampleOutputExtension = 'xwiki20' def genRandomString(self, n, hasWhitespace=False): source = string.ascii_letters + string.digits if hasWhitespace: source = source + ' \t' result = ''.join(random.SystemRandom().choice(source) for _ in range(n)) return result def textFormatTest(self, inputTemplate, outputTemplate): input = self.genRandomString(80, False) token = next(iter(tokenize_inner(inputTemplate.format(input)))) output = self.renderer.render(token) expected = outputTemplate.format(input) self.assertEqual(output, expected) def test_escaping(self): self.textFormatTest('**code: `a = 1;// comment`, plain text URL: http://example.com**', '**code: {{{{code}}}}a = 1;// comment{{{{/code}}}}, plain text URL: http:~//example.com**') def test_render_strong(self): self.textFormatTest('**a{}**', '**a{}**') def test_render_emphasis(self): self.textFormatTest('*a{}*', '//a{}//') def test_render_inline_code(self): self.textFormatTest('`a{}b`', '{{{{code}}}}a{}b{{{{/code}}}}') def test_render_strikethrough(self): self.textFormatTest('~~{}~~', '--{}--') def test_render_image(self): token = next(iter(tokenize_inner('![image](foo.jpg)'))) output = self.renderer.render(token) expected = '[[image:foo.jpg]]' self.assertEqual(output, expected) def test_render_link(self): url = 'http://{0}.{1}.{2}'.format(self.genRandomString(5), self.genRandomString(5), self.genRandomString(3)) body = self.genRandomString(80, True) token = next(iter(tokenize_inner('[{body}]({url})'.format(url=url, body=body)))) output = self.renderer.render(token) expected = '[[{body}>>{url}]]'.format(url=url, body=body) self.assertEqual(output, expected) def test_render_auto_link(self): url = 'http://{0}.{1}.{2}'.format(self.genRandomString(5), self.genRandomString(5), self.genRandomString(3)) token = next(iter(tokenize_inner('<{url}>'.format(url=url)))) output = self.renderer.render(token) expected = '[[{url}]]'.format(url=url) self.assertEqual(output, expected) def test_render_html_span(self): markdown = 'text styles: italic, bold' # See fixme at the `render_html_span` method... # expected = 'text styles: {{html wiki="true"}}italic{{/html}}, {{html wiki="true"}}bold{{/html}}\n\n' expected = 'text styles: italic, bold\n\n' self.markdownResultTest(markdown, expected) def test_render_html_block(self): markdown = 'paragraph\n\n
    some cool code
    ' expected = 'paragraph\n\n{{html wiki="true"}}\n
    some cool code
    \n{{/html}}\n\n' self.markdownResultTest(markdown, expected) def test_render_xwiki_macros_simple(self): markdown = """\ {{warning}} Use this feature with *caution*. See {{Wikipedia article="SomeArticle"/}}. {{test}}Another inline macro{{/test}}. {{/warning}} """ # Note: There is a trailing ' ' at the end of the second line. It will be a bit complicated to get rid of it. expected = """\ {{warning}} Use this feature with //caution//. See {{Wikipedia article="SomeArticle"/}}. {{test}}Another inline macro{{/test}}. \n\ {{/warning}} """ self.markdownResultTest(markdown, expected) def test_render_xwiki_macros_in_list(self): markdown = """\ * list item {{warning}} Use this feature with *caution*. See {{Wikipedia article="SomeArticle"/}}. {{test}}Another inline macro{{/test}}. {{/warning}} """ # Note: There is a trailing ' ' at the end of the second line. It will be a bit complicated to get rid of it. expected = """\ * list item((( {{warning}} Use this feature with //caution//. See {{Wikipedia article="SomeArticle"/}}. {{test}}Another inline macro{{/test}}. \n\ {{/warning}} ))) """ self.markdownResultTest(markdown, expected) @filesBasedTest def test_render__basic_blocks(self): pass @filesBasedTest def test_render__lists(self): pass @filesBasedTest def test_render__quotes(self): pass mistletoe-1.3.0/test/test_core_tokens.py000066400000000000000000000061561455324047100204400ustar00rootroot00000000000000from unittest import TestCase from mistletoe.core_tokens import (MatchObj, Delimiter, follows, shift_whitespace, is_control_char, deactivate_delimiters, preceded_by, succeeded_by) class TestCoreTokens(TestCase): def test_match_obj(self): match = MatchObj(0, 2, (0, 1, 'a'), (1, 2, 'b')) self.assertEqual(match.start(), 0) self.assertEqual(match.start(1), 0) self.assertEqual(match.start(2), 1) self.assertEqual(match.end(), 2) self.assertEqual(match.end(1), 1) self.assertEqual(match.end(2), 2) self.assertEqual(match.group(), 'ab') self.assertEqual(match.group(1), 'a') self.assertEqual(match.group(2), 'b') def test_delimiter(self): delimiter = Delimiter(4, 6, 'abcd**') self.assertEqual(delimiter.type, '**') self.assertEqual(delimiter.number, 2) self.assertEqual(delimiter.active, True) self.assertEqual(delimiter.start, 4) self.assertEqual(delimiter.end, 6) def test_delimiter_remove_left(self): delimiter = Delimiter(4, 6, 'abcd**') self.assertTrue(delimiter.remove(1, left=True)) self.assertEqual(delimiter.number, 1) self.assertEqual(delimiter.start, 5) self.assertEqual(delimiter.end, 6) def test_delimiter_remove_right(self): delimiter = Delimiter(4, 6, 'abcd**') self.assertTrue(delimiter.remove(1, left=False)) self.assertEqual(delimiter.number, 1) self.assertEqual(delimiter.start, 4) self.assertEqual(delimiter.end, 5) def test_delimiter_remove_empty(self): delimiter = Delimiter(4, 6, 'abcd**') self.assertFalse(delimiter.remove(2)) def test_follows(self): string = '(foobar)' self.assertTrue(follows(string, 6, ')')) self.assertFalse(follows(string, 6, '(')) self.assertFalse(follows(string, 7, ')')) def test_shift_whitespace(self): string = ' \n\t\rfoo' self.assertEqual(shift_whitespace(string, 0), 4) self.assertEqual(shift_whitespace('', 0), 0) def test_is_control_char(self): char = chr(0) self.assertTrue(is_control_char(char)) self.assertFalse(is_control_char('a')) def test_deactivate_delimiters(self): s = 'abc' delimiters = [Delimiter(0, 1, s), Delimiter(1, 2, s), Delimiter(2, 3, s)] deactivate_delimiters(delimiters, 2, 'b') self.assertTrue(delimiters[0].active) self.assertFalse(delimiters[1].active) self.assertTrue(delimiters[2].active) def test_preceded_by(self): whitespace = ' \t\n\r' self.assertTrue(preceded_by(1, ' abc', whitespace)) self.assertTrue(preceded_by(0, 'aabc', whitespace)) self.assertFalse(preceded_by(1, 'aabc', whitespace)) self.assertFalse(preceded_by(0, 'aabc', 'abc')) def test_succeeded_by(self): whitespace = ' \t\n\r' self.assertTrue(succeeded_by(3, 'abc ', whitespace)) self.assertTrue(succeeded_by(4, 'abcc', whitespace)) self.assertFalse(succeeded_by(3, 'abcc', whitespace)) self.assertFalse(succeeded_by(4, 'abcc', 'abc')) mistletoe-1.3.0/test/test_html_renderer.py000066400000000000000000000153651455324047100207610ustar00rootroot00000000000000from unittest import TestCase, mock from mistletoe import Document from mistletoe.html_renderer import HtmlRenderer from parameterized import parameterized class TestRenderer(TestCase): def setUp(self): self.renderer = HtmlRenderer() self.renderer.render_inner = mock.Mock(return_value='inner') self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def _test_token(self, token_name, expected_output, children=True, without_attrs=None, **kwargs): render_func = self.renderer.render_map[token_name] children = mock.MagicMock(spec=list) if children else None mock_token = mock.Mock(children=children, **kwargs) without_attrs = without_attrs or [] for attr in without_attrs: delattr(mock_token, attr) self.assertEqual(render_func(mock_token), expected_output) class TestHtmlRenderer(TestRenderer): def test_strong(self): self._test_token('Strong', 'inner') def test_emphasis(self): self._test_token('Emphasis', 'inner') def test_inline_code(self): from mistletoe.span_token import tokenize_inner output = self.renderer.render(tokenize_inner('`foo`')[0]) self.assertEqual(output, 'foo') output = self.renderer.render(tokenize_inner('`` \\[\\` ``')[0]) self.assertEqual(output, '\\[\\`') def test_strikethrough(self): self._test_token('Strikethrough', 'inner') def test_image(self): expected = '' self._test_token('Image', expected, src='src', title='title') def test_link(self): expected = 'inner' self._test_token('Link', expected, target='target', title='title') def test_autolink(self): expected = 'inner' self._test_token('AutoLink', expected, target='link', mailto=False) def test_escape_sequence(self): self._test_token('EscapeSequence', 'inner') def test_raw_text(self): self._test_token('RawText', 'john & jane', children=False, content='john & jane') def test_html_span(self): self._test_token('HtmlSpan', 'text', children=False, content='text') def test_heading(self): expected = '

    inner

    ' self._test_token('Heading', expected, level=3) def test_quote(self): expected = '
    \n
    ' self._test_token('Quote', expected) def test_paragraph(self): self._test_token('Paragraph', '

    inner

    ') def test_block_code(self): from mistletoe.block_token import tokenize output = self.renderer.render(tokenize(['```sh\n', 'foo\n', '```\n'])[0]) expected = '
    foo\n
    ' self.assertEqual(output, expected) def test_block_code_no_language(self): from mistletoe.block_token import tokenize output = self.renderer.render(tokenize(['```\n', 'foo\n', '```\n'])[0]) expected = '
    foo\n
    ' self.assertEqual(output, expected) def test_list(self): expected = '
      \n\n
    ' self._test_token('List', expected, start=None) def test_list_item(self): expected = '
  • ' self._test_token('ListItem', expected) def test_table_with_header(self): func_path = 'mistletoe.html_renderer.HtmlRenderer.render_table_row' with mock.patch(func_path, autospec=True) as mock_func: mock_func.return_value = 'row' expected = ('\n' '\nrow\n' '\ninner\n' '
    ') self._test_token('Table', expected) def test_table_without_header(self): func_path = 'mistletoe.html_renderer.HtmlRenderer.render_table_row' with mock.patch(func_path, autospec=True) as mock_func: mock_func.return_value = 'row' expected = '\n\ninner\n
    ' self._test_token('Table', expected, without_attrs=['header',]) def test_table_row(self): self._test_token('TableRow', '\n\n') def test_table_cell(self): expected = 'inner\n' self._test_token('TableCell', expected, align=None) def test_table_cell0(self): expected = 'inner\n' self._test_token('TableCell', expected, align=0) def test_table_cell1(self): expected = 'inner\n' self._test_token('TableCell', expected, align=1) def test_thematic_break(self): self._test_token('ThematicBreak', '
    ', children=False) def test_html_block(self): content = expected = '

    hello

    \n

    this is\na paragraph

    \n' self._test_token('HtmlBlock', expected, children=False, content=content) def test_line_break(self): self._test_token('LineBreak', '
    \n', children=False, soft=False) def test_document(self): self._test_token('Document', '', footnotes={}) class TestHtmlRendererEscaping(TestCase): @parameterized.expand([ (False, False, '" and \''), (False, True, '" and ''), (True, False, '" and \''), (True, True, '" and ''), ]) def test_escape_html_text(self, escape_double, escape_single, expected): with HtmlRenderer(html_escape_double_quotes=escape_double, html_escape_single_quotes=escape_single) as renderer: self.assertEqual(renderer.escape_html_text('" and \''), expected) def test_unprocessed_html_tokens_escaped(self): with HtmlRenderer(process_html_tokens=False) as renderer: token = Document(['

    as plain text
    \n']) expected = '

    <div><br> as plain text</div>

    \n' self.assertEqual(renderer.render(token), expected) class TestHtmlRendererFootnotes(TestCase): def setUp(self): self.renderer = HtmlRenderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def test_footnote_image(self): token = Document(['![alt][foo]\n', '\n', '[foo]: bar "title"\n']) expected = '

    alt

    \n' self.assertEqual(self.renderer.render(token), expected) def test_footnote_link(self): token = Document(['[name][foo]\n', '\n', '[foo]: target\n']) expected = '

    name

    \n' self.assertEqual(self.renderer.render(token), expected) mistletoe-1.3.0/test/test_latex_renderer.py000066400000000000000000000146731455324047100211330ustar00rootroot00000000000000from unittest import TestCase, mock from parameterized import parameterized import mistletoe.latex_renderer from mistletoe.latex_renderer import LaTeXRenderer from mistletoe import markdown class TestLaTeXRenderer(TestCase): def setUp(self): self.renderer = LaTeXRenderer() self.renderer.render_inner = mock.Mock(return_value='inner') self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def _test_token(self, token_name, expected_output, children=True, without_attrs=None, **kwargs): render_func = self.renderer.render_map[token_name] children = mock.MagicMock(spec=list) if children else None mock_token = mock.Mock(children=children, **kwargs) without_attrs = without_attrs or [] for attr in without_attrs: delattr(mock_token, attr) self.assertEqual(render_func(mock_token), expected_output) def test_strong(self): self._test_token('Strong', '\\textbf{inner}') def test_emphasis(self): self._test_token('Emphasis', '\\textit{inner}') def test_inline_code(self): func_path = 'mistletoe.latex_renderer.LaTeXRenderer.render_raw_text' for content, expected in {'inner': '\\verb|inner|', 'a + b': '\\verb|a + b|', 'a | b': '\\verb!a | b!', '|ab!|': '\\verb"|ab!|"', }.items(): with mock.patch(func_path, return_value=content): self._test_token('InlineCode', expected, content=content) content = mistletoe.latex_renderer.verb_delimiters with self.assertRaises(RuntimeError): with mock.patch(func_path, return_value=content): self._test_token('InlineCode', None, content=content) def test_strikethrough(self): self._test_token('Strikethrough', '\\sout{inner}') def test_image(self): expected = '\n\\includegraphics{src}\n' self._test_token('Image', expected, src='src') @parameterized.expand([ ('page', '\\href{page}{inner}'), ('page%3A+with%3A+escape', '\\href{page\\%3A+with\\%3A+escape}{inner}'), ('page#target', '\\href{page\\#target}{inner}') ]) def test_link(self, target, expected): self._test_token('Link', expected, target=target) @parameterized.expand([ ('page', '\\url{page}'), ('page%3A+with%3A+escape', '\\url{page\\%3A+with\\%3A+escape}'), ('page#target', '\\url{page\\#target}') ]) def test_autolink(self, target, expected): self._test_token('AutoLink', expected, target=target) def test_math(self): expected = '$ 1 + 2 = 3 $' self._test_token('Math', expected, children=False, content='$ 1 + 2 = 3 $') def test_raw_text(self): expected = '\\$\\&\\#\\{\\}' self._test_token('RawText', expected, children=False, content='$&#{}') def test_heading(self): expected = '\n\\section{inner}\n' self._test_token('Heading', expected, level=1) def test_quote(self): expected = '\\begin{displayquote}\ninner\\end{displayquote}\n' self._test_token('Quote', expected) def test_paragraph(self): expected = '\ninner\n' self._test_token('Paragraph', expected) def test_block_code(self): func_path = 'mistletoe.latex_renderer.LaTeXRenderer.render_raw_text' with mock.patch(func_path, return_value='inner'): expected = '\n\\begin{lstlisting}[language=sh]\ninner\\end{lstlisting}\n' self._test_token('BlockCode', expected, language='sh') def test_list(self): expected = '\\begin{itemize}\ninner\\end{itemize}\n' self._test_token('List', expected, start=None) def test_list_item(self): self._test_token('ListItem', '\\item inner\n') def test_table_with_header(self): func_path = 'mistletoe.latex_renderer.LaTeXRenderer.render_table_row' with mock.patch(func_path, autospec=True, return_value='row\n'): expected = '\\begin{tabular}{l c r}\nrow\n\\hline\ninner\\end{tabular}\n' self._test_token('Table', expected, column_align=[None, 0, 1]) def test_table_without_header(self): expected = ('\\begin{tabular}\ninner\\end{tabular}\n') self._test_token('Table', expected, without_attrs=['header'], column_align=[None]) def test_table_row(self): self._test_token('TableRow', ' \\\\\n') def test_table_cell(self): self._test_token('TableCell', 'inner') def test_thematic_break(self): self._test_token('ThematicBreak', '\\hrulefill\n') def test_line_break(self): self._test_token('LineBreak', '\\newline\n', soft=False) def test_document(self): expected = ('\\documentclass{article}\n' '\\begin{document}\n' 'inner' '\\end{document}\n') self._test_token('Document', expected, footnotes={}) class TestHtmlEntity(TestCase): def test_html_entity(self): self.assertIn('hello \\& goodbye', markdown('hello & goodbye', LaTeXRenderer)) def test_html_entity_in_link_target(self): self.assertIn('\\href{foo}{hello}', markdown('[hello](foo)', LaTeXRenderer)) class TestLaTeXFootnotes(TestCase): def setUp(self): self.renderer = LaTeXRenderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def test_footnote_image(self): from mistletoe import Document raw = ['![alt][foo]\n', '\n', '[foo]: bar "title"\n'] expected = ('\\documentclass{article}\n' '\\usepackage{graphicx}\n' '\\begin{document}\n' '\n' '\n\\includegraphics{bar}\n' '\n' '\\end{document}\n') self.assertEqual(self.renderer.render(Document(raw)), expected) def test_footnote_link(self): from mistletoe import Document raw = ['[name][key]\n', '\n', '[key]: target\n'] expected = ('\\documentclass{article}\n' '\\usepackage{hyperref}\n' '\\begin{document}\n' '\n' '\\href{target}{name}' '\n' '\\end{document}\n') self.assertEqual(self.renderer.render(Document(raw)), expected) mistletoe-1.3.0/test/test_latex_token.py000066400000000000000000000010401455324047100204250ustar00rootroot00000000000000import unittest from mistletoe.span_token import tokenize_inner from mistletoe.latex_token import Math from mistletoe.latex_renderer import LaTeXRenderer class TestLaTeXToken(unittest.TestCase): def setUp(self): self.renderer = LaTeXRenderer() self.renderer.__enter__() self.addCleanup(self.renderer.__exit__, None, None, None) def test_span(self): token = next(iter(tokenize_inner('$ 1 + 2 = 3 $'))) self.assertIsInstance(token, Math) self.assertEqual(token.content, '$ 1 + 2 = 3 $') mistletoe-1.3.0/test/test_line_numbers.py000066400000000000000000000042311455324047100205770ustar00rootroot00000000000000import unittest import mistletoe.block_token as block_token import mistletoe.span_token as span_token from mistletoe.markdown_renderer import ( LinkReferenceDefinition, LinkReferenceDefinitionBlock, ) class TestLineNumbers(unittest.TestCase): def setUp(self) -> None: block_token.add_token(block_token.HTMLBlock) span_token.add_token(span_token.HTMLSpan) block_token.remove_token(block_token.Footnote) block_token.add_token(LinkReferenceDefinitionBlock) return super().setUp() def tearDown(self) -> None: span_token.reset_tokens() block_token.reset_tokens() return super().tearDown() def test_main(self): # see line_numbers.md for a description of how the test works. NUMBER_OF_LINE_NUMBERS_TO_BE_CHECKED = 13 with open("test/samples/line_numbers.md", "r") as fin: document = block_token.Document(fin) count = self.check_line_numbers(document) self.assertEqual(count, NUMBER_OF_LINE_NUMBERS_TO_BE_CHECKED) def check_line_numbers(self, token: block_token.BlockToken): """Check the line number on the given block token and its children, if possible.""" count = 0 line_number = self.get_expected_line_number(token) if line_number: self.assertEqual(token.line_number, line_number) count += 1 if isinstance(token, block_token.Table): count += self.check_line_numbers(token.header) for child in token.children: if isinstance(child, block_token.BlockToken): count += self.check_line_numbers(child) return count def get_expected_line_number(self, token: block_token.BlockToken): # the expected line number, if it exists, should be wrapped in an inline # code token and be an immediate child of the token. # or it could be the title of a link reference definition. for child in token.children: if isinstance(child, span_token.InlineCode): return int(child.children[0].content) if isinstance(child, LinkReferenceDefinition): return int(child.title) mistletoe-1.3.0/test/test_markdown_renderer.py000066400000000000000000000532011455324047100216260ustar00rootroot00000000000000import unittest from mistletoe import block_token, span_token from mistletoe.block_token import Document from mistletoe.markdown_renderer import MarkdownRenderer class TestMarkdownRenderer(unittest.TestCase): @staticmethod def roundtrip(input, **rendererArgs): """Parses the given markdown input and renders it back to markdown again.""" with MarkdownRenderer(**rendererArgs) as renderer: return renderer.render(Document(input)) def test_empty_document(self): input = [] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_paragraphs_and_blank_lines(self): input = [ "Paragraph 1. Single line. Followed by two white-space-only lines.\n", "\n", "\n", "Paragraph 2. Two\n", "lines, no final line break.", ] output = self.roundtrip(input) # note: a line break is always added at the end of a paragraph. self.assertEqual(output, "".join(input) + "\n") def test_line_breaks(self): input = [ "soft line break\n", "hard line break (backslash)\\\n", "another hard line break (double spaces) \n", "yet another hard line break \n", "that's all.\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_emphasized_and_strong(self): input = ["*emphasized* __strong__ _**emphasized and strong**_\n"] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_strikethrough(self): input = ["~~strikethrough~~\n"] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_escaped_chars(self): input = ["\\*escaped, not emphasized\\*\n"] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_html_span(self): input = ["so

    hear ye

    \n"] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_code_span(self): input = [ "a) `code span` b) ``trailing space, double apostrophes `` c) ` leading and trailing space `\n" ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_code_span_with_embedded_line_breaks(self): input = [ "a `multi-line\n", "code\n", "span`.\n" ] output = self.roundtrip(input) expected = [ "a `multi-line code span`.\n" ] self.assertEqual(output, "".join(expected)) def test_images_and_links(self): input = [ "[a link](#url (title))\n", "[another link]( '*emphasized\n", "title*')\n", '![an \\[*image*\\], escapes and emphasis](#url "title")\n', "\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_multiline_fragment(self): input = [ "[a link]( '*emphasized\n", "title\n", "spanning\n", "many\n", "lines*')\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_thematic_break(self): input = [ " ** * ** * ** * **\n", "followed by a paragraph of text\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_atx_headings(self): input = [ "## atx *heading* ##\n", "# another atx heading, without trailing hashes\n", "###\n", "^ empty atx heading\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_setext_headings(self): input = [ "*setext*\n", "heading!\n", "===============\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_numbered_list(self): input = [ " 22) *emphasized list item*\n", " 96) \n", " 128) here begins a nested list.\n", " + apples\n", " + bananas\n", ] output = self.roundtrip(input) expected = [ "22) *emphasized list item*\n", "96) \n", "128) here begins a nested list.\n", " + apples\n", " + bananas\n", ] self.assertEqual(output, "".join(expected)) def test_bulleted_list(self): input = [ "* **test case**:\n", " testing a link as the first item on a continuation line\n", " [links must be indented][properly].\n", "\n", "[properly]: uri\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) # we don't currently support keeping margin indentation: def test_list_item_margin_indentation_not_preserved(self): # 0 to 4 spaces of indentation from the margin input = [ "- 0 space: ok.\n", " subsequent line.\n", " - 1 space: ok.\n", " subsequent line.\n", " - 2 spaces: ok.\n", " subsequent line.\n", " - 3 spaces: ok.\n", " subsequent line.\n", " - 4 spaces: in the paragraph of the above list item.\n", " subsequent line.\n", ] output = self.roundtrip(input) expected = [ "- 0 space: ok.\n", " subsequent line.\n", "- 1 space: ok.\n", " subsequent line.\n", "- 2 spaces: ok.\n", " subsequent line.\n", "- 3 spaces: ok.\n", " subsequent line.\n", " - 4 spaces: in the paragraph of the above list item.\n", " subsequent line.\n", ] self.assertEqual(output, "".join(expected)) def test_list_item_indentation_after_leader_preserved(self): # leaders followed by 1 to 5 spaces input = [ "- 1 space: ok.\n", " subsequent line.\n", "- 2 spaces: ok.\n", " subsequent line.\n", "- 3 spaces: ok.\n", " subsequent line.\n", "- 4 spaces: ok.\n", " subsequent line.\n", "- 5 spaces: list item starting with indented code.\n", " subsequent line.\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_list_item_indentation_after_leader_normalized(self): # leaders followed by 1 to 5 spaces input = [ "- 1 space: ok.\n", " subsequent line.\n", "- 2 spaces: ok.\n", " subsequent line.\n", "- 3 spaces: ok.\n", " subsequent line.\n", "- 4 spaces: ok.\n", " subsequent line.\n", "- 5 spaces: list item starting with indented code.\n", " subsequent line.\n", ] output = self.roundtrip(input, normalize_whitespace=True) expected = [ "- 1 space: ok.\n", " subsequent line.\n", "- 2 spaces: ok.\n", " subsequent line.\n", "- 3 spaces: ok.\n", " subsequent line.\n", "- 4 spaces: ok.\n", " subsequent line.\n", "- 5 spaces: list item starting with indented code.\n", " subsequent line.\n", ] self.assertEqual(output, "".join(expected)) def test_code_blocks(self): input = [ " this is an indented code block\n", " on two lines \n", " with some extra whitespace here and there, to be preserved \n", " just as it is.\n", "```\n", "now for a fenced code block \n", " where indentation is also preserved. as are the double spaces at the end of this line: \n", "```\n", " ~~~this is an info string: behold the fenced code block with tildes!\n", " *tildes are great*\n", " ~~~\n", "1. a list item with an embedded\n", "\n", " indented code block.\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_blank_lines_following_code_block(self): input = [ " code block\n", "\n", "paragraph.\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_html_block(self): input = [ "

    some text

    \n", "
    \n", "\n", "+

    html block embedded in list

    \n", "
    \n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_block_quote(self): input = [ "> a block quote\n", "> > and a nested block quote\n", "> 1. > and finally, a list with a nested block quote\n", "> > which continues on a second line.\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_link_reference_definition(self): input = [ "[label]: https://domain.com\n", "\n", "paragraph [with a link][label-2], etc, etc.\n", "and [a *second* link][label] as well\n", "shortcut [label] & collapsed [label][]\n", "\n", "[label-2]: 'title\n", "with line break'\n", "[label-not-referred-to]: https://foo (title)\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_table(self): input = [ "| Emoji | Description |\n", "| :---: | ------------------------- |\n", "| 📚 | Update documentation. |\n", "| 🐎 | Performance improvements. |\n", "etc, etc\n", ] output = self.roundtrip(input) self.assertEqual(output, "".join(input)) def test_table_with_varying_column_counts(self): input = [ " | header | x | \n", " | --- | ---: | \n", " | . | Performance improvements. | an extra column | \n", "etc, etc\n", ] output = self.roundtrip(input) expected = [ "| header | x | |\n", "| ------ | ------------------------: | --------------- |\n", "| . | Performance improvements. | an extra column |\n", "etc, etc\n", ] self.assertEqual(output, "".join(expected)) def test_table_with_narrow_column(self): input = [ "| xyz | ? |\n", "| --- | - |\n", "| a | p |\n", "| b | q |\n", ] output = self.roundtrip(input) expected = [ "| xyz | ? |\n", "| --- | --- |\n", "| a | p |\n", "| b | q |\n", ] self.assertEqual(output, "".join(expected)) def test_direct_rendering_of_block_token(self): input = [ "Line 1\n", "Line 2\n", ] paragraph = block_token.Paragraph(input) with MarkdownRenderer() as renderer: lines = renderer.render(paragraph) assert lines == "".join(input) def test_direct_rendering_of_span_token(self): input = "some text" raw_text = span_token.RawText(input) with MarkdownRenderer() as renderer: lines = renderer.render(raw_text) assert lines == input + "\n" class TestMarkdownFormatting(unittest.TestCase): def test_wordwrap_plain_paragraph(self): with MarkdownRenderer() as renderer: # given a paragraph with only plain text and soft line breaks paragraph = block_token.Paragraph( [ "A \n", "short paragraph \n", " without any \n", "long words \n", "or hard line breaks.\n", ] ) # when reflowing with the max line length set medium long renderer.max_line_length = 30 lines = renderer.render(paragraph) # then the content is reflowed accordingly assert lines == ( "A short paragraph without any\n" "long words or hard line\n" "breaks.\n" ) # when reflowing with the max line length set lower than the longest word: "paragraph", 9 chars renderer.max_line_length = 8 lines = renderer.render(paragraph) # then the content is reflowed so that the max line length is only exceeded for long words assert lines == ( "A short\n" "paragraph\n" "without\n" "any long\n" "words or\n" "hard\n" "line\n" "breaks.\n" ) def test_wordwrap_paragraph_with_emphasized_words(self): with MarkdownRenderer() as renderer: # given a paragraph with emphasized words paragraph = block_token.Paragraph( ["*emphasized* _nested *emphasis* too_\n"] ) # when reflowing with the max line length set very short renderer.max_line_length = 1 lines = renderer.render(paragraph) # then the content is reflowed to make the lines as short as possible (but not shorter). assert lines == ( "*emphasized*\n" "_nested\n" "*emphasis*\n" "too_\n" ) def test_wordwrap_paragraph_with_inline_code(self): with MarkdownRenderer() as renderer: # given a paragraph with inline code paragraph = block_token.Paragraph( [ "`inline code` and\n", "`` inline with\n", "line break ``\n", ] ) # when reflowing with the max line length set very short renderer.max_line_length = 1 lines = renderer.render(paragraph) # then the content is reflowed to make the lines as short as possible (but not shorter). # line breaks within the inline code are NOT preserved. # however, padding at the end of the inline code may not be word wrapped. assert lines == ( "`inline\n" "code`\n" "and\n" "`` inline\n" "with\n" "line\n" "break ``\n" ) def test_wordwrap_paragraph_with_hard_line_breaks(self): with MarkdownRenderer() as renderer: # given a paragraph with hard line breaks paragraph = block_token.Paragraph( [ "A short paragraph \n", " without any\\\n", "very long\n", "words.\n", ] ) # when reflowing with the max line length set to normal renderer.max_line_length = 80 lines = renderer.render(paragraph) # then the content is reflowed with hard line breaks preserved assert lines == ( "A short paragraph \n" "without any\\\n" "very long words.\n" ) def test_wordwrap_paragraph_with_link(self): with MarkdownRenderer() as renderer: # given a paragraph with a link paragraph = block_token.Paragraph( [ "A paragraph\n", "containing [a link]( 'which\n", "has a rather long title\n", "spanning multiple lines.')\n", ] ) # when reflowing with the max line length set very short renderer.max_line_length = 1 lines = renderer.render(paragraph) # then the content is reflowed to make the lines as short as possible (but not shorter) assert lines == ( "A\n" "paragraph\n" "containing\n" "[a\n" "link](\n" "'which\n" "has\n" "a\n" "rather\n" "long\n" "title\n" "spanning\n" "multiple\n" "lines.')\n" ) def test_wordwrap_text_in_setext_heading(self): with MarkdownRenderer() as renderer: # given a paragraph with a setext heading document = block_token.Document( [ "A \n", "setext heading \n", " without any \n", "long words \n", "or hard line breaks.\n", "=====\n", ] ) # when reflowing with the max line length set medium long renderer.max_line_length = 30 lines = renderer.render(document) # then the content is reflowed accordingly assert lines == ( "A setext heading without any\n" "long words or hard line\n" "breaks.\n" "=====\n" ) def test_wordwrap_text_in_link_reference_definition(self): with MarkdownRenderer() as renderer: # given some markdown with link reference definitions document = block_token.Document( [ "[This is\n", " the *link label*.]: 'title (with parens). new\n", "lines allowed.'\n", "[*]:url 'Another link reference\tdefinition'\n", ] ) # when reflowing with the max line length set medium long renderer.max_line_length = 30 lines = renderer.render(document) # then the content is reflowed accordingly assert lines == ( "[This is the *link label*.]:\n" "\n" "'title (with parens). new\n" "lines allowed.'\n" "[*]: url 'Another link\n" "reference definition'\n" ) def test_wordwrap_paragraph_in_list(self): with MarkdownRenderer() as renderer: # given some markdown with a nested list document = block_token.Document( [ "1. List item\n", "2. A second list item including:\n", " * Nested list.\n", " This is a continuation line\n", ] ) # when reflowing with the max line length set medium long renderer.max_line_length = 25 lines = renderer.render(document) # then the content is reflowed accordingly assert lines == ( "1. List item\n" "2. A second list item\n" " including:\n" " * Nested list. This is\n" " a continuation line\n" ) def test_wordwrap_paragraph_in_block_quote(self): with MarkdownRenderer() as renderer: # given some markdown with nested block quotes document = block_token.Document( [ "> Devouring Time, blunt thou the lion's paws,\n", "> And make the earth devour her own sweet brood;\n", "> > When Dawn strides out to wake a dewy farm\n", "> > Across green fields and yellow hills of hay\n", ] ) # when reflowing with the max line length set medium long renderer.max_line_length = 30 lines = renderer.render(document) # then the content is reflowed accordingly assert lines == ( "> Devouring Time, blunt thou\n" "> the lion's paws, And make\n" "> the earth devour her own\n" "> sweet brood;\n" "> > When Dawn strides out to\n" "> > wake a dewy farm Across\n" "> > green fields and yellow\n" "> > hills of hay\n" ) def test_wordwrap_tables(self): with MarkdownRenderer(max_line_length=30) as renderer: # given a markdown table input = [ "| header | x | |\n", "| ------ | ------------------------: | --------------- |\n", "| . | Performance improvements. | an extra column |\n", ] document = block_token.Document(input) # when reflowing lines = renderer.render(document) # then the table is rendered without any word wrapping assert lines == "".join(input) mistletoe-1.3.0/test/test_repr.py000066400000000000000000000131501455324047100170650ustar00rootroot00000000000000import unittest from mistletoe import Document from mistletoe import block_token class TestRepr(unittest.TestCase): def _check_repr_matches(self, token, expected_match): expected_match = " Foo") self._check_repr_matches(doc.children[0], "block_token.Quote with 1 child line_number=1") def test_paragraph(self): doc = Document("Foo") self._check_repr_matches(doc.children[0], "block_token.Paragraph with 1 child line_number=1") def test_blockcode(self): doc = Document("Foo\n\n\tBar\n\nBaz") self._check_repr_matches(doc.children[1], "block_token.BlockCode with 1 child line_number=3 language=''") def test_codefence(self): doc = Document("""```python\nprint("Hello, World!"\n```""") self._check_repr_matches(doc.children[0], "block_token.CodeFence with 1 child line_number=1 language='python'") def test_unordered_list(self): doc = Document("* Foo\n* Bar\n* Baz") self._check_repr_matches(doc.children[0], "block_token.List with 3 children line_number=1 loose=False start=None") self._check_repr_matches(doc.children[0].children[0], "block_token.ListItem with 1 child line_number=1 leader='*' indentation=0 prepend=2 loose=False") def test_ordered_list(self): doc = Document("1. Foo\n2. Bar\n3. Baz") self._check_repr_matches(doc.children[0], "block_token.List with 3 children line_number=1 loose=False start=1") self._check_repr_matches(doc.children[0].children[0], "block_token.ListItem with 1 child line_number=1 leader='1.' indentation=0 prepend=3 loose=False") def test_table(self): doc = Document("| Foo | Bar | Baz |\n|:--- |:---:| ---:|\n| Foo | Bar | Baz |\n") self._check_repr_matches(doc.children[0], "block_token.Table with 1 child line_number=1 column_align=[None, 0, 1]") self._check_repr_matches(doc.children[0].children[0], "block_token.TableRow with 3 children line_number=3 row_align=[None, 0, 1]") self._check_repr_matches(doc.children[0].children[0].children[0], "block_token.TableCell with 1 child line_number=3 align=None") def test_thematicbreak(self): doc = Document("Foo\n\n---\n\nBar\n") self._check_repr_matches(doc.children[1], "block_token.ThematicBreak line_number=3") # No test for ``Footnote`` def test_htmlblock(self): try: block_token.add_token(block_token.HtmlBlock) doc = Document("
    \nFoo\n
    \n") finally: block_token.reset_tokens() self._check_repr_matches(doc.children[0], "block_token.HtmlBlock with 1 child line_number=1") self._check_repr_matches(doc.children[0].children[0], "span_token.RawText content='
    \\nFoo\\n
    '") # Span tokens def test_strong(self): doc = Document("**foo**\n") self._check_repr_matches(doc.children[0].children[0], "span_token.Strong with 1 child") def test_emphasis(self): doc = Document("*foo*\n") self._check_repr_matches(doc.children[0].children[0], "span_token.Emphasis with 1 child") def test_inlinecode(self): doc = Document("`foo`\n") self._check_repr_matches(doc.children[0].children[0], "span_token.InlineCode with 1 child") def test_strikethrough(self): doc = Document("~~foo~~\n") self._check_repr_matches(doc.children[0].children[0], "span_token.Strikethrough with 1 child") def test_image(self): doc = Document("""![Foo](http://www.example.org/ "bar")\n""") self._check_repr_matches(doc.children[0].children[0], "span_token.Image with 1 child src='http://www.example.org/' title='bar'") def test_link(self): doc = Document("[Foo](http://www.example.org/)\n") self._check_repr_matches(doc.children[0].children[0], "span_token.Link with 1 child target='http://www.example.org/' title=''") def test_autolink(self): doc = Document("Foo \n") self._check_repr_matches(doc.children[0].children[1], "span_token.AutoLink with 1 child target='http://www.example.org/' mailto=False") def test_escapesequence(self): doc = Document("\\*\n") self._check_repr_matches(doc.children[0].children[0], "span_token.EscapeSequence with 1 child") def test_soft_linebreak(self): doc = Document("Foo\nBar\n") self._check_repr_matches(doc.children[0].children[1], "span_token.LineBreak content='' soft=True") def test_hard_linebreak(self): doc = Document("Foo\\\nBar\n") self._check_repr_matches(doc.children[0].children[1], "span_token.LineBreak content='\\\\' soft=False") def test_rawtext(self): doc = Document("Foo\n") self._check_repr_matches(doc.children[0].children[0], "span_token.RawText content='Foo'") mistletoe-1.3.0/test/test_span_token.py000066400000000000000000000202521455324047100202570ustar00rootroot00000000000000import unittest from unittest.mock import patch from mistletoe import span_token class TestBranchToken(unittest.TestCase): def setUp(self): self.addCleanup(lambda: span_token._token_types.__setitem__(-1, span_token.RawText)) patcher = patch('mistletoe.span_token.RawText') self.mock = patcher.start() span_token._token_types[-1] = self.mock self.addCleanup(patcher.stop) def _test_parse(self, token_cls, raw, arg, **kwargs): token = next(iter(span_token.tokenize_inner(raw))) self.assertIsInstance(token, token_cls) self._test_token(token, arg, **kwargs) return token def _test_token(self, token, arg, children=True, **kwargs): for attr, value in kwargs.items(): self.assertEqual(getattr(token, attr), value) if children: self.mock.assert_any_call(arg) class TestStrong(TestBranchToken): def test_parse(self): self._test_parse(span_token.Strong, '**some text**', 'some text') self._test_parse(span_token.Strong, '__some text__', 'some text') def test_strong_when_both_delimiter_run_lengths_are_multiples_of_3(self): tokens = iter(span_token.tokenize_inner('foo******bar*********baz')) self._test_token(next(tokens), 'foo', children=False) self._test_token(next(tokens), 'bar', children=True) self._test_token(next(tokens), '***baz', children=False) class TestEmphasis(TestBranchToken): def test_parse(self): self._test_parse(span_token.Emphasis, '*some text*', 'some text') self._test_parse(span_token.Emphasis, '_some text_', 'some text') def test_emphasis_with_straight_quote(self): tokens = iter(span_token.tokenize_inner('_Book Title_\'s author')) self._test_token(next(tokens), 'Book Title', children=True) self._test_token(next(tokens), '\'s author', children=False) def test_emphasis_with_smart_quote(self): tokens = iter(span_token.tokenize_inner('_Book Title_’s author')) self._test_token(next(tokens), 'Book Title', children=True) self._test_token(next(tokens), '’s author', children=False) def test_no_emphasis_for_underscore_without_punctuation(self): tokens = iter(span_token.tokenize_inner('_an example without_punctuation')) self._test_token(next(tokens), '_an example without_punctuation', children=True) def test_emphasis_for_asterisk_without_punctuation(self): tokens = iter(span_token.tokenize_inner('*an example without*punctuation')) self._test_token(next(tokens), 'an example without', children=True) self._test_token(next(tokens), 'punctuation', children=False) class TestInlineCode(TestBranchToken): def _test_parse_enclosed(self, encl_type, encl_delimiter): token = self._test_parse(encl_type, '{delim}`some text`{delim}'.format(delim=encl_delimiter), 'some text') self.assertEqual(len(token.children), 1) self.assertIsInstance(token.children[0], span_token.InlineCode) def test_parse(self): self._test_parse(span_token.InlineCode, '`some text`', 'some text') def test_parse_in_bold(self): self._test_parse_enclosed(span_token.Strong, '**') self._test_parse_enclosed(span_token.Strong, '__') def test_parse_in_emphasis(self): self._test_parse_enclosed(span_token.Emphasis, '*') self._test_parse_enclosed(span_token.Emphasis, '_') def test_parse_in_strikethrough(self): self._test_parse_enclosed(span_token.Strikethrough, '~~') def test_remove_space_if_present_on_both_sides(self): self._test_parse(span_token.InlineCode, '``` ```', ' ') self._test_parse(span_token.InlineCode, '` `` `', ' `` ') def test_preserve_escapes(self): self._test_parse(span_token.InlineCode, '`\\xa0b\\xa0`', '\\xa0b\\xa0') self._test_parse(span_token.InlineCode, '``\\`\\[``', '\\`\\[') class TestStrikethrough(TestBranchToken): def test_parse(self): self._test_parse(span_token.Strikethrough, '~~some text~~', 'some text') def test_parse_multiple(self): tokens = iter(span_token.tokenize_inner('~~one~~ ~~two~~')) self._test_token(next(tokens), 'one') self._test_token(next(tokens), 'two') class TestLink(TestBranchToken): def test_parse(self): self._test_parse(span_token.Link, '[name 1](target1)', 'name 1', target='target1', title='') def test_parse_multi_links(self): tokens = iter(span_token.tokenize_inner('[n1](t1) & [n2](t2)')) self._test_token(next(tokens), 'n1', target='t1') self._test_token(next(tokens), ' & ', children=False) self._test_token(next(tokens), 'n2', target='t2') def test_parse_children(self): token = next(iter(span_token.tokenize_inner('[![alt](src)](target)'))) child = next(iter(token.children)) self._test_token(child, 'alt', src='src') def test_parse_angle_bracketed_inline_link_with_space(self): self._test_parse(span_token.Link, '[link]( \'a title\')', 'link', target='/my uri', title='a title') class TestAutoLink(TestBranchToken): def test_parse(self): self._test_parse(span_token.AutoLink, '', 'ftp://foo.com', target='ftp://foo.com') class TestImage(TestBranchToken): def test_parse(self): self._test_parse(span_token.Image, '![alt](link)', 'alt', src='link') self._test_parse(span_token.Image, '![alt](link "title")', 'alt', src='link', title='title') def test_no_alternative_text(self): self._test_parse(span_token.Image, '![](link)', '', children=False, src='link') class TestEscapeSequence(TestBranchToken): def test_parse(self): self._test_parse(span_token.EscapeSequence, r'\*', '*') def test_parse_in_text(self): tokens = iter(span_token.tokenize_inner(r'some \*text*')) self._test_token(next(tokens), 'some ', children=False) self._test_token(next(tokens), '*') self._test_token(next(tokens), 'text*', children=False) class TestRawText(unittest.TestCase): def test_attribute(self): token = span_token.RawText('some text') self.assertEqual(token.content, 'some text') def test_no_children(self): token = span_token.RawText('some text') with self.assertRaises(AttributeError): token.children def test_valid_html_entities(self): tokens = span_token.tokenize_inner('  合') self.assertEqual(tokens[0].content, '\xa0 \u5408') def test_invalid_html_entities(self): text = '  &x; &#; &#x; � &#abcdef0; &ThisIsNotDefined; &hi?;' tokens = span_token.tokenize_inner(text) self.assertEqual(tokens[0].content, text) class TestLineBreak(unittest.TestCase): def test_parse_soft_break(self): token, = span_token.tokenize_inner('\n') self.assertIsInstance(token, span_token.LineBreak) self.assertTrue(token.soft) def test_parse_hard_break_with_double_blanks(self): token, = span_token.tokenize_inner(' \n') self.assertIsInstance(token, span_token.LineBreak) self.assertFalse(token.soft) def test_parse_hard_break_with_backslash(self): _, token, = span_token.tokenize_inner(' \\\n') self.assertIsInstance(token, span_token.LineBreak) self.assertFalse(token.soft) class TestContains(unittest.TestCase): def test_contains(self): token = next(iter(span_token.tokenize_inner('**with some *emphasis* text**'))) self.assertTrue('text' in token) self.assertTrue('emphasis' in token) self.assertFalse('foo' in token) class TestHtmlSpan(unittest.TestCase): def setUp(self): span_token.add_token(span_token.HtmlSpan) self.addCleanup(span_token.reset_tokens) def test_parse(self): tokens = span_token.tokenize_inner('
    ') self.assertIsInstance(tokens[0], span_token.HtmlSpan) self.assertEqual('', tokens[0].content) def test_parse_with_illegal_whitespace(self): tokens = span_token.tokenize_inner('< a><\nfoo>\n') for t in tokens: self.assertNotIsInstance(t, span_token.HtmlSpan) mistletoe-1.3.0/test/test_traverse.py000066400000000000000000000074221455324047100177550ustar00rootroot00000000000000from textwrap import dedent import unittest from mistletoe import Document from mistletoe.span_token import Strong from mistletoe.utils import traverse class TestTraverse(unittest.TestCase): def test(self): doc = Document("a **b** c **d**") filtered = [t.node.__class__.__name__ for t in traverse(doc)] self.assertEqual( filtered, [ "Paragraph", "RawText", "Strong", "RawText", "Strong", "RawText", "RawText", ], ) def test_with_included_source(self): doc = Document( dedent( """\ a **b** c [*d*](link) """ ) ) tree = [ ( t.node.__class__.__name__, t.parent.__class__.__name__ if t.parent else None, t.depth ) for t in traverse(doc, include_source=True) ] self.assertEqual( tree, [ ('Document', None, 0), ('Paragraph', 'Document', 1), ('Paragraph', 'Document', 1), ('RawText', 'Paragraph', 2), ('Strong', 'Paragraph', 2), ('RawText', 'Paragraph', 2), ('Link', 'Paragraph', 2), ('RawText', 'Strong', 3), ('Emphasis', 'Link', 3), ('RawText', 'Emphasis', 4), ] ) def test_with_class_filter(self): doc = Document("a **b** c **d**") filtered = [t.node.__class__.__name__ for t in traverse(doc, klass=Strong)] self.assertEqual(filtered, ["Strong", "Strong"]) def test_with_included_source_and_class_filter(self): doc = Document("a **b** c **d**") filtered = [ t.node.__class__.__name__ for t in traverse(doc, include_source=True, klass=Strong) ] self.assertEqual(filtered, ["Strong", "Strong"]) def test_with_depth_limit(self): doc = Document("a **b** c **d**") # Zero depth with root not included yields no nodes. filtered = [t.node.__class__.__name__ for t in traverse(doc, depth=0)] self.assertEqual(filtered, []) # Zero depth with root included yields the root node. filtered = [ t.node.__class__.__name__ for t in traverse(doc, depth=0, include_source=True) ] self.assertEqual(filtered, ["Document"]) # Depth=1 correctly returns the single node at that level. filtered = [t.node.__class__.__name__ for t in traverse(doc, depth=1)] self.assertEqual(filtered, ["Paragraph"]) # Depth=2 returns the correct nodes. filtered = [t.node.__class__.__name__ for t in traverse(doc, depth=2)] self.assertEqual( filtered, ["Paragraph", "RawText", "Strong", "RawText", "Strong"] ) # Depth=3 returns the correct nodes (all nodes in the tree). filtered = [t.node.__class__.__name__ for t in traverse(doc, depth=3)] self.assertEqual( filtered, [ "Paragraph", "RawText", "Strong", "RawText", "Strong", "RawText", "RawText", ], ) # Verify there are no additional nodes at depth=4. filtered = [t.node.__class__.__name__ for t in traverse(doc, depth=4)] self.assertEqual( filtered, [ "Paragraph", "RawText", "Strong", "RawText", "Strong", "RawText", "RawText", ], ) mistletoe-1.3.0/tox.ini000066400000000000000000000003521455324047100150400ustar00rootroot00000000000000[flake8] # See https://www.flake8rules.com/ for the full list of error codes. extend-ignore = E124,E126,E127,E128,E501 # For the case we activated E501. The GitHub editor is 127 chars wide. max-line-length = 127 max-complexity = 10