pax_global_header 0000666 0000000 0000000 00000000064 14066642277 0014530 g ustar 00root root 0000000 0000000 52 comment=d3022d34a4ea5eb8d4edf9bcd231d51966248e36 turndown-7.1.1/ 0000775 0000000 0000000 00000000000 14066642277 0013416 5 ustar 00root root 0000000 0000000 turndown-7.1.1/.gitignore 0000664 0000000 0000000 00000000065 14066642277 0015407 0 ustar 00root root 0000000 0000000 dist lib node_modules npm-debug.log test/*browser.js turndown-7.1.1/.tm_properties 0000664 0000000 0000000 00000000071 14066642277 0016311 0 ustar 00root root 0000000 0000000 [test/index.html] scopeAttributes = attr.keep-whitespace turndown-7.1.1/.travis.yml 0000664 0000000 0000000 00000000057 14066642277 0015531 0 ustar 00root root 0000000 0000000 language: node_js node_js: - "node" - "10" turndown-7.1.1/LICENSE 0000664 0000000 0000000 00000002055 14066642277 0014425 0 ustar 00root root 0000000 0000000 MIT License Copyright (c) 2017 Dom Christie Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. turndown-7.1.1/README.md 0000664 0000000 0000000 00000022307 14066642277 0014701 0 ustar 00root root 0000000 0000000 # Turndown [](https://travis-ci.org/domchristie/turndown) Convert HTML into Markdown with JavaScript. ## Project Updates * `to-markdown` has been renamed to Turndown. See the [migration guide](https://github.com/domchristie/to-markdown/wiki/Migrating-from-to-markdown-to-Turndown) for details. * Turndown repository has changed its URL to https://github.com/mixmark-io/turndown. ## Installation npm: ``` npm install turndown ``` Browser: ```html ``` For usage with RequireJS, UMD versions are located in `lib/turndown.umd.js` (for Node.js) and `lib/turndown.browser.umd.js` for browser usage. These files are generated when the npm package is published. To generate them manually, clone this repo and run `npm run build`. ## Usage ```js // For Node.js var TurndownService = require('turndown') var turndownService = new TurndownService() var markdown = turndownService.turndown('
Hello worldWorld
Hello worldWorld
` elements is as follows: ```js { filter: 'p', replacement: function (content) { return '\n\n' + content + '\n\n' } } ``` The filter selects `
` elements, and the replacement function returns the `
` contents separated by two new lines. ### `filter` String|Array|Function The filter property determines whether or not an element should be replaced with the rule's `replacement`. DOM nodes can be selected simply using a tag name or an array of tag names: * `filter: 'p'` will select `
` elements
* `filter: ['em', 'i']` will select `` or `` elements
Alternatively, the filter can be a function that returns a boolean depending on whether a given node should be replaced. The function is passed a DOM node as well as the `TurndownService` options. For example, the following rule selects `` elements (with an `href`) when the `linkStyle` option is `inlined`:
```js
filter: function (node, options) {
return (
options.linkStyle === 'inlined' &&
node.nodeName === 'A' &&
node.getAttribute('href')
)
}
```
### `replacement` Function
The replacement function determines how an element should be converted. It should return the Markdown string for a given node. The function is passed the node's content, the node itself, and the `TurndownService` options.
The following rule shows how `` elements are converted:
```js
rules.emphasis = {
filter: ['em', 'i'],
replacement: function (content, node, options) {
return options.emDelimiter + content + options.emDelimiter
}
}
```
### Special Rules
**Blank rule** determines how to handle blank elements. It overrides every rule (even those added via `addRule`). A node is blank if it only contains whitespace, and it's not an ``, ` Lorem ipsum Lorem ipsum sit next: A paragraph. Another paragraph. List item with paragraph This is a paragraph in a list item. This is a paragraph in the same list item as above. A paragraph in a second list item. A list item with a blockquote: This is a blockquote inside a list item. This is a paragraph within a blockquote. This is another paragraph within a blockquote. This is the first level of quoting. This is a paragraph in a nested blockquote. Back to the first level. A code block: * * ** It aims to be* _Really_? Is that what it _is_? A **2000** year-old computer? Foo Bar Foo Bar Foo Bar Hello Hello Hello world Hello Hello world Hello `,` ` or a void element. Its behaviour can be customised using the `blankReplacement` option.
**Keep rules** determine how to handle the elements that should not be converted, i.e. rendered as HTML in the Markdown output. By default, no elements are kept. Block-level elements will be separated from surrounding content by blank lines. Its behaviour can be customised using the `keepReplacement` option.
**Remove rules** determine which elements to remove altogether. By default, no elements are removed.
**Default rule** handles nodes which are not recognised by any other rule. By default, it outputs the node's text content (separated by blank lines if it is a block-level element). Its behaviour can be customised with the `defaultReplacement` option.
### Rule Precedence
Turndown iterates over the set of rules, and picks the first one that matches the `filter`. The following list describes the order of precedence:
1. Blank rule
2. Added rules (optional)
3. Commonmark rules
4. Keep rules
5. Remove rules
6. Default rule
## Plugins
The plugin API provides a convenient way for developers to apply multiple extensions. A plugin is just a function that is called with the `TurndownService` instance.
## Escaping Markdown Characters
Turndown uses backslashes (`\`) to escape Markdown characters in the HTML input. This ensures that these characters are not interpreted as Markdown when the output is compiled back to HTML. For example, the contents of ` 1. Hello world
` needs to be escaped to `1\. Hello world`, otherwise it will be interpreted as a list item rather than a heading.
To avoid the complexity and the performance implications of parsing the content of every HTML element as Markdown, Turndown uses a group of regular expressions to escape potential Markdown syntax. As a result, the escaping rules can be quite aggressive.
### Overriding `TurndownService.prototype.escape`
If you are confident in doing so, you may want to customise the escaping behaviour to suit your needs. This can be done by overriding `TurndownService.prototype.escape`. `escape` takes the text of each HTML element and should return a version with the Markdown characters escaped.
Note: text in code elements is never passed to`escape`.
## License
turndown is copyright © 2017+ Dom Christie and released under the MIT license.
turndown-7.1.1/SECURITY.md 0000664 0000000 0000000 00000003110 14066642277 0015202 0 ustar 00root root 0000000 0000000 # Security Policy
## Supported Versions
| Version | Supported | Remark |
| ------- | ------------------ | -------|
| 7.0.x | :white_check_mark: | |
| < 7.0 | :x: | jsdom |
## DOM Parser Notice
Turndown input is
* either a string that is passed to a DOM parser
* or an `HTMLElement` referring to an already built DOM tree
When a string input is passed, the DOM parser is picked as follows.
* For web browser usage, the corresponding native web parser is used, which is typically `DOMImplementation`.
* For standalone usage, [domino](https://github.com/fgnass/domino) parser is used.
Please note that a malicious string input can cause undesired effects within the DOM parser
even before Turndown code starts processing the document itself.
These effects especially include script execution and downloading external resources.
For critical applications with untrusted inputs, you should consider either cleaning up
the input with a dedicated HTML sanitizer library or using an alternate DOM parser that
better suits your security needs.
In particular, Turndown version 6 and below used [jsdom](https://github.com/jsdom/jsdom) as the
standalone DOM parser. As `jsdom` is a fully featured DOM parser with script execution support,
it imposes an inherent security risk. We recommend upgrading to version 7, which uses
[domino](https://github.com/fgnass/domino) that doesn't even support executing scripts nor
downloading external resources.
## Reporting a Vulnerability
If you've found a vulnerability, please report it to disclosure@orchitech.cz and we'll get back to you.
turndown-7.1.1/config/ 0000775 0000000 0000000 00000000000 14066642277 0014663 5 ustar 00root root 0000000 0000000 turndown-7.1.1/config/rollup.config.browser.cjs.js 0000664 0000000 0000000 00000000262 14066642277 0022242 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.browser.cjs.js',
format: 'cjs',
exports: 'auto'
},
browser: true
})
turndown-7.1.1/config/rollup.config.browser.es.js 0000664 0000000 0000000 00000000233 14066642277 0022070 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.browser.es.js',
format: 'es'
},
browser: true
})
turndown-7.1.1/config/rollup.config.browser.umd.js 0000664 0000000 0000000 00000000272 14066642277 0022251 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.browser.umd.js',
format: 'umd',
name: 'TurndownService'
},
browser: true
})
turndown-7.1.1/config/rollup.config.cjs.js 0000664 0000000 0000000 00000000253 14066642277 0020560 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.cjs.js',
format: 'cjs',
exports: 'auto'
},
browser: false
})
turndown-7.1.1/config/rollup.config.es.js 0000664 0000000 0000000 00000000224 14066642277 0020406 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.es.js',
format: 'es'
},
browser: false
})
turndown-7.1.1/config/rollup.config.iife.js 0000664 0000000 0000000 00000000260 14066642277 0020713 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'dist/turndown.js',
format: 'iife',
name: 'TurndownService'
},
browser: true
})
turndown-7.1.1/config/rollup.config.js 0000664 0000000 0000000 00000000656 14066642277 0020011 0 ustar 00root root 0000000 0000000 import commonjs from '@rollup/plugin-commonjs'
import replace from '@rollup/plugin-replace'
import resolve from '@rollup/plugin-node-resolve'
export default function (config) {
return {
input: 'src/turndown.js',
output: config.output,
external: ['domino'],
plugins: [
commonjs(),
replace({ 'process.browser': JSON.stringify(!!config.browser), preventAssignment: true }),
resolve()
]
}
}
turndown-7.1.1/config/rollup.config.umd.js 0000664 0000000 0000000 00000000241 14066642277 0020563 0 ustar 00root root 0000000 0000000 import config from './rollup.config'
export default config({
output: {
file: 'lib/turndown.umd.js',
format: 'umd',
name: 'TurndownService'
}
})
turndown-7.1.1/index.html 0000664 0000000 0000000 00000014502 14066642277 0015415 0 ustar 00root root 0000000 0000000
turndown
Source on GitHub
Lorem ipsum
Lorem
ipsum
sit
_em element_
_i element_
**strong element**
**b element**
code element
`code element`
There is a literal backtick (`) here
``There is a literal backtick (`) here``
here are three ``` here are four ```` that's it
`here are three ``` here are four ```` that's it`
here are three ``` here are four ```` here is one ` that's it
``here are three ``` here are four ```` here is one ` that's it``
`starting with a backtick
`` `starting with a backtick ``
_emphasis_
`_emphasis_`
_emphasis_
`_emphasis_`
Level One Heading
Level One Heading
=================
\===
A sentence containing =
Level One Heading with ATX
# Level One Heading with ATX
Level Two Heading
Level Two Heading
-----------------
Level Two Heading with ATX
## Level Two Heading with ATX
Level Three Heading
### Level Three Heading
Level Four Heading with
child
#### Level Four Heading with `child`
Level Seven Heading?
* * *
* * *
- - -
after the breakMore
after the break
after the breakMore\
after the break






[An anchor](http://example.com)
[An anchor](http://example.com "Title for link")
[An anchor](http://example.com "Title for
link")
Anchor without a title
[Some `code`](http://example.com/code)
[Reference link][1]
[1]: http://example.com
[Reference link with collapsed style][]
[Reference link with collapsed style]: http://example.com
[Reference link with shortcut style]
[Reference link with shortcut style]: http://example.com
def code_block
# 42 < 9001
"Hello world!"
end
def code_block
# 42 < 9001
"Hello world!"
end
def first_code_block
# 42 < 9001
"Hello world!"
end
def second_code_block
# 42 < 9001
"Hello world!"
end
def first_code_block
# 42 < 9001
"Hello world!"
end
next:
def second_code_block
# 42 < 9001
"Hello world!"
end
Multiple new lines
should not be
removed
Multiple new lines
should not be
removed
def a_fenced_code block; end
```
def a_fenced_code block; end
```
def a_fenced_code block; end
~~~
def a_fenced_code block; end
~~~
~~~ foo
\~~~ foo
A sentence containing ~~~
def a_fenced_code block; end
```ruby
def a_fenced_code block; end
```
1. Ordered list item 1
2. Ordered list item 2
3. Ordered list item 3
42. Ordered list item 42
43. Ordered list item 43
44. Ordered list item 44
A paragraph.
1. Ordered list item 1
2. Ordered list item 2
3. Ordered list item 3
Another paragraph.
* Unordered list item 1
* Unordered list item 2
* Unordered list item 3
* Unordered list item 1
* Unordered list item 2
* Unordered list item 3
- Unordered list item 1
- Unordered list item 2
- Unordered list item 3
* List item with paragraph
* List item without paragraph
1. This is a paragraph in a list item.
This is a paragraph in the same list item as above.
2. A paragraph in a second list item.
* This is a list item at root level
* This is another item at root level
* * This is a nested list item
* This is another nested list item
* * This is a deeply nested list item
* This is another deeply nested list item
* This is a third deeply nested list item
* This is a third item at root level
* This is a list item at root level
* This is another item at root level
* 1. This is a nested list item
2. This is another nested list item
3. * This is a deeply nested list item
* This is another deeply nested list item
* This is a third deeply nested list item
* This is a third item at root level
* A list item with a blockquote:
> This is a blockquote inside a list item.
> This is a paragraph within a blockquote.
>
> This is another paragraph within a blockquote.
> This is the first level of quoting.
>
> > This is a paragraph in a nested blockquote.
>
> Back to the first level.
This is a header.
return 1 < 2 ? shell_exec('echo $input | $markdown_script') : 0;
> This is a header.
> -----------------
>
> 1. This is the first list item.
> 2. This is the second list item.
>
> A code block:
>
> return 1 < 2 ? shell_exec('echo $input | $markdown_script') : 0;
A div
Another div
A div
Another div
Hello world
Hello world
h3 with leading whitespace
### h3 with leading whitespace
1. Chapter One
1. Section One
2. Section Two with trailing whitespace
3. Section Three with trailing whitespace
2. Chapter Two
3. Chapter Three with trailing whitespace
* Indented li with leading/trailing newlines
* **Strong with trailing space inside li with leading/trailing whitespace**
* li without whitespace
* Leading space, text, lots of whitespace … text
Text with no space after the period. _Text in em with leading/trailing spaces_ **text in strong with trailing space**
Text at root **[link text with trailing space in strong](http://www.example.com)** more text at root
Text before blank em … text after blank em
Text before blank div …
text after blank div
Content in a nested div
Content in another div
backslash \\
\### This is not a heading
#This is not # a heading
To add emphasis, surround text with \*. For example: \*this is emphasis\*
To add emphasis, surround text with \_. For example: \_this is emphasis\_
def this_is_a_method; end;
def this_is_a_method; end;
To add strong emphasis, surround text with \*\*. For example: \*\*this is strong\*\*
To add strong emphasis, surround text with \_\_. For example: \_\_this is strong\_\_
\* \* \*
\- - -
\_ \_ \_
\*\*\*
\* \* \* \* \*
1984\. by George Orwell
1984.George Orwell wrote 1984.
\* An unordered list item
\- An unordered list item
\+ An unordered list item
Hello-world, 45 - 3 is 42
+1 and another +
You can use \* for multiplication
**\*\*test**
_test\_italics_
\> Blockquote in markdown
\>Blockquote in markdown
42 > 1
\`not code\`
\[This\] is a sentence with brackets
[c\[iao](http://www.example.com)
fasdf \*883 asdf wer qweasd fsd asdf asdfaqwe rqwefrsdf
\* \* \*\* It aims to be\*
\_Really\_? Is that what it \_is\_? A \*\*2000\*\* year-old computer?
Foo
Bar
Baz
Foo Bar
Hello world


Foo Bar
Foo Bar
~~~
Code
~~~
~~~~
~~~
Code
~~~
~~~~
```
Code
```
````
```
Code
```
````
````
Code
````
`````
````
Code
````
`````
Code
```
Code
```
Foo Bar
2. Second 1. First
2. Second
_foo_ bar
_foo_ bar
_foo_ bar
_foo_ bar
foo _bar_
foo _bar_
foo _bar_
foo _bar_
make an indented code block in Markdown
Four spaces ` make an indented code block in Markdown`
A line break
note the spaces`A line break ` **note the spaces**
code
wrap**tight**`code`**wrap**
code
wrap**not so tight** `code` **wrap**
nasty
code
` nasty code `
worldWorldworldWorldworldWorld'
)
})
test('#keep returns the TurndownService instance for chaining', function (t) {
t.plan(1)
var turndownService = new TurndownService()
t.equal(turndownService.keep(['del', 'ins']), turndownService)
})
test('keep rules are overridden by the standard rules', function (t) {
t.plan(1)
var turndownService = new TurndownService()
turndownService.keep('p')
t.equal(turndownService.turndown('worldWorldworld\n\nWorld'
)
})
test('#remove removes elements', function (t) {
t.plan(2)
var turndownService = new TurndownService()
var input = 'Please redact me'
// Without `.remove('del')`
t.equal(turndownService.turndown(input), 'Please redact me')
// With `.remove('del')`
turndownService.remove('del')
t.equal(turndownService.turndown(input), '')
})
test('#remove returns the TurndownService instance for chaining', function (t) {
t.plan(1)
var turndownService = new TurndownService()
t.equal(turndownService.remove(['del', 'ins']), turndownService)
})
test('remove elements are overridden by rules', function (t) {
t.plan(1)
var turndownService = new TurndownService()
turndownService.remove('p')
t.equal(turndownService.turndown('worldWorldworldWorld'
)
})