'
```
See [corresponding blog post](http://perfectionkills.com/experimenting-with-html-minifier) for all the gory details of [how it works](http://perfectionkills.com/experimenting-with-html-minifier#how_it_works), [description of each option](http://perfectionkills.com/experimenting-with-html-minifier#options), [testing results](http://perfectionkills.com/experimenting-with-html-minifier#field_testing) and [conclusions](http://perfectionkills.com/experimenting-with-html-minifier#cost_and_benefits).
Also see corresponding [Ruby wrapper](https://github.com/stereobooster/html_minifier), and for Node.js, [Grunt plugin](https://github.com/gruntjs/grunt-contrib-htmlmin), [Gulp plugin](https://github.com/pioug/gulp-html-minifier-terser), [Koa middleware wrapper](https://github.com/koajs/html-minifier) and [Express middleware wrapper](https://github.com/melonmanchan/express-minify-html).
For lint-like capabilities take a look at [HTMLLint](https://github.com/kangax/html-lint).
## Minification comparison
How does HTMLMinifier compare to other solutions — [HTML Minifier from Will Peavy](http://www.willpeavy.com/minifier/) (1st result in [Google search for "html minifier"](https://www.google.com/#q=html+minifier)) as well as [htmlcompressor.com](http://htmlcompressor.com) and [minimize](https://github.com/Swaagie/minimize)?
| Site | Original size *(KB)* | HTMLMinifier | minimize | Will Peavy | htmlcompressor.com |
| ---------------------------------------------------------------------------- |:--------------------:| ------------:| --------:| ----------:| ------------------:|
| [Google](https://www.google.com/) | 52 | **48** | 52 | 54 | n/a |
| [Stack Overflow](https://stackoverflow.com/) | 177 | **143** | 154 | 154 | n/a |
| [HTMLMinifier](https://github.com/kangax/html-minifier) | 252 | **171** | 230 | 250 | n/a |
| [Bootstrap CSS](https://getbootstrap.com/docs/3.3/css/) | 271 | **260** | 269 | 229 | n/a |
| [BBC](https://www.bbc.co.uk/) | 355 | **324** | 353 | 344 | n/a |
| [Amazon](https://www.amazon.co.uk/) | 466 | **430** | 456 | 474 | n/a |
| [Twitter](https://twitter.com/) | 469 | **394** | 462 | 513 | n/a |
| [Wikipedia](https://en.wikipedia.org/wiki/President_of_the_United_States) | 703 | **569** | 682 | 708 | n/a |
| [Eloquent Javascript](https://eloquentjavascript.net/1st_edition/print.html) | 870 | **815** | 840 | 864 | n/a |
| [NBC](https://www.nbc.com/) | 1701 | **1566** | 1689 | 1705 | n/a |
| [New York Times](https://www.nytimes.com/) | 1731 | **1583** | 1726 | 1680 | n/a |
| [ES draft](https://tc39.github.io/ecma262/) | 6296 | **5538** | 5733 | n/a | n/a |
## Options Quick Reference
Most of the options are disabled by default.
| Option | Description | Default |
|--------------------------------|-----------------|---------|
| `caseSensitive` | Treat attributes in case sensitive manner (useful for custom HTML tags) | `false` |
| `collapseBooleanAttributes` | [Omit attribute values from boolean attributes](http://perfectionkills.com/experimenting-with-html-minifier#collapse_boolean_attributes) | `false` |
| `collapseInlineTagWhitespace` | Don't leave any spaces between `display:inline;` elements when collapsing. Must be used in conjunction with `collapseWhitespace=true` | `false` |
| `collapseWhitespace` | [Collapse white space that contributes to text nodes in a document tree](http://perfectionkills.com/experimenting-with-html-minifier#collapse_whitespace) | `false` |
| `conservativeCollapse` | Always collapse to 1 space (never remove it entirely). Must be used in conjunction with `collapseWhitespace=true` | `false` |
| `continueOnParseError` | [Handle parse errors](https://html.spec.whatwg.org/multipage/parsing.html#parse-errors) instead of aborting. | `false` |
| `customAttrAssign` | Arrays of regex'es that allow to support custom attribute assign expressions (e.g. `''`) | `[ ]` |
| `customAttrCollapse` | Regex that specifies custom attribute to strip newlines from (e.g. `/ng-class/`) | |
| `customAttrSurround` | Arrays of regex'es that allow to support custom attribute surround expressions (e.g. ``) | `[ ]` |
| `customEventAttributes` | Arrays of regex'es that allow to support custom event attributes for `minifyJS` (e.g. `ng-click`) | `[ /^on[a-z]{3,}$/ ]` |
| `decodeEntities` | Use direct Unicode characters whenever possible | `false` |
| `html5` | Parse input according to HTML5 specifications | `true` |
| `ignoreCustomComments` | Array of regex'es that allow to ignore certain comments, when matched | `[ /^!/, /^\s*#/ ]` |
| `ignoreCustomFragments` | Array of regex'es that allow to ignore certain fragments, when matched (e.g. ``, `{{ ... }}`, etc.) | `[ /<%[\s\S]*?%>/, /<\?[\s\S]*?\?>/ ]` |
| `includeAutoGeneratedTags` | Insert tags generated by HTML parser | `true` |
| `keepClosingSlash` | Keep the trailing slash on singleton elements | `false` |
| `maxLineLength` | Specify a maximum line length. Compressed output will be split by newlines at valid HTML split-points |
| `minifyCSS` | Minify CSS in style elements and style attributes (uses [clean-css](https://github.com/jakubpawlowicz/clean-css)) | `false` (could be `true`, `Object`, `Function(text, type)`) |
| `minifyJS` | Minify JavaScript in script elements and event attributes (uses [Terser](https://github.com/terser/terser)) | `false` (could be `true`, `Object`, `Function(text, inline)`) |
| `minifyURLs` | Minify URLs in various attributes (uses [relateurl](https://github.com/stevenvachon/relateurl)) | `false` (could be `String`, `Object`, `Function(text)`) |
| `noNewlinesBeforeTagClose` | Never add a newline before a tag that closes an element | `false` |
| `preserveLineBreaks` | Always collapse to 1 line break (never remove it entirely) when whitespace between tags include a line break. Must be used in conjunction with `collapseWhitespace=true` | `false` |
| `preventAttributesEscaping` | Prevents the escaping of the values of attributes | `false` |
| `processConditionalComments` | Process contents of conditional comments through minifier | `false` |
| `processScripts` | Array of strings corresponding to types of script elements to process through minifier (e.g. `text/ng-template`, `text/x-handlebars-template`, etc.) | `[ ]` |
| `quoteCharacter` | Type of quote to use for attribute values (' or ") | |
| `removeAttributeQuotes` | [Remove quotes around attributes when possible](http://perfectionkills.com/experimenting-with-html-minifier#remove_attribute_quotes) | `false` |
| `removeComments` | [Strip HTML comments](http://perfectionkills.com/experimenting-with-html-minifier#remove_comments) | `false` |
| `removeEmptyAttributes` | [Remove all attributes with whitespace-only values](http://perfectionkills.com/experimenting-with-html-minifier#remove_empty_or_blank_attributes) | `false` (could be `true`, `Function(attrName, tag)`) |
| `removeEmptyElements` | [Remove all elements with empty contents](http://perfectionkills.com/experimenting-with-html-minifier#remove_empty_elements) | `false` |
| `removeOptionalTags` | [Remove optional tags](http://perfectionkills.com/experimenting-with-html-minifier#remove_optional_tags) | `false` |
| `removeRedundantAttributes` | [Remove attributes when value matches default.](http://perfectionkills.com/experimenting-with-html-minifier#remove_redundant_attributes) | `false` |
| `removeScriptTypeAttributes` | Remove `type="text/javascript"` from `script` tags. Other `type` attribute values are left intact | `false` |
| `removeStyleLinkTypeAttributes`| Remove `type="text/css"` from `style` and `link` tags. Other `type` attribute values are left intact | `false` |
| `removeTagWhitespace` | Remove space between attributes whenever possible. **Note that this will result in invalid HTML!** | `false` |
| `sortAttributes` | [Sort attributes by frequency](#sorting-attributes--style-classes) | `false` |
| `sortClassName` | [Sort style classes by frequency](#sorting-attributes--style-classes) | `false` |
| `trimCustomFragments` | Trim white space around `ignoreCustomFragments`. | `false` |
| `useShortDoctype` | [Replaces the `doctype` with the short (HTML5) doctype](http://perfectionkills.com/experimenting-with-html-minifier#use_short_doctype) | `false` |
### Sorting attributes / style classes
Minifier options like `sortAttributes` and `sortClassName` won't impact the plain-text size of the output. However, they form long repetitive chains of characters that should improve compression ratio of gzip used in HTTP compression.
## Special cases
### Ignoring chunks of markup
If you have chunks of markup you would like preserved, you can wrap them ``.
### Minifying JSON-LD
You can minify script tags with JSON-LD by setting the option `{ processScripts: ['application/ld+json'] }`. Note that this minification is very rudimentary, it is mainly useful for removing newlines and excessive whitespace.
### Preserving SVG tags
SVG tags are automatically recognized, and when they are minified, both case-sensitivity and closing-slashes are preserved, regardless of the minification settings used for the rest of the file.
### Working with invalid markup
HTMLMinifier **can't work with invalid or partial chunks of markup**. This is because it parses markup into a tree structure, then modifies it (removing anything that was specified for removal, ignoring anything that was specified to be ignored, etc.), then it creates a markup out of that tree and returns it.
Input markup (e.g. `
foo`)
↓
Internal representation of markup in a form of tree (e.g. `{ tag: "p", attr: "id", children: ["foo"] }`)
↓
Transformation of internal representation (e.g. removal of `id` attribute)
↓
Output of resulting markup (e.g. `
foo
`)
HTMLMinifier can't know that original markup was only half of the tree; it does its best to try to parse it as a full tree and it loses information about tree being malformed or partial in the beginning. As a result, it can't create a partial/malformed tree at the time of the output.
## Running benchmarks
Benchmarks for minified HTML:
```shell
cd benchmarks
npm install
npm run benchmark
```
## Running local server
```shell
npm run serve
```
html-minifier-terser-7.2.0/benchmarks/ 0000775 0000000 0000000 00000000000 14415170120 0017676 5 ustar 00root root 0000000 0000000 html-minifier-terser-7.2.0/benchmarks/backtest.cjs 0000664 0000000 0000000 00000012340 14415170120 0022177 0 ustar 00root root 0000000 0000000 #!/usr/bin/env node
'use strict';
const childProcess = require('child_process');
const fs = require('fs');
const os = require('os');
const path = require('path');
const Progress = require('progress');
const urls = require('./sites.json');
const fileNames = Object.keys(urls);
function git() {
const args = [].concat.apply([], [].slice.call(arguments, 0, -1));
const callback = arguments[arguments.length - 1];
const task = childProcess.spawn('git', args, { stdio: ['ignore', 'pipe', 'ignore'] });
let output = '';
task.stdout.setEncoding('utf8');
task.stdout.on('data', function (data) {
output += data;
});
task.on('exit', function (code) {
callback(code, output);
});
}
function readText(filePath, callback) {
fs.readFile(filePath, { encoding: 'utf8' }, callback);
}
function writeText(filePath, data) {
fs.writeFile(filePath, data, { encoding: 'utf8' }, function (err) {
if (err) {
throw err;
}
});
}
function loadModule() {
require('./src/htmlparser');
return require('./src/htmlminifier').minify || global.minify;
}
function getOptions(fileName, options) {
const result = {
minifyURLs: {
site: urls[fileName]
}
};
for (const key in options) {
result[key] = options[key];
}
return result;
}
function minify(hash, options) {
const minify = loadModule();
process.send('ready');
let count = fileNames.length;
fileNames.forEach(function (fileName) {
readText(path.join('./', fileName + '.html'), function (err, data) {
if (err) {
throw err;
} else {
try {
const minified = minify(data, getOptions(fileName, options));
if (minified) {
process.send({ name: fileName, size: minified.length });
} else {
throw new Error('unexpected result: ' + minified);
}
} catch (e) {
console.error('[' + fileName + ']', e.stack || e);
} finally {
if (!--count) {
process.disconnect();
}
}
}
});
});
}
function print(table) {
const output = [];
const errors = [];
let row = fileNames.slice(0);
row.unshift('hash', 'date');
output.push(row.join(','));
for (const hash in table) {
const data = table[hash];
row = [hash, '"' + data.date + '"'];
fileNames.forEach(function (fileName) {
row.push(data[fileName]);
});
output.push(row.join(','));
if (data.error) {
errors.push(hash + ' - ' + data.error);
}
}
writeText('backtest.csv', output.join('\n'));
writeText('backtest.log', errors.join('\n'));
}
if (process.argv.length > 2) {
const count = +process.argv[2];
if (count) {
git('log', '--date=iso', '--pretty=format:%h %cd', '-' + count, function (code, data) {
const table = {};
const commits = data.split(/\s*?\n/).map(function (line) {
const index = line.indexOf(' ');
const hash = line.substr(0, index);
table[hash] = {
date: line.substr(index + 1).replace('+', '').replace(/ 0000$/, '')
};
return hash;
});
const nThreads = os.cpus().length;
let running = 0;
const progress = new Progress('[:bar] :etas', {
width: 50,
total: commits.length * 2
});
function fork() {
if (commits.length && running < nThreads) {
const hash = commits.shift();
const task = childProcess.fork('./backtest', { silent: true });
let error = '';
const id = setTimeout(function () {
if (task.connected) {
error += 'task timed out\n';
task.kill();
}
}, 60000);
task.on('message', function (data) {
if (data === 'ready') {
progress.tick(1);
fork();
} else {
table[hash][data.name] = data.size;
}
}).on('exit', function () {
progress.tick(1);
clearTimeout(id);
if (error) {
table[hash].error = error;
}
if (!--running && !commits.length) {
print(table);
} else {
fork();
}
});
task.stderr.setEncoding('utf8');
task.stderr.on('data', function (data) {
error += data;
});
task.stdout.resume();
task.send(hash);
running++;
}
}
fork();
});
} else {
console.error('Invalid input:', process.argv[2]);
}
} else {
process.on('message', function (hash) {
const paths = ['src', 'benchmark.conf', 'sample-cli-config-file.conf'];
git('reset', 'HEAD', '--', paths, function () {
let conf = 'sample-cli-config-file.conf';
function checkout() {
const path = paths.shift();
git('checkout', hash, '--', path, function (code) {
if (code === 0 && path === 'benchmark.conf') {
conf = path;
}
if (paths.length) {
checkout();
} else {
readText(conf, function (err, data) {
if (err) {
throw err;
} else {
minify(hash, JSON.parse(data));
}
});
}
});
}
checkout();
});
});
}
html-minifier-terser-7.2.0/benchmarks/benchmark.cjs 0000664 0000000 0000000 00000032723 14415170120 0022340 0 ustar 00root root 0000000 0000000 #!/usr/bin/env node
'use strict';
const fs = require('fs');
const zlib = require('zlib');
const https = require('https');
const path = require('path');
const { fork } = require('child_process');
const chalk = require('chalk');
const lzma = require('lzma');
const Minimize = require('minimize');
const Progress = require('progress');
const Table = require('cli-table3');
const urls = require('./sites.json');
const fileNames = Object.keys(urls);
const minimize = new Minimize();
const progress = new Progress('[:bar] :etas :fileName', {
width: 50,
total: fileNames.length
});
const table = new Table({
head: ['File', 'Before', 'After', 'Minimize', 'Will Peavy', 'htmlcompressor.com', 'Savings', 'Time'],
colWidths: [fileNames.reduce(function (length, fileName) {
return Math.max(length, fileName.length);
}, 0) + 2, 25, 25, 25, 25, 25, 20, 10]
});
function toKb(size, precision) {
return (size / 1024).toFixed(precision || 0);
}
function redSize(size) {
return chalk.red.bold(size) + chalk.white(' (' + toKb(size, 2) + ' KB)');
}
function greenSize(size) {
return chalk.green.bold(size) + chalk.white(' (' + toKb(size, 2) + ' KB)');
}
function blueSavings(oldSize, newSize) {
const savingsPercent = (1 - newSize / oldSize) * 100;
const savings = oldSize - newSize;
return chalk.cyan.bold(savingsPercent.toFixed(2)) + chalk.white('% (' + toKb(savings, 2) + ' KB)');
}
function blueTime(time) {
return chalk.cyan.bold(time) + chalk.white(' ms');
}
function readBuffer(filePath, callback) {
fs.readFile(filePath, function (err, data) {
if (err) {
throw new Error('There was an error reading ' + filePath);
}
callback(data);
});
}
function readText(filePath, callback) {
fs.readFile(filePath, { encoding: 'utf8' }, function (err, data) {
if (err) {
throw new Error('There was an error reading ' + filePath);
}
callback(data);
});
}
function writeBuffer(filePath, data, callback) {
fs.writeFile(filePath, data, function (err) {
if (err) {
throw new Error('There was an error writing ' + filePath);
}
callback();
});
}
function writeText(filePath, data, callback) {
fs.writeFile(filePath, data, { encoding: 'utf8' }, function (err) {
if (err) {
throw new Error('There was an error writing ' + filePath);
}
if (callback) {
callback();
}
});
}
function readSize(filePath, callback) {
fs.stat(filePath, function (err, stats) {
if (err) {
throw new Error('There was an error reading ' + filePath);
}
callback(stats.size);
});
}
function gzip(inPath, outPath, callback) {
fs.createReadStream(inPath).pipe(zlib.createGzip({
level: zlib.constants.Z_BEST_COMPRESSION
})).pipe(fs.createWriteStream(outPath)).on('finish', callback);
}
function brotli(inPath, outPath, callback) {
fs.createReadStream(inPath).pipe(zlib.createBrotliCompress())
.pipe(fs.createWriteStream(outPath)).on('finish', callback);
}
function run(tasks, done) {
let i = 0;
function callback() {
if (i < tasks.length) {
tasks[i++](callback);
} else {
done();
}
}
callback();
}
const rows = {};
function generateMarkdownTable() {
const headers = [
'Site',
'Original size *(KB)*',
'HTMLMinifier',
'minimize',
'Will Peavy',
'htmlcompressor.com'
];
fileNames.forEach(function (fileName) {
const row = rows[fileName].report;
row[2] = '**' + row[2] + '**';
});
const widths = headers.map(function (header, index) {
let width = header.length;
fileNames.forEach(function (fileName) {
width = Math.max(width, rows[fileName].report[index].length);
});
return width;
});
let content = '';
function output(row) {
widths.forEach(function (width, index) {
const text = row[index];
content += '| ' + text + new Array(width - text.length + 2).join(' ');
});
content += '|\n';
}
output(headers);
widths.forEach(function (width, index) {
content += '|';
content += index === 1 ? ':' : ' ';
content += new Array(width + 1).join('-');
content += index === 0 ? ' ' : ':';
});
content += '|\n';
fileNames.sort(function (a, b) {
const r = +rows[a].report[1];
const s = +rows[b].report[1];
return r < s ? -1 : r > s ? 1 : a < b ? -1 : a > b ? 1 : 0;
}).forEach(function (fileName) {
output(rows[fileName].report);
});
return content;
}
function displayTable() {
fileNames.forEach(function (fileName) {
table.push(rows[fileName].display);
});
console.log();
console.log(table.toString());
}
run(fileNames.map(function (fileName) {
const filePath = path.join('./sources', fileName + '.html');
function processFile(site, done) {
const original = {
filePath: filePath,
gzFilePath: path.join('./generated/', fileName + '.html.gz'),
lzFilePath: path.join('./generated/', fileName + '.html.lz'),
brFilePath: path.join('./generated/', fileName + '.html.br')
};
const infos = {};
['minifier', 'minimize', 'willpeavy', 'compressor'].forEach(function (name) {
infos[name] = {
filePath: path.join('./generated/', fileName + '.' + name + '.html'),
gzFilePath: path.join('./generated/', fileName + '.' + name + '.html.gz'),
lzFilePath: path.join('./generated/', fileName + '.' + name + '.html.lz'),
brFilePath: path.join('./generated/', fileName + '.' + name + '.html.br')
};
});
function readSizes(info, done) {
info.endTime = Date.now();
run([
// Apply Gzip on minified output
function (done) {
gzip(info.filePath, info.gzFilePath, function () {
info.gzTime = Date.now();
// Open and read the size of the minified+gzip output
readSize(info.gzFilePath, function (size) {
info.gzSize = size;
done();
});
});
},
// Apply LZMA on minified output
function (done) {
readBuffer(info.filePath, function (data) {
lzma.compress(data, 1, function (result, error) {
if (error) {
throw error;
}
writeBuffer(info.lzFilePath, Buffer.from(result), function () {
info.lzTime = Date.now();
// Open and read the size of the minified+lzma output
readSize(info.lzFilePath, function (size) {
info.lzSize = size;
done();
});
});
});
});
},
// Apply Brotli on minified output
function (done) {
brotli(info.filePath, info.brFilePath, function () {
info.brTime = Date.now();
// Open and read the size of the minified+gzip output
readSize(info.brFilePath, function (size) {
info.brSize = size;
done();
});
});
},
// Open and read the size of the minified output
function (done) {
readSize(info.filePath, function (size) {
info.size = size;
done();
});
}
], done);
}
function testHTMLMinifier(done) {
const info = infos.minifier;
info.startTime = Date.now();
const args = [filePath, '-c', 'sample-cli-config-file.conf', '--minify-urls', site, '-o', info.filePath];
fork('../cli', args).on('exit', function () {
readSizes(info, done);
});
}
function testMinimize(done) {
readBuffer(filePath, function (data) {
minimize.parse(data, function (_, data) {
const info = infos.minimize;
writeBuffer(info.filePath, data, function () {
readSizes(info, done);
});
});
});
}
function testWillPeavy(done) {
readText(filePath, function (data) {
const url = new URL('https://www.willpeavy.com/tools/minifier/');
const options = {
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded'
}
};
https.request(url, options, function (res) {
res.setEncoding('utf8');
let response = '';
res.on('data', function (chunk) {
response += chunk;
}).on('end', function () {
const info = infos.willpeavy;
if (res.statusCode === 200) {
// Extract result from
const start = response.indexOf('>', response.indexOf('
'],
['', ' '],
// closing slash and optional attribute quotes removed by minifier, but not by PHPTAL
// attribute ordering differences between minifier and PHPTAL
['', ''],
['', ''],
['
\n\n\ntest
', '
\n\n\ntest
'],
/* single line-break preceding
is redundant, assuming
is block element
['
test
', '
\ntest
'], */
// closing slash and optional attribute quotes removed by minifier, but not by PHPTAL
// attribute ordering differences between minifier and PHPTAL
// redundant inter-attribute spacing removed by minifier, but not by PHPTAL
['', ''],
/* minifier does not optimise in HTML5 mode
['', ''], */
/* minifier does not optimise in HTML5 mode
[
'',
''
], */
// minifier removes more javascript type attributes than PHPTAL
['', '']
/* trim "title" attribute value in
[
'Foo
';
expect(await minify(input, {
collapseWhitespace: true,
canTrimWhitespace: canCollapseAndTrimWhitespace,
canCollapseWhitespace: canCollapseAndTrimWhitespace
})).toBe(output);
// Regression test: Previously the first would clear the internal
// stackNo{Collapse,Trim}Whitespace, so that ' foo bar' turned into ' foo bar'
input = '
foo bar
';
output = '
foo bar
';
expect(await minify(input, {
collapseWhitespace: true,
canTrimWhitespace: canCollapseAndTrimWhitespace,
canCollapseWhitespace: canCollapseAndTrimWhitespace
})).toBe(output);
// Make sure that the stack does get reset when leaving the element for which
// the hooks returned false:
input = '