pax_global_header00006660000000000000000000000064144706757240014532gustar00rootroot0000000000000052 comment=7b3692c79555e981e3cdc6da1fdfabf5e7b0a4ea icdiff-release-2.0.7/000077500000000000000000000000001447067572400144025ustar00rootroot00000000000000icdiff-release-2.0.7/.github/000077500000000000000000000000001447067572400157425ustar00rootroot00000000000000icdiff-release-2.0.7/.github/workflows/000077500000000000000000000000001447067572400177775ustar00rootroot00000000000000icdiff-release-2.0.7/.github/workflows/release.yml000066400000000000000000000013721447067572400221450ustar00rootroot00000000000000name: Release Package on: push: tags: ["release-*"] permissions: contents: read jobs: build_package: name: Build Package runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Build sdist and wheel run: pipx run build --sdist --wheel - uses: actions/upload-artifact@v3 with: path: dist pypi-publish: needs: [build_package] name: Upload release to PyPI runs-on: ubuntu-latest environment: name: pypi permissions: id-token: write steps: - uses: actions/download-artifact@v3 with: name: artifact path: dist - name: Publish package distributions to PyPI uses: pypa/gh-action-pypi-publish@release/v1 icdiff-release-2.0.7/.gitignore000066400000000000000000000000071447067572400163670ustar00rootroot00000000000000/venv/ icdiff-release-2.0.7/.travis.yml000066400000000000000000000003271447067572400165150ustar00rootroot00000000000000language: python sudo: required dist: xenial python: - "3.4" - "3.5" - "3.6" - "3.7" - "3.8" - "3.9" - "nightly" script: - pip install -r requirements-dev.txt - ./test.sh python git: depth: falseicdiff-release-2.0.7/ChangeLog000066400000000000000000000063511447067572400161610ustar00rootroot000000000000002.0.7 Add --show-no-spaces #173 2.0.6 Fix process exit code for non-file inputs #205 Make fullwidth characters take two columns #202 Add -x/--exclude optin to to exclude files matching patterns #199 2.0.5 Set process exit code to indicate differences #195 Support -P/--permissions option #197 2.0.4 Include LICENSE in package 2.0.3 Attempts to publisher 2.0.1 and 2.0.2 to pypi gave broken packages 2.0.1 Add -t/--truncate option #184 2.0.0 Drop support for Python 2 Clearer error on unknown encodings #145 Stop printing an error on closed pipes #156 Fix displayed filed names with git #159 Fix testcase that assumed a git repo #166 Implement --report-identical-files #168 Improved handling of very long lines #180 1.9.4 Allow {path} and {basename} in --label #139 Properly implement git difftool protocol #140 1.9.3 Properly set the version number 1.9.2 Add --exclude-lines (-E) which can exclude comments Add --color-map so you can choose which colors to use for what #121 #117 Allow highlighted characters to be bold #122 Support configuring git-icdiff with gitconfig Don't choke on bad terminal sizes #113 Print proper error messages instead of raising exceptions Allow the line numbers to be colorized Add a LICENSE file 1.9.1 Handle files with CR characters better and add --strip-trailing-cr 1.9.0 Fix setup.py by symlinking icdiff to icdiff.py 1.8.2 Add short flags for --highlight (-H), --line-numbers (-N), and --whole-file (-W). Fix use with bash process substitution and other special files 1.8.1 Updated remaining copy of unicode test file (b/1) 1.8.0 Updated unicode test file (input-3) Allow testing installed version Allow importing as a module Minor deduplication tweak to git-icdiff Add pip instructions to readme Allow using --tabsize Allow non-recursive directory diffing 1.7.6 Fixed copyright. 1.7.3 Fix git-icdiff to handle filenames with spaces as arguments. 1.7.2 Don't stop diffing recursively when encountering a binary file. 1.7.1 Don't treat files with identical (mode, size, mtime) as equal. 1.7.0 Add tests 1.6.4 Unbreak --recursive again 1.6.3 Stop setting LESS_IS_MORE with git-icdiff, fixing #33. 1.6.2 Add support for setting the output encoding and default to utf8 1.6.1 Unbreak --recursive 1.6.0 Add support for setting the encoding, and handle fullwidth chars in python2 1.5.3 Support use as an svn difftool. Support -U and -L, and allow but ignore -u. 1.5.2 Various pager improvements in git-icdiff. 1.5.1 Make --highlight and --no-bold play nice. 1.5.0 Pass arguments through to icdiff when using git-icdiff. 1.4.0 Use less with "git icdiff" by default. 1.3.2 Fix linewrapping with unicode. 1.3.1 1.3.0 was completely borked. 1.3.0 Use setup.py to support standard python installation. 1.2.2 Start printing output as soon as its ready instead of waiting for the whole file to complete. 1.2.1 Space fullwidth characters properly when treating input as unicode. 1.2.0 Add --recursive to support diffing directory trees. 1.1.2 Flush stdout when done. 1.1.1 Don't print stack traces on Ctrl+C or when piping into something that quits. 1.1.0 Add --no-bold option useful with the solarized colorscheme and for people who just don't like bold. 1.0.0 First Release icdiff-release-2.0.7/LICENSE000066400000000000000000000307331447067572400154150ustar00rootroot00000000000000A. HISTORY OF THE SOFTWARE ========================== Python was created in the early 1990s by Guido van Rossum at Stichting Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands as a successor of a language called ABC. Guido remains Python's principal author, although it includes many contributions from others. In 1995, Guido continued his work on Python at the Corporation for National Research Initiatives (CNRI, see http://www.cnri.reston.va.us) in Reston, Virginia where he released several versions of the software. In May 2000, Guido and the Python core development team moved to BeOpen.com to form the BeOpen PythonLabs team. In October of the same year, the PythonLabs team moved to Digital Creations, which became Zope Corporation. In 2001, the Python Software Foundation (PSF, see https://www.python.org/psf/) was formed, a non-profit organization created specifically to own Python-related Intellectual Property. Zope Corporation was a sponsoring member of the PSF. All Python releases are Open Source (see http://www.opensource.org for the Open Source Definition). Historically, most, but not all, Python releases have also been GPL-compatible; the table below summarizes the various releases. Release Derived Year Owner GPL- from compatible? (1) 0.9.0 thru 1.2 1991-1995 CWI yes 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes 1.6 1.5.2 2000 CNRI no 2.0 1.6 2000 BeOpen.com no 1.6.1 1.6 2001 CNRI yes (2) 2.1 2.0+1.6.1 2001 PSF no 2.0.1 2.0+1.6.1 2001 PSF yes 2.1.1 2.1+2.0.1 2001 PSF yes 2.1.2 2.1.1 2002 PSF yes 2.1.3 2.1.2 2002 PSF yes 2.2 and above 2.1.1 2001-now PSF yes Footnotes: (1) GPL-compatible doesn't mean that we're distributing Python under the GPL. All Python licenses, unlike the GPL, let you distribute a modified version without making your changes open source. The GPL-compatible licenses make it possible to combine Python with other software that is released under the GPL; the others don't. (2) According to Richard Stallman, 1.6.1 is not GPL-compatible, because its license has a choice of law clause. According to CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1 is "not incompatible" with the GPL. Thanks to the many outside volunteers who have worked under Guido's direction to make these releases possible. B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON =============================================================== PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 -------------------------------------------- 1. This LICENSE AGREEMENT is between the Python Software Foundation ("PSF"), and the Individual or Organization ("Licensee") accessing and otherwise using this software ("Python") in source or binary form and its associated documentation. 2. Subject to the terms and conditions of this License Agreement, PSF hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python alone or in any derivative version, provided, however, that PSF's License Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 Python Software Foundation; All Rights Reserved" are retained in Python alone or in any derivative version prepared by Licensee. 3. In the event Licensee prepares a derivative work that is based on or incorporates Python or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python. 4. PSF is making Python available to Licensee on an "AS IS" basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. 5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 6. This License Agreement will automatically terminate upon a material breach of its terms and conditions. 7. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between PSF and Licensee. This License Agreement does not grant permission to use PSF trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party. 8. By copying, installing or otherwise using Python, Licensee agrees to be bound by the terms and conditions of this License Agreement. BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0 ------------------------------------------- BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1 1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the Individual or Organization ("Licensee") accessing and otherwise using this software in source or binary form and its associated documentation ("the Software"). 2. Subject to the terms and conditions of this BeOpen Python License Agreement, BeOpen hereby grants Licensee a non-exclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use the Software alone or in any derivative version, provided, however, that the BeOpen Python License is retained in the Software, alone or in any derivative version prepared by Licensee. 3. BeOpen is making the Software available to Licensee on an "AS IS" basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. 4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 5. This License Agreement will automatically terminate upon a material breach of its terms and conditions. 6. This License Agreement shall be governed by and interpreted in all respects by the law of the State of California, excluding conflict of law provisions. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between BeOpen and Licensee. This License Agreement does not grant permission to use BeOpen trademarks or trade names in a trademark sense to endorse or promote products or services of Licensee, or any third party. As an exception, the "BeOpen Python" logos available at http://www.pythonlabs.com/logos.html may be used according to the permissions granted on that web page. 7. By copying, installing or otherwise using the software, Licensee agrees to be bound by the terms and conditions of this License Agreement. CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1 --------------------------------------- 1. This LICENSE AGREEMENT is between the Corporation for National Research Initiatives, having an office at 1895 Preston White Drive, Reston, VA 20191 ("CNRI"), and the Individual or Organization ("Licensee") accessing and otherwise using Python 1.6.1 software in source or binary form and its associated documentation. 2. Subject to the terms and conditions of this License Agreement, CNRI hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 1.6.1 alone or in any derivative version, provided, however, that CNRI's License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) 1995-2001 Corporation for National Research Initiatives; All Rights Reserved" are retained in Python 1.6.1 alone or in any derivative version prepared by Licensee. Alternately, in lieu of CNRI's License Agreement, Licensee may substitute the following text (omitting the quotes): "Python 1.6.1 is made available subject to the terms and conditions in CNRI's License Agreement. This Agreement together with Python 1.6.1 may be located on the Internet using the following unique, persistent identifier (known as a handle): 1895.22/1013. This Agreement may also be obtained from a proxy server on the Internet using the following URL: http://hdl.handle.net/1895.22/1013". 3. In the event Licensee prepares a derivative work that is based on or incorporates Python 1.6.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 1.6.1. 4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. 5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. 6. This License Agreement will automatically terminate upon a material breach of its terms and conditions. 7. This License Agreement shall be governed by the federal intellectual property law of the United States, including without limitation the federal copyright law, and, to the extent such U.S. federal law does not apply, by the law of the Commonwealth of Virginia, excluding Virginia's conflict of law provisions. Notwithstanding the foregoing, with regard to derivative works based on Python 1.6.1 that incorporate non-separable material that was previously distributed under the GNU General Public License (GPL), the law of the Commonwealth of Virginia shall govern this License Agreement only as to issues arising under or with respect to Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between CNRI and Licensee. This License Agreement does not grant permission to use CNRI trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party. 8. By clicking on the "ACCEPT" button where indicated, or by copying, installing or otherwise using Python 1.6.1, Licensee agrees to be bound by the terms and conditions of this License Agreement. ACCEPT CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2 -------------------------------------------------- Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, The Netherlands. All rights reserved. Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Stichting Mathematisch Centrum or CWI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. icdiff-release-2.0.7/MANIFEST.in000066400000000000000000000000421447067572400161340ustar00rootroot00000000000000include README.md include LICENSE icdiff-release-2.0.7/README.md000066400000000000000000000112231447067572400156600ustar00rootroot00000000000000# Icdiff Improved colored diff ![screenshot](https://www.jefftk.com/icdiff-css-demo.png) ## Installation Download the [latest](https://github.com/jeffkaufman/icdiff/tags) `icdiff` and put it on your PATH. Alternatively, install with packaging tools: ``` # pip pip install icdiff # apt sudo apt install icdiff # homebrew brew install icdiff # aur yay -S icdiff # nix nix-env -i icdiff ``` ## Usage ```sh icdiff [options] left_file right_file ``` Show differences between files in a two column view. ### Options ``` --version show program's version number and exit -h, --help show this help message and exit --cols=COLS specify the width of the screen. Autodetection is Unix only --encoding=ENCODING specify the file encoding; defaults to utf8 -E MATCHER, --exclude-lines=MATCHER Do not diff lines that match this regex. Not compatible with the 'line-numbers' option --head=HEAD consider only the first N lines of each file -H, --highlight color by changing the background color instead of the foreground color. Very fast, ugly, displays all changes -L LABELS, --label=LABELS override file labels with arbitrary tags. Use twice, one for each file -N, --line-numbers generate output with line numbers. Not compatible with the 'exclude-lines' option. --no-bold use non-bold colors; recommended for solarized --no-headers don't label the left and right sides with their file names --output-encoding=OUTPUT_ENCODING specify the output encoding; defaults to utf8 -r, --recursive recursively compare subdirectories -s, --report-identical-files report when two files are the same --show-all-spaces color all non-matching whitespace including that which is not needed for drawing the eye to changes. Slow, ugly, displays all changes --tabsize=TABSIZE tab stop spacing -t, --truncate truncate long lines instead of wrapping them -u, --patch generate patch. This is always true, and only exists for compatibility -U NUM, --unified=NUM, --numlines=NUM how many lines of context to print; can't be combined with --whole-file -W, --whole-file show the whole file instead of just changed lines and context --strip-trailing-cr strip any trailing carriage return at the end of an input line --color-map=COLOR_MAP choose which colors are used for which items. Default is --color-map='add:green_bold,change:yellow_bold,desc ription:blue,meta:magenta,separator:blue,subtract:red_ bold'. You don't have to override all of them: '--color-map=separator:white,description:cyan ``` ## Using with Git To see what it looks like, try: ```sh git difftool --extcmd icdiff ``` To install this as a tool you can use with Git, copy `git-icdiff` into your PATH and run: ```sh git icdiff ``` You can configure `git-icdiff` in Git's config: ``` git config --global icdiff.options '--highlight --line-numbers' ``` ## Using with subversion To try it out, run: ```sh svn diff --diff-cmd icdiff ``` ## Using with Mercurial Add the following to your `~/.hgrc`: ```sh [extensions] extdiff= [extdiff] cmd.icdiff=icdiff opts.icdiff=--recursive --line-numbers ``` Or check more [in-depth setup instructions](https://ianobermiller.com/blog/2016/07/14/side-by-side-diffs-for-mercurial-hg-icdiff-revisited/). ## Setting up a dev environment Create a virtualenv and install the dev dependencies. This is not needed for normal usage. ```sh virtualenv venv source venv/bin/activate pip install -r requirements-dev.txt ``` ## Running tests ```sh ./test.sh python3 ``` ## Making a release - Update ChangeLog with all the changes since the last release - Update `__version__` in `icdiff` - Run tests, make sure they pass - `git commit -a -m "release ${version}"` - `git push` - `git tag release-${version}` - `git push origin release-${version}` - A GitHub Action should be triggered due to the release tag being pushed, and will upload to PyPI. ## License This file is derived from `difflib.HtmlDiff` which is under [license](https://www.python.org/download/releases/2.6.2/license/). I release my changes here under the same license. This is GPL compatible. icdiff-release-2.0.7/git-icdiff000077500000000000000000000007221447067572400163360ustar00rootroot00000000000000#!/bin/sh ICDIFF_OPTIONS=$(git config --get icdiff.options) ICDIFF_OPTIONS="${ICDIFF_OPTIONS} --is-git-diff" GITPAGER=$(git config --get icdiff.pager) if [ -z "$GITPAGER" ]; then GITPAGER=$(git config --get core.pager) fi if [ -z "$GITPAGER" ]; then GITPAGER="${PAGER:-less}" fi if [ "${GITPAGER%% *}" = "more" ] || [ "${GITPAGER%% *}" = "less" ]; then GITPAGER="$GITPAGER -R" fi git difftool --no-prompt --extcmd="icdiff $ICDIFF_OPTIONS" "$@" | $GITPAGER icdiff-release-2.0.7/icdiff000077500000000000000000000764461447067572400155750ustar00rootroot00000000000000#!/usr/bin/env python3 """ icdiff.py Author: Jeff Kaufman, derived from difflib.HtmlDiff License: This code is usable under the same open terms as the rest of python. See: http://www.python.org/psf/license/ Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Python Software Foundation; All Rights Reserved Based on Python's difflib.HtmlDiff, with changes to provide console output instead of html output. """ import os import stat import sys import errno import difflib from optparse import Option, OptionParser import re import filecmp import unicodedata import codecs import fnmatch __version__ = "2.0.7" # Exit code constants EXIT_CODE_SUCCESS = 0 EXIT_CODE_DIFF = 1 EXIT_CODE_ERROR = 2 color_codes = { "black": "\033[0;30m", "red": "\033[0;31m", "green": "\033[0;32m", "yellow": "\033[0;33m", "blue": "\033[0;34m", "magenta": "\033[0;35m", "cyan": "\033[0;36m", "white": "\033[0;37m", "none": "\033[m", "black_bold": "\033[1;30m", "red_bold": "\033[1;31m", "green_bold": "\033[1;32m", "yellow_bold": "\033[1;33m", "blue_bold": "\033[1;34m", "magenta_bold": "\033[1;35m", "cyan_bold": "\033[1;36m", "white_bold": "\033[1;37m", } color_mapping = { "add": "green_bold", "subtract": "red_bold", "change": "yellow_bold", "separator": "blue", "description": "blue", "permissions": "yellow", "meta": "magenta", "line-numbers": "white", } class ConsoleDiff(object): """Console colored side by side comparison with change highlights. Based on difflib.HtmlDiff This class can be used to create a text-mode table showing a side by side, line by line comparison of text with inter-line and intra-line change highlights in ansi color escape sequences as intra-line change highlights in ansi color escape sequences as read by xterm. The table can be generated in either full or contextual difference mode. To generate the table, call make_table. Usage is the almost the same as HtmlDiff except only make_table is implemented and the file can be invoked on the command line. Run:: python icdiff.py --help for command line usage information. """ def __init__( self, tabsize=8, wrapcolumn=None, linejunk=None, charjunk=difflib.IS_CHARACTER_JUNK, cols=80, line_numbers=False, show_all_spaces=False, show_no_spaces=False, highlight=False, truncate=False, strip_trailing_cr=False, ): """ConsoleDiff instance initializer Arguments: tabsize -- tab stop spacing, defaults to 8. wrapcolumn -- column number where lines are broken and wrapped, defaults to None where lines are not wrapped. linejunk, charjunk -- keyword arguments passed into ndiff() (used by ConsoleDiff() to generate the side by side differences). See ndiff() documentation for argument default values and descriptions. """ self._tabsize = tabsize self.line_numbers = line_numbers self.cols = cols self.show_all_spaces = show_all_spaces self.show_no_spaces = show_no_spaces self.highlight = highlight self.strip_trailing_cr = strip_trailing_cr self.truncate = truncate if wrapcolumn is None: if not line_numbers: wrapcolumn = self.cols // 2 - 2 else: wrapcolumn = self.cols // 2 - 10 self._wrapcolumn = wrapcolumn self._linejunk = linejunk self._charjunk = charjunk def _tab_newline_replace(self, fromlines, tolines): """Returns from/to line lists with tabs expanded and newlines removed. Instead of tab characters being replaced by the number of spaces needed to fill in to the next tab stop, this function will fill the space with tab characters. This is done so that the difference algorithms can identify changes in a file when tabs are replaced by spaces and vice versa. At the end of the table generation, the tab characters will be replaced with a space. """ def expand_tabs(line): # hide real spaces line = line.replace(" ", "\0") # expand tabs into spaces line = line.expandtabs(self._tabsize) # replace spaces from expanded tabs back into tab characters # (we'll replace them with markup after we do differencing) line = line.replace(" ", "\t") return line.replace("\0", " ").rstrip("\n") fromlines = [expand_tabs(line) for line in fromlines] tolines = [expand_tabs(line) for line in tolines] return fromlines, tolines def _strip_trailing_cr(self, lines): """Remove windows return carriage""" lines = [line.rstrip("\r") for line in lines] return lines def _all_cr_nl(self, lines): """Whether a file is entirely \r\n line endings""" return all(line.endswith("\r") for line in lines) def _display_len(self, s): # Handle wide characters like Chinese. def width(c): if isinstance(c, type("")) and unicodedata.east_asian_width(c) in [ "F", "W", ]: return 2 elif c == "\r": return 2 return 1 return sum(width(c) for c in s) def _split_line(self, data_list, line_num, text): """Builds list of text lines by splitting text lines at wrap point This function will determine if the input text line needs to be wrapped (split) into separate lines. If so, the first wrap point will be determined and the first line appended to the output text line list. This function is used recursively to handle the second part of the split line to further split it. """ while True: # if blank line or context separator, just add it to the output # list if not line_num: data_list.append((line_num, text)) return # if line text doesn't need wrapping, just add it to the output # list if ( self._display_len(text) - (text.count("\0") * 3) <= self._wrapcolumn ): data_list.append((line_num, text)) return # scan text looking for the wrap point, keeping track if the wrap # point is inside markers i = 0 n = 0 mark = "" while n < self._wrapcolumn and i < len(text): if text[i] == "\0": i += 1 mark = text[i] i += 1 elif text[i] == "\1": i += 1 mark = "" else: n += self._display_len(text[i]) i += 1 # wrap point is inside text, break it up into separate lines line1 = text[:i] line2 = text[i:] # if wrap point is inside markers, place end marker at end of first # line and start marker at beginning of second line because each # line will have its own table tag markup around it. if mark: line1 = line1 + "\1" line2 = "\0" + mark + line2 # tack on first line onto the output list data_list.append((line_num, line1)) # use this routine again to wrap the remaining text # unless truncate is set if self.truncate: return line_num = ">" text = line2 def _line_wrapper(self, diffs): """Returns iterator that splits (wraps) mdiff text lines""" # pull from/to data and flags from mdiff iterator for fromdata, todata, flag in diffs: # check for context separators and pass them through if flag is None: yield fromdata, todata, flag continue (fromline, fromtext), (toline, totext) = fromdata, todata # for each from/to line split it at the wrap column to form # list of text lines. fromlist, tolist = [], [] self._split_line(fromlist, fromline, fromtext) self._split_line(tolist, toline, totext) # yield from/to line in pairs inserting blank lines as # necessary when one side has more wrapped lines while fromlist or tolist: if fromlist: fromdata = fromlist.pop(0) else: fromdata = ("", " ") if tolist: todata = tolist.pop(0) else: todata = ("", " ") yield fromdata, todata, flag def _collect_lines(self, diffs): """Collects mdiff output into separate lists Before storing the mdiff from/to data into a list, it is converted into a single line of text with console markup. """ # pull from/to data and flags from mdiff style iterator for fromdata, todata, flag in diffs: if (fromdata, todata, flag) == (None, None, None): yield None else: yield ( self._format_line(*fromdata), self._format_line(*todata), ) def _format_line(self, linenum, text): text = text.rstrip() if not self.line_numbers: return text return self._add_line_numbers(linenum, text) def _add_line_numbers(self, linenum, text): try: lid = "%d" % linenum except TypeError: # handle blank lines where linenum is '>' or '' lid = "" return text return "%s %s" % ( self._rpad(simple_colorize(str(lid), "line-numbers"), 8), text, ) def _real_len(self, s): s_len = 0 in_esc = False prev = " " for c in replace_all( {"\0+": "", "\0-": "", "\0^": "", "\1": "", "\t": " "}, s ): if in_esc: if c == "m": in_esc = False else: if c == "[" and prev == "\033": in_esc = True s_len -= 1 # we counted prev when we shouldn't have else: s_len += self._display_len(c) prev = c return s_len def _rpad(self, s, field_width): return self._pad(s, field_width) + s def _pad(self, s, field_width): return " " * (field_width - self._real_len(s)) def _lpad(self, s, field_width): return s + self._pad(s, field_width) def make_table( self, fromlines, tolines, fromdesc="", todesc="", fromperms=None, toperms=None, context=False, numlines=5, ): """Generates table of side by side comparison with change highlights Arguments: fromlines -- list of "from" lines tolines -- list of "to" lines fromdesc -- "from" file column header string todesc -- "to" file column header string fromperms -- "from" file permissions toperms -- "to" file permissions context -- set to True for contextual differences (defaults to False which shows full differences). numlines -- number of context lines. When context is set True, controls number of lines displayed before and after the change. When context is False, controls the number of lines to place the "next" link anchors before the next change (so click of "next" link jumps to just before the change). """ if context: context_lines = numlines else: context_lines = None # change tabs to spaces before it gets more difficult after we insert # markup fromlines, tolines = self._tab_newline_replace(fromlines, tolines) if self.strip_trailing_cr or ( self._all_cr_nl(fromlines) and self._all_cr_nl(tolines) ): fromlines = self._strip_trailing_cr(fromlines) tolines = self._strip_trailing_cr(tolines) # create diffs iterator which generates side by side from/to data diffs = difflib._mdiff( fromlines, tolines, context_lines, linejunk=self._linejunk, charjunk=self._charjunk, ) # set up iterator to wrap lines that exceed desired width if self._wrapcolumn: diffs = self._line_wrapper(diffs) diffs = self._collect_lines(diffs) for left, right in self._generate_table( fromdesc, todesc, fromperms, toperms, diffs ): yield self.colorize( "%s %s" % ( self._lpad(left, self.cols // 2 - 1), self._lpad(right, self.cols // 2 - 1), ) ) def _generate_table(self, fromdesc, todesc, fromperms, toperms, diffs): if fromdesc or todesc: yield ( simple_colorize(fromdesc, "description"), simple_colorize(todesc, "description"), ) if fromperms != toperms: yield ( simple_colorize( f"{stat.filemode(fromperms)} ({fromperms:o})", "permissions", ), simple_colorize( f"{stat.filemode(toperms)} ({toperms:o})", "permissions" ), ) for i, line in enumerate(diffs): if line is None: # mdiff yields None on separator lines; skip the bogus ones # generated for the first line if i > 0: yield ( simple_colorize("---", "separator"), simple_colorize("---", "separator"), ) else: yield line def colorize(self, s): def background(color): return replace_all( {"\033[1;": "\033[7;1;", "\033[0;": "\033[7;"}, color ) C_ADD = color_codes[color_mapping["add"]] C_SUB = color_codes[color_mapping["subtract"]] C_CHG = color_codes[color_mapping["change"]] if self.highlight: C_ADD, C_SUB, C_CHG = ( background(C_ADD), background(C_SUB), background(C_CHG), ) C_NONE = color_codes["none"] colors = (C_ADD, C_SUB, C_CHG, C_NONE) s = replace_all( { "\0+": C_ADD, "\0-": C_SUB, "\0^": C_CHG, "\1": C_NONE, "\t": " ", "\r": "\\r", }, s, ) if self.highlight: return s if self.show_no_spaces: # Don't show whitespace even if it's a whitespace-only line. return re.sub( "\033\\[[01];3([01234567])m(\\s+)(\033\\[)", "\033[0;3\\1m\\2\\3", s, ) elif not self.show_all_spaces: # If there's a change consisting entirely of whitespace, # don't color it. return re.sub( "\033\\[[01];3([01234567])m(\\s+)(\033\\[)", "\033[7;3\\1m\\2\\3", s, ) def will_see_coloredspace(i, s): while i < len(s) and s[i].isspace(): i += 1 if i < len(s) and s[i] == "\033": return False return True n_s = [] in_color = False seen_coloredspace = False for i, c in enumerate(s): if len(n_s) > 6 and n_s[-1] == "m": ns_end = "".join(n_s[-7:]) for color in colors: if ns_end.endswith(color): if color != in_color: seen_coloredspace = False in_color = color if ns_end.endswith(C_NONE): in_color = False if ( c.isspace() and in_color and ( self.show_all_spaces or not (seen_coloredspace or will_see_coloredspace(i, s)) ) ): n_s.extend([C_NONE, background(in_color), c, C_NONE, in_color]) else: if in_color: seen_coloredspace = True n_s.append(c) joined = "".join(n_s) return joined def raw_colorize(s, color): return "%s%s%s" % (color_codes[color], s, color_codes["none"]) def simple_colorize(s, category): return raw_colorize(s, color_mapping[category]) def replace_all(replacements, string): for search, replace in replacements.items(): string = string.replace(search, replace) return string class MultipleOption(Option): ACTIONS = Option.ACTIONS + ("extend",) STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",) TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",) ALWAYS_TYPED_ACTIONS = Option.ALWAYS_TYPED_ACTIONS + ("extend",) def take_action(self, action, dest, opt, value, values, parser): if action == "extend": values.ensure_value(dest, []).append(value) else: Option.take_action(self, action, dest, opt, value, values, parser) def create_option_parser(): # If you change any of these, also update README. parser = OptionParser( usage="usage: %prog [options] left_file right_file", version="icdiff version %s" % __version__, description="Show differences between files in a " "two column view.", option_class=MultipleOption, ) parser.add_option( "--cols", default=None, help="specify the width of the screen. Autodetection is " "Unix only", ) parser.add_option( "--encoding", default="utf-8", help="specify the file encoding; defaults to utf8", ) parser.add_option( "-E", "--exclude-lines", action="store", type="string", dest="matcher", help="Do not diff lines that match this regex. Not " "compatible with the 'line-numbers' option", ) parser.add_option( "--head", default=0, help="consider only the first N lines of each file", ) parser.add_option( "-H", "--highlight", default=False, action="store_true", help="color by changing the background color instead of " "the foreground color. Very fast, ugly, displays all " "changes", ) parser.add_option( "-L", "--label", action="extend", type="string", dest="labels", help="override file labels with arbitrary tags. " "Use twice, one for each file. You may include the " "formatting strings '{path}' and '{basename}'", ) parser.add_option( "-N", "--line-numbers", default=False, action="store_true", help="generate output with line numbers. Not compatible " "with the 'exclude-lines' option.", ) parser.add_option( "--no-bold", default=False, action="store_true", help="use non-bold colors; recommended for solarized", ) parser.add_option( "--no-headers", default=False, action="store_true", help="don't label the left and right sides " "with their file names", ) parser.add_option( "--output-encoding", default="utf-8", help="specify the output encoding; defaults to utf8", ) parser.add_option( "-r", "--recursive", default=False, action="store_true", help="recursively compare subdirectories", ) parser.add_option( "-s", "--report-identical-files", default=False, action="store_true", help="report when two files are the same", ) parser.add_option( "--show-all-spaces", default=False, action="store_true", help="color all non-matching whitespace including " "that which is not needed for drawing the eye to " "changes. Slow, ugly, displays all changes", ) parser.add_option( "--show-no-spaces", default=False, action="store_true", help="don't color whitespace-only changes", ) parser.add_option("--tabsize", default=8, help="tab stop spacing") parser.add_option( "-t", "--truncate", default=False, action="store_true", help="truncate long lines instead of wrapping them", ) parser.add_option( "-u", "--patch", default=True, action="store_true", help="generate patch. This is always true, " "and only exists for compatibility", ) parser.add_option( "-U", "--unified", "--numlines", default=5, metavar="NUM", help="how many lines of context to print; " "can't be combined with --whole-file", ) parser.add_option( "-W", "--whole-file", default=False, action="store_true", help="show the whole file instead of just changed " "lines and context", ) parser.add_option( "-P", "--permissions", default=False, action="store_true", help="compare the file permissions as well as the " "content of the file", ) parser.add_option( "--strip-trailing-cr", default=False, action="store_true", help="strip any trailing carriage return at the end of " "an input line", ) parser.add_option( "--color-map", default=None, help="choose which colors are used for which items. " "Default is --color-map='" + ",".join("%s:%s" % x for x in sorted(color_mapping.items())) + "'" ". You don't have to override all of them: " "'--color-map=separator:white,description:cyan", ) parser.add_option( "--is-git-diff", default=False, action="store_true", help="Show the real file name when displaying " "git-diff result", ) parser.add_option( "-x", "--exclude", metavar="PAT", action="append", default=[], help="exclude files that match PAT", ) return parser def set_cols_option(options): if os.name == "nt": try: import struct from ctypes import windll, create_string_buffer fh = windll.kernel32.GetStdHandle(-12) # stderr is -12 csbi = create_string_buffer(22) windll.kernel32.GetConsoleScreenBufferInfo(fh, csbi) res = struct.unpack("hhhhHhhhhhh", csbi.raw) options.cols = res[7] - res[5] + 1 # right - left + 1 return except Exception: pass else: def ioctl_GWINSZ(fd): try: import fcntl import termios import struct cr = struct.unpack( "hh", fcntl.ioctl(fd, termios.TIOCGWINSZ, "1234") ) except Exception: return None return cr cr = ioctl_GWINSZ(0) or ioctl_GWINSZ(1) or ioctl_GWINSZ(2) if cr and cr[1] > 0: options.cols = cr[1] return options.cols = 80 def validate_has_two_arguments(parser, args): if len(args) != 2: parser.print_help() sys.exit(EXIT_CODE_DIFF) def start(): diffs_found = False parser = create_option_parser() options, args = parser.parse_args() validate_has_two_arguments(parser, args) if not options.cols: set_cols_option(options) try: diffs_found = diff(options, *args) except KeyboardInterrupt: pass except IOError as e: if e.errno == errno.EPIPE: pass else: raise # Close stderr to prevent printing errors when icdiff is piped to # something that closes before icdiff is done writing # # See: https://stackoverflow.com/questions/26692284/... # ...how-to-prevent-brokenpipeerror-when-doing-a-flush-in-python sys.stderr.close() sys.exit(EXIT_CODE_DIFF if diffs_found else EXIT_CODE_SUCCESS) def codec_print(s, options): s = "%s\n" % s if hasattr(sys.stdout, "buffer"): sys.stdout.buffer.write(s.encode(options.output_encoding)) else: sys.stdout.write(s.encode(options.output_encoding)) def cmp_perms(options, a, b): return not options.permissions or ( os.lstat(a).st_mode == os.lstat(b).st_mode ) def should_be_excluded(name, pats): return any(fnmatch.fnmatchcase(name, pat) for pat in pats) def diff(options, a, b): def print_meta(s): codec_print(simple_colorize(s, "meta"), options) # We start out and assume that no diffs have been found (so far) diffs_found = False # Don't use os.path.isfile; it returns False for file-like entities like # bash's process substitution (/dev/fd/N). is_a_file = not os.path.isdir(a) is_b_file = not os.path.isdir(b) if is_a_file and is_b_file: try: if not ( filecmp.cmp(a, b, shallow=False) and cmp_perms(options, a, b) ): diffs_found = diffs_found | diff_files(options, a, b) elif options.report_identical_files: print("Files %s and %s are identical." % (a, b)) except OSError as e: if e.errno == errno.ENOENT: print_meta("error: file '%s' was not found" % e.filename) sys.exit(EXIT_CODE_ERROR) else: raise (e) elif not is_a_file and not is_b_file: a_contents = set(os.listdir(a)) b_contents = set(os.listdir(b)) for child in sorted(a_contents.union(b_contents)): if should_be_excluded(child, options.exclude): continue if child not in b_contents: print_meta("Only in %s: %s" % (a, child)) elif child not in a_contents: print_meta("Only in %s: %s" % (b, child)) elif options.recursive: diffs_found = diffs_found | diff( options, os.path.join(a, child), os.path.join(b, child) ) elif not is_a_file and is_b_file: print_meta("File %s is a directory while %s is a file" % (a, b)) diffs_found = True elif is_a_file and not is_b_file: print_meta("File %s is a file while %s is a directory" % (a, b)) diffs_found = True return diffs_found def read_file(fname, options): try: with codecs.open(fname, encoding=options.encoding, mode="rb") as inf: return inf.readlines() except UnicodeDecodeError as e: codec_print( "error: file '%s' not valid with encoding '%s': <%s> at %s-%s." % (fname, options.encoding, e.reason, e.start, e.end), options, ) raise except LookupError: codec_print( "error: encoding '%s' was not found." % (options.encoding), options ) sys.exit(EXIT_CODE_ERROR) def format_label(path, label="{path}"): """Format a label using a file's path and basename. Example: For file `/foo/bar.py` and label "Yours: {basename}" - The output is "Yours: bar.py" """ return label.format(path=path, basename=os.path.basename(path)) def diff_files(options, a, b): diff_found = False if options.is_git_diff is True: # Use $BASE as label when displaying git-diff result base = os.getenv("BASE") headers = [format_label(a, base), format_label(b, base)] else: if options.labels: if len(options.labels) == 2: headers = [ format_label(a, options.labels[0]), format_label(b, options.labels[1]), ] else: codec_print( "error: to use arbitrary file labels, " "specify -L twice.", options, ) sys.exit(EXIT_CODE_ERROR) else: headers = a, b if options.no_headers: headers = None, None head = int(options.head) assert not os.path.isdir(a) assert not os.path.isdir(b) try: lines_a = read_file(a, options) lines_b = read_file(b, options) except UnicodeDecodeError: return diff_found if head != 0: lines_a = lines_a[:head] lines_b = lines_b[:head] if options.matcher: lines_a = [ line_a for line_a in lines_a if not re.search(options.matcher, line_a) ] lines_b = [ line_b for line_b in lines_b if not re.search(options.matcher, line_b) ] # Determine if a difference has been detected diff_found = lines_a != lines_b or not cmp_perms(options, a, b) if options.no_bold: for key in color_mapping: color_mapping[key] = color_mapping[key].replace("_bold", "") if options.color_map: command_for_errors = '--color-map="%s"' % (options.color_map) for mapping in options.color_map.split(","): category, color = mapping.split(":", 1) if category not in color_mapping: print( "Invalid category '%s' in '%s'. Valid categories are: %s." % ( category, command_for_errors, ", ".join(sorted(color_mapping.keys())), ) ) sys.exit(EXIT_CODE_ERROR) if color not in color_codes: print( "Invalid color '%s' in '%s'. Valid colors are: %s." % ( color, command_for_errors, ", ".join( [ raw_colorize(x, x) for x in sorted(color_codes.keys()) ] ), ) ) sys.exit(EXIT_CODE_ERROR) color_mapping[category] = color if options.permissions: mode_a = os.lstat(a).st_mode mode_b = os.lstat(b).st_mode else: mode_a = None mode_b = None cd = ConsoleDiff( cols=int(options.cols), show_all_spaces=options.show_all_spaces, show_no_spaces=options.show_no_spaces, highlight=options.highlight, line_numbers=options.line_numbers, tabsize=int(options.tabsize), truncate=options.truncate, strip_trailing_cr=options.strip_trailing_cr, ) for line in cd.make_table( lines_a, lines_b, headers[0], headers[1], mode_a, mode_b, context=(not options.whole_file), numlines=int(options.unified), ): codec_print(line, options) sys.stdout.flush() return diff_found if __name__ == "__main__": start() icdiff-release-2.0.7/icdiff.py000077700000000000000000000000001447067572400173422icdiffustar00rootroot00000000000000icdiff-release-2.0.7/requirements-dev.txt000066400000000000000000000001041447067572400204350ustar00rootroot00000000000000flake8==2.4.0 mccabe==0.3 pep8==1.5.7 pyflakes==0.8.1 black==22.3.0 icdiff-release-2.0.7/setup.py000066400000000000000000000012671447067572400161220ustar00rootroot00000000000000from setuptools import setup from icdiff import __version__ setup( name="icdiff", version=__version__, url="https://www.jefftk.com/icdiff", project_urls={ "Source": "https://github.com/jeffkaufman/icdiff", }, classifiers=[ "License :: OSI Approved :: Python Software Foundation License" ], author="Jeff Kaufman", author_email="jeff@jefftk.com", description="improved colored diff", long_description=open('README.md').read(), long_description_content_type='text/markdown', scripts=['git-icdiff'], py_modules=['icdiff'], entry_points={ 'console_scripts': [ 'icdiff=icdiff:start', ], }, ) icdiff-release-2.0.7/test.sh000077500000000000000000000202251447067572400157210ustar00rootroot00000000000000#!/bin/bash # Usage: ./test.sh [--regold] [test-name] python[3] # Example: # Run all tests: # ./test.sh python3 # Regold all tests: # ./test.sh --regold python3 # Run one test: # ./test.sh tests/gold-45-sas-h-nb.txt python3 # Regold one test: # ./test.sh --regold tests/gold-45-sas-h-nb.txt python3 if [ "$#" -gt 1 -a "$1" = "--regold" ]; then REGOLD=true shift else REGOLD=false fi TEST_NAME=all if [ "$#" -gt 1 ]; then TEST_NAME=$1 shift fi if [ "$#" != 1 ]; then echo "Usage: '$0 [--regold] [test-name] python[3]'" exit 1 fi PYTHON="$1" ICDIFF="icdiff" if [ ! -z "$INSTALLED" ]; then INVOCATION="$ICDIFF" else INVOCATION="$PYTHON $ICDIFF" fi function fail() { echo "FAIL" exit 1 } function check_gold() { local error_code local expect=$1 local gold=tests/$2 shift shift if [ $TEST_NAME != "all" -a $TEST_NAME != $gold ]; then return fi echo " check_gold $gold matches $@" local tmp=/tmp/icdiff.output $INVOCATION "$@" &> $tmp error_code=$? if $REGOLD; then if [ -e $gold ] && diff $tmp $gold > /dev/null; then echo "Did not need to regold $gold" else cat $tmp read -p "Is this correct? y/n > " -n 1 -r echo if [[ $REPLY =~ ^[Yy]$ ]]; then mv $tmp $gold echo "Regolded $gold." else echo "Did not regold $gold." fi fi return fi if ! diff $gold $tmp; then echo "Got: ($tmp)" cat $tmp echo "Expected: ($gold)" cat $gold fail fi if [[ $error_code != $expect ]]; then echo "Got error code: $error_code" echo "Expected error code: $expect" fail fi } FIRST_TIME_CHECK_GIT_DIFF=true function check_git_diff() { local gitdiff=tests/$1 shift echo " check_gitdiff $gitdiff matches git icdiff $@" # Check when using icdiff in git if $FIRST_TIME_CHECK_GIT_DIFF; then FIRST_TIME_CHECK_GIT_DIFF=false # Set default args when first time check git diff yes | git difftool --extcmd icdiff > /dev/null git config --global icdiff.options '--cols=80' export PATH="$(pwd)":$PATH fi local tmp=/tmp/git-icdiff.output git icdiff $1 $2 &> $tmp if ! diff $tmp $gitdiff; then echo "Got: ($tmp)" cat $tmp echo "Expected: ($gitdiff)" fail fi } check_gold 1 gold-recursive.txt --recursive tests/{a,b} --cols=80 check_gold 1 gold-exclude.txt --exclude-lines '^#| pad' tests/input-4-cr.txt tests/input-4-partial-cr.txt --cols=80 check_gold 0 gold-dir.txt tests/{a,b} --cols=80 check_gold 1 gold-12.txt tests/input-{1,2}.txt --cols=80 check_gold 1 gold-12-t.txt tests/input-{1,2}.txt --cols=80 --truncate check_gold 0 gold-3.txt tests/input-{3,3}.txt check_gold 1 gold-45.txt tests/input-{4,5}.txt --cols=80 check_gold 1 gold-45-95.txt tests/input-{4,5}.txt --cols=95 check_gold 1 gold-45-sas.txt tests/input-{4,5}.txt --cols=80 --show-all-spaces check_gold 1 gold-45-h.txt tests/input-{4,5}.txt --cols=80 --highlight check_gold 1 gold-45-nb.txt tests/input-{4,5}.txt --cols=80 --no-bold check_gold 1 gold-45-sas-h.txt tests/input-{4,5}.txt --cols=80 --show-all-spaces --highlight check_gold 1 gold-45-sas-h-nb.txt tests/input-{4,5}.txt --cols=80 --show-all-spaces --highlight --no-bold check_gold 1 gold-sas.txt tests/input-{10,11}.txt --cols=80 --show-all-spaces check_gold 1 gold-sns.txt tests/input-{10,11}.txt --cols=80 --show-no-spaces check_gold 1 gold-show-spaces.txt tests/input-{10,11}.txt --cols=80 check_gold 1 gold-45-h-nb.txt tests/input-{4,5}.txt --cols=80 --highlight --no-bold check_gold 1 gold-45-ln.txt tests/input-{4,5}.txt --cols=80 --line-numbers check_gold 1 gold-45-ln-color.txt tests/input-{4,5}.txt --cols=80 --line-numbers --color-map='line-numbers:cyan' check_gold 1 gold-45-nh.txt tests/input-{4,5}.txt --cols=80 --no-headers check_gold 1 gold-45-h3.txt tests/input-{4,5}.txt --cols=80 --head=3 check_gold 2 gold-45-l.txt tests/input-{4,5}.txt --cols=80 -L left check_gold 1 gold-45-lr.txt tests/input-{4,5}.txt --cols=80 -L left -L right check_gold 1 gold-45-lbrb.txt tests/input-{4,5}.txt --cols=80 -L "L {basename}" -L "R {basename}" check_gold 1 gold-45-pipe.txt tests/input-4.txt <(cat tests/input-5.txt) --cols=80 --no-headers check_gold 1 gold-4dn.txt tests/input-4.txt /dev/null --cols=80 -L left -L right check_gold 1 gold-dn5.txt /dev/null tests/input-5.txt --cols=80 -L left -L right check_gold 1 gold-67.txt tests/input-{6,7}.txt --cols=80 check_gold 1 gold-67-wf.txt tests/input-{6,7}.txt --cols=80 --whole-file check_gold 1 gold-67-ln.txt tests/input-{6,7}.txt --cols=80 --line-numbers check_gold 1 gold-67-u3.txt tests/input-{6,7}.txt --cols=80 -U 3 check_gold 1 gold-tabs-default.txt tests/input-{8,9}.txt --cols=80 check_gold 1 gold-tabs-4.txt tests/input-{8,9}.txt --cols=80 --tabsize=4 check_gold 2 gold-file-not-found.txt tests/input-4.txt nonexistent_file check_gold 1 gold-strip-cr-off.txt tests/input-4.txt tests/input-4-cr.txt --cols=80 check_gold 1 gold-strip-cr-on.txt tests/input-4.txt tests/input-4-cr.txt --cols=80 --strip-trailing-cr check_gold 1 gold-no-cr-indent tests/input-4-cr.txt tests/input-4-partial-cr.txt --cols=80 check_gold 1 gold-hide-cr-if-dos tests/input-4-cr.txt tests/input-5-cr.txt --cols=80 check_gold 1 gold-12-subcolors.txt tests/input-{1,2}.txt --cols=80 --color-map='change:magenta,description:cyan_bold' check_gold 2 gold-subcolors-bad-color tests/input-{1,2}.txt --cols=80 --color-map='change:mageta,description:cyan_bold' check_gold 2 gold-subcolors-bad-cat tests/input-{1,2}.txt --cols=80 --color-map='chnge:magenta,description:cyan_bold' check_gold 2 gold-subcolors-bad-fmt tests/input-{1,2}.txt --cols=80 --color-map='change:magenta:gold,description:cyan_bold' check_gold 0 gold-identical-on.txt tests/input-{1,1}.txt -s check_gold 2 gold-bad-encoding.txt tests/input-{1,2}.txt --encoding=nonexistend_encoding check_gold 0 gold-recursive-with-exclude.txt --recursive -x c tests/{a,b} --cols=80 check_gold 1 gold-recursive-with-exclude2.txt --recursive -x 'excl*' tests/test-with-exclude/{a,b} --cols=80 check_gold 0 gold-exit-process-sub tests/input-1.txt <(cat tests/input-1.txt) --no-headers --cols=80 rm -f tests/permissions-{a,b} touch tests/permissions-{a,b} check_gold 0 gold-permissions-same.txt tests/permissions-{a,b} -P --cols=80 chmod 666 tests/permissions-a chmod 665 tests/permissions-b check_gold 1 gold-permissions-diff.txt tests/permissions-{a,b} -P --cols=80 echo "some text" >> tests/permissions-a check_gold 1 gold-permissions-diff-text.txt tests/permissions-{a,b} -P --cols=80 echo -e "\04" >> tests/permissions-b check_gold 1 gold-permissions-diff-binary.txt tests/permissions-{a,b} -P --cols=80 rm -f tests/permissions-{a,b} if git show 4e86205629 &> /dev/null; then # We're in the repo, so test git. check_git_diff gitdiff-only-newlines.txt 4e86205629~1 4e86205629 else echo "Not in icdiff repo; skipping git test" fi # Testing pipe behavior doesn't fit well with the check_gold system $INVOCATION tests/input-{4,5}.txt 2>/tmp/icdiff-pipe-error-output | head -n 1 if [ -s /tmp/icdiff-pipe-error-output ]; then echo 'emitting errors on early pipe closure' fail fi VERSION=$($INVOCATION --version | awk '{print $NF}') if [ "$VERSION" != $(head -n 1 ChangeLog) ]; then echo "Version mismatch between ChangeLog and icdiff source." fail fi function ensure_installed() { if ! command -v "$1" >/dev/null 2>&1; then echo "Could not find $1." echo 'Ensure it is installed and on your $PATH.' if [ -z "$VIRTUAL_ENV" ]; then echo 'It appears you have have forgotten to activate your virtualenv.' fi echo 'See README.md for details on setting up your environment.' fail fi } ensure_installed "black" echo 'Running black formatter...' if ! black icdiff --quiet --line-length 79 --check; then echo "" echo 'Consider running `black icdiff --line-length 79`' fail fi ensure_installed "flake8" echo 'Running flake8 linter...' if ! flake8 icdiff; then fail fi if ! $REGOLD; then echo PASS fi icdiff-release-2.0.7/tests/000077500000000000000000000000001447067572400155445ustar00rootroot00000000000000icdiff-release-2.0.7/tests/a/000077500000000000000000000000001447067572400157645ustar00rootroot00000000000000icdiff-release-2.0.7/tests/a/1000066400000000000000000000000001447067572400160350ustar00rootroot00000000000000icdiff-release-2.0.7/tests/a/c/000077500000000000000000000000001447067572400162065ustar00rootroot00000000000000icdiff-release-2.0.7/tests/a/c/e000066400000000000000000000000021447067572400163450ustar00rootroot000000000000001 icdiff-release-2.0.7/tests/a/c/f000066400000000000000000000000021447067572400163460ustar00rootroot000000000000002 icdiff-release-2.0.7/tests/a/j000066400000000000000000000000021447067572400161300ustar00rootroot000000000000007 icdiff-release-2.0.7/tests/b/000077500000000000000000000000001447067572400157655ustar00rootroot00000000000000icdiff-release-2.0.7/tests/b/1000066400000000000000000000543751447067572400160660ustar00rootroot00000000000000UTF-8 decoder capability and stress test ---------------------------------------- Markus Kuhn - 2015-08-28 - CC BY 4.0 This test file can help you examine, how your UTF-8 decoder handles various types of correct, malformed, or otherwise interesting UTF-8 sequences. This file is not meant to be a conformance test. It does not prescribe any particular outcome. Therefore, there is no way to "pass" or "fail" this test file, even though the text does suggest a preferable decoder behaviour at some places. Its aim is, instead, to help you think about, and test, the behaviour of your UTF-8 decoder on a systematic collection of unusual inputs. Experience so far suggests that most first-time authors of UTF-8 decoders find at least one serious problem in their decoder using this file. The test lines below cover boundary conditions, malformed UTF-8 sequences, as well as correctly encoded UTF-8 sequences of Unicode code points that should never occur in a correct UTF-8 file. According to ISO 10646-1:2000, sections D.7 and 2.3c, a device receiving UTF-8 shall interpret a "malformed sequence in the same way that it interprets a character that is outside the adopted subset" and "characters that are not within the adopted subset shall be indicated to the user" by a receiving device. One commonly used approach in UTF-8 decoders is to replace any malformed UTF-8 sequence by a replacement character (U+FFFD), which looks a bit like an inverted question mark, or a similar symbol. It might be a good idea to visually distinguish a malformed UTF-8 sequence from a correctly encoded Unicode character that is just not available in the current font but otherwise fully legal, even though ISO 10646-1 doesn't mandate this. In any case, just ignoring malformed sequences or unavailable characters does not conform to ISO 10646, will make debugging more difficult, and can lead to user confusion. Please check, whether a malformed UTF-8 sequence is (1) represented at all, (2) represented by exactly one single replacement character (or equivalent signal), and (3) the following quotation mark after an illegal UTF-8 sequence is correctly displayed, i.e. proper resynchronization takes place immediately after any malformed sequence. This file says "THE END" in the last line, so if you don't see that, your decoder crashed somehow before, which should always be cause for concern. All lines in this file are exactly 79 characters long (plus the line feed). In addition, all lines end with "|", except for the two test lines 2.1.1 and 2.2.1, which contain non-printable ASCII controls U+0000 and U+007F. If you display this file with a fixed-width font, these "|" characters should all line up in column 79 (right margin). This allows you to test quickly, whether your UTF-8 decoder finds the correct number of characters in every line, that is whether each malformed sequences is replaced by a single replacement character. Note that, as an alternative to the notion of malformed sequence used here, it is also a perfectly acceptable (and in some situations even preferable) solution to represent each individual byte of a malformed sequence with a replacement character. If you follow this strategy in your decoder, then please ignore the "|" column. Here come the tests: | | 1 Some correct UTF-8 text | | You should see the Greek word 'kosme': "κόσμε" | | 2 Boundary condition test cases | | 2.1 First possible sequence of a certain length | | 2.1.1 1 byte (U-00000000): "" 2.1.2 2 bytes (U-00000080): "€" | 2.1.3 3 bytes (U-00000800): "ࠀ" | 2.1.4 4 bytes (U-00010000): "𐀀" | 2.1.5 5 bytes (U-00200000): "" | 2.1.6 6 bytes (U-04000000): "" | | 2.2 Last possible sequence of a certain length | | 2.2.1 1 byte (U-0000007F): "" 2.2.2 2 bytes (U-000007FF): "߿" | 2.2.3 3 bytes (U-0000FFFF): "￿" | 2.2.4 4 bytes (U-001FFFFF): "" | 2.2.5 5 bytes (U-03FFFFFF): "" | 2.2.6 6 bytes (U-7FFFFFFF): "" | | 2.3 Other boundary conditions | | 2.3.1 U-0000D7FF = ed 9f bf = "퟿" | 2.3.2 U-0000E000 = ee 80 80 = "" | 2.3.3 U-0000FFFD = ef bf bd = "�" | 2.3.4 U-0010FFFF = f4 8f bf bf = "􏿿" | 2.3.5 U-00110000 = f4 90 80 80 = "" | | 3 Malformed sequences | | 3.1 Unexpected continuation bytes | | Each unexpected continuation byte should be separately signalled as a | malformed sequence of its own. | | 3.1.1 First continuation byte 0x80: "" | 3.1.2 Last continuation byte 0xbf: "" | | 3.1.3 2 continuation bytes: "" | 3.1.4 3 continuation bytes: "" | 3.1.5 4 continuation bytes: "" | 3.1.6 5 continuation bytes: "" | 3.1.7 6 continuation bytes: "" | 3.1.8 7 continuation bytes: "" | | 3.1.9 Sequence of all 64 possible continuation bytes (0x80-0xbf): | | " | | | " | | 3.2 Lonely start characters | | 3.2.1 All 32 first bytes of 2-byte sequences (0xc0-0xdf), | each followed by a space character: | | " | " | | 3.2.2 All 16 first bytes of 3-byte sequences (0xe0-0xef), | each followed by a space character: | | " " | | 3.2.3 All 8 first bytes of 4-byte sequences (0xf0-0xf7), | each followed by a space character: | | " " | | 3.2.4 All 4 first bytes of 5-byte sequences (0xf8-0xfb), | each followed by a space character: | | " " | | 3.2.5 All 2 first bytes of 6-byte sequences (0xfc-0xfd), | each followed by a space character: | | " " | | 3.3 Sequences with last continuation byte missing | | All bytes of an incomplete sequence should be signalled as a single | malformed sequence, i.e., you should see only a single replacement | character in each of the next 10 tests. (Characters as in section 2) | | 3.3.1 2-byte sequence with last byte missing (U+0000): "" | 3.3.2 3-byte sequence with last byte missing (U+0000): "" | 3.3.3 4-byte sequence with last byte missing (U+0000): "" | 3.3.4 5-byte sequence with last byte missing (U+0000): "" | 3.3.5 6-byte sequence with last byte missing (U+0000): "" | 3.3.6 2-byte sequence with last byte missing (U-000007FF): "" | 3.3.7 3-byte sequence with last byte missing (U-0000FFFF): "" | 3.3.8 4-byte sequence with last byte missing (U-001FFFFF): "" | 3.3.9 5-byte sequence with last byte missing (U-03FFFFFF): "" | 3.3.10 6-byte sequence with last byte missing (U-7FFFFFFF): "" | | 3.4 Concatenation of incomplete sequences | | All the 10 sequences of 3.3 concatenated, you should see 10 malformed | sequences being signalled: | | "" | | 3.5 Impossible bytes | | The following two bytes cannot appear in a correct UTF-8 string | | 3.5.1 fe = "" | 3.5.2 ff = "" | 3.5.3 fe fe ff ff = "" | | 4 Overlong sequences | | The following sequences are not malformed according to the letter of | the Unicode 2.0 standard. However, they are longer then necessary and | a correct UTF-8 encoder is not allowed to produce them. A "safe UTF-8 | decoder" should reject them just like malformed sequences for two | reasons: (1) It helps to debug applications if overlong sequences are | not treated as valid representations of characters, because this helps | to spot problems more quickly. (2) Overlong sequences provide | alternative representations of characters, that could maliciously be | used to bypass filters that check only for ASCII characters. For | instance, a 2-byte encoded line feed (LF) would not be caught by a | line counter that counts only 0x0a bytes, but it would still be | processed as a line feed by an unsafe UTF-8 decoder later in the | pipeline. From a security point of view, ASCII compatibility of UTF-8 | sequences means also, that ASCII characters are *only* allowed to be | represented by ASCII bytes in the range 0x00-0x7f. To ensure this | aspect of ASCII compatibility, use only "safe UTF-8 decoders" that | reject overlong UTF-8 sequences for which a shorter encoding exists. | | 4.1 Examples of an overlong ASCII character | | With a safe UTF-8 decoder, all of the following five overlong | representations of the ASCII character slash ("/") should be rejected | like a malformed UTF-8 sequence, for instance by substituting it with | a replacement character. If you see a slash below, you do not have a | safe UTF-8 decoder! | | 4.1.1 U+002F = c0 af = "" | 4.1.2 U+002F = e0 80 af = "" | 4.1.3 U+002F = f0 80 80 af = "" | 4.1.4 U+002F = f8 80 80 80 af = "" | 4.1.5 U+002F = fc 80 80 80 80 af = "" | | 4.2 Maximum overlong sequences | | Below you see the highest Unicode value that is still resulting in an | overlong sequence if represented with the given number of bytes. This | is a boundary test for safe UTF-8 decoders. All five characters should | be rejected like malformed UTF-8 sequences. | | 4.2.1 U-0000007F = c1 bf = "" | 4.2.2 U-000007FF = e0 9f bf = "" | 4.2.3 U-0000FFFF = f0 8f bf bf = "" | 4.2.4 U-001FFFFF = f8 87 bf bf bf = "" | 4.2.5 U-03FFFFFF = fc 83 bf bf bf bf = "" | | 4.3 Overlong representation of the NUL character | | The following five sequences should also be rejected like malformed | UTF-8 sequences and should not be treated like the ASCII NUL | character. | | 4.3.1 U+0000 = c0 80 = "" | 4.3.2 U+0000 = e0 80 80 = "" | 4.3.3 U+0000 = f0 80 80 80 = "" | 4.3.4 U+0000 = f8 80 80 80 80 = "" | 4.3.5 U+0000 = fc 80 80 80 80 80 = "" | | 5 Illegal code positions | | The following UTF-8 sequences should be rejected like malformed | sequences, because they never represent valid ISO 10646 characters and | a UTF-8 decoder that accepts them might introduce security problems | comparable to overlong UTF-8 sequences. | | 5.1 Single UTF-16 surrogates | | 5.1.1 U+D800 = ed a0 80 = "" | 5.1.2 U+DB7F = ed ad bf = "" | 5.1.3 U+DB80 = ed ae 80 = "" | 5.1.4 U+DBFF = ed af bf = "" | 5.1.5 U+DC00 = ed b0 80 = "" | 5.1.6 U+DF80 = ed be 80 = "" | 5.1.7 U+DFFF = ed bf bf = "" | | 5.2 Paired UTF-16 surrogates | | 5.2.1 U+D800 U+DC00 = ed a0 80 ed b0 80 = "" | 5.2.2 U+D800 U+DFFF = ed a0 80 ed bf bf = "" | 5.2.3 U+DB7F U+DC00 = ed ad bf ed b0 80 = "" | 5.2.4 U+DB7F U+DFFF = ed ad bf ed bf bf = "" | 5.2.5 U+DB80 U+DC00 = ed ae 80 ed b0 80 = "" | 5.2.6 U+DB80 U+DFFF = ed ae 80 ed bf bf = "" | 5.2.7 U+DBFF U+DC00 = ed af bf ed b0 80 = "" | 5.2.8 U+DBFF U+DFFF = ed af bf ed bf bf = "" | | 5.3 Noncharacter code positions | | The following "noncharacters" are "reserved for internal use" by | applications, and according to older versions of the Unicode Standard | "should never be interchanged". Unicode Corrigendum #9 dropped the | latter restriction. Nevertheless, their presence in incoming UTF-8 data | can remain a potential security risk, depending on what use is made of | these codes subsequently. Examples of such internal use: | | - Some file APIs with 16-bit characters may use the integer value -1 | = U+FFFF to signal an end-of-file (EOF) or error condition. | | - In some UTF-16 receivers, code point U+FFFE might trigger a | byte-swap operation (to convert between UTF-16LE and UTF-16BE). | | With such internal use of noncharacters, it may be desirable and safer | to block those code points in UTF-8 decoders, as they should never | occur legitimately in incoming UTF-8 data, and could trigger unsafe | behaviour in subsequent processing. | | Particularly problematic noncharacters in 16-bit applications: | | 5.3.1 U+FFFE = ef bf be = "￾" | 5.3.2 U+FFFF = ef bf bf = "￿" | | Other noncharacters: | | 5.3.3 U+FDD0 .. U+FDEF = "﷐﷑﷒﷓﷔﷕﷖﷗﷘﷙﷚﷛﷜﷝﷞﷟﷠﷡﷢﷣﷤﷥﷦﷧﷨﷩﷪﷫﷬﷭﷮﷯"| | 5.3.4 U+nFFFE U+nFFFF (for n = 1..10) | | "🿾🿿𯿾𯿿𿿾𿿿񏿾񏿿񟿾񟿿񯿾񯿿񿿾񿿿򏿾򏿿 | 򟿾򟿿򯿾򯿿򿿾򿿿󏿾󏿿󟿾󟿿󯿾󯿿󿿾󿿿􏿾􏿿" | | THE END | icdiff-release-2.0.7/tests/b/c/000077500000000000000000000000001447067572400162075ustar00rootroot00000000000000icdiff-release-2.0.7/tests/b/c/e000066400000000000000000000000021447067572400163460ustar00rootroot000000000000001 icdiff-release-2.0.7/tests/b/c/f000066400000000000000000000000021447067572400163470ustar00rootroot000000000000003 icdiff-release-2.0.7/tests/b/c/g000066400000000000000000000000021447067572400163500ustar00rootroot000000000000004 icdiff-release-2.0.7/tests/b/c/h000066400000000000000000000000021447067572400163510ustar00rootroot000000000000005 icdiff-release-2.0.7/tests/b/d/000077500000000000000000000000001447067572400162105ustar00rootroot00000000000000icdiff-release-2.0.7/tests/b/d/q000066400000000000000000000000021447067572400163630ustar00rootroot000000000000009 icdiff-release-2.0.7/tests/b/i000066400000000000000000000000021447067572400161300ustar00rootroot000000000000006 icdiff-release-2.0.7/tests/gitdiff-only-newlines.txt000066400000000000000000000035541447067572400225310ustar00rootroot00000000000000README README --show-all-spaces color all non-m --show-all-spaces color all non-m atching whitespace including atching whitespace including that which is n that which is n ot needed for drawing the eye to ot needed for drawing the eye to changes. Slow, changes. Slow, ugly, displays all changes ugly, displays all changes --print-headers label the left --print-headers label the left and right sides with their file and right sides with their file names names   License:    This file is derived from difflib.Ht mlDiff which is under the license:    http://www.python.org/download/rel eases/2.6.2/license/    I release my changes here under the  same license. This is GPL compatible.   icdiff-release-2.0.7/tests/gold-12-subcolors.txt000066400000000000000000000012641447067572400214660ustar00rootroot00000000000000tests/input-1.txt tests/input-2.txt 测试行abc测试测试行abc,测试测试行abc测 测试行adc测试测试行adc,测试测试行adc测 试测试行abc测试测试行abc测试测试行abc测 试测试行adc测试测试行adc测试测试行adc测 试测试行abc测试测试行abc测试测试行abc测 试测试行adc测试测试行adc测试测试行adc测 试 试 icdiff-release-2.0.7/tests/gold-12-t.txt000066400000000000000000000004161447067572400177140ustar00rootroot00000000000000tests/input-1.txt tests/input-2.txt 测试行abc测试测试行abc,测试测试行abc测 测试行adc测试测试行adc,测试测试行adc测 icdiff-release-2.0.7/tests/gold-12.txt000066400000000000000000000012641447067572400174550ustar00rootroot00000000000000tests/input-1.txt tests/input-2.txt 测试行abc测试测试行abc,测试测试行abc测 测试行adc测试测试行adc,测试测试行adc测 试测试行abc测试测试行abc测试测试行abc测 试测试行adc测试测试行adc测试测试行adc测 试测试行abc测试测试行abc测试测试行abc测 试测试行adc测试测试行adc测试测试行adc测 试 试 icdiff-release-2.0.7/tests/gold-3.txt000066400000000000000000000000001447067572400173600ustar00rootroot00000000000000icdiff-release-2.0.7/tests/gold-45-95.txt000066400000000000000000000017741447067572400177240ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-h-nb.txt000066400000000000000000000015601447067572400203040ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-h.txt000066400000000000000000000015741447067572400177140ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-h3.txt000066400000000000000000000005501447067572400177700ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px; icdiff-release-2.0.7/tests/gold-45-l.txt000066400000000000000000000000671447067572400177140ustar00rootroot00000000000000error: to use arbitrary file labels, specify -L twice. icdiff-release-2.0.7/tests/gold-45-lbrb.txt000066400000000000000000000015601447067572400204010ustar00rootroot00000000000000L input-4.txt R input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-ln-color.txt000066400000000000000000000020201447067572400211750ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt 1 #input, #button { 1 #input, #button { 2 width: 350px; 2 width: 400px; 3 height: 40px; 3 height: 40px; 4  font-size: 30px; 4 margin: 0px; 5 margin: 0; 5 padding: 0px; 6 padding: 0; 6  margin-bottom: 15px; 7 text-align: center; 7 text-align: center; 8 } 8 } icdiff-release-2.0.7/tests/gold-45-ln.txt000066400000000000000000000020201447067572400200610ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt 1 #input, #button { 1 #input, #button { 2 width: 350px; 2 width: 400px; 3 height: 40px; 3 height: 40px; 4  font-size: 30px; 4 margin: 0px; 5 margin: 0; 5 padding: 0px; 6 padding: 0; 6  margin-bottom: 15px; 7 text-align: center; 7 text-align: center; 8 } 8 } icdiff-release-2.0.7/tests/gold-45-lr.txt000066400000000000000000000015601447067572400200750ustar00rootroot00000000000000left right #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-nb.txt000066400000000000000000000015601447067572400200570ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-nh.txt000066400000000000000000000014141447067572400200630ustar00rootroot00000000000000#input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-pipe.txt000066400000000000000000000014141447067572400204130ustar00rootroot00000000000000#input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-sas-h-nb.txt000066400000000000000000000015601447067572400210700ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-sas-h.txt000066400000000000000000000015741447067572400205000ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45-sas.txt000066400000000000000000000017641447067572400202540ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;   font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;   margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-45.txt000066400000000000000000000015601447067572400174620ustar00rootroot00000000000000tests/input-4.txt tests/input-5.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-4dn.txt000066400000000000000000000014641447067572400177220ustar00rootroot00000000000000left right #input, #button {  width: 350px;  height: 40px;  margin: 0px;  padding: 0px;  margin-bottom: 15px;  text-align: center; } icdiff-release-2.0.7/tests/gold-67-ln.txt000066400000000000000000000024241447067572400200750ustar00rootroot00000000000000tests/input-6.txt tests/input-7.txt 8 h 8 h 9 i 9 i 10 j 10 j 11 k 11 k 12 l 12 l 13 m 14 m 13 m 15 m 14 n 16 n 15 o 17 o 16 p 18 p 17 q 19 q icdiff-release-2.0.7/tests/gold-67-u3.txt000066400000000000000000000013701447067572400200120ustar00rootroot00000000000000tests/input-6.txt tests/input-7.txt j j k k l l m m m m n n o o icdiff-release-2.0.7/tests/gold-67-wf.txt000066400000000000000000000044701447067572400201030ustar00rootroot00000000000000tests/input-6.txt tests/input-7.txt a a b b c c d d e e f f g g h h i i j j k k l l m m m m n n o o p p q q r r s s t t u u z z w w x x y y z z icdiff-release-2.0.7/tests/gold-67.txt000066400000000000000000000020701447067572400174630ustar00rootroot00000000000000tests/input-6.txt tests/input-7.txt h h i i j j k k l l m m m m n n o o p p q q icdiff-release-2.0.7/tests/gold-bad-encoding.txt000066400000000000000000000000661447067572400215440ustar00rootroot00000000000000error: encoding 'nonexistend_encoding' was not found. icdiff-release-2.0.7/tests/gold-dir.txt000066400000000000000000000001271447067572400200060ustar00rootroot00000000000000Only in tests/b: d Only in tests/b: i Only in tests/a: j icdiff-release-2.0.7/tests/gold-dn5.txt000066400000000000000000000014641447067572400177230ustar00rootroot00000000000000left right #input, #button {  width: 400px;  height: 40px;  font-size: 30px;  margin: 0;  padding: 0;  text-align: center; } icdiff-release-2.0.7/tests/gold-exclude.txt000066400000000000000000000011161447067572400206600ustar00rootroot00000000000000tests/input-4-cr.txt tests/input-4-partial-cr.txt width: 350px; width: 350px; height: 40px; height: 40px; margin: 0px; margin: 0px; margin-bottom: 15px; margin-bottom: 15px; text-align: center;\r text-align: center; } } icdiff-release-2.0.7/tests/gold-exit-process-sub000066400000000000000000000000001447067572400216140ustar00rootroot00000000000000icdiff-release-2.0.7/tests/gold-file-not-found.txt000066400000000000000000000000671447067572400220610ustar00rootroot00000000000000error: file 'nonexistent_file' was not found icdiff-release-2.0.7/tests/gold-hide-cr-if-dos000066400000000000000000000015601447067572400211060ustar00rootroot00000000000000tests/input-4-cr.txt tests/input-5-cr.txt #input, #button { #input, #button { width: 350px; width: 400px; height: 40px; height: 40px;  font-size: 30px; margin: 0px; margin: 0; padding: 0px; padding: 0;  margin-bottom: 15px; text-align: center; text-align: center; } } icdiff-release-2.0.7/tests/gold-identical-on.txt000066400000000000000000000000751447067572400216000ustar00rootroot00000000000000Files tests/input-1.txt and tests/input-1.txt are identical. icdiff-release-2.0.7/tests/gold-no-cr-indent000066400000000000000000000012361447067572400207110ustar00rootroot00000000000000tests/input-4-cr.txt tests/input-4-partial-cr.txt width: 350px; width: 350px; height: 40px; height: 40px; margin: 0px; margin: 0px; padding: 0px; padding: 0px; margin-bottom: 15px; margin-bottom: 15px; text-align: center;\r text-align: center; } } icdiff-release-2.0.7/tests/gold-permissions-diff-binary.txt000066400000000000000000000004541447067572400237760ustar00rootroot00000000000000tests/permissions-a tests/permissions-b -rw-rw-rw- (100666) -rw-rw-r-x (100665) some text  icdiff-release-2.0.7/tests/gold-permissions-diff-text.txt000066400000000000000000000004421447067572400234730ustar00rootroot00000000000000tests/permissions-a tests/permissions-b -rw-rw-rw- (100666) -rw-rw-r-x (100665) some text icdiff-release-2.0.7/tests/gold-permissions-diff.txt000066400000000000000000000003101447067572400225030ustar00rootroot00000000000000tests/permissions-a tests/permissions-b -rw-rw-rw- (100666) -rw-rw-r-x (100665) icdiff-release-2.0.7/tests/gold-permissions-same.txt000066400000000000000000000000001447067572400225140ustar00rootroot00000000000000icdiff-release-2.0.7/tests/gold-recursive-with-exclude.txt000066400000000000000000000002631447067572400236400ustar00rootroot00000000000000error: file 'tests/b/1' not valid with encoding 'utf-8': at 4461-4462. Only in tests/b: d Only in tests/b: i Only in tests/a: j icdiff-release-2.0.7/tests/gold-recursive-with-exclude2.txt000066400000000000000000000010451447067572400237210ustar00rootroot00000000000000error: file 'tests/test-with-exclude/b/1' not valid with encoding 'utf-8': at 4461-4462. tests/test-with-exclude/a/c/f tests/test-with-exclude/b/c/f 2 3 Only in tests/test-with-exclude/b/c: g Only in tests/test-with-exclude/b/c: h Only in tests/test-with-exclude/b: d Only in tests/test-with-exclude/b: i Only in tests/test-with-exclude/a: j icdiff-release-2.0.7/tests/gold-recursive.txt000066400000000000000000000006711447067572400212430ustar00rootroot00000000000000error: file 'tests/b/1' not valid with encoding 'utf-8': at 4461-4462. tests/a/c/f tests/b/c/f 2 3 Only in tests/b/c: g Only in tests/b/c: h Only in tests/b: d Only in tests/b: i Only in tests/a: j icdiff-release-2.0.7/tests/gold-sas.txt000066400000000000000000000020161447067572400200150ustar00rootroot00000000000000tests/input-10.txt tests/input-11.txt void main () void main () { { int x; int x; int y; int y; } }   void foo () {   } icdiff-release-2.0.7/tests/gold-show-spaces.txt000066400000000000000000000016661447067572400214750ustar00rootroot00000000000000tests/input-10.txt tests/input-11.txt void main () void main () { { int x; int x; int y; int y; } }   void foo () {   } icdiff-release-2.0.7/tests/gold-sns.txt000066400000000000000000000016661447067572400200440ustar00rootroot00000000000000tests/input-10.txt tests/input-11.txt void main () void main () { { int x; int x; int y; int y; } }   void foo () {   } icdiff-release-2.0.7/tests/gold-strip-cr-off.txt000066400000000000000000000014761447067572400215530ustar00rootroot00000000000000tests/input-4.txt tests/input-4-cr.txt #input, #button { #input, #button {\r width: 350px; width: 350px;\r height: 40px; height: 40px;\r margin: 0px; margin: 0px;\r padding: 0px; padding: 0px;\r margin-bottom: 15px; margin-bottom: 15px;\r text-align: center; text-align: center;\r } }\r icdiff-release-2.0.7/tests/gold-strip-cr-on.txt000066400000000000000000000001441447067572400214040ustar00rootroot00000000000000tests/input-4.txt tests/input-4-cr.txt icdiff-release-2.0.7/tests/gold-subcolors-bad-cat000066400000000000000000000002701447067572400217150ustar00rootroot00000000000000Invalid category 'chnge' in '--color-map="chnge:magenta,description:cyan_bold"'. Valid categories are: add, change, description, line-numbers, meta, permissions, separator, subtract. icdiff-release-2.0.7/tests/gold-subcolors-bad-color000066400000000000000000000006441447067572400222710ustar00rootroot00000000000000Invalid color 'mageta' in '--color-map="change:mageta,description:cyan_bold"'. Valid colors are: black, black_bold, blue, blue_bold, cyan, cyan_bold, green, green_bold, magenta, magenta_bold, none, red, red_bold, white, white_bold, yellow, yellow_bold. icdiff-release-2.0.7/tests/gold-subcolors-bad-fmt000066400000000000000000000006601447067572400217370ustar00rootroot00000000000000Invalid color 'magenta:gold' in '--color-map="change:magenta:gold,description:cyan_bold"'. Valid colors are: black, black_bold, blue, blue_bold, cyan, cyan_bold, green, green_bold, magenta, magenta_bold, none, red, red_bold, white, white_bold, yellow, yellow_bold. icdiff-release-2.0.7/tests/gold-tabs-4.txt000066400000000000000000000003341447067572400203220ustar00rootroot00000000000000tests/input-8.txt tests/input-9.txt abc def aQc dQf icdiff-release-2.0.7/tests/gold-tabs-default.txt000066400000000000000000000003341447067572400216030ustar00rootroot00000000000000tests/input-8.txt tests/input-9.txt abc def aQc dQf icdiff-release-2.0.7/tests/input-1.txt000066400000000000000000000002461447067572400176040ustar00rootroot00000000000000测试行abc测试测试行abc,测试测试行abc测试测试行abc测试测试行abc测试测试行abc测试测试行abc测试测试行abc测试测试行abc测试 icdiff-release-2.0.7/tests/input-10.txt000066400000000000000000000000431447067572400176570ustar00rootroot00000000000000void main () { int x; int y; } icdiff-release-2.0.7/tests/input-11.txt000066400000000000000000000000651447067572400176640ustar00rootroot00000000000000void main () { int x; int y; } void foo () { } icdiff-release-2.0.7/tests/input-2.txt000066400000000000000000000002461447067572400176050ustar00rootroot00000000000000测试行adc测试测试行adc,测试测试行adc测试测试行adc测试测试行adc测试测试行adc测试测试行adc测试测试行adc测试测试行adc测试 icdiff-release-2.0.7/tests/input-3.txt000066400000000000000000000543751447067572400176220ustar00rootroot00000000000000UTF-8 decoder capability and stress test ---------------------------------------- Markus Kuhn - 2015-08-28 - CC BY 4.0 This test file can help you examine, how your UTF-8 decoder handles various types of correct, malformed, or otherwise interesting UTF-8 sequences. This file is not meant to be a conformance test. It does not prescribe any particular outcome. Therefore, there is no way to "pass" or "fail" this test file, even though the text does suggest a preferable decoder behaviour at some places. Its aim is, instead, to help you think about, and test, the behaviour of your UTF-8 decoder on a systematic collection of unusual inputs. Experience so far suggests that most first-time authors of UTF-8 decoders find at least one serious problem in their decoder using this file. The test lines below cover boundary conditions, malformed UTF-8 sequences, as well as correctly encoded UTF-8 sequences of Unicode code points that should never occur in a correct UTF-8 file. According to ISO 10646-1:2000, sections D.7 and 2.3c, a device receiving UTF-8 shall interpret a "malformed sequence in the same way that it interprets a character that is outside the adopted subset" and "characters that are not within the adopted subset shall be indicated to the user" by a receiving device. One commonly used approach in UTF-8 decoders is to replace any malformed UTF-8 sequence by a replacement character (U+FFFD), which looks a bit like an inverted question mark, or a similar symbol. It might be a good idea to visually distinguish a malformed UTF-8 sequence from a correctly encoded Unicode character that is just not available in the current font but otherwise fully legal, even though ISO 10646-1 doesn't mandate this. In any case, just ignoring malformed sequences or unavailable characters does not conform to ISO 10646, will make debugging more difficult, and can lead to user confusion. Please check, whether a malformed UTF-8 sequence is (1) represented at all, (2) represented by exactly one single replacement character (or equivalent signal), and (3) the following quotation mark after an illegal UTF-8 sequence is correctly displayed, i.e. proper resynchronization takes place immediately after any malformed sequence. This file says "THE END" in the last line, so if you don't see that, your decoder crashed somehow before, which should always be cause for concern. All lines in this file are exactly 79 characters long (plus the line feed). In addition, all lines end with "|", except for the two test lines 2.1.1 and 2.2.1, which contain non-printable ASCII controls U+0000 and U+007F. If you display this file with a fixed-width font, these "|" characters should all line up in column 79 (right margin). This allows you to test quickly, whether your UTF-8 decoder finds the correct number of characters in every line, that is whether each malformed sequences is replaced by a single replacement character. Note that, as an alternative to the notion of malformed sequence used here, it is also a perfectly acceptable (and in some situations even preferable) solution to represent each individual byte of a malformed sequence with a replacement character. If you follow this strategy in your decoder, then please ignore the "|" column. Here come the tests: | | 1 Some correct UTF-8 text | | You should see the Greek word 'kosme': "κόσμε" | | 2 Boundary condition test cases | | 2.1 First possible sequence of a certain length | | 2.1.1 1 byte (U-00000000): "" 2.1.2 2 bytes (U-00000080): "€" | 2.1.3 3 bytes (U-00000800): "ࠀ" | 2.1.4 4 bytes (U-00010000): "𐀀" | 2.1.5 5 bytes (U-00200000): "" | 2.1.6 6 bytes (U-04000000): "" | | 2.2 Last possible sequence of a certain length | | 2.2.1 1 byte (U-0000007F): "" 2.2.2 2 bytes (U-000007FF): "߿" | 2.2.3 3 bytes (U-0000FFFF): "￿" | 2.2.4 4 bytes (U-001FFFFF): "" | 2.2.5 5 bytes (U-03FFFFFF): "" | 2.2.6 6 bytes (U-7FFFFFFF): "" | | 2.3 Other boundary conditions | | 2.3.1 U-0000D7FF = ed 9f bf = "퟿" | 2.3.2 U-0000E000 = ee 80 80 = "" | 2.3.3 U-0000FFFD = ef bf bd = "�" | 2.3.4 U-0010FFFF = f4 8f bf bf = "􏿿" | 2.3.5 U-00110000 = f4 90 80 80 = "" | | 3 Malformed sequences | | 3.1 Unexpected continuation bytes | | Each unexpected continuation byte should be separately signalled as a | malformed sequence of its own. | | 3.1.1 First continuation byte 0x80: "" | 3.1.2 Last continuation byte 0xbf: "" | | 3.1.3 2 continuation bytes: "" | 3.1.4 3 continuation bytes: "" | 3.1.5 4 continuation bytes: "" | 3.1.6 5 continuation bytes: "" | 3.1.7 6 continuation bytes: "" | 3.1.8 7 continuation bytes: "" | | 3.1.9 Sequence of all 64 possible continuation bytes (0x80-0xbf): | | " | | | " | | 3.2 Lonely start characters | | 3.2.1 All 32 first bytes of 2-byte sequences (0xc0-0xdf), | each followed by a space character: | | " | " | | 3.2.2 All 16 first bytes of 3-byte sequences (0xe0-0xef), | each followed by a space character: | | " " | | 3.2.3 All 8 first bytes of 4-byte sequences (0xf0-0xf7), | each followed by a space character: | | " " | | 3.2.4 All 4 first bytes of 5-byte sequences (0xf8-0xfb), | each followed by a space character: | | " " | | 3.2.5 All 2 first bytes of 6-byte sequences (0xfc-0xfd), | each followed by a space character: | | " " | | 3.3 Sequences with last continuation byte missing | | All bytes of an incomplete sequence should be signalled as a single | malformed sequence, i.e., you should see only a single replacement | character in each of the next 10 tests. (Characters as in section 2) | | 3.3.1 2-byte sequence with last byte missing (U+0000): "" | 3.3.2 3-byte sequence with last byte missing (U+0000): "" | 3.3.3 4-byte sequence with last byte missing (U+0000): "" | 3.3.4 5-byte sequence with last byte missing (U+0000): "" | 3.3.5 6-byte sequence with last byte missing (U+0000): "" | 3.3.6 2-byte sequence with last byte missing (U-000007FF): "" | 3.3.7 3-byte sequence with last byte missing (U-0000FFFF): "" | 3.3.8 4-byte sequence with last byte missing (U-001FFFFF): "" | 3.3.9 5-byte sequence with last byte missing (U-03FFFFFF): "" | 3.3.10 6-byte sequence with last byte missing (U-7FFFFFFF): "" | | 3.4 Concatenation of incomplete sequences | | All the 10 sequences of 3.3 concatenated, you should see 10 malformed | sequences being signalled: | | "" | | 3.5 Impossible bytes | | The following two bytes cannot appear in a correct UTF-8 string | | 3.5.1 fe = "" | 3.5.2 ff = "" | 3.5.3 fe fe ff ff = "" | | 4 Overlong sequences | | The following sequences are not malformed according to the letter of | the Unicode 2.0 standard. However, they are longer then necessary and | a correct UTF-8 encoder is not allowed to produce them. A "safe UTF-8 | decoder" should reject them just like malformed sequences for two | reasons: (1) It helps to debug applications if overlong sequences are | not treated as valid representations of characters, because this helps | to spot problems more quickly. (2) Overlong sequences provide | alternative representations of characters, that could maliciously be | used to bypass filters that check only for ASCII characters. For | instance, a 2-byte encoded line feed (LF) would not be caught by a | line counter that counts only 0x0a bytes, but it would still be | processed as a line feed by an unsafe UTF-8 decoder later in the | pipeline. From a security point of view, ASCII compatibility of UTF-8 | sequences means also, that ASCII characters are *only* allowed to be | represented by ASCII bytes in the range 0x00-0x7f. To ensure this | aspect of ASCII compatibility, use only "safe UTF-8 decoders" that | reject overlong UTF-8 sequences for which a shorter encoding exists. | | 4.1 Examples of an overlong ASCII character | | With a safe UTF-8 decoder, all of the following five overlong | representations of the ASCII character slash ("/") should be rejected | like a malformed UTF-8 sequence, for instance by substituting it with | a replacement character. If you see a slash below, you do not have a | safe UTF-8 decoder! | | 4.1.1 U+002F = c0 af = "" | 4.1.2 U+002F = e0 80 af = "" | 4.1.3 U+002F = f0 80 80 af = "" | 4.1.4 U+002F = f8 80 80 80 af = "" | 4.1.5 U+002F = fc 80 80 80 80 af = "" | | 4.2 Maximum overlong sequences | | Below you see the highest Unicode value that is still resulting in an | overlong sequence if represented with the given number of bytes. This | is a boundary test for safe UTF-8 decoders. All five characters should | be rejected like malformed UTF-8 sequences. | | 4.2.1 U-0000007F = c1 bf = "" | 4.2.2 U-000007FF = e0 9f bf = "" | 4.2.3 U-0000FFFF = f0 8f bf bf = "" | 4.2.4 U-001FFFFF = f8 87 bf bf bf = "" | 4.2.5 U-03FFFFFF = fc 83 bf bf bf bf = "" | | 4.3 Overlong representation of the NUL character | | The following five sequences should also be rejected like malformed | UTF-8 sequences and should not be treated like the ASCII NUL | character. | | 4.3.1 U+0000 = c0 80 = "" | 4.3.2 U+0000 = e0 80 80 = "" | 4.3.3 U+0000 = f0 80 80 80 = "" | 4.3.4 U+0000 = f8 80 80 80 80 = "" | 4.3.5 U+0000 = fc 80 80 80 80 80 = "" | | 5 Illegal code positions | | The following UTF-8 sequences should be rejected like malformed | sequences, because they never represent valid ISO 10646 characters and | a UTF-8 decoder that accepts them might introduce security problems | comparable to overlong UTF-8 sequences. | | 5.1 Single UTF-16 surrogates | | 5.1.1 U+D800 = ed a0 80 = "" | 5.1.2 U+DB7F = ed ad bf = "" | 5.1.3 U+DB80 = ed ae 80 = "" | 5.1.4 U+DBFF = ed af bf = "" | 5.1.5 U+DC00 = ed b0 80 = "" | 5.1.6 U+DF80 = ed be 80 = "" | 5.1.7 U+DFFF = ed bf bf = "" | | 5.2 Paired UTF-16 surrogates | | 5.2.1 U+D800 U+DC00 = ed a0 80 ed b0 80 = "" | 5.2.2 U+D800 U+DFFF = ed a0 80 ed bf bf = "" | 5.2.3 U+DB7F U+DC00 = ed ad bf ed b0 80 = "" | 5.2.4 U+DB7F U+DFFF = ed ad bf ed bf bf = "" | 5.2.5 U+DB80 U+DC00 = ed ae 80 ed b0 80 = "" | 5.2.6 U+DB80 U+DFFF = ed ae 80 ed bf bf = "" | 5.2.7 U+DBFF U+DC00 = ed af bf ed b0 80 = "" | 5.2.8 U+DBFF U+DFFF = ed af bf ed bf bf = "" | | 5.3 Noncharacter code positions | | The following "noncharacters" are "reserved for internal use" by | applications, and according to older versions of the Unicode Standard | "should never be interchanged". Unicode Corrigendum #9 dropped the | latter restriction. Nevertheless, their presence in incoming UTF-8 data | can remain a potential security risk, depending on what use is made of | these codes subsequently. Examples of such internal use: | | - Some file APIs with 16-bit characters may use the integer value -1 | = U+FFFF to signal an end-of-file (EOF) or error condition. | | - In some UTF-16 receivers, code point U+FFFE might trigger a | byte-swap operation (to convert between UTF-16LE and UTF-16BE). | | With such internal use of noncharacters, it may be desirable and safer | to block those code points in UTF-8 decoders, as they should never | occur legitimately in incoming UTF-8 data, and could trigger unsafe | behaviour in subsequent processing. | | Particularly problematic noncharacters in 16-bit applications: | | 5.3.1 U+FFFE = ef bf be = "￾" | 5.3.2 U+FFFF = ef bf bf = "￿" | | Other noncharacters: | | 5.3.3 U+FDD0 .. U+FDEF = "﷐﷑﷒﷓﷔﷕﷖﷗﷘﷙﷚﷛﷜﷝﷞﷟﷠﷡﷢﷣﷤﷥﷦﷧﷨﷩﷪﷫﷬﷭﷮﷯"| | 5.3.4 U+nFFFE U+nFFFF (for n = 1..10) | | "🿾🿿𯿾𯿿𿿾𿿿񏿾񏿿񟿾񟿿񯿾񯿿񿿾񿿿򏿾򏿿 | 򟿾򟿿򯿾򯿿򿿾򿿿󏿾󏿿󟿾󟿿󯿾󯿿󿿾󿿿􏿾􏿿" | | THE END | icdiff-release-2.0.7/tests/input-4-cr.txt000066400000000000000000000002101447067572400202000ustar00rootroot00000000000000#input, #button { width: 350px; height: 40px; margin: 0px; padding: 0px; margin-bottom: 15px; text-align: center; } icdiff-release-2.0.7/tests/input-4-partial-cr.txt000066400000000000000000000002071447067572400216400ustar00rootroot00000000000000#input, #button { width: 350px; height: 40px; margin: 0px; padding: 0px; margin-bottom: 15px; text-align: center; } icdiff-release-2.0.7/tests/input-4.txt000066400000000000000000000002001447067572400175750ustar00rootroot00000000000000#input, #button { width: 350px; height: 40px; margin: 0px; padding: 0px; margin-bottom: 15px; text-align: center; } icdiff-release-2.0.7/tests/input-5-cr.txt000066400000000000000000000002001447067572400202000ustar00rootroot00000000000000#input, #button { width: 400px; height: 40px; font-size: 30px; margin: 0; padding: 0; text-align: center; } icdiff-release-2.0.7/tests/input-5.txt000066400000000000000000000001701447067572400176040ustar00rootroot00000000000000#input, #button { width: 400px; height: 40px; font-size: 30px; margin: 0; padding: 0; text-align: center; } icdiff-release-2.0.7/tests/input-6.txt000066400000000000000000000000641447067572400176070ustar00rootroot00000000000000a b c d e f g h i j k l m n o p q r s t u z w x y z icdiff-release-2.0.7/tests/input-7.txt000066400000000000000000000000701447067572400176050ustar00rootroot00000000000000a b c d e f g h i j k l m m m n o p q r s t u z w x y z icdiff-release-2.0.7/tests/input-8.txt000066400000000000000000000000111447067572400176010ustar00rootroot00000000000000 abc def icdiff-release-2.0.7/tests/input-9.txt000066400000000000000000000000111447067572400176020ustar00rootroot00000000000000 aQc dQf icdiff-release-2.0.7/tests/test-with-exclude/000077500000000000000000000000001447067572400211235ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/a/000077500000000000000000000000001447067572400213435ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/a/1000066400000000000000000000000001447067572400214140ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/a/c/000077500000000000000000000000001447067572400215655ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/a/c/e000066400000000000000000000000021447067572400217240ustar00rootroot000000000000001 icdiff-release-2.0.7/tests/test-with-exclude/a/c/f000066400000000000000000000000021447067572400217250ustar00rootroot000000000000002 icdiff-release-2.0.7/tests/test-with-exclude/a/exclude/000077500000000000000000000000001447067572400227745ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/a/exclude/text.txt000066400000000000000000000000131447067572400245130ustar00rootroot00000000000000excluded a icdiff-release-2.0.7/tests/test-with-exclude/a/j000066400000000000000000000000021447067572400215070ustar00rootroot000000000000007 icdiff-release-2.0.7/tests/test-with-exclude/b/000077500000000000000000000000001447067572400213445ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/b/1000066400000000000000000000543751447067572400214450ustar00rootroot00000000000000UTF-8 decoder capability and stress test ---------------------------------------- Markus Kuhn - 2015-08-28 - CC BY 4.0 This test file can help you examine, how your UTF-8 decoder handles various types of correct, malformed, or otherwise interesting UTF-8 sequences. This file is not meant to be a conformance test. It does not prescribe any particular outcome. Therefore, there is no way to "pass" or "fail" this test file, even though the text does suggest a preferable decoder behaviour at some places. Its aim is, instead, to help you think about, and test, the behaviour of your UTF-8 decoder on a systematic collection of unusual inputs. Experience so far suggests that most first-time authors of UTF-8 decoders find at least one serious problem in their decoder using this file. The test lines below cover boundary conditions, malformed UTF-8 sequences, as well as correctly encoded UTF-8 sequences of Unicode code points that should never occur in a correct UTF-8 file. According to ISO 10646-1:2000, sections D.7 and 2.3c, a device receiving UTF-8 shall interpret a "malformed sequence in the same way that it interprets a character that is outside the adopted subset" and "characters that are not within the adopted subset shall be indicated to the user" by a receiving device. One commonly used approach in UTF-8 decoders is to replace any malformed UTF-8 sequence by a replacement character (U+FFFD), which looks a bit like an inverted question mark, or a similar symbol. It might be a good idea to visually distinguish a malformed UTF-8 sequence from a correctly encoded Unicode character that is just not available in the current font but otherwise fully legal, even though ISO 10646-1 doesn't mandate this. In any case, just ignoring malformed sequences or unavailable characters does not conform to ISO 10646, will make debugging more difficult, and can lead to user confusion. Please check, whether a malformed UTF-8 sequence is (1) represented at all, (2) represented by exactly one single replacement character (or equivalent signal), and (3) the following quotation mark after an illegal UTF-8 sequence is correctly displayed, i.e. proper resynchronization takes place immediately after any malformed sequence. This file says "THE END" in the last line, so if you don't see that, your decoder crashed somehow before, which should always be cause for concern. All lines in this file are exactly 79 characters long (plus the line feed). In addition, all lines end with "|", except for the two test lines 2.1.1 and 2.2.1, which contain non-printable ASCII controls U+0000 and U+007F. If you display this file with a fixed-width font, these "|" characters should all line up in column 79 (right margin). This allows you to test quickly, whether your UTF-8 decoder finds the correct number of characters in every line, that is whether each malformed sequences is replaced by a single replacement character. Note that, as an alternative to the notion of malformed sequence used here, it is also a perfectly acceptable (and in some situations even preferable) solution to represent each individual byte of a malformed sequence with a replacement character. If you follow this strategy in your decoder, then please ignore the "|" column. Here come the tests: | | 1 Some correct UTF-8 text | | You should see the Greek word 'kosme': "κόσμε" | | 2 Boundary condition test cases | | 2.1 First possible sequence of a certain length | | 2.1.1 1 byte (U-00000000): "" 2.1.2 2 bytes (U-00000080): "€" | 2.1.3 3 bytes (U-00000800): "ࠀ" | 2.1.4 4 bytes (U-00010000): "𐀀" | 2.1.5 5 bytes (U-00200000): "" | 2.1.6 6 bytes (U-04000000): "" | | 2.2 Last possible sequence of a certain length | | 2.2.1 1 byte (U-0000007F): "" 2.2.2 2 bytes (U-000007FF): "߿" | 2.2.3 3 bytes (U-0000FFFF): "￿" | 2.2.4 4 bytes (U-001FFFFF): "" | 2.2.5 5 bytes (U-03FFFFFF): "" | 2.2.6 6 bytes (U-7FFFFFFF): "" | | 2.3 Other boundary conditions | | 2.3.1 U-0000D7FF = ed 9f bf = "퟿" | 2.3.2 U-0000E000 = ee 80 80 = "" | 2.3.3 U-0000FFFD = ef bf bd = "�" | 2.3.4 U-0010FFFF = f4 8f bf bf = "􏿿" | 2.3.5 U-00110000 = f4 90 80 80 = "" | | 3 Malformed sequences | | 3.1 Unexpected continuation bytes | | Each unexpected continuation byte should be separately signalled as a | malformed sequence of its own. | | 3.1.1 First continuation byte 0x80: "" | 3.1.2 Last continuation byte 0xbf: "" | | 3.1.3 2 continuation bytes: "" | 3.1.4 3 continuation bytes: "" | 3.1.5 4 continuation bytes: "" | 3.1.6 5 continuation bytes: "" | 3.1.7 6 continuation bytes: "" | 3.1.8 7 continuation bytes: "" | | 3.1.9 Sequence of all 64 possible continuation bytes (0x80-0xbf): | | " | | | " | | 3.2 Lonely start characters | | 3.2.1 All 32 first bytes of 2-byte sequences (0xc0-0xdf), | each followed by a space character: | | " | " | | 3.2.2 All 16 first bytes of 3-byte sequences (0xe0-0xef), | each followed by a space character: | | " " | | 3.2.3 All 8 first bytes of 4-byte sequences (0xf0-0xf7), | each followed by a space character: | | " " | | 3.2.4 All 4 first bytes of 5-byte sequences (0xf8-0xfb), | each followed by a space character: | | " " | | 3.2.5 All 2 first bytes of 6-byte sequences (0xfc-0xfd), | each followed by a space character: | | " " | | 3.3 Sequences with last continuation byte missing | | All bytes of an incomplete sequence should be signalled as a single | malformed sequence, i.e., you should see only a single replacement | character in each of the next 10 tests. (Characters as in section 2) | | 3.3.1 2-byte sequence with last byte missing (U+0000): "" | 3.3.2 3-byte sequence with last byte missing (U+0000): "" | 3.3.3 4-byte sequence with last byte missing (U+0000): "" | 3.3.4 5-byte sequence with last byte missing (U+0000): "" | 3.3.5 6-byte sequence with last byte missing (U+0000): "" | 3.3.6 2-byte sequence with last byte missing (U-000007FF): "" | 3.3.7 3-byte sequence with last byte missing (U-0000FFFF): "" | 3.3.8 4-byte sequence with last byte missing (U-001FFFFF): "" | 3.3.9 5-byte sequence with last byte missing (U-03FFFFFF): "" | 3.3.10 6-byte sequence with last byte missing (U-7FFFFFFF): "" | | 3.4 Concatenation of incomplete sequences | | All the 10 sequences of 3.3 concatenated, you should see 10 malformed | sequences being signalled: | | "" | | 3.5 Impossible bytes | | The following two bytes cannot appear in a correct UTF-8 string | | 3.5.1 fe = "" | 3.5.2 ff = "" | 3.5.3 fe fe ff ff = "" | | 4 Overlong sequences | | The following sequences are not malformed according to the letter of | the Unicode 2.0 standard. However, they are longer then necessary and | a correct UTF-8 encoder is not allowed to produce them. A "safe UTF-8 | decoder" should reject them just like malformed sequences for two | reasons: (1) It helps to debug applications if overlong sequences are | not treated as valid representations of characters, because this helps | to spot problems more quickly. (2) Overlong sequences provide | alternative representations of characters, that could maliciously be | used to bypass filters that check only for ASCII characters. For | instance, a 2-byte encoded line feed (LF) would not be caught by a | line counter that counts only 0x0a bytes, but it would still be | processed as a line feed by an unsafe UTF-8 decoder later in the | pipeline. From a security point of view, ASCII compatibility of UTF-8 | sequences means also, that ASCII characters are *only* allowed to be | represented by ASCII bytes in the range 0x00-0x7f. To ensure this | aspect of ASCII compatibility, use only "safe UTF-8 decoders" that | reject overlong UTF-8 sequences for which a shorter encoding exists. | | 4.1 Examples of an overlong ASCII character | | With a safe UTF-8 decoder, all of the following five overlong | representations of the ASCII character slash ("/") should be rejected | like a malformed UTF-8 sequence, for instance by substituting it with | a replacement character. If you see a slash below, you do not have a | safe UTF-8 decoder! | | 4.1.1 U+002F = c0 af = "" | 4.1.2 U+002F = e0 80 af = "" | 4.1.3 U+002F = f0 80 80 af = "" | 4.1.4 U+002F = f8 80 80 80 af = "" | 4.1.5 U+002F = fc 80 80 80 80 af = "" | | 4.2 Maximum overlong sequences | | Below you see the highest Unicode value that is still resulting in an | overlong sequence if represented with the given number of bytes. This | is a boundary test for safe UTF-8 decoders. All five characters should | be rejected like malformed UTF-8 sequences. | | 4.2.1 U-0000007F = c1 bf = "" | 4.2.2 U-000007FF = e0 9f bf = "" | 4.2.3 U-0000FFFF = f0 8f bf bf = "" | 4.2.4 U-001FFFFF = f8 87 bf bf bf = "" | 4.2.5 U-03FFFFFF = fc 83 bf bf bf bf = "" | | 4.3 Overlong representation of the NUL character | | The following five sequences should also be rejected like malformed | UTF-8 sequences and should not be treated like the ASCII NUL | character. | | 4.3.1 U+0000 = c0 80 = "" | 4.3.2 U+0000 = e0 80 80 = "" | 4.3.3 U+0000 = f0 80 80 80 = "" | 4.3.4 U+0000 = f8 80 80 80 80 = "" | 4.3.5 U+0000 = fc 80 80 80 80 80 = "" | | 5 Illegal code positions | | The following UTF-8 sequences should be rejected like malformed | sequences, because they never represent valid ISO 10646 characters and | a UTF-8 decoder that accepts them might introduce security problems | comparable to overlong UTF-8 sequences. | | 5.1 Single UTF-16 surrogates | | 5.1.1 U+D800 = ed a0 80 = "" | 5.1.2 U+DB7F = ed ad bf = "" | 5.1.3 U+DB80 = ed ae 80 = "" | 5.1.4 U+DBFF = ed af bf = "" | 5.1.5 U+DC00 = ed b0 80 = "" | 5.1.6 U+DF80 = ed be 80 = "" | 5.1.7 U+DFFF = ed bf bf = "" | | 5.2 Paired UTF-16 surrogates | | 5.2.1 U+D800 U+DC00 = ed a0 80 ed b0 80 = "" | 5.2.2 U+D800 U+DFFF = ed a0 80 ed bf bf = "" | 5.2.3 U+DB7F U+DC00 = ed ad bf ed b0 80 = "" | 5.2.4 U+DB7F U+DFFF = ed ad bf ed bf bf = "" | 5.2.5 U+DB80 U+DC00 = ed ae 80 ed b0 80 = "" | 5.2.6 U+DB80 U+DFFF = ed ae 80 ed bf bf = "" | 5.2.7 U+DBFF U+DC00 = ed af bf ed b0 80 = "" | 5.2.8 U+DBFF U+DFFF = ed af bf ed bf bf = "" | | 5.3 Noncharacter code positions | | The following "noncharacters" are "reserved for internal use" by | applications, and according to older versions of the Unicode Standard | "should never be interchanged". Unicode Corrigendum #9 dropped the | latter restriction. Nevertheless, their presence in incoming UTF-8 data | can remain a potential security risk, depending on what use is made of | these codes subsequently. Examples of such internal use: | | - Some file APIs with 16-bit characters may use the integer value -1 | = U+FFFF to signal an end-of-file (EOF) or error condition. | | - In some UTF-16 receivers, code point U+FFFE might trigger a | byte-swap operation (to convert between UTF-16LE and UTF-16BE). | | With such internal use of noncharacters, it may be desirable and safer | to block those code points in UTF-8 decoders, as they should never | occur legitimately in incoming UTF-8 data, and could trigger unsafe | behaviour in subsequent processing. | | Particularly problematic noncharacters in 16-bit applications: | | 5.3.1 U+FFFE = ef bf be = "￾" | 5.3.2 U+FFFF = ef bf bf = "￿" | | Other noncharacters: | | 5.3.3 U+FDD0 .. U+FDEF = "﷐﷑﷒﷓﷔﷕﷖﷗﷘﷙﷚﷛﷜﷝﷞﷟﷠﷡﷢﷣﷤﷥﷦﷧﷨﷩﷪﷫﷬﷭﷮﷯"| | 5.3.4 U+nFFFE U+nFFFF (for n = 1..10) | | "🿾🿿𯿾𯿿𿿾𿿿񏿾񏿿񟿾񟿿񯿾񯿿񿿾񿿿򏿾򏿿 | 򟿾򟿿򯿾򯿿򿿾򿿿󏿾󏿿󟿾󟿿󯿾󯿿󿿾󿿿􏿾􏿿" | | THE END | icdiff-release-2.0.7/tests/test-with-exclude/b/c/000077500000000000000000000000001447067572400215665ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/b/c/e000066400000000000000000000000021447067572400217250ustar00rootroot000000000000001 icdiff-release-2.0.7/tests/test-with-exclude/b/c/f000066400000000000000000000000021447067572400217260ustar00rootroot000000000000003 icdiff-release-2.0.7/tests/test-with-exclude/b/c/g000066400000000000000000000000021447067572400217270ustar00rootroot000000000000004 icdiff-release-2.0.7/tests/test-with-exclude/b/c/h000066400000000000000000000000021447067572400217300ustar00rootroot000000000000005 icdiff-release-2.0.7/tests/test-with-exclude/b/d/000077500000000000000000000000001447067572400215675ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/b/d/q000066400000000000000000000000021447067572400217420ustar00rootroot000000000000009 icdiff-release-2.0.7/tests/test-with-exclude/b/exclude/000077500000000000000000000000001447067572400227755ustar00rootroot00000000000000icdiff-release-2.0.7/tests/test-with-exclude/b/exclude/text.txt000066400000000000000000000000131447067572400245140ustar00rootroot00000000000000excluded b icdiff-release-2.0.7/tests/test-with-exclude/b/i000066400000000000000000000000021447067572400215070ustar00rootroot000000000000006