utfcheck-1.2/ 0000755 0001750 0001750 00000000000 13343107235 011635 5 ustar paul paul utfcheck-1.2/README 0000644 0001750 0001750 00000004054 13331463217 012522 0 ustar paul paul This is the README file for the utfcheck package.
This package contains a utility, utfcheck, to examine an input file
and provide feedback on whether or not it consists of valid ASCII
or UTF-8 text (depending on the options selected).
Notable code points (for example, Unicode "noncharacters") generate
diagnostic output. As long as the input is valid for the encoding
type though, utfcheck continues to scan the input stream and will
end with exit status EXIT_SUCCESS.
If the input stream begins with a UTF-16 Byte Order Mark, or if
it contains the UTF-8 encoding of a code point in the Unicode
Surrogate Pair range, an error message is printed and the program
terminates with exit status EXIT_FAILURE.
Information about the latest version is in the NEWS file.
If you downloaded this source package, instructions for
building and installation can be found in the INSTALL file
and license information is in the COPYING file.
If you are a downstream maintainer porting this package
to a new architecture, you can remove all files that
Autotools added with the command
autoreconf -f -i && ./configure && make orig
In all other cases, typing the following command will
usually build the software on your system:
./configure && make
Then consult the INSTALL file for installation instructions.
LICENSE
-------
The license is contained in the COPYING file. A summary of this license
appears below.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
utfcheck-1.2/Makefile.am 0000644 0001750 0001750 00000000635 13331413520 013667 0 ustar paul paul ## Process this file with automake to produce Makefile.in
SUBDIRS = man src test
#
# Add "orig" target to remove all Autotools-added files left over from
#
# autoreconf && ./configure && make && make distclean
#
orig: distclean
\rm -rf aclocal.m4 autom4te.cache build-aux configure *~ */*~ \
INSTALL Makefile.in man/Makefile.in test/Makefile.in \
src/Makefile.in src/config.h.in src/utfcheck.c
utfcheck-1.2/man/ 0000755 0001750 0001750 00000000000 13343107235 012410 5 ustar paul paul utfcheck-1.2/man/utfcheck.1 0000644 0001750 0001750 00000017533 13343074675 014312 0 ustar paul paul .TH UTFCHECK 1 "2018 Sep 01" UTFCHECK "User Commands"
.SH NAME
utfcheck \- Check a file to verify that it is valid UTF-8 or ASCII
.SH SYNOPSIS
.br
.B utfcheck
[\-a] [\-q] [\-\-expurgated] [\-i \fIinput_file.beta\fP] [\-o \fIoutput_file.utf8\fP]
.SH DESCRIPTION
\fButfcheck\fP(1)
reads an input file and prints messages about contents that might
be unexpected (even if legal Unicode) in a UTF-8 or ASCII file,
such as embedded control characters or Unicode "noncharacters".
No diagnostic messages are printed for the control characters
horizontal tab, vertical tab, line feed, or form feed. A final
summary will indicate if null, carriage return, or escape
characters were read.
.PP
.B utfcheck
will detect a UTF-16 big-endian or little-endian Byte Order Mark
at the beginning of a file and quit if it sees one. There is no
support for parsing UTF-16 files beyond initial detection of the
Byte Order Mark.
.SH OPTIONS
.TP 6
\-a
Test for a pure ASCII file. ASCII control characters are allowed,
but \fButfcheck\fP will fail if it encounters a byte with value
greater than hexadecimal 7F (the delete control character).
.TP
\-i
Specify the input file. The default is STDIN.
.TP
\-o
Specify the output file. The default is STDOUT.
.TP
\-q
Quiet mode. Do not print any output unless an illegal byte sequence
is detected.
.TP
\-\-expurgated
Check a UTF-8 file against the "expurgated" version of the Unicode Standard,
the one without the Byte Order Mark, after Monty Python's "Bookshop"
skit with the "expurgated" version of \fIOlsen's Standard Book of
British Birds,\fP the one without the gannet\(embecause the customer
didn't like them. (But they've all got the Byte Order Mark. It's a
standard part of the Unicode Standard, the Byte Order Mark. It's in
all the books.)\| This option is not abbreviated, to keep the user mindful
of the questionable nature of testing for the lack of something even though
it is a legitimate part of the Unicode Standard. \fButfcheck\fP will fail
if this option is selected and the UTF-8 Byte Order Mark (officially the
zero width no-break space) is detected anywhere in the input file.
.PP
Sample usage:
.PP
.RS
utfcheck \-i \fImy_input_file.txt\fP \-o \fImy_output_file.log\fP
.RE
.SH MESSAGES
.SS "IMMEDIATE MESSAGES"
Some uncommon characters are noted immediately as they are encountered.
Some are fatal errors and some are not, as noted below.
The messages associated with them follow.
.TP 5
.B ASCII-CONTROL: U+\fInnnn\fP
The file contains ASCII control characters in the range U+0001 through
U+001F, inclusive, except for Horizontal Tab, Line Feed, Vertical Tab,
Form Feed, New Line, Carriage Return; or the file contains the
Delete character (U+007F).
.TP
.B ASCII-NULL
The file contains an ASCII NULL character (U+0000).
.TP
.B BINARY-DATA: 0x\fInn\fP
The file contains a byte value that is not part of a well-formed UTF-8
character. This is considered a fatal error and the program will terminate
with exit status EXIT_FAILURE.
.TP
.B NON-ASCII-DATA: 0x\fInn\fP
The \fB\-a\fP (ASCII only) option was selected and the file contains non-ASCII
data (i.e., a byte with the high bit set). This is considered a fatal error
and the program will terminate with exit status EXIT_FAILURE.
.TP
.B SURROGATE-PAIR-CODE-POINT: 0x\fInn\|.\|.\|.\fP (U+\fInnnn\fP)
The file contains a Unicode surrogate pair code point encoded as UTF-8
(U+D800 through U+DFFF, inclusive). Surrogate code points are used
with UTF-16 files, so they should never appear in UTF-8 files.
The byte values are printed first, and then the UTF-8 converted Unicode
code point is printed in parentheses.
This is considered a fatal error and the program will terminate with
exit status EXIT_FAILURE.
.TP
.B UTF-16-BE: Unsupported
The file begins with a big-endian UTF-16 Byte Order Mark.
Because \fButfcheck\fP does not support UTF-16, this is considered
a fatal error and the program will terminate with exit status EXIT_FAILURE.
.TP
.B UTF-16-LE: Unsupported
The file begins with a little-endian UTF-16 Byte Order Mark.
Because \fButfcheck\fP does not support UTF-16, this is considered
a fatal error and the program will terminate with exit status EXIT_FAILURE.
.TP
.B UTF-8-BOM-BEGIN
The file begins with a Byte Order Mark (U+FEFF) in UTF-8 form.
If the \fB\-\-expurgated\fP option is selected and this condition
is detected, this is considered a fatal error and the program will
terminate with exit status EXIT_FAILURE; otherwise, the program continues.
.TP
.B UTF-8-BOM-EMBEDDED
The file contains a Byte Order Mark (U+FEFF) after the start of the file.
If the \fB\-\-expurgated\fP option is selected and this condition
is detected, this is considered a fatal error and the program will
terminate with exit status EXIT_FAILURE; otherwise, the program continues.
.TP
.B UTF-8-CONTROL: 0x\fInn\|.\|.\|.\fP (U+\fInnnn\fP)
The file contains a UTF-8 control character (U+0080 through U+009F, inclusive).
The byte values are printed first, and then the UTF-8 converted Unicode
code point is printed in parentheses.
.TP
.B UTF-8-NONCHARACTER: 0x\fInn\|.\|.\|.\fP (U+\fInnnn\fP)
The file contains a Unicode "noncharacter". This can be a code point
in the range U+FDD0 through U+FDEF, inclusive, or the last two code points
of any Unicode plane, from Plane 0 through Plane 16, inclusive.
The byte values are printed first, and then the UTF-8 converted Unicode
code point is printed in parentheses.
Note that a noncharacter is allowable in well-formed Unicode files,
so this condition is not considered an error.
.SS "END OF FILE SUMMARY"
If the \fB\-q\fP option is not selected and the program has not encountered
a fatal error before reaching the end of the input stream, \fButfcheck\fP
prints a summary of the file contents after the input stream has reached
its end. This will begin with the line "FILE-SUMMARY:". This is followed by
a line beginning with "Character-Set: " followed by one of "ASCII", "UTF-8",
"UTF-16-BE" (UTF-16 Big Endian), "UTF-16-LE" (UTF-16 Little Endian),
or "BINARY". (Note that UTF-16 parsing is not currently implemented,
so the UTF-16-BE and UTF-16-LE types will not appear in this final summary
at present.) The following messages can appear in this end of file summary
if the program encountered the corresponding types of Unicode code points.
.TP 5
.B BOM-AT-START
The file begins with a UTF-8 Byte Order Mark (U+FEFF).
.TP
.B BOM-AFTER-START
The file contains a UTF-8 Byte Order Mark (U+FEFF) after the start of the file.
.TP
.B CONTAINS-NULLS
The file contains null characters (U+0000).
.TP
.B CONTAINS-CARRIAGE_RETURN
The file contains carriage returns (U+000D).
.TP
.B CONTAINS-CONTROL_CHARACTERS
The file contains ASCII control characters in the range U+0001 through
U+001F, inclusive, except for Horizontal Tab, Line Feed, Vertical Tab,
Form Feed, New Line, or Carriage Return; or contains the Delete character
(U+007F) or control characters in the range U+0080 through U+009F, inclusive.
.TP
.B CONTAINS-ESCAPE_SEQUENCES
The file contains at least one ASCII escape character (U+001B), which
is interpreted to be part of an escape sequence (for example, a VT-100 or
ANSI terminal control sequence).
.TP
.B Plane-0-PUA: \fIn\fP characters
Number of Plane 0 Private Use Area characters in file.
.TP
.B Plane-15-PUA: \fIn\fP characters
Number of Plane 15 Private Use Area characters in file.
.TP
.B Plane-16-PUA: \fIn\fP characters
Number of Plane 16 Private Use Area characters in file.
.SH "EXIT STATUS"
.B utfcheck
will exit with a status of EXIT_SUCCESS if the input file only contains
valid text, or with a status of EXIT_FAILURE if it contains invalid bytes.
.SH FILES
ASCII or UTF-8 text files.
.SH AUTHOR
.B utfcheck
was written by Paul Hardy.
.SH LICENSE
.B utfcheck
is Copyright \(co 2018 Paul Hardy.
.PP
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
.SH BUGS
No known bugs exist.
utfcheck-1.2/man/Makefile.am 0000644 0001750 0001750 00000000153 13327125203 014440 0 ustar paul paul ## Process this file with automake to produce Makefile.in
man_MANS = utfcheck.1
EXTRA_DIST = $(man_MANS)
utfcheck-1.2/configure.ac 0000644 0001750 0001750 00000000604 13342604530 014122 0 ustar paul paul AC_INIT([utfcheck], [1.2], [unifoundry@unifoundry.com],
[utfcheck], [http://www.unifoundry.com/utfcheck/])
AC_PREREQ([2.68])
AC_CONFIG_SRCDIR([src/utfcheck.l])
AC_CONFIG_AUX_DIR([build-aux])
AM_INIT_AUTOMAKE([1.11 subdir-objects -Wall -Werror])
AC_CONFIG_HEADERS([src/config.h])
AC_CONFIG_FILES([Makefile man/Makefile src/Makefile test/Makefile])
AC_PROG_LEX
AC_PROG_CC
AC_OUTPUT
utfcheck-1.2/test/ 0000755 0001750 0001750 00000000000 13343107235 012614 5 ustar paul paul utfcheck-1.2/test/test-utf8-expurgated 0000755 0001750 0001750 00000002560 13327135645 016566 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8.txt.gz
# The output file of the conversion
OUTFILE=test-utf8-expurgated.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf8-expurgated.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to utf8 executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck --expurgated > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf8-expurgated FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/init-out 0000755 0001750 0001750 00000001251 13327154411 014311 0 ustar paul paul #!/bin/sh
# This script should only be run by the package maintainer,
# to generate reference copies of test script output.
for i in ascii binary utf16-be utf16-le utf8-bom-begin utf8-bom-end utf8 ; do
gunzip < samples/sample-${i}.txt.gz | \
../src/utfcheck > expected/out-${i}.txt ;
done
gunzip < samples/sample-utf8.txt.gz | \
../src/utfcheck -a > expected/out-ascii-only.txt ;
gunzip < samples/sample-utf8-surrogate.txt.gz | \
../src/utfcheck > expected/out-utf8-surrogate.txt ;
gunzip < samples/sample-utf8.txt.gz | \
../src/utfcheck --expurgated > expected/out-utf8-expurgated.txt ;
utfcheck-1.2/test/test-utf8 0000755 0001750 0001750 00000002310 13327135604 014404 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8.txt.gz
# The output file of the conversion
OUTFILE=test-utf8.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf8.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to binary executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE}
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf8 FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/Makefile.am 0000644 0001750 0001750 00000001136 13327155040 014650 0 ustar paul paul ## Process this file with automake to produce Makefile.in
check_SCRIPTS=test-ascii \
test-utf8 test-utf8-bom-begin test-utf8-bom-end \
test-binary test-utf16-be test-utf16-le \
test-ascii-only test-utf8-expurgated test-utf8-surrogate
TESTS=$(check_SCRIPTS)
EXTRA_DIST=$(check_SCRIPTS) init-out test-all samples expected
dist-hook:
rm -rf $(distdir)/test/samples/gen-utf
AM_TESTS_ENVIRONMENT = utfcheck_path='$(abs_top_builddir)/src' ; \
export utfcheck_path ;
installcheck-local:
make utfcheck_bindir=${DESTDIR}${bindir} check
maintainer-clean-local:
\rm -f *.log *.trs
\rm -f samples/gen-utf
utfcheck-1.2/test/test-utf8-surrogate 0000755 0001750 0001750 00000002567 13327135676 016444 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8-surrogate.txt.gz
# The output file of the conversion
OUTFILE=test-utf8-surrogate.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf8-surrogate.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to utf8 executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck --expurgated > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf8-surrogate FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-utf8-bom-end 0000755 0001750 0001750 00000002350 13327135632 015730 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8-bom-end.txt.gz
# The output file of the conversion
OUTFILE=test-utf8-bom-end.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf8-bom-end.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to binary executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE}
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf8-bom-end FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-ascii 0000755 0001750 0001750 00000002314 13327135507 014614 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-ascii.txt.gz
# The output file of the conversion
OUTFILE=test-ascii.txt
# The reference file to compare the output against
CMPFILE=./expected/out-ascii.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to binary executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE}
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-ascii FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-utf8-bom-begin 0000755 0001750 0001750 00000002360 13327135620 016244 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8-bom-begin.txt.gz
# The output file of the conversion
OUTFILE=test-utf8-bom-begin.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf8-bom-begin.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to binary executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE}
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf8-bom-begin FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/expected/ 0000755 0001750 0001750 00000000000 13327375600 014422 5 ustar paul paul utfcheck-1.2/test/expected/out-ascii.txt 0000644 0001750 0001750 00000001264 13343057756 017072 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
FILE-SUMMARY:
Character-Set: ASCII
CONTAINS-NULLS
CONTAINS-CARRIAGE_RETURN
CONTAINS-CONTROL_CHARACTERS
CONTAINS-ESCAPE_SEQUENCES
utfcheck-1.2/test/expected/out-utf16-be.txt 0000644 0001750 0001750 00000000027 13343057756 017327 0 ustar paul paul UTF-16-BE: Unsupported
utfcheck-1.2/test/expected/out-binary.txt 0000644 0001750 0001750 00000001131 13343057756 017257 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
ASCII-CONTROL: U+007F
BINARY-DATA: 0x80
utfcheck-1.2/test/expected/out-utf16-le.txt 0000644 0001750 0001750 00000000027 13343057756 017341 0 ustar paul paul UTF-16-LE: Unsupported
utfcheck-1.2/test/expected/out-utf8.txt 0000644 0001750 0001750 00000011174 13343057756 016671 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
ASCII-CONTROL: U+007F
UTF-8-CONTROL: 0xC2 80 (U+0080)
UTF-8-CONTROL: 0xC2 81 (U+0081)
UTF-8-CONTROL: 0xC2 82 (U+0082)
UTF-8-CONTROL: 0xC2 83 (U+0083)
UTF-8-CONTROL: 0xC2 84 (U+0084)
UTF-8-CONTROL: 0xC2 85 (U+0085)
UTF-8-CONTROL: 0xC2 86 (U+0086)
UTF-8-CONTROL: 0xC2 87 (U+0087)
UTF-8-CONTROL: 0xC2 88 (U+0088)
UTF-8-CONTROL: 0xC2 89 (U+0089)
UTF-8-CONTROL: 0xC2 8A (U+008A)
UTF-8-CONTROL: 0xC2 8B (U+008B)
UTF-8-CONTROL: 0xC2 8C (U+008C)
UTF-8-CONTROL: 0xC2 8D (U+008D)
UTF-8-CONTROL: 0xC2 8E (U+008E)
UTF-8-CONTROL: 0xC2 8F (U+008F)
UTF-8-CONTROL: 0xC2 90 (U+0090)
UTF-8-CONTROL: 0xC2 91 (U+0091)
UTF-8-CONTROL: 0xC2 92 (U+0092)
UTF-8-CONTROL: 0xC2 93 (U+0093)
UTF-8-CONTROL: 0xC2 94 (U+0094)
UTF-8-CONTROL: 0xC2 95 (U+0095)
UTF-8-CONTROL: 0xC2 96 (U+0096)
UTF-8-CONTROL: 0xC2 97 (U+0097)
UTF-8-CONTROL: 0xC2 98 (U+0098)
UTF-8-CONTROL: 0xC2 99 (U+0099)
UTF-8-CONTROL: 0xC2 9A (U+009A)
UTF-8-CONTROL: 0xC2 9B (U+009B)
UTF-8-CONTROL: 0xC2 9C (U+009C)
UTF-8-CONTROL: 0xC2 9D (U+009D)
UTF-8-CONTROL: 0xC2 9E (U+009E)
UTF-8-CONTROL: 0xC2 9F (U+009F)
UTF-8-NONCHARACTER: 0xEF B7 90 (U+FDD0)
UTF-8-NONCHARACTER: 0xEF B7 91 (U+FDD1)
UTF-8-NONCHARACTER: 0xEF B7 92 (U+FDD2)
UTF-8-NONCHARACTER: 0xEF B7 93 (U+FDD3)
UTF-8-NONCHARACTER: 0xEF B7 94 (U+FDD4)
UTF-8-NONCHARACTER: 0xEF B7 95 (U+FDD5)
UTF-8-NONCHARACTER: 0xEF B7 96 (U+FDD6)
UTF-8-NONCHARACTER: 0xEF B7 97 (U+FDD7)
UTF-8-NONCHARACTER: 0xEF B7 98 (U+FDD8)
UTF-8-NONCHARACTER: 0xEF B7 99 (U+FDD9)
UTF-8-NONCHARACTER: 0xEF B7 9A (U+FDDA)
UTF-8-NONCHARACTER: 0xEF B7 9B (U+FDDB)
UTF-8-NONCHARACTER: 0xEF B7 9C (U+FDDC)
UTF-8-NONCHARACTER: 0xEF B7 9D (U+FDDD)
UTF-8-NONCHARACTER: 0xEF B7 9E (U+FDDE)
UTF-8-NONCHARACTER: 0xEF B7 9F (U+FDDF)
UTF-8-NONCHARACTER: 0xEF B7 A0 (U+FDE0)
UTF-8-NONCHARACTER: 0xEF B7 A1 (U+FDE1)
UTF-8-NONCHARACTER: 0xEF B7 A2 (U+FDE2)
UTF-8-NONCHARACTER: 0xEF B7 A3 (U+FDE3)
UTF-8-NONCHARACTER: 0xEF B7 A4 (U+FDE4)
UTF-8-NONCHARACTER: 0xEF B7 A5 (U+FDE5)
UTF-8-NONCHARACTER: 0xEF B7 A6 (U+FDE6)
UTF-8-NONCHARACTER: 0xEF B7 A7 (U+FDE7)
UTF-8-NONCHARACTER: 0xEF B7 A8 (U+FDE8)
UTF-8-NONCHARACTER: 0xEF B7 A9 (U+FDE9)
UTF-8-NONCHARACTER: 0xEF B7 AA (U+FDEA)
UTF-8-NONCHARACTER: 0xEF B7 AB (U+FDEB)
UTF-8-NONCHARACTER: 0xEF B7 AC (U+FDEC)
UTF-8-NONCHARACTER: 0xEF B7 AD (U+FDED)
UTF-8-NONCHARACTER: 0xEF B7 AE (U+FDEE)
UTF-8-NONCHARACTER: 0xEF B7 AF (U+FDEF)
UTF-8-BOM-EMBEDDED
UTF-8-NONCHARACTER: 0xEF BF BE (U+FFFE)
UTF-8-NONCHARACTER: 0xEF BF BF (U+FFFF)
UTF-8-NONCHARACTER: 0xF0 9F BF BE (U+1FFFE)
UTF-8-NONCHARACTER: 0xF0 9F BF BF (U+1FFFF)
UTF-8-NONCHARACTER: 0xF0 AF BF BE (U+2FFFE)
UTF-8-NONCHARACTER: 0xF0 AF BF BF (U+2FFFF)
UTF-8-NONCHARACTER: 0xF0 BF BF BE (U+3FFFE)
UTF-8-NONCHARACTER: 0xF0 BF BF BF (U+3FFFF)
UTF-8-NONCHARACTER: 0xF1 8F BF BE (U+4FFFE)
UTF-8-NONCHARACTER: 0xF1 8F BF BF (U+4FFFF)
UTF-8-NONCHARACTER: 0xF1 9F BF BE (U+5FFFE)
UTF-8-NONCHARACTER: 0xF1 9F BF BF (U+5FFFF)
UTF-8-NONCHARACTER: 0xF1 AF BF BE (U+6FFFE)
UTF-8-NONCHARACTER: 0xF1 AF BF BF (U+6FFFF)
UTF-8-NONCHARACTER: 0xF1 BF BF BE (U+7FFFE)
UTF-8-NONCHARACTER: 0xF1 BF BF BF (U+7FFFF)
UTF-8-NONCHARACTER: 0xF2 8F BF BE (U+8FFFE)
UTF-8-NONCHARACTER: 0xF2 8F BF BF (U+8FFFF)
UTF-8-NONCHARACTER: 0xF2 9F BF BE (U+9FFFE)
UTF-8-NONCHARACTER: 0xF2 9F BF BF (U+9FFFF)
UTF-8-NONCHARACTER: 0xF2 AF BF BE (U+AFFFE)
UTF-8-NONCHARACTER: 0xF2 AF BF BF (U+AFFFF)
UTF-8-NONCHARACTER: 0xF2 BF BF BE (U+BFFFE)
UTF-8-NONCHARACTER: 0xF2 BF BF BF (U+BFFFF)
UTF-8-NONCHARACTER: 0xF3 8F BF BE (U+CFFFE)
UTF-8-NONCHARACTER: 0xF3 8F BF BF (U+CFFFF)
UTF-8-NONCHARACTER: 0xF3 9F BF BE (U+DFFFE)
UTF-8-NONCHARACTER: 0xF3 9F BF BF (U+DFFFF)
UTF-8-NONCHARACTER: 0xF3 AF BF BE (U+EFFFE)
UTF-8-NONCHARACTER: 0xF3 AF BF BF (U+EFFFF)
UTF-8-NONCHARACTER: 0xF3 BF BF BE (U+FFFFE)
UTF-8-NONCHARACTER: 0xF3 BF BF BF (U+FFFFF)
UTF-8-NONCHARACTER: 0xF4 8F BF BE (U+10FFFE)
UTF-8-NONCHARACTER: 0xF4 8F BF BF (U+10FFFF)
FILE-SUMMARY:
Character-Set: UTF-8
BOM-AFTER-START
CONTAINS-NULLS
CONTAINS-CARRIAGE_RETURN
CONTAINS-CONTROL_CHARACTERS
CONTAINS-ESCAPE_SEQUENCES
Plane-0-PUA: 6400 characters
Plane-15-PUA: 65534 characters
Plane-16-PUA: 65534 characters
utfcheck-1.2/test/expected/out-utf8-bom-begin.txt 0000644 0001750 0001750 00000001321 13343057756 020517 0 ustar paul paul UTF-8-BOM-BEGIN
ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
FILE-SUMMARY:
Character-Set: UTF-8
BOM-AT-START
CONTAINS-NULLS
CONTAINS-CARRIAGE_RETURN
CONTAINS-CONTROL_CHARACTERS
CONTAINS-ESCAPE_SEQUENCES
utfcheck-1.2/test/expected/out-utf8-expurgated.txt 0000644 0001750 0001750 00000005632 13343057756 021041 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
ASCII-CONTROL: U+007F
UTF-8-CONTROL: 0xC2 80 (U+0080)
UTF-8-CONTROL: 0xC2 81 (U+0081)
UTF-8-CONTROL: 0xC2 82 (U+0082)
UTF-8-CONTROL: 0xC2 83 (U+0083)
UTF-8-CONTROL: 0xC2 84 (U+0084)
UTF-8-CONTROL: 0xC2 85 (U+0085)
UTF-8-CONTROL: 0xC2 86 (U+0086)
UTF-8-CONTROL: 0xC2 87 (U+0087)
UTF-8-CONTROL: 0xC2 88 (U+0088)
UTF-8-CONTROL: 0xC2 89 (U+0089)
UTF-8-CONTROL: 0xC2 8A (U+008A)
UTF-8-CONTROL: 0xC2 8B (U+008B)
UTF-8-CONTROL: 0xC2 8C (U+008C)
UTF-8-CONTROL: 0xC2 8D (U+008D)
UTF-8-CONTROL: 0xC2 8E (U+008E)
UTF-8-CONTROL: 0xC2 8F (U+008F)
UTF-8-CONTROL: 0xC2 90 (U+0090)
UTF-8-CONTROL: 0xC2 91 (U+0091)
UTF-8-CONTROL: 0xC2 92 (U+0092)
UTF-8-CONTROL: 0xC2 93 (U+0093)
UTF-8-CONTROL: 0xC2 94 (U+0094)
UTF-8-CONTROL: 0xC2 95 (U+0095)
UTF-8-CONTROL: 0xC2 96 (U+0096)
UTF-8-CONTROL: 0xC2 97 (U+0097)
UTF-8-CONTROL: 0xC2 98 (U+0098)
UTF-8-CONTROL: 0xC2 99 (U+0099)
UTF-8-CONTROL: 0xC2 9A (U+009A)
UTF-8-CONTROL: 0xC2 9B (U+009B)
UTF-8-CONTROL: 0xC2 9C (U+009C)
UTF-8-CONTROL: 0xC2 9D (U+009D)
UTF-8-CONTROL: 0xC2 9E (U+009E)
UTF-8-CONTROL: 0xC2 9F (U+009F)
UTF-8-NONCHARACTER: 0xEF B7 90 (U+FDD0)
UTF-8-NONCHARACTER: 0xEF B7 91 (U+FDD1)
UTF-8-NONCHARACTER: 0xEF B7 92 (U+FDD2)
UTF-8-NONCHARACTER: 0xEF B7 93 (U+FDD3)
UTF-8-NONCHARACTER: 0xEF B7 94 (U+FDD4)
UTF-8-NONCHARACTER: 0xEF B7 95 (U+FDD5)
UTF-8-NONCHARACTER: 0xEF B7 96 (U+FDD6)
UTF-8-NONCHARACTER: 0xEF B7 97 (U+FDD7)
UTF-8-NONCHARACTER: 0xEF B7 98 (U+FDD8)
UTF-8-NONCHARACTER: 0xEF B7 99 (U+FDD9)
UTF-8-NONCHARACTER: 0xEF B7 9A (U+FDDA)
UTF-8-NONCHARACTER: 0xEF B7 9B (U+FDDB)
UTF-8-NONCHARACTER: 0xEF B7 9C (U+FDDC)
UTF-8-NONCHARACTER: 0xEF B7 9D (U+FDDD)
UTF-8-NONCHARACTER: 0xEF B7 9E (U+FDDE)
UTF-8-NONCHARACTER: 0xEF B7 9F (U+FDDF)
UTF-8-NONCHARACTER: 0xEF B7 A0 (U+FDE0)
UTF-8-NONCHARACTER: 0xEF B7 A1 (U+FDE1)
UTF-8-NONCHARACTER: 0xEF B7 A2 (U+FDE2)
UTF-8-NONCHARACTER: 0xEF B7 A3 (U+FDE3)
UTF-8-NONCHARACTER: 0xEF B7 A4 (U+FDE4)
UTF-8-NONCHARACTER: 0xEF B7 A5 (U+FDE5)
UTF-8-NONCHARACTER: 0xEF B7 A6 (U+FDE6)
UTF-8-NONCHARACTER: 0xEF B7 A7 (U+FDE7)
UTF-8-NONCHARACTER: 0xEF B7 A8 (U+FDE8)
UTF-8-NONCHARACTER: 0xEF B7 A9 (U+FDE9)
UTF-8-NONCHARACTER: 0xEF B7 AA (U+FDEA)
UTF-8-NONCHARACTER: 0xEF B7 AB (U+FDEB)
UTF-8-NONCHARACTER: 0xEF B7 AC (U+FDEC)
UTF-8-NONCHARACTER: 0xEF B7 AD (U+FDED)
UTF-8-NONCHARACTER: 0xEF B7 AE (U+FDEE)
UTF-8-NONCHARACTER: 0xEF B7 AF (U+FDEF)
UTF-8-BOM-EMBEDDED
utfcheck-1.2/test/expected/out-utf8-bom-end.txt 0000644 0001750 0001750 00000001327 13343057756 020207 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
UTF-8-BOM-EMBEDDED
FILE-SUMMARY:
Character-Set: UTF-8
BOM-AFTER-START
CONTAINS-NULLS
CONTAINS-CARRIAGE_RETURN
CONTAINS-CONTROL_CHARACTERS
CONTAINS-ESCAPE_SEQUENCES
utfcheck-1.2/test/expected/out-utf8-surrogate.txt 0000644 0001750 0001750 00000000060 13343057756 020672 0 ustar paul paul SURROGATE-PAIR-CODE-POINT: 0xED A0 80 (U+D800)
utfcheck-1.2/test/expected/out-ascii-only.txt 0000644 0001750 0001750 00000001175 13343057756 020052 0 ustar paul paul ASCII-NULL
ASCII-CONTROL: U+0001
ASCII-CONTROL: U+0002
ASCII-CONTROL: U+0003
ASCII-CONTROL: U+0004
ASCII-CONTROL: U+0005
ASCII-CONTROL: U+0006
ASCII-CONTROL: U+0007
ASCII-CONTROL: U+0008
ASCII-CONTROL: U+000E
ASCII-CONTROL: U+000F
ASCII-CONTROL: U+0010
ASCII-CONTROL: U+0011
ASCII-CONTROL: U+0012
ASCII-CONTROL: U+0013
ASCII-CONTROL: U+0014
ASCII-CONTROL: U+0015
ASCII-CONTROL: U+0016
ASCII-CONTROL: U+0017
ASCII-CONTROL: U+0018
ASCII-CONTROL: U+0019
ASCII-CONTROL: U+001A
ASCII-CONTROL: U+001C
ASCII-CONTROL: U+001D
ASCII-CONTROL: U+001E
ASCII-CONTROL: U+001F
ASCII-CONTROL: U+007F
UTF-8-CONTROL: 0xC2 80 (U+0080)
NON-ASCII-DATA: 0xC2
utfcheck-1.2/test/test-ascii-only 0000755 0001750 0001750 00000002527 13327135531 015576 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf8.txt.gz
# The output file of the conversion
OUTFILE=test-ascii-only.txt
# The reference file to compare the output against
CMPFILE=./expected/out-ascii-only.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to utf8 executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck -a > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-ascii-only FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-binary 0000755 0001750 0001750 00000002514 13327135544 015013 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-binary.txt.gz
# The output file of the conversion
OUTFILE=test-binary.txt
# The reference file to compare the output against
CMPFILE=./expected/out-binary.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to binary executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-binary FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-all 0000755 0001750 0001750 00000001310 13327135465 014272 0 ustar paul paul #!/bin/sh
echo "*** Running Tests..."
./test-ascii || exit 1
echo "test-ascii PASSED"
./test-utf8 || exit 1
echo "test-utf8 PASSED"
./test-utf8-bom-begin || exit 1
echo "test-utf8-bom-begin PASSED"
./test-utf8-bom-end || exit 1
echo "test-utf8-bom-end PASSED"
./test-binary || exit 0
echo "test-binary PASSED"
./test-ascii-only || exit 0
echo "test-ascii-only PASSED"
./test-utf8-expurgated || exit 0
echo "test-utf8-expurgated PASSED"
./test-utf8-surrogate || exit 0
echo "test-utf8-surrogate PASSED"
./test-utf16-be || exit 0
echo "test-utf16-be PASSED"
./test-utf16-le || exit 0
echo "test-utf16-le PASSED"
echo "*** Finished Tests"
utfcheck-1.2/test/test-utf16-be 0000755 0001750 0001750 00000002526 13327135562 015063 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf16-be.txt.gz
# The output file of the conversion
OUTFILE=test-utf16-be.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf16-be.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to utf16-be executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf16-be FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/test-utf16-le 0000755 0001750 0001750 00000002526 13327135574 015100 0 ustar paul paul #!/bin/sh
set -e
# The input file to convert
INFILE=./samples/sample-utf16-le.txt.gz
# The output file of the conversion
OUTFILE=test-utf16-le.txt
# The reference file to compare the output against
CMPFILE=./expected/out-utf16-le.txt
#
# Create temporary directory for test
# output if AUTOPKGTEST_TMP is undefined.
# Debian GNU/Linux defines AUTOPKGTEST_TMP.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
TEST_TMP=$(mktemp -d)
trap "\rm -rf ${AUTOPKGTEST_TMP}" 0 INT QUIT ABRT PIPE TERM
else
TEST_TMP=${AUTOPKGTEST_TMP}
fi
#
# Point to the source directory for test.
#
if [ "x${srcdir}" = "x" ] ; then
srcdir=.
fi
#
# Point to utf16-le executable; utfcheck_bindir
# should be defined for "make installcheck".
# Otherwise, leave undefined for "make check".
#
if [ "x${utfcheck_bindir}" = "x" ] ; then
utfcheck_bindir=../src
fi
#
# Ignore error exit status so we can keep going and
# compare the utfcheck output with the expected output.
#
( gunzip < ${srcdir}/${INFILE} | \
${utfcheck_bindir}/utfcheck > ${TEST_TMP}/${OUTFILE} ) || true
diff ${srcdir}/${CMPFILE} ${TEST_TMP}/${OUTFILE} || \
(echo "test-utf16-le FAILED; output in ${TEST_TMP}/${OUTFILE}" ; exit 1)
#
# If AUTOPKGTEST_TMP was defined, don't remove it;
# a Debian calling process will take care of that.
#
if [ "x${AUTOPKGTEST_TMP}" = "x" ] ; then
\rm -rf ${TEST_TMP}
fi
utfcheck-1.2/test/samples/ 0000755 0001750 0001750 00000000000 13343057524 014265 5 ustar paul paul utfcheck-1.2/test/samples/sample-ascii.txt.gz 0000644 0001750 0001750 00000000244 13343057524 020014 0 ustar paul paul T_[sample-ascii.txt c`dbfaecWPTRVQUS70426153wptrvqus
OHLJNIMK/(,*.)-+ utfcheck-1.2/test/samples/Makefile 0000644 0001750 0001750 00000000527 13327134610 015723 0 ustar paul paul
#
# Note: this is run by hand by the package maintainer
# to compile gen-utf.c and run it to create the input
# files for testing. It is not expected to be run by
# an end user.
#
all: gen-utf
./gen-utf
\rm -f *.txt.gz
for i in *.txt ; do \
gzip -9 $$i ; \
done
clean:
\rm -f gen-utf
distclean: clean
.PHONY: all clean distclean
utfcheck-1.2/test/samples/sample-utf16-be.txt.gz 0000644 0001750 0001750 00000000604 13343057524 020255 0 ustar paul paul T_[sample-utf16-be.txt m۶m۶m۶m۶m۶Q! , .B - /",.b-/I,I.RI-I/2,.r-ɯ
))J))*jꩯiiZii:鬋鮇z魏olnFmo&lnfmoYlYnVYmYo6lnvm9숣9N9팳9.슫n펻y쉧y^y퍷y>싯~폿JD utfcheck-1.2/test/samples/gen-utf 0000755 0001750 0001750 00000031160 13343057524 015561 0 ustar paul paul ELF > @ * @ 8 @ @ @ @ 8 8 8 \ \ h p T T T D D Ptd L L Qtd Rtd ( ( /lib64/ld-linux-x86-64.so.2 GNU GNU Z%R3Xt*nv6x ? - [ j ~ " libc.so.6 fopen fputc fclose __cxa_finalize __libc_start_main _ITM_deregisterTMCloneTable __gmon_start__ _Jv_RegisterClasses _ITM_registerTMCloneTable GLIBC_2.2.5 ui 8 8 ( HH5 HtH 5B %D @ %B h %: h %2 h % f 1I^HHPTL H
c H= D H= H UH)HHvHn Ht ]fD ]@ f. H= H5 UH)HHHH?HHtHA Ht]f ]@ f. =i u'H= UHtH=J
H]@ @ f. H= H? u^fD H HtUH]@UHH0H5 H= HEE HUEH։gE}~~HEHAH5~ H= NHEHEHƿ)HEHƿHEHƿE HUEH։E}~~HEHH5 H=* HEE HUEH։E}~~HEHƿHEHƿzHEHƿiHEHMH5 H= ZHEE FEHUH։ EE EHDЉHEHƉEE;E|ۃE} ~E FEHUH։E EE EHDЉHEHƉEE;E|ۃE} ~HEHzH5 H= HEHEHƿ bHEHƿ QE /EHUH։1EHUH։E} ~HEHH51 H= HEHEHƿ HEHƿ E /EHUH։EHUH։E} ~HEHnH5 H=! {HEE EHUH։JE} ~HEH!H5^ H= .HEHEHƿ z EE EHDЉHEHƉEE;E|HEHƿ 8 EE EHDЉHEHƉEE;E|۸ UHH }Hu} E E E}GE HEUHEHpHEHHUHHMH
}XE EHEHEHU?ʀHEHHHEHHUH $ }bE EHEHEHU?ʀHEHU?ʀHEHPHEH } E EHEHEHU?ʀHEHU?ʀHEHU?ʀHEH IEHE HEH HEH HEH HEH EUH}E E mm} t
E#EtE] AWAVAAUATL%6 UH-6 SIIL)HHHt 1 LLDAHH9uH[]A\A]A^A_Ðf. HH w sample-ascii.txt sample-utf8-bom-begin.txt sample-utf8-bom-end.txt sample-utf8.txt sample-utf16-be.txt sample-utf16-le.txt sample-binary.txt sample-utf8-surrogate.txt ;L @ Ph 8 @ zR x + zR x $ `@ FJw ?;*3$" D x \ AC
| AC
u3 AC
n D e BBE B(H0H8M@r8A0A(B BBB
o
H X o o ` o o L o 8 GCC: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 8 T t L `
X
0 @ @ . D @ S z X / 0 K @ R ) f y 0 U
3 8
e H + @ @ # @ / Q 8 R " crtstuff.c __JCR_LIST__ deregister_tm_clones __do_global_dtors_aux completed.6972 __do_global_dtors_aux_fini_array_entry frame_dummy __frame_dummy_init_array_entry gen-utf.c __FRAME_END__ __JCR_END__ __init_array_end _DYNAMIC __init_array_start __GNU_EH_FRAME_HDR _GLOBAL_OFFSET_TABLE_ __libc_csu_fini _ITM_deregisterTMCloneTable _edata fclose@@GLIBC_2.2.5 fputc@@GLIBC_2.2.5 __libc_start_main@@GLIBC_2.2.5 __data_start bin_digits __gmon_start__ __dso_handle _IO_stdin_used __libc_csu_init __bss_start main fopen@@GLIBC_2.2.5 _Jv_RegisterClasses __TMC_END__ cvt2utf8 _ITM_registerTMCloneTable __cxa_finalize@@GLIBC_2.2.5 .symtab .strtab .shstrtab .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame .init_array .fini_array .jcr .dynamic .got.plt .data .bss .comment 8 8 # T T 1 t t $ D o N V ^ o L L k o ` ` z B X X H @ L L 0 0 0 0 @ @ 0 @ - p / 0' n ) utfcheck-1.2/test/samples/gen-utf.c 0000644 0001750 0001750 00000013320 13327135122 015766 0 ustar paul paul /*
gen-utf.c - generate ASCII, UTF-8, and UTF-16 test files.
Paul Hardy, 2018
*/
#include
int
main ()
{
int i; /* Loop variable */
int codept; /* Unicode code point to output */
int nbytes; /* Number of bytes in current UTF-8 character */
unsigned utf_bytes[5]; /* For UTF-8 encoded bytes */
int cvt2utf8 (unsigned, unsigned *); /* convert binary code point to UTF-8 */
FILE *utffp;
/*
ASCII
*/
utffp = fopen ("sample-ascii.txt", "w");
for (codept = 0x00; codept < 0x7F; codept++)
fputc (codept, utffp);
fclose (utffp);
/*
Shortened UTF-8 file with Byte Order Mark at beginning.
*/
utffp = fopen ("sample-utf8-bom-begin.txt", "w");
/* UTF-8 Byte Order Mark */
fputc ('\357', utffp); fputc ('\273', utffp); fputc ('\277', utffp);
for (codept = 0x00; codept < 0x7F; codept++)
fputc (codept, utffp);
fclose (utffp);
/*
Shortened UTF-8 file with Byte Order Mark at end, to check embedded BOM.
*/
utffp = fopen ("sample-utf8-bom-end.txt", "w");
for (codept = 0x00; codept < 0x7F; codept++)
fputc (codept, utffp);
/* UTF-8 Byte Order Mark */
fputc ('\357', utffp); fputc ('\273', utffp); fputc ('\277', utffp);
fclose (utffp);
/*
UTF-8
*/
utffp = fopen ("sample-utf8.txt", "w");
for (codept = 0x00; codept < 0xD800; codept++) {
nbytes = cvt2utf8 (codept, utf_bytes);
for (i = 0; i < nbytes; i++)
fputc (utf_bytes [i], utffp);
}
/* Skip over Unicode Surrogate Pair range; not valid UTF-8 */
for (codept = 0xE000; codept <= 0x10FFFF; codept++) {
nbytes = cvt2utf8 (codept, utf_bytes);
for (i = 0; i < nbytes; i++)
fputc (utf_bytes [i], utffp);
}
fclose (utffp);
/*
Big-endian UTF-16
*/
utffp = fopen ("sample-utf16-be.txt", "w");
/* Big-endian UTF-16 Byte Order Mark */
fputc (0xFE, utffp);
fputc (0xFF, utffp);
for (codept = 0x0000; codept <= 0x0100; codept++) {
fputc ((codept >> 8) & 0xFF, utffp);
fputc ( codept & 0xFF, utffp);
}
fclose (utffp);
/*
Little-endian UTF-16
*/
utffp = fopen ("sample-utf16-le.txt", "w");
/* Little-endian UTF-16 Byte Order Mark */
fputc (0xFF, utffp);
fputc (0xFE, utffp);
for (codept = 0x0000; codept <= 0x0100; codept++) {
fputc ( codept & 0xFF, utffp);
fputc ((codept >> 8) & 0xFF, utffp);
}
fclose (utffp);
/*
Binary
*/
utffp = fopen ("sample-binary.txt", "w");
for (codept = 0x00; codept <= 0xFF; codept++) {
fputc ( codept & 0xFF, utffp);
}
fclose (utffp);
/*
UTF-8 with embedded Surrogate Pairs -- not valid UTF-8
*/
utffp = fopen ("sample-utf8-surrogate.txt", "w");
nbytes = cvt2utf8 (0xD800, utf_bytes);
for (i = 0; i < nbytes; i++)
fputc (utf_bytes [i], utffp);
nbytes = cvt2utf8 (0xDC00, utf_bytes);
for (i = 0; i < nbytes; i++)
fputc (utf_bytes [i], utffp);
}
/*
Convert a Unicode code point to a UTF-8 string.
The allowable Unicode range is U+0000..U+10FFFF.
codept - the Unicode code point to encode
utf8_bytes - an array of 5 bytes to hold the UTF-8 encoded string;
the string will consist of up to 4 UTF-8-encoded bytes,
with null bytes after the last encoded byte to signal
to the end of the array, utf8_bytes[4].
*/
int
cvt2utf8 (unsigned codept, unsigned *utf8_bytes)
{
int bin_length; /* number of binary digits, for forming UTF-8 */
int byte_length; /* numberof bytes of UTF-8 */
int bin_digits (unsigned);
/*
If codept is within the valid Unicode range of
0x0 through 0x10FFFF inclusive, convert it to UTF-8.
*/
if (codept <= 0x10FFFF) {
byte_length = 0;
bin_length = bin_digits (codept);
if (bin_length < 8) { /* U+0000..U+007F */
byte_length = 1;
utf8_bytes [0] = codept;
utf8_bytes [1] =
utf8_bytes [2] =
utf8_bytes [3] =
utf8_bytes [4] = 0;
}
else if (bin_length < 12) { /* U+0080..U+07FF */
byte_length = 2;
utf8_bytes [0] = 0xC0 | ((codept >> 6) & 0x1F);
utf8_bytes [1] = 0x80 | ( codept & 0x3F);
utf8_bytes [2] =
utf8_bytes [3] =
utf8_bytes [4] = 0;
}
else if (bin_length < 17) { /* U+0800..U+FFFF */
byte_length = 3;
utf8_bytes [0] = 0xE0 | ((codept >> 12) & 0x0F);
utf8_bytes [1] = 0x80 | ((codept >> 6) & 0x3F);
utf8_bytes [2] = 0x80 | ( codept & 0x3F);
utf8_bytes [3] =
utf8_bytes [4] = 0;
}
else if (bin_length < 22) { /* U+010000..U+10FFFF */
byte_length = 4;
utf8_bytes [0] = 0xF0 | ((codept >> 18) & 0x07);
utf8_bytes [1] = 0x80 | ((codept >> 12) & 0x3F);
utf8_bytes [2] = 0x80 | ((codept >> 6) & 0x3F);
utf8_bytes [3] = 0x80 | ( codept & 0x3F);
utf8_bytes [4] = 0;
}
} /* encoded output for valid Unicode code point */
else { /* flag out of range Unicode code point */
/*
0xFF is never a valid UTF-8 code point, so testing
for it will be an easy check of a valid return value.
*/
byte_length = -1;
utf8_bytes [0] = 0xFF;
utf8_bytes [1] = 0xFF;
utf8_bytes [2] = 0xFF;
utf8_bytes [3] = 0xFF;
utf8_bytes [4] = 0;
}
return byte_length;
}
/*
Return the number of significant binary digits in an unsigned number.
*/
int
bin_digits (unsigned itest)
{
unsigned i;
int result;
i = 0x80000000; /* mask highest unsigned bit */
result = 32;
while ( (i != 0) && ((itest & i) == 0) ) {
i >>= 1;
result--;
}
return result;
}
utfcheck-1.2/test/samples/README.samples 0000644 0001750 0001750 00000002224 13327401220 016574 0 ustar paul paul The test/samples directory contains pre-built ASCII, UTF-8, UTF-16,
and binary test input files with names of the form
sample-*.txt.gz
They are created by the gen-utf.c program in this directory (which
is not installed with the main utfcheck program), and are compressed
with "gzip -9". These test files are distributed pre-built with the
source package to simplify the configuration of systems with continuous
integration testing.
These sample files are not installed, so unless you save this source
package they will not take up room on your system following installation.
The largest file, sample-utf8.txt.gz, contains every legal UTF-8 code
point (not including the surrogate pairs range), and is approximately
2 Megabytes. This file allows testing utfcheck exhaustively across
the entire UTF-8 range.
If you want to keep the source pakcage on your system and do not want
these files taking up space, can remove them. Just type
rm sample-*.txt.gz
To recreate them, you should just be able to type
make
in this directory. To be safe, try the "make" command after moving
the existing "sample-*" files to a safe location.
--Paul Hardy, 2018
utfcheck-1.2/test/samples/sample-utf8-surrogate.txt.gz 0000644 0001750 0001750 00000000064 13343057524 021623 0 ustar paul paul T_[sample-utf8-surrogate.txt { 9!. utfcheck-1.2/test/samples/sample-utf8.txt.gz 0000644 0001750 0001750 00007757550 13343057524 017644 0 ustar paul paul T_[sample-utf8.txt
>۶m۶mmMچcc{/ys?#??O3?_/+_o;?'?_7?/?AAAAAAAAAAAAAAAAAAAAAAAA@}|z~=}=уaaaaaaaaaaaaaaaaaaaaaaaaP}|z~=͇C}=ÇQQQQQQQQQQQQQQQQQQQQQQQQH}|z~=G#}=GѣqqqqqqqqqqqqqqqqqqqqqqqqX}|z~=c}=IIIIIIIIIIIIIIIIIIIIIIIID}|z~=џO'}=O'ѓiiiiiiiiiiiiiiiiiiiiiiiiT}|z~=՟OͧS}=OçYYYYYYYYYYYYYYYYYYYYYYYYL}|z~=ӟg3}=gѳɿS;wſK]wͿ[wÿG={˿W66;{ǿO}yyyyyyyyyyyyyyyyyyyyyyyy\}|z~=ןs}=EEEEEEEEEEEEEEEEEEEEEEEEB}|z~_/}_/ыeeeeeeeeeeeeeeeeeeeeeeeeR}|z~_/͗K}_/×UUUUUUUUUUUUUUUUUUUUUUUUJ}|z~_W+}_WѫuuuuuuuuuuuuuuuuuuuuuuuuZ}|z~_k}_MMMMMMMMMMMMMMMMMMMMMMMMF}|z~o7}o7ћmmmmmmmmmmmmmmmmmmmmmmmmV}|z~oͷ[}o÷]]]]]]]]]]]]]]]]]]]]]]]]N}|z~w;}wѻ}}}}}}}}}}}}}}}}}}}}}}}}^}|z~{}CCCCCCCCCCCCCCCCCCCCCCCCA}?}?чccccccccccccccccccccccccQ}?͏G}?ÏSSSSSSSSSSSSSSSSSSSSSSSSI}?O'}?OѧssssssssssssssssssssssssY}?g}?KKKKKKKKKKKKKKKKKKKKKKKKE}ѿ_/}_/їkkkkkkkkkkkkkkkkkkkkkkkkU}տ_ͯW}_ï[[[[[[[[[[[[[[[[[[[[[[[[M}ӿo7}oѷ{{{{{{{{{{{{{{{{{{{{{{{{]}w}GGGGGGGGGGGGGGGGGGGGGGGGC??яggggggggggggggggggggggggS?͟O?ßWWWWWWWWWWWWWWWWWWWWWWWWK_/_ѯwwwwwwwwwwwwwwwwwwwwwwww[oOOOOOOOOOOOOOOOOOOOOOOOOG??џ] Y$eEZYdeE^YeQEYYTeQE]
Y4eђE[YteѓE_Ye1XYLe1\Y(PeJkYhea%[Y8peE P,e^YeqYY\eq](IQREɈ%'J^(EQJER&J](MQZE'J_(CQFE2e&\(((KQVEDE1D1EDEو%e'^(GQNEr&]ԄIQSE͈5'j^ԂEQKEZ&j]ԆMQ[E'j_ԁCQGE:u&\ԅKQWEDE5D5EDE݈5u'^ԃGQOEz&] Y&eeZYfee^YeYeYYVeYe]
Y6eْe[Yveٓe_Ye9XYNe9\Y*TeJkYjeiҔ%K[Y:teҗe P,e^YeyYY^ey]V Y%eUZVYeeU^VYeUUYVYUeUU]V
Y5eՒU[VYueՓU_VY
e5XVYMe5\VY)ReJVkYieeʔ%+[VY9reʗU PVe^VYeuYVY]eu] Y'euZYgeu^Ye]uYYWe]u]
Y7eݒu[Yweݓu_Ye=XYOe=\Y+VeJkYkemڔ%k[Y;veڗu P֑e^Ye}YY_e}]hIREˈ-'Z^hEJEV&Z]hMZE'Z_hCFE6m&\hhhKVEDE3D3EDEۈ-m'^hGNEv&]ISEψ='z^EKE^&z]M[E'z_CGE>}&\KWEDE7D7EDE߈=}'^GOE~&]I1RbȈ#'F^E1JbŨQ&F]M1Zb'F_C1FbŘ1c&\K1Vb00Űİ؈#c'^G1NbŸq&]̄I1Sb̈3'f^̂E1KbŬY&f]̆M1[b'f_́C1GbŜ9s&\̅K1Wb44ŴĴ܈3s'^̃G1Obży&]XIRbʈ+'V^XEJbŪU&V]XMZb'V_XCFbŚ5k&\XXXKVb22ŲIJڈ+k'^XGNbźu&]ISbΈ;'v^EKbŮ]&v]M[b'v_CGbŞ={&\؊تKWb66ŶĶވ;{'^GObž}&]6 $eMZ6deM^6eSMY6TeSM]6
4eӒM[6teӓM_6e3X6Le3\6(QeJ6khecƔ%[68qeƗM P6le^6esY6\es]8IqRɈ''N^8EqJũS&N]8MqZ'N_8CqFř3g&\888KqV11űıو'g'^8GqNŹs&]܄IqS͈7'n^܂EqKŭ[&n]܆Mq['n_܁CqGŝ;w&\܅KqW55ŵĵ݈7w'^܃GqOŽ{&]xIRˈ/'^^xEJūW&^]xMZ'^_xCFś7o&\xxxKV33ųijۈ/o'^xGNŻw&]ISψ?'~^EKů_&~]M['~_CGş?&\KW77ŷķ߈?'^GOſ&]I R%H 'A^E J%HP&A]M Z%HЕ'A_C F%H0`&\K V%$%0$0%$%HHJIKHJI `'^G N%Hp&]I S%H0'a^E K%HX&a]M [%Hؕ'a_C G%H8p&\
K W%$%4$4%$%HHJIKHJI0p'^G O%Hx&]DIR%H('Q^DEJ%HT&Q]DMZ%Hԕ'Q_DCF%H4h&\DDDKV%$%2$2%$%HHJIKHJI(h'^DGN%Ht&] &emZfem^e[mYVe[m]
6eےm[veۓm_e;XNe;\*UeJkjek֔%[[:ue֗m Ple^e{Y^e{]IS%H8'q^EK%H\&q]M[%Hܕ'q_CG%HXO>\+rW徔Jkkrnݔ%w[;rwݗ{ Pܷr従^~Y_~I$RH E"DE$JH QEDM$ZH EDC$FH 1Eb
*K$VHАБ00 "!# "!E"FbG$NH qEI$SH E2dE$KH YEdM$[H EdC$GH 9Er
*K$WHԐԑ44 "!# "!E2FrG$OH yERIRH E*TREJH UETRMZH ETRCFH 5EjR
R*RKVHҐґ22 "!# "!E*FjRGNH uEISH E:tEKH ]EtM[H EtCGH =Ez
*KWH֑66 "!# "!E:FzGOH }E2IdRȤ E&L2EdJȔ SEL2MdZȴ EL2CdFȌ 3Ef2
2*2KdVȬѐё11 "!# "!E&Ff2GdNȜ sEIdSȦ E6lEdKȖ [ElMd[ȶ ElCdGȎ ;Ev
*KdWȮՐՑ55 "!# "!E6FvGdOȞ {ErIRȥ E.\rEJȕ WE\rMZȵ E\rCFȍ 7Enr
r*rKVȭӐӑ33 "!# "!E.FnrGNȝ wEISȧ E>|EKȗ _E|M[ȷ E|CGȏ ?E~
*KWȯאב77 "!# "!E>F~GOȟ E
IR(QȠE!B
EJ(QPEB
MZ(QEB
CF(Q0Ea
*
KV(QPQ0P0QPQؠE!Fa
GN(QpEIS(Q̠E1bEK(QXEbM[(QEbCG(Q8Eq
*KW(QPQ4P4QPQܠE1FqGO(QxEJIR(QʠE)RJEJ(QTERJMZ(QERJCF(Q4EiJ
J*JKV(QPQ2P2QPQڠE)FiJGN(QtEIS(QΠE9rEK(Q\ErM[(QErCG(QE}
*KWQPQ7P7QPQߠE=F}GOQ~EI4RhȠE#FE4JhѨQEFM4ZhEFC4Fhј1Ec
*K4Vh00ѰаؠE#FcG4NhѸqEI4Sh̠E3fE4KhѬYEfM4[hEfC4Ghќ9Es
*K4Wh44ѴдܠE3FsG4OhѼyEZIRhʠE+VZEJhѪUEVZMZhEVZCFhњ5EkZ
Z*ZKVh22ѲвڠE+FkZGNhѺuEIShΠE;vEKhѮ]EvM[hEvCGhў=E{
*KWh66ѶжޠE;F{GOhѾ}E:ItRɠE'N:EtJѩSEN:MtZEN:CtFљ3Eg:
:*:KtV11ѱб٠E'Fg:GtNѹsEItS͠E7nEtKѭ[EnMt[EnCtGѝ;Ew
*KtW55ѵеݠE7FwGtOѽ{EzIRˠE/^zEJѫWE^zMZE^zCFћ7Eoz
z*zKV33ѳг۠E/FozGNѻwEISϠE?~EKѯ_E~M[E~CGџ?E
*KW77ѷзߠE?FGOѿEIR1` AEJ1`PŠAMZ1`ŠACF1`0`
*KV101000101``bac`ba `GN1`pIS1`0aEK1`XŰaM[1`ŰaCG1`8p
*KW101404101``bac`ba0pGO1`xFIR1`(QFEJ1`TŨQFMZ1`ŨQFCF1`4hF
F*FKV101202101``bac`ba(hFGN1`tIS1`8qEK1`\ŸqM[1`ŸqCG1`|
*KW101707101``bac`ba<|GO1`~I,RX`"EE,JX`QŢEM,ZX`ŢEC,FX`1b
*K,VXаб00``bac`ba"bG,NX`qJJJ
JJJJJJJJ JJJJ
JJJJJJJJJ EAѡPL(ŅAPB(-
jjj
jjjjjjjj jjjj
jjjjjjjjj UAաPMTՅAPC-
I,SX`2eE,KX`YŲeM,[X`ŲeC,GX`9r
*K,WXԱ44``bac`ba2rG,OX`yVIRX`*UVEJX`UŪUVMZX`ŪUVCFX`5jV
V*VKVXҰұ22``bac`ba*jVGNX`uISX`:uEKX`]źuM[X`źuCGX`=z
*KWXְֱ66``bac`ba:zGOX`}ZZZ
ZZZZZZZZ ZZZZ
ZZZZZZZZZ MAӡLh4ͅABh-
zzz
zzzzzzzz zzzz
zzzzzzzzz ]AסMt݅AC-
FFF
FFFFFFFF FFFF
FFFFFFFFF Ca0LÅa0B-
fff
ffffffff ffff
fffffffff Sa0MLӅa0C-
VVV
VVVVVVVV VVVV
VVVVVVVVV KaLX,˅aBX-
vvv
vvvvvvvv vvvv
vvvvvvvvv [aMlۅaC-
6IlRؤ`&M6ElJؔ`SŦM6MlZش`ŦM6ClF،`3f6
6*6KlVجѰѱ11``bac`ba&f6GlN`sNNN
NNNNNNNN NNNN
NNNNNNNNN GpL8DžpB8-
nnn
nnnnnnnn nnnn
nnnnnnnnn WpM\ׅpC-
^^^
^^^^^^^^ ^^^^
^^^^^^^^^ OLx<υBx-
~~~
~~~~~~~~ ~~~~
~~~~~~~~~ _M|߅C-
AAA
AAAAAAAA AAAA
AAAAAAAAA @EDBF!LE!B-
aaa
aaaaaaaa aaaa
aaaaaaaaa PEDBF!MBE!C-
QQQ
QQQQQQQQ QQQQ
QQQQQQQQQ HEDBF!LD"E!BD-
IlSئ`6mElKؖ`[ŶmMl[ض`ŶmClG؎`;v
*KlWخհձ55``bac`ba6vGlO؞`{qqq
qqqqqqqq qqqq
qqqqqqqqq XEDBF!MbE!C-
vIRإ`.]vEJؕ`WŮ]vMZص`Ů]vCF؍`7nv
v*vKVحӰӱ33``bac`ba.nvGN؝`wISا`>}EKؗ`_ž}M[ط`ž}CG؏`?~
*KWدװױ77``bac`ba>~GO؟`IR8q!CEJ8qPšCMZ8qšCCF8q0a
*KV8qpq0p0qpq!aGN8qpIS8q1cEK8qXűcM[8qűcCG8q8q
*KW8qpq4p4qpq1qGO8qxNIR8q)SNEJ8qTũSNMZ8qũSNCF8q4iN
N*NKV8qpq2p2qpq)iNGN8qtIS8q9sEK8q\ŹsM[8qŹsCG8q}
*KWqpq7p7qpq=}GOq~I&RL0e"DE&JL0QeDM&ZL0eDC&FL01ebĜ
*K&VLИЙ0000213021e"fbĞG&NL0qeĝI&SL0e2dE&KL0YedM&[L0edC&GL09er
*K&WLԘԙ4400213021e2frG&OL0ye䝩SIRL0e*TSEJL0UeTSMZL0eTSCFL05ejԜS
S*SKVLҘҙ2200213021e*fjԞSGNL0ueԝISL0e:tEKL0]etM[L0etCGL0=ez
*KWL֘֙6600213021e:fzGOL0}e3IfR̤0e&L3EfJ̔0SeL3MfZ̴0eL3CfF̌03ef̜3
3*3KfV̬јљ1100213021e&ff̞3GfN̜0se̝IfȘ0e6lEfK̖0[elMf[̶0elCfG̎0;ev
*KfW̮ՙ5500213021e6fvGfO̞0{e읹sIR̥0e.\sEJ̕0We\sMZ̵0e\sCF̍07enܜs
s*sKV̭Әә3300213021e.fnܞsGN̝0weܝIŞ0e>|EK̗0_e|M[̷0e|CG̏0?e~
*KW̯טי7700213021e>f~GO̟0eIR,YȰe!BEJ,YPeBMZ,YeBCF,Y0ea
*KV,YXY0X0YXYذe!faGN,YpeIS,Y̰e1bEK,YXebM[,YebCG,Y8eq
*KW,YXY4X4YXYܰe1fqGO,Yxe❥KIR,Yʰe)RKEJ,YTeRKMZ,YeRKCF,Y4eiҜK
K*KKV,YXY2X2YXYڰe)fiҞKGN,YteҝIS,Yΰe9rEK,Y\erM[,YerCG,Ye}
*KWYXY7X7YXY߰e=f}GOY~eI6RlȰe#FE6Jl٨QeFM6ZleFC6Fl٘1ecƜ
*K6Vl00ٰذذe#fcƞG6NlٸqeƝI6Sl̰e3fE6Kl٬YefM6[lefC6Glٜ9es
*K6Wl44ٴشܰe3fsG6Olټye杭[IRlʰe+V[EJl٪UeV[MZleV[CFlٚ5ek֜[
[*[KVl22ٲزڰe+fk֞[GNlٺue֝ISlΰe;vEKlٮ]evM[levCGlٞ=e{
*KWl66ٶضްe;f{GOlپ}e;IvRɰe'N;EvJ٩SeN;MvZeN;CvFٙ3egΜ;
;*;KvV11ٱرٰe'fgΞ;GvNٹseΝIvSͰe7nEvK٭[enMv[enCvGٝ;ew
*KvW55ٵصݰe7fwGvOٽ{e{IR˰e/^{EJ٫We^{MZe^{CFٛ7eoޜ{
{*{KV33ٳس۰e/foޞ{GNٻweޝISϰe?~EKٯ_e~M[e~CGٟ?e
*KW77ٷط߰e?fGOٿeIR9p AEJ9pPAMZ9pACF9p0`
*KV989080989pprqsprq `GN9ppIS9p0aEK9pXaM[9paCG9p8p
*KW989484989pprqsprq0pGO9pxᝣGIR9p(QGEJ9pTQGMZ9pQGCF9p4hќG
G*GKV989282989pprqsprq(hўGGN9ptѝIS9p8qEK9p\qM[9pqCG9p|
*KW989787989pprqsprq<|GO9p~I.R\p"EE.J\pQEM.Z\pEC.F\p1bŜ
*K.V\ий00pprqsprq"bŞG.N\pqŝJJJJJJJJJJJJJJJJJJJJJJJJʀʐʈʘʄʔʌʜʂBEQѩTL*ʆCťQTB*-ʎʞʁʑʉʙʅʕʍʝjjjjjjjjjjjjjjjjjjjjjjjjꂪBUQթTMUꆪCեQTC-I.S\p2eE.K\pYeM.[\peC.G\p9r
*K.W\ԸԹ44pprqsprq2rG.O\py坫WIR\p*UWEJ\pUUWMZ\pUWCF\p5j՜W
W*WKV\Ҹҹ22pprqsprq*j՞WGN\pu՝IS\p:uEK\p]uM[\puCG\p=z
*KW\ָֹ66pprqsprq:zGO\p}ZZZZZZZZZZZZZZZZZZZZZZZZڀڐڈژڄڔڌڜڂBMQөLj5چCͥQBj-ڎڞځڑډڙڅڕڍڝzzzzzzzzzzzzzzzzzzzzzzzzB]QשMuCݥQC-FFFFFFFFFFFFFFFFFFFFFFFFƀƐƈƘƄƔƌƜƂBCi4L
ƆCåi4B-ƎƞƁƑƉƙƅƕƍƝffffffffffffffffffffffff悦BSi4MM憦Cӥi4C-͘VVVVVVVVVVVVVVVVVVVVVVVVրֈ֘ք֔֜ւBKiLZ-ֆC˥iBZ-֎֞ց֑։֙օ֕֍֝vvvvvvvvvvvvvvvvvvvvvvvvB[iMmCۥiC-7InRܤp&M7EnJܔpSM7MnZܴpM7CnF܌p3f͜7
7*7KnVܬѸѹ11pprqsprq&f͞7GnNܜps͝NNNNNNNNNNNNNNNNNNNNNNNNΐΈΘ΄ΔΌΜBGtL:ΆCǥtB:-ΎΞΑΉΙ΅ΕΝnnnnnnnnnnnnnnnnnnnnnnnnBWtM]CץtC-ݘ^^^^^^^^^^^^^^^^^^^^^^^^ހސވޘބޔތޜނBOLz=ކCϥBz-ގޞށޑމޙޅޕލޝ~~~~~~~~~~~~~~~~~~~~~~~~B_M}CߥC-AAAAAAAAAAAAAAAAAAAAAAAA@edbf1Le1B-aaaaaaaaaaaaaaaaaaaaaaaaႡPedbf1MCᆡe1C-ØQQQQQQQQQQQQQQQQQQQQQQQQрѐшјфєьќтHedbf1LF#цe1BF-юўсёщљхѕэѝInSܦp6mEnKܖp[mMn[ܶpmCnGp;v
*KnWܮոչ55pprqsprq6vGnOܞp{qqqqqqqqqqqqqqqqqqqqqqqqXedbf1Mce1C- 8AT u2BIye/}MnjuU7ͨnfu]ͩnnu[խYZ-nau[-niuTnu˪[խ_݊6n6n6n6n궨n궪n궩n궫nvnvnvnWn^Y^]>[~խJuW{k]Fu7U^NuwWݯުAu{T=IuO{w{V{=^TAuVQuWIuVYuWEu_VUu_WMuV]uWCu?VSu?WKuV~ePZ=jCz0Y=aWzp[=Ѩ~hVVGX=ѮqHGZ=:գ[=ѫyWzǰzǸzLǴz̪Ǽz,DzzǺzlǶzǾzDZzǹz\ǵzU[WGxVWxWOV_(R=YZԫQ=iUOz֫S=UFgzgP=UvgR=٩U^̫gzsP=sT=sR=sV=sQ=sU=sS=sW=P=T=R=zުz>zz~zgYVZҪ^eV/zի^V/z5_իUVzW\VNVzW^ի^5^5^5^5^구^굮^국^굯^u^u^u^VQUSWTozתVmToz[ۮToz{ջQݬޭTzGջ]TzwwzgջWݯETaUqTiUy^Te^UuTmU}>Tc>UsTk}ޏޯޟ]VէV}WǪ>vWǫ>WfiU'>WէS}'>W~)ϠϨϤLϬ̫Ϣ,ϪϦlϮϡϩϥ\_U{yTgyUwToU*WU_Vo:^mT_6o7QmW߸&7ۭYUoQ}wX}Gw\}'wZ}gw^}wY}Ww]}7w[}ww_}X}O\}/Z}ޫ>~O~jU?Ϩ~f]ϩ~n_կYZ/~a_/~iTn˪_կ_7~7~7~7~귨~귪~귩~귫~w~w~w~Wn^Y^]>[~կJ*ժUVzUUiVUvU֫ҩJ*lT_ͪlUePaUFUٮʸ*LSݪ̪WyUAUrT㪜T*gU9EU.rUT*wUCUT窼T*V|T*_UOU~W%
EC1P,(6:ŃҀCiBiA P"(m(1J
%҃CC)LL̡̠,,ll\\AACy@yByAyC@BA)*PU5TՄjA֡:P]Tjjj 5AmC&PS]jju uuuuu
uuuu uu
uuuu
jZ
5
55555:jj.jj
|ԚPPQPkAZZ>jjԆP6Am9jԖP[Am=jԎP;vAj7=P{B/j?Jh
4Z
Mf@3Ylhuh4͇քւ@Ebh ZZZ-ևV@@BAC@BAC[@[B[A[C@BAC;@;B;A;C@BvvVBWk5:t ݂nCCw=
>& z=ކCO;л3=9>
;'7/CQa0LFÅha4a`0B6F#схaa000F00&00f0000V00600v0000N00.00``a<`G}k7oQߡGg/_QCO_QEz G©p8&
ǁ4ppZp8!NN'pp
88C8#8c88S838s88K8+8k88[8;8{88G8'8g88W8pnpppp^pp>pp~pJ
\n
Wk5Zpmu\ׇۄۂ
Fppc nnn7ۇ[Ý]]]]====ý{{[Sj4x:< ςgës& ^/׆K:2x=x9>
;'7/
:&6u44\4<4hh4hF#F#A#EF9}4
4h1Fcs4h,XFc{4h8qF?4nhxD_4~h*|
߀o·.|~ ?).~~????????
DSASEfMMM>M4[hhhFhьLLE3CffC4Ghќ9EsK4WhܠEsG4OhѼyEw4h>|F%Z
Z*Z54tL,lh9hhyh5jB+@+D+BVVVV.ZZ=rhh
5Bk)Z3h-ZBk-Z;h:uB+Zhк@7Zh*(T5D`!8\AAA D!h#$R]AA````````````````
APA"!
&BA"6[!aa0A" "C#cS3sK+k[;{G'gWo_?KD
"Q
HGd 2YlDuD"GDB
EڈbD QQQ(GGT "!# "!#Z Z"Z!Z# "!#: :":!:# "CtCtG@DBFAECT]C[C[G@DBFvm&-C#hh'hhwC;Gv !#hOО=C{%+hoޢC{#'h_о
;h~AXE\C!MbqE!n 7C6q8EAE!!GLj'gW7wOψ/??H$*
@b"HH$.I$@"H$).II@2@2D2B2F2A2E2C2G@DBFAECGr@rDrBrFrArE䇤D UjHuRFZG uzHH}M-i4F MvvfH{Hs}!1 )9%5
-=#3+?7wO/o_?%:
:*:5t4ttttLt,tlt8x4iB'@'D'BNNNN.::=trt3Bg):3t,YBg-:;t9sB+:й@7:t)UUѭkkkk[GAECn!mtcttSt;vG@wct'Nѝ;Gwkt7nݡGgt/^COt_EnLA"!Ӑd&2A"5ȚZd!YY,A" "ː
ddCd#dcddSd3dsddKd+dkdd[d;d{ddGd'dgddWdnȞ^>Ⱦ~JTjiYsskD^^^^z]2z+7Do zSf[Do
z[v;wDzWлwGzo>}W"Wk5:rBn##w=
>& y
;'7/}}:&6u]=7oF?F?A?E~9}?Fs/_F{?F?oD_(*
@aP((.
E"@PQ().EE@1@1D1B1F1A1E1C1G@DBFAECGq@qDqBqFqAqE⇢@@ŠA>MZD11H0H1` àAACF1`0`KV1``GN1`pw<1xa%
*55u
M-m:z601lb000°aaaa.=s0b8p)3.0\bp-;00cc|7?0~b/?KLLTLjhXؘ1q0q10i`cĤIIII&]L2Lz1)0`2d &SLf1Y`d
&[Lv19`r&WL0ar&oL>|1aRb`bZTTƴiS&-LLCL#LۘƘ&v0ba4Ǵi