MIME-EncWords-1.014.2/000075500000000000000000000000001220667516400150005ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/ARTISTIC000064400000000000000000000144631220667516400161550ustar00viewvcviewvc00000010000000The "Artistic License" Preamble The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications. Definitions "Package" refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification. "Standard Version" refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder as specified below. "Copyright Holder" is whoever is named in the copyright or copyrights for the package. "You" is you, if you're thinking about copying or distributing this Package. "Reasonable copying fee" is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.) "Freely Available" means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it. 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. 3. You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following: a. place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as uunet.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package. b. use the modified Package only within your corporation or organization. c. rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version. d. make other distribution arrangements with the Copyright Holder. You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following: a. distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version. b. accompany the distribution with the machine-readable source of the Package with your modifications. c. give non-standard executables non-standard names, and clearly document the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version. d. make other distribution arrangements with the Copyright Holder. You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. You may embed this Package's interpreter within an executable of yours (by linking); this shall be construed as a mere form of aggregation, provided that the complete Standard Version of the interpreter is so embedded. The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whomever generated them, and may be sold commercially, and may be aggregated with this Package. If such scripts or library files are aggregated with this Package via the so-called "undump" or "unexec" methods of producing a binary executable image, then distribution of such an image shall neither be construed as a distribution of this Package nor shall it fall under the restrictions of Paragraphs 3 and 4, provided that you do not represent such an executable image as a Standard Version of this Package. C subroutines (or comparably compiled subroutines in other languages) supplied by you and linked into this Package in order to emulate subroutines and variables of the language defined by this Package shall not be considered part of this Package, but are the equivalent of input as in Paragraph 6, provided these subroutines do not change the language in any way that would cause it to fail the regression tests for the language. Aggregation of this Package with a commercial distribution is always permitted provided that the use of this Package is embedded; that is, when no overt attempt is made to make this Package's interfaces visible to the end user of the commercial distribution. Such use shall not be construed as a distribution of this Package. The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. The End MIME-EncWords-1.014.2/Changes000064400000000000000000000177571220667516400163140ustar00viewvcviewvc00000010000000Revision history for Perl module MIME::EncWords 1.014.2 2013-08-26 Hatuka*nezumi - IKEDA Soji * No new features. * Reformat Changes file: CPAN RT#88096. 1.014.1 2013-08-25 Hatuka*nezumi - IKEDA Soji * No new features. * Move Japanese documentations under POD2::JA. 1.014 2013-04-27 Hatuka*nezumi - IKEDA Soji * No changes. 1.013_02 2013-04-14 Hatuka*nezumi - IKEDA Soji * Fix: CPAN RT#84295: MaxLineLen fixes to the value set at the first time. * Imp: encode_mimewords() supports UTF-16, UTF-32 and their flavors. They will be encoded as UTF-8. * Requires MIME-Charset >= 1.010. 1.012.6 2012-10-01 Hatuka*nezumi - IKEDA Soji * No changes. 1.012_5 2012-09-05 Hatuka*nezumi - IKEDA Soji * Minor Fix: After ASCII words extending over multiple lines, line length was estimated shorter. CPAN RT #79399. * Doc: corrected typo. 1.012.4 2011-10-26 Hatuka*nezumi - IKEDA Soji * Chg: encode_mimewords(): 'B' was advantageous over 'Q' by 4/3 byte on average... * Updated address of FSF. 1.012.3 2011-06-05 Hatuka*nezumi - IKEDA Soji * Fix: encode_mimewords(): inproper handling of Encoding => 'S'. * Imp: decode_mimewords(): Broken "Q" encoding also warned: "=" not leading two hexdigits (raw " " and "\t" are allowed). * Imp: encode_mimewords(): negative MaxLineLen allows unlimited length of line. * Encode::MIME::EncWords: Rewritten. 0.03. - Any newlines not forming folding white space are preserved. cf. CPAN RT #68582 for standard encodings. - Error handling. * Doc: typos etc. 1.012.2 2011-06-01 Hatuka*nezumi - IKEDA Soji * Chg: encode_mimewords(): By 'A' or 'S' encodings, 'Q' will be used more often: When number of bytes to be encoded exceeds 6th of entire bytes, words may be encoded by 'B'. In other words, ``S encoding'' is to choose shorter one of 'B' or 'Q' according to length of maximally-encoded result. * Fix: encode_mimewords(): Pure ASCII words containing unsafe sequence ignored Encoding option; encoded by header_encoding() of its charset. * Updated Encode::MIME::EncWords. * Added test #03 & #04. Added UTF-8 cases to #02. 1.012.1 2011-05-29 Hatuka*nezumi - IKEDA Soji * Unicode/multibyte support on Perl 5.7.3 (experimental). * New: Encode::MIME::EncWords [alpha release] - Encode module for "MIME-EncWords", "MIME-EncWords-B", "MIME-EncWords-Q" and "MIME-EncWords-ISO_2022_JP". * Requires MIME::Charset >= 1.008.2. 1.012 2010-06-17 Hatuka*nezumi - IKEDA Soji * encode_mimewords(): New option Minimal => 'DISPNAME' to help encoding RFC5322 name-addr. 1.011.1 2009-06-16 Hatuka*nezumi - IKEDA Soji * no new features. * Fix: Perl <= 5.6.x - skip tests with older POD::Simple. * MIME::Charset >= 1.007.1 is required. 1.011 2009-05-17 Hatuka*nezumi - IKEDA Soji * not really released. 1.011_01 2009-05-11 Hatuka*nezumi - IKEDA Soji * no new features. * Supports Perl 5.8.0. * MIME::Charset >= 1.007 is required. 1.010.101 2008-04-19 Hatuka*nezumi - IKEDA Soji * tidinesses only; no new features. * CPAN RT #34909, #35070 (depends on #35120): Perl >= 5.8.1 requires MIME::Charset >= 1.006.2. * Perl 5.11.0: Suppress ``Use of uninitialized value within @_ in lc'' warnings. * Perl <= 5.6.2: Suppress ``Useless use of a constant in void context'' warnings. * Correct META.yml & MANIFEST. 1.010 2008-04-12 Hatuka*nezumi - IKEDA Soji * encode_mimeword(): Restrict characters in encoded-words according to RFC 2047 section 5 (3). Note: length(encode_mimeword()) may not be equal to encoded_header_len() of MIME::Charset 1.004 or earlier. * Bug Fix: Texts with ``US-ASCII transformation'' charsets, HZ-GB-2312 (RFC 1842) and UTF-7 (RFC 2152), were treated as US-ASCII. * Fix: encoded-words exceeding line length can be generated. * encode_mimewords(): Improved encoding of unsafe ASCII sequences (words exceeding line length or including ``=?''). * encode_mimeword(): can take charset object argument. In this case RAW can be Unicode string. * Modified / added tests for multibyte / singlebyte / unsafe ASCII. 1.009 2008-03-30 Hatuka*nezumi - IKEDA Soji * Bug Fix: Perl <=5.6.x: encode_mimewords(): ASCII words are encoded. * Bug Fix: Perl <=5.005: our is ``deprecated''. 1.007 2008-03-21 Hatuka*nezumi - IKEDA Soji * encode_mimewords(): New option 'Folding' defaults to be "\n" which may break conformance to RFC 2822 / MIME. * Improve handling of linear-white-spaces: preserve multiple whitespace sequences. * Fix: decode_mimewords(): excessive spaces are inserted on pre-Encode environments (e.g. 5.6.x). * Fix: decode_mimewords(): no 'Charset' option must be no conversion to keep compatible with MIME::Words. * Remove multibyte tests on pre-Encode environments where it cannot be supported exactly. * Restructured processing of option parameters. * Added tests for decoding multibyte and encoding singlebyte. 1.005 2008-03-16 Hatuka*nezumi - IKEDA Soji * Fix: Injected bug on _UNICODE_ conversion. * Fix: decode_mimewords(): line folding of encoded text is preserved in the result. 1.004 2008-03-16 Hatuka*nezumi - IKEDA Soji * withdrawn. * By this release we require OO interface of MIME::Charset 1.001 or later. * Fix: encode_mimewords(): Newlines were encoded when original text includes them. * New feature: MIME/EncWords/Defaults.pm: If it exists, built-in defaults for option parameters of methods can be overridden. * encode_mimewords(): Built-in default for "Encoding" option has been changed from "Q" to "A". * encode_mimewords(): New option "MaxLineLen" which defaults to be 76, and "Mapping" which defaults to be "EXTENDED". * decode_mimewords(): New option "Mapping" which defaults to be "EXTENDED". * Added tests for multibyte. * Clean-up PODs and codes. 1.003 2008-03-14 Hatuka*nezumi - IKEDA Soji * encode_mimewords(): Fix: Minimal option won't affect when Encoding options is not "A". * decode_mimewords(): Support RFC 2231 section 5 extension. 1.000 2008-03-08 Hatuka*nezumi - IKEDA Soji * decode_mimewords(): New option 'Detect7bit', enabled by default. * encode_mimewords(): New option 'Replacement. 0.040 2006-11-16 Hatuka*nezumi - IKEDA Soji * encode_mimewords(): New option 'Minimal' to control minimal encoding behavior. NOTE: Default behavior was changed from "NO" to "YES". 0.032 2006-10-22 Hatuka*nezumi - IKEDA Soji * More documentation changes. 0.03.1 2006-10-20 Hatuka*nezumi - IKEDA Soji * not really released * Documentation changes only: Note on modifications, clarifications about compatibility with MIME::Words. 0.03 2006-10-17 Hatuka*nezumi - IKEDA Soji * decode_mimewords: allow Unicode input. * decode_mimewords: don't collapse spaces between '?='...'=?'. * Bug fix: cannot encode null string. * Handle wide characters exactly. * Change die to croak. 0.02 2006-10-13 Hatuka*nezumi - IKEDA Soji * decode_mimewords: Fix bug about default charset. * Supports Perl 5.005 or later. Unicode/multibyte handling will be enabled on Perl 5.8.1 or later. * Added test cases for encode_mimewords (only for singlebyte). 0.01 2006-10-11 Hatuka*nezumi - IKEDA Soji * Initial CPAN upload. MIME-EncWords-1.014.2/GPL000064400000000000000000000432541220667516400153550ustar00viewvcviewvc00000010000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. MIME-EncWords-1.014.2/MANIFEST000064400000000000000000000010741220667516400161330ustar00viewvcviewvc00000010000000ARTISTIC Changes GPL lib/Encode/MIME/EncWords.pm lib/MIME/EncWords.pm lib/MIME/EncWords/Defaults.pm.sample lib/POD2/JA/Encode/MIME/EncWords.pod lib/POD2/JA/MIME/EncWords.pod Makefile.PL MANIFEST This list of files META.yml Module meta-data (added by author) README t/01decode.t t/02encode.t t/03Encode-MIME-EncWords.t t/04Encode-MIME-EncWords-ISO_2022_JP.t t/05encode_utf.t t/pod.t testin/decode-ascii.txt testin/decode-multibyte.txt testin/decode-singlebyte.txt testin/encode-ascii.txt testin/encode-multibyte.txt testin/encode-singlebyte.txt testin/encode-utf-8.txt MIME-EncWords-1.014.2/META.yml000064400000000000000000000016011220667516400162470ustar00viewvcviewvc00000010000000--- #YAML:1.0 name: MIME-EncWords abstract: deal with RFC 2047 encoded words (improved) version: 1.014.2 author: - Hatuka*nezumi - IKEDA Soji license: perl distribution_type: module requires: Encode: 1.98 MIME::Base64: 2.13 MIME::Charset: 1.010.1 perl: 5.005 build_requires: Test::More: 0 provides: MIME::EncWords: file: lib/MIME/EncWords.pm version: 1.014.2 Encode::MIME::EncWords: file: lib/Encode/MIME/EncWords.pm version: 0.03 resources: repository: http://hatuka.nezumi.nu/repos/MIME-EncWords/ meta-spec: version: 1.3 url: http://module-build.sourceforge.net/META-spec-v1.3.html generated_by: author MIME-EncWords-1.014.2/Makefile.PL000064400000000000000000000006661220667516400167620ustar00viewvcviewvc00000010000000use ExtUtils::MakeMaker; WriteMakefile( 'NAME' => 'MIME::EncWords', 'ABSTRACT_FROM' => 'lib/MIME/EncWords.pm', 'VERSION_FROM' => 'lib/MIME/EncWords.pm', 'PREREQ_PM' => ($] >= 5.007003)? { 'Encode' => 1.98, 'MIME::Charset' => '1.010.1', 'MIME::Base64' => 2.13, 'Test::More' => 0, }: { 'MIME::Charset' => '1.006', 'MIME::Base64' => 2.13, 'Test::More' => 0, 'Unicode::String' => 2.09, }, ); MIME-EncWords-1.014.2/README000064400000000000000000000013651220667516400156650ustar00viewvcviewvc00000010000000MIME-EncWords Package. Copyright (C) 2006-2013 by Hatuka*nezumi - IKEDA Soji . This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself. *** Two modules and some supporting program files are contained. For more details read following POD documentations: MIME::EncWords - deal with RFC 2047 encoded words (improved) Encode::MIME::EncWords - MIME 'B' and 'Q' header encoding (alternative) For japonophones, POD in Japanese language is also included: POD2::JA::MIME::EncWords - RFC 2047 encoded-word 関連 (改良版) POD2::JA::Encode::MIME::EncWords - MIME の「B」・「Q」ヘッダエンコーディング (代替案) $$ MIME-EncWords-1.014.2/lib/000075500000000000000000000000001220667516400155465ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/Encode/000075500000000000000000000000001220667516400167435ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/Encode/MIME/000075500000000000000000000000001220667516400174725ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/Encode/MIME/EncWords.pm000064400000000000000000000204471220667516400215630ustar00viewvcviewvc00000010000000# -*- perl -*- package Encode::MIME::EncWords; require 5.007003; use strict; use warnings; use Carp qw(croak carp); use MIME::EncWords; our $VERSION = '0.03'; # Default of options my $Config = { Charset => 'UTF-8', # Encoding => specified by each subclass. # Folding => fixes to "\n". # Replacement => given by encode()/decode(). # others => derived from MIME::EncWords: map { ($_ => $MIME::EncWords::Config->{$_}) } qw(Detect7bit Field Mapping MaxLineLen Minimal) }; $Encode::Encoding{'MIME-EncWords'} = bless { Encoding => 'A', Name => 'MIME-EncWords', } => __PACKAGE__; $Encode::Encoding{'MIME-EncWords-B'} = bless { Encoding => 'B', Name => 'MIME-EncWords-B', } => __PACKAGE__; $Encode::Encoding{'MIME-EncWords-Q'} = bless { Encoding => 'Q', Name => 'MIME-EncWords-Q', } => __PACKAGE__; $Encode::Encoding{'MIME-EncWords-ISO_2022_JP'} = bless { Charset => 'ISO-2022-JP', Encoding => 'B', Name => 'MIME-EncWords-ISO_2022_JP', } => __PACKAGE__; use base qw(Encode::Encoding); sub needs_lines { 1 } sub perlio_ok { 0 } sub decode($$;$) { my ($obj, $str, $chk) = @_; my %opts = map { ($_ => ($obj->{$_} || $Config->{$_})) } qw(Detect7bit Mapping); $chk = 0 if ref $chk; # coderef not supported. my $repl = ($chk & 4) ? ($chk & ~4 | 1) : $chk; local $@; my $skip = 0; # for RETURN_ON_ERR my $ret = undef; pos($str) = 0; foreach my $line ( $str =~ m{ \G (.*?) (?:\r\n|[\r\n]) (?![ \t]) }cgsx, substr($str, pos($str)) ) { if (defined $ret) { $ret .= "\n" unless $skip; } else { $ret = ''; } if ($skip) { $_[1] .= "\n"; $_[1] .= $line; next; } next unless length $line; my @words = MIME::EncWords::decode_mimewords($line, %opts); if ($@) { # broken MIME encoding. croak $@ if $chk & 1; # DIE_ON_ERR carp $@ if $chk & 2; # WARN_ON_ERR if ($chk & 4) { # RETURN_ON_ERR $_[1] = $line; $skip = 1; next; } } for (my $i = 0; $i <= $#words; $i++) { my $word = $words[$i]; my $cset = MIME::Charset->new(($word->[1] || 'US-ASCII'), Mapping => $opts{Mapping}); if (! $cset->decoder) { # unknown charset or ``8BIT''. $@ = 'Unknown charset "'.$cset->as_string.'"'; croak $@ if $chk & 1; carp $@ if $chk & 2; if ($chk & 4) { # already decoded... re-encoding $_[1] = MIME::EncWords::encode_mimewords([splice @words, $i], Encoding => 'B', Folding => '', MaxLineLen => -1); $skip = 1; last; } $ret .= Encode::decode("ISO-8859-1", $word->[0], 0); #FIXME next; } eval { $ret .= $cset->decode($word->[0], $repl); }; if ($@) { $@ =~ s/ at .+? line \d+[.\n]*$//; croak $@ if $chk & 1; carp $@ if $chk & 2; if ($chk & 4) { # already decoded... re-encoding $_[1] = MIME::EncWords::encode_mimewords([splice @words, $i], Encoding => 'B', Folding => '', MaxLineLen => -1); $skip = 1; last; } } } } if ($chk & 4) { # RETURN_ON_ERR $_[1] = '' unless $skip; } elsif ($chk) { # ! LEAVE_SRC $_[1] = $ret unless $chk & 8; } return $ret; } sub encode($$;$) { my ($obj, $str, $chk) = @_; my %opts = map { ($_ => ($obj->{$_} || $Config->{$_})) } qw(Charset Detect7bit Encoding Field Mapping MaxLineLen Minimal); $opts{Charset} ||= 'UTF-8'; $opts{Folding} = "\n"; $chk = 0 if ref $chk; # coderef not supported. my $repl = ($chk & 4) ? ($chk & ~4 | 1) : $chk; $str = Encode::decode('ISO-8859-1', $str) if ! Encode::is_utf8($str) and $str =~ /[^\x00-\x7F]/; local $@; my $skip = 0; # for RETURN_ON_ERR my $ret = undef; pos($str) = 0; foreach my $line ( $str =~ m{ \G (.*?) (?:\r\n|[\r\n]) (?![ \t]) }cgsx, substr($str, pos($str)) ) { if (defined $ret) { $ret .= "\n" unless $skip; } else { $ret = ''; } if ($skip) { $_[1] .= "\n"; $_[1] .= $line; next; } next unless length $line; eval { $ret .= MIME::EncWords::encode_mimewords($line, %opts, Replacement => $repl); }; if ($@) { $@ =~ s/ at .+? line \d+[.\n]*$//; croak $@ if $chk & 1; # DIE_ON_ERR carp $@ if $chk & 2; # WARN_ON_ERR if ($chk & 4) { # RETURN_ON_ERR $_[1] = $line; $skip = 1; next; } } } if ($chk & 4) { # RETURN_ON_ERR $_[1] = '' unless $skip; } elsif ($chk) { # ! LEAVE_SRC $_[1] = '' unless $chk & 8; # FIXME:spec? } return $ret; } sub config { my $klass = shift if scalar @_ % 2; my %opts = @_; foreach my $key (keys %opts) { croak "Unknown config option: $key" unless exists $Config->{$key}; $Config->{$key} = $opts{$key}; } } 1; __END__ =head1 NAME Encode::MIME::EncWords -- MIME 'B' and 'Q' header encoding (alternative) =head1 SYNOPSIS use Encode::MIME::EncWords; use Encode qw/encode decode/; # decode header: $utf8 = decode('MIME-EncWords', $header); # encode header with default charset, UTF-8: $header = encode('MIME-EncWords', $utf8); # encode header with another charset: Encode::MIME::EncWords->config(Charset => 'GB2312'); $header = encode('MIME-EncWords', $utf8); =head1 ABSTRACT This module implements MIME header encoding described in RFC 2047. There are three variant encoding names and one shorthand special to a charset: Encoding name Result of encode() Comment ------------------------------------------------------------------- MIME-EncWords (auto-detect B or Q) MIME-EncWords-B =?XXXX?B?...?= Default is UTF-8. MIME-EncWords-Q =?XXXX?Q?...?= ,, MIME-EncWords-ISO_2022_JP =?ISO-2022-JP?B?...?= All encodings generate the same result by decode(). =head1 DESCRIPTION This module is intended to be an alternative of C encodings provided by L core module. To find out how to use this module in detail, see L. =head2 Module specific feature =over 4 =item config(KEY => VALUE, ...); I Set options by KEY => VALUE pairs. Following options are available. =over 4 =item Charset [encode] Name of character set by which data elements will be converted. Default is C<"UTF-8">. On C it is fixed to C<"ISO-2022-JP">. =item Detect7bit [decode/encode] Try to detect 7-bit charset on unencoded portions. Default is C<"YES">. =item Field [encode] Name of the header field which will be considered on the first line of encoded result in its length. Default is C. =item Mapping [decode/encode] Specify mappings actually used for charset names. Default is C<"EXTENDED">. =item MaxLineLen [encode] Maximum line length excluding newline. Default is C<76>. =item Minimal [encode] Whether to do minimal encoding or not. Default is C<"YES">. =back For more details about options see L. =back =head1 CAVEAT =over 4 =item * The encoding modules for MIME header encoding are not the magic porridge pot to cook complex header fields properly. To decode address header fields (From:, To:, ...), at first parse mailbox-list; then decode each element by encoding module. To encode them, at first encode each element by encoding module; then construct mailbox-list of encoded elements. To construct or parse mailbox-list, some modules such as L may be used. =item * Lines are delimited with LF (C<"\n">). RFC5322 states that lines in Internet messages are delimited with CRLF (C<"\r\n">). =back =head1 BUGS Please report bugs or buggy behaviors to developer. CPAN Request Tracker: L. =head1 VERSION Consult C<$VERSION> variable. B. Features might be changed in the near future. Development versions of this package may be found at L. =head1 SEE ALSO L, L, L. RFC 2047 I. =head1 AUTHOR Hatuka*nezumi - IKEDA Soji =head1 COPYRIGHT Copyright (C) 2011 Hatuka*nezumi - IKEDA Soji. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut MIME-EncWords-1.014.2/lib/Encode/MIME/EncWords/000075500000000000000000000000001220667516400212165ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/MIME/000075500000000000000000000000001220667516400162755ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/MIME/EncWords.pm000064400000000000000000001063461220667516400203710ustar00viewvcviewvc00000010000000#-*- perl -*- package MIME::EncWords; require 5.005; =head1 NAME MIME::EncWords - deal with RFC 2047 encoded words (improved) =head1 SYNOPSIS I is aimed to be another implimentation of L so that it will achieve more exact conformance with RFC 2047 (formerly RFC 1522) specifications. Additionally, it contains some improvements. Following synopsis and descriptions are inherited from its inspirer, then added descriptions on improvements (B<**>) or changes and clarifications (B<*>).> Before reading further, you should see L to make sure that you understand where this module fits into the grand scheme of things. Go on, do it now. I'll wait. Ready? Ok... use MIME::EncWords qw(:all); ### Decode the string into another string, forgetting the charsets: $decoded = decode_mimewords( 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= ', ); ### Split string into array of decoded [DATA,CHARSET] pairs: @decoded = decode_mimewords( 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= ', ); ### Encode a single unsafe word: $encoded = encode_mimeword("\xABFran\xE7ois\xBB"); ### Encode a string, trying to find the unsafe words inside it: $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB in town"); =head1 DESCRIPTION Fellow Americans, you probably won't know what the hell this module is for. Europeans, Russians, et al, you probably do. C<:-)>. For example, here's a valid MIME header you might get: From: =?US-ASCII?Q?Keith_Moore?= To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= =?US-ASCII?Q?.._cool!?= The fields basically decode to (sorry, I can only approximate the Latin characters with 7 bit sequences /o and 'e): From: Keith Moore To: Keld J/orn Simonsen CC: Andr'e Pirard Subject: If you can read this you understand the example... cool! B: Fellow Americans, Europeans, you probably won't know what the hell this module is for. East Asians, et al, you probably do. C<(^_^)>. For example, here's a valid MIME header you might get: Subject: =?EUC-KR?B?sNTAuLinKGxhemluZXNzKSwgwvzB9ri7seIoaW1w?= =?EUC-KR?B?YXRpZW5jZSksILGzuLgoaHVicmlzKQ==?= The fields basically decode to (sorry, I cannot approximate the non-Latin multibyte characters with any 7 bit sequences): Subject: ???(laziness), ????(impatience), ??(hubris) =head1 PUBLIC INTERFACE =over 4 =cut ### Pragmas: use strict; use vars qw($VERSION @EXPORT_OK %EXPORT_TAGS @ISA $Config); ### Exporting: use Exporter; %EXPORT_TAGS = (all => [qw(decode_mimewords encode_mimeword encode_mimewords)]); Exporter::export_ok_tags(qw(all)); ### Inheritance: @ISA = qw(Exporter); ### Other modules: use Carp qw(croak carp); use MIME::Base64; use MIME::Charset qw(:trans); my @ENCODE_SUBS = qw(FB_CROAK is_utf8 resolve_alias); if (MIME::Charset::USE_ENCODE) { eval "use ".MIME::Charset::USE_ENCODE." \@ENCODE_SUBS;"; if ($@) { # Perl 5.7.3 + Encode 0.40 eval "use ".MIME::Charset::USE_ENCODE." qw(is_utf8);"; require MIME::Charset::_Compat; for my $sub (@ENCODE_SUBS) { no strict "refs"; *{$sub} = \&{"MIME::Charset::_Compat::$sub"} unless $sub eq 'is_utf8'; } } } else { require Unicode::String; require MIME::Charset::_Compat; for my $sub (@ENCODE_SUBS) { no strict "refs"; *{$sub} = \&{"MIME::Charset::_Compat::$sub"}; } } #------------------------------ # # Globals... # #------------------------------ ### The package version, both in 1.23 style *and* usable by MakeMaker: $VERSION = '1.014.2'; ### Public Configuration Attributes $Config = { %{$MIME::Charset::Config}, # Detect7bit, Replacement, Mapping Charset => 'ISO-8859-1', Encoding => 'A', Field => undef, Folding => "\n", MaxLineLen => 76, Minimal => 'YES', }; eval { require MIME::EncWords::Defaults; }; ### Private Constants my $PRINTABLE = "\\x21-\\x7E"; #my $NONPRINT = "\\x00-\\x1F\\x7F-\\xFF"; my $NONPRINT = qr{[^$PRINTABLE]}; # Improvement: Unicode support. my $UNSAFE = qr{[^\x01-\x20$PRINTABLE]}; my $WIDECHAR = qr{[^\x00-\xFF]}; my $ASCIITRANS = qr{^(?:HZ-GB-2312|UTF-7)$}i; my $ASCIIINCOMPAT = qr{^UTF-(?:16|32)(?:BE|LE)?$}i; my $DISPNAMESPECIAL = "\\x22(),:;<>\\x40\\x5C"; # RFC5322 name-addr specials. #------------------------------ # _utf_to_unicode CSETOBJ, STR # Private: Convert UTF-16*/32* to Unicode or UTF-8. sub _utf_to_unicode { my $csetobj = shift; my $str = shift; return $str if is_utf8($str); return $csetobj->decode($str) if MIME::Charset::USE_ENCODE(); my $cset = $csetobj->as_string; my $unistr = Unicode::String->new(); if ($cset eq 'UTF-16' or $cset eq 'UTF-16BE') { $unistr->utf16($str); } elsif ($cset eq 'UTF-16LE') { $unistr->utf16le($str); } elsif ($cset eq 'UTF-32' or $cset eq 'UTF-32BE') { $unistr->utf32($str); } elsif ($cset eq 'UTF-32LE') { $unistr->utf32le($str); } else { croak "unknown transformation '$cset'"; } return $unistr->utf8; } #------------------------------ # _decode_B STRING # Private: used by _decode_header() to decode "B" encoding. # Improvement by this module: sanity check on encoded sequence. sub _decode_B { my $str = shift; unless ((length($str) % 4 == 0) and $str =~ m|^[A-Za-z0-9+/]+={0,2}$|) { return undef; } return decode_base64($str); } # _decode_Q STRING # Private: used by _decode_header() to decode "Q" encoding, which is # almost, but not exactly, quoted-printable. :-P # Improvement by this module: sanity check on encoded sequence (>=1.012.3). sub _decode_Q { my $str = shift; if ($str =~ /=(?![0-9a-fA-F][0-9a-fA-F])/) { #XXX:" " and "\t" are allowed return undef; } $str =~ s/_/\x20/g; # RFC 2047, Q rule 2 $str =~ s/=([0-9a-fA-F]{2})/pack("C", hex($1))/ge; # RFC 2047, Q rule 1 $str; } # _encode_B STRING # Private: used by encode_mimeword() to encode "B" encoding. sub _encode_B { my $str = shift; encode_base64($str, ''); } # _encode_Q STRING # Private: used by encode_mimeword() to encode "Q" encoding, which is # almost, but not exactly, quoted-printable. :-P # Improvement by this module: Spaces are escaped by ``_''. sub _encode_Q { my $str = shift; # Restrict characters to those listed in RFC 2047 section 5 (3) $str =~ s{[^-!*+/0-9A-Za-z]}{ $& eq "\x20"? "_": sprintf("=%02X", ord($&)) }eog; $str; } #------------------------------ =item decode_mimewords ENCODED, [OPTS...] I Go through the string looking for RFC 2047-style "Q" (quoted-printable, sort of) or "B" (base64) encoding, and decode them. B splits the ENCODED string into a list of decoded C<[DATA, CHARSET]> pairs, and returns that list. Unencoded data are returned in a 1-element array C<[DATA]>, giving an effective CHARSET of C. $enc = '=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= '; foreach (decode_mimewords($enc)) { print "", ($_[1] || 'US-ASCII'), ": ", $_[0], "\n"; } B<**> However, adjacent encoded-words with same charset will be concatenated to handle multibyte sequences safely. B<**> Language information defined by RFC2231, section 5 will be additonal third element, if any. B<*> Whitespaces surrounding unencoded data will not be stripped so that compatibility with L will be ensured. B joins the "data" elements of the above list together, and returns that. I and probably I what you want, but if you know that all charsets in the ENCODED string are identical, it might be useful to you. (Before you use this, please see L, which is probably what you want.) B<**> See also "Charset" option below. In the event of a syntax error, $@ will be set to a description of the error, but parsing will continue as best as possible (so as to get I back when decoding headers). $@ will be false if no error was detected. B<*> Malformed encoded-words will be kept encoded. In this case $@ will be set. Any arguments past the ENCODED string are taken to define a hash of options. B<**> When Unicode/multibyte support is disabled (see L), these options will not have any effects. =over 4 =item Charset B<**> Name of character set by which data elements in scalar context will be converted. The default is no conversion. If this option is specified as special value C<"_UNICODE_">, returned value will be Unicode string. B: This feature is still information-lossy, I when C<"_UNICODE_"> is specified. =item Detect7bit B<**> Try to detect 7-bit charset on unencoded portions. Default is C<"YES">. =cut #=item Field # #Name of the mail field this string came from. I =item Mapping B<**> In scalar context, specify mappings actually used for charset names. C<"EXTENDED"> uses extended mappings. C<"STANDARD"> uses standardized strict mappings. Default is C<"EXTENDED">. =back =cut sub decode_mimewords { my $encstr = shift; my %params = @_; my %Params = &_getparams(\%params, NoDefault => [qw(Charset)], # default is no conv. YesNo => [qw(Detect7bit)], Others => [qw(Mapping)], Obsoleted => [qw(Field)], ToUpper => [qw(Charset Mapping)], ); my $cset = MIME::Charset->new($Params{Charset}, Mapping => $Params{Mapping}); # unfolding: normalize linear-white-spaces and orphan newlines. $encstr =~ s/(?:[\r\n]+[\t ])*[\r\n]+([\t ]|\Z)/$1? " ": ""/eg; $encstr =~ s/[\r\n]+/ /g; my @tokens; $@ = ''; ### error-return ### Decode: my ($word, $charset, $language, $encoding, $enc, $dec); my $spc = ''; pos($encstr) = 0; while (1) { last if (pos($encstr) >= length($encstr)); my $pos = pos($encstr); ### save it ### Case 1: are we looking at "=?..?..?="? if ($encstr =~ m{\G # from where we left off.. =\?([^?]*) # "=?" + charset + \?([bq]) # "?" + encoding + \?([^?]+) # "?" + data maybe with spcs + \?= # "?=" ([\r\n\t ]*) }xgi) { ($word, $charset, $encoding, $enc) = ($&, $1, lc($2), $3); my $tspc = $4; # RFC 2231 section 5 extension if ($charset =~ s/^([^\*]*)\*(.*)/$1/) { $language = $2 || undef; $charset ||= undef; } else { $language = undef; } if ($encoding eq 'q') { $dec = _decode_Q($enc); } else { $dec = _decode_B($enc); } unless (defined $dec) { $@ .= qq|Illegal sequence in "$word" (pos $pos)\n|; push @tokens, [$spc.$word]; $spc = ''; next; } { local $@; if (scalar(@tokens) and lc($charset || "") eq lc($tokens[-1]->[1] || "") and resolve_alias($charset) and (!${tokens[-1]}[2] and !$language or lc(${tokens[-1]}[2]) eq lc($language))) { # Concat words if possible. $tokens[-1]->[0] .= $dec; } elsif ($language) { push @tokens, [$dec, $charset, $language]; } elsif ($charset) { push @tokens, [$dec, $charset]; } else { push @tokens, [$dec]; } $spc = $tspc; } next; } ### Case 2: are we looking at a bad "=?..." prefix? ### We need this to detect problems for case 3, which stops at "=?": pos($encstr) = $pos; # reset the pointer. if ($encstr =~ m{\G=\?}xg) { $@ .= qq|unterminated "=?..?..?=" in "$encstr" (pos $pos)\n|; push @tokens, [$spc.'=?']; $spc = ''; next; } ### Case 3: are we looking at ordinary text? pos($encstr) = $pos; # reset the pointer. if ($encstr =~ m{\G # from where we left off... (.*? # shortest possible string, \n*) # followed by 0 or more NLs, (?=(\Z|=\?)) # terminated by "=?" or EOS }xgs) { length($1) or croak "MIME::EncWords: internal logic err: empty token\n"; push @tokens, [$spc.$1]; $spc = ''; next; } ### Case 4: bug! croak "MIME::EncWords: unexpected case:\n($encstr) pos $pos\n\t". "Please alert developer.\n"; } push @tokens, [$spc] if $spc; # Detect 7-bit charset if ($Params{Detect7bit} ne "NO") { local $@; foreach my $t (@tokens) { unless ($t->[0] =~ $UNSAFE or $t->[1]) { my $charset = MIME::Charset::_detect_7bit_charset($t->[0]); if ($charset and $charset ne &MIME::Charset::default()) { $t->[1] = $charset; } } } } if (wantarray) { @tokens; } else { join('', map { &_convert($_->[0], $_->[1], $cset, $Params{Mapping}) } @tokens); } } #------------------------------ # _convert RAW, FROMCHARSET, TOCHARSET, MAPPING # Private: used by decode_mimewords() to convert string by other charset # or to decode to Unicode. # When source charset is unknown and Unicode string is requested, at first # try well-formed UTF-8 then fallback to ISO-8859-1 so that almost all # non-ASCII bytes will be preserved. sub _convert($$$$) { my $s = shift; my $charset = shift; my $cset = shift; my $mapping = shift; return $s unless &MIME::Charset::USE_ENCODE; return $s unless $cset->as_string; croak "unsupported charset ``".$cset->as_string."''" unless $cset->decoder or $cset->as_string eq "_UNICODE_"; local($@); $charset = MIME::Charset->new($charset, Mapping => $mapping); if ($charset->as_string and $charset->as_string eq $cset->as_string) { return $s; } # build charset object to transform string from $charset to $cset. $charset->encoder($cset); my $converted = $s; if (is_utf8($s) or $s =~ $WIDECHAR) { if ($charset->output_charset ne "_UNICODE_") { $converted = $charset->encode($s); } } elsif ($charset->output_charset eq "_UNICODE_") { if (!$charset->decoder) { if ($s =~ $UNSAFE) { $@ = ''; eval { $charset = MIME::Charset->new("UTF-8", Mapping => 'STANDARD'); $converted = $charset->decode($converted, FB_CROAK()); }; if ($@) { $converted = $s; $charset = MIME::Charset->new("ISO-8859-1", Mapping => 'STANDARD'); $converted = $charset->decode($converted, 0); } } } else { $converted = $charset->decode($s); } } elsif ($charset->decoder) { $converted = $charset->encode($s); } return $converted; } #------------------------------ =item encode_mimeword RAW, [ENCODING], [CHARSET] I Encode a single RAW "word" that has unsafe characters. The "word" will be encoded in its entirety. ### Encode "<>": $encoded = encode_mimeword("\xABFran\xE7ois\xBB"); You may specify the ENCODING (C<"Q"> or C<"B">), which defaults to C<"Q">. B<**> You may also specify it as ``special'' value: C<"S"> to choose shorter one of either C<"Q"> or C<"B">. You may specify the CHARSET, which defaults to C. B<*> Spaces will be escaped with ``_'' by C<"Q"> encoding. =cut sub encode_mimeword { my $word = shift; my $encoding = uc(shift || 'Q'); # not overridden. my $charset = shift || 'ISO-8859-1'; # ditto. my $language = uc(shift || ""); # ditto. if (ref $charset) { if (is_utf8($word) or $word =~ /$WIDECHAR/) { $word = $charset->undecode($word, 0); } $charset = $charset->as_string; } else { $charset = uc($charset); } my $encstr; if ($encoding eq 'Q') { $encstr = &_encode_Q($word); } elsif ($encoding eq "S") { my ($B, $Q) = (&_encode_B($word), &_encode_Q($word)); if (length($B) < length($Q)) { $encoding = "B"; $encstr = $B; } else { $encoding = "Q"; $encstr = $Q; } } else { # "B" $encoding = "B"; $encstr = &_encode_B($word); } if ($language) { return "=?$charset*$language?$encoding?$encstr?="; } else { return "=?$charset?$encoding?$encstr?="; } } #------------------------------ =item encode_mimewords RAW, [OPTS] I Given a RAW string, try to find and encode all "unsafe" sequences of characters: ### Encode a string with some unsafe "words": $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB"); Returns the encoded string. B<**> RAW may be a Unicode string when Unicode/multibyte support is enabled (see L). Furthermore, RAW may be a reference to that returned by L on array context. In latter case "Charset" option (see below) will be overridden (see also a note below). B: B<*> When RAW is an arrayref, adjacent encoded-words (i.e. elements having non-ASCII charset element) are concatenated. Then they are split taking care of character boundaries of multibyte sequences when Unicode/multibyte support is enabled. Portions for unencoded data should include surrounding whitespace(s), or they will be merged into adjoining encoded-word(s). Any arguments past the RAW string are taken to define a hash of options: =over 4 =item Charset Encode all unsafe stuff with this charset. Default is 'ISO-8859-1', a.k.a. "Latin-1". =item Detect7bit B<**> When "Encoding" option (see below) is specified as C<"a"> and "Charset" option is unknown, try to detect 7-bit charset on given RAW string. Default is C<"YES">. When Unicode/multibyte support is disabled, this option will not have any effects (see L). =item Encoding The encoding to use, C<"q"> or C<"b">. B<**> You may also specify ``special'' values: C<"a"> will automatically choose recommended encoding to use (with charset conversion if alternative charset is recommended: see L); C<"s"> will choose shorter one of either C<"q"> or C<"b">. B: B<*> As of release 1.005, The default was changed from C<"q"> (the default on MIME::Words) to C<"a">. =item Field Name of the mail field this string will be used in. B<**> Length of mail field name will be considered in the first line of encoded header. =item Folding B<**> A Sequence to fold encoded lines. The default is C<"\n">. If empty string C<""> is specified, encoded-words exceeding line length (see L below) will be split by SPACE. B: B<*> Though RFC 5322 (formerly RFC 2822) states that the lines in Internet messages are delimited by CRLF (C<"\r\n">), this module chose LF (C<"\n">) as a default to keep backward compatibility. When you use the default, you might need converting newlines before encoded headers are thrown into session. =item Mapping B<**> Specify mappings actually used for charset names. C<"EXTENDED"> uses extended mappings. C<"STANDARD"> uses standardized strict mappings. The default is C<"EXTENDED">. When Unicode/multibyte support is disabled, this option will not have any effects (see L). =item MaxLineLen B<**> Maximum line length excluding newline. The default is 76. Negative value means unlimited line length (as of release 1.012.3). =item Minimal B<**> Takes care of natural word separators (i.e. whitespaces) in the text to be encoded. If C<"NO"> is specified, this module will encode whole text (if encoding needed) not regarding whitespaces; encoded-words exceeding line length will be split based only on their lengths. Default is C<"YES"> by which minimal portions of text are encoded. If C<"DISPNAME"> is specified, portions including special characters described in RFC5322 (formerly RFC2822, RFC822) address specification (section 3.4) are also encoded. This is useful for encoding display-name of address fields. B: As of release 0.040, default has been changed to C<"YES"> to ensure compatibility with MIME::Words. On earlier releases, this option was fixed to be C<"NO">. B: C<"DISPNAME"> option was introduced at release 1.012. =item Replacement B<**> See L. =back =cut sub encode_mimewords { my $words = shift; my %params = @_; my %Params = &_getparams(\%params, YesNo => [qw(Detect7bit)], Others => [qw(Charset Encoding Field Folding Mapping MaxLineLen Minimal Replacement)], ToUpper => [qw(Charset Encoding Mapping Minimal Replacement)], ); croak "unsupported encoding ``$Params{Encoding}''" unless $Params{Encoding} =~ /^[ABQS]$/; # newline and following WSP my ($fwsbrk, $fwsspc); if ($Params{Folding} =~ m/^([\r\n]*)([\t ]?)$/) { $fwsbrk = $1; $fwsspc = $2 || " "; } else { croak sprintf "illegal folding sequence ``\\x%*v02X''", '\\x', $Params{Folding}; } # charset objects my $charsetobj = MIME::Charset->new($Params{Charset}, Mapping => $Params{Mapping}); my $ascii = MIME::Charset->new("US-ASCII", Mapping => 'STANDARD'); $ascii->encoder($ascii); # lengths my $firstlinelen = $Params{MaxLineLen} - ($Params{Field}? length("$Params{Field}: "): 0); my $maxrestlen = $Params{MaxLineLen} - length($fwsspc); # minimal encoding flag if (!$Params{Minimal}) { $Params{Minimal} = 'NO'; } elsif ($Params{Minimal} !~ /^(NO|DISPNAME)$/) { $Params{Minimal} = 'YES'; } # unsafe ASCII sequences my $UNSAFEASCII = ($maxrestlen <= 1)? qr{(?: =\? )}ox: qr{(?: =\? | [$PRINTABLE]{$Params{MaxLineLen}} )}x; $UNSAFEASCII = qr{(?: [$DISPNAMESPECIAL] | $UNSAFEASCII )}x if $Params{Minimal} eq 'DISPNAME'; unless (ref($words) eq "ARRAY") { # workaround for UTF-16* & UTF-32*: force UTF-8. if ($charsetobj->as_string =~ /$ASCIIINCOMPAT/) { $words = _utf_to_unicode($charsetobj, $words); $charsetobj = MIME::Charset->new('UTF-8'); } my @words = (); # unfolding: normalize linear-white-spaces and orphan newlines. $words =~ s/(?:[\r\n]+[\t ])*[\r\n]+([\t ]|\Z)/$1? " ": ""/eg; $words =~ s/[\r\n]+/ /g; # split if required if ($Params{Minimal} =~ /YES|DISPNAME/) { my ($spc, $unsafe_last) = ('', 0); foreach my $w (split(/([\t ]+)/, $words)) { next unless scalar(@words) or length($w); # skip garbage if ($w =~ /[\t ]/) { $spc = $w; next; } # workaround for ``ASCII transformation'' charsets my $u = $w; if ($charsetobj->as_string =~ /$ASCIITRANS/) { if (MIME::Charset::USE_ENCODE) { if (is_utf8($w) or $w =~ /$WIDECHAR/) { $w = $charsetobj->undecode($u); } else { $u = $charsetobj->decode($w); } } elsif ($w =~ /[+~]/) { #FIXME: for pre-Encode environment $u = "x$w"; } } if (scalar(@words)) { if (($w =~ /$NONPRINT|$UNSAFEASCII/ or $u ne $w) xor $unsafe_last) { if ($unsafe_last) { push @words, $spc.$w; } else { $words[-1] .= $spc; push @words, $w; } $unsafe_last = not $unsafe_last; } else { $words[-1] .= $spc.$w; } } else { push @words, $spc.$w; $unsafe_last = ($w =~ /$NONPRINT|$UNSAFEASCII/ or $u ne $w); } $spc = ''; } if ($spc) { if (scalar(@words)) { $words[-1] .= $spc; } else { # only WSPs push @words, $spc; } } } else { @words = ($words); } $words = [map { [$_, $Params{Charset}] } @words]; } # Translate / concatenate words. my @triplets; foreach (@$words) { my ($s, $cset) = @$_; next unless length($s); my $csetobj = MIME::Charset->new($cset || "", Mapping => $Params{Mapping}); # workaround for UTF-16*/UTF-32*: force UTF-8 if ($csetobj->as_string and $csetobj->as_string =~ /$ASCIIINCOMPAT/) { $s = _utf_to_unicode($csetobj, $s); $csetobj = MIME::Charset->new('UTF-8'); } # determine charset and encoding # try defaults only if 7-bit charset detection is not required my $enc; my $obj = $csetobj; unless ($obj->as_string) { if ($Params{Encoding} ne "A" or $Params{Detect7bit} eq "NO" or $s =~ /$UNSAFE/) { $obj = $charsetobj; } } ($s, $cset, $enc) = $obj->header_encode($s, Detect7bit => $Params{Detect7bit}, Replacement => $Params{Replacement}, Encoding => $Params{Encoding}); # Resolve 'S' encoding based on global length. See (*). $enc = 'S' if defined $enc and ($Params{Encoding} eq 'S' or $Params{Encoding} eq 'A' and $obj->header_encoding eq 'S'); # pure ASCII if ($cset eq "US-ASCII" and !$enc and $s =~ /$UNSAFEASCII/) { # pure ASCII with unsafe sequences should be encoded $cset = $csetobj->output_charset || $charsetobj->output_charset || $ascii->output_charset; $csetobj = MIME::Charset->new($cset, Mapping => $Params{Mapping}); # Preserve original Encoding option unless it was 'A'. $enc = ($Params{Encoding} eq 'A') ? ($csetobj->header_encoding || 'Q') : $Params{Encoding}; } else { $csetobj = MIME::Charset->new($cset, Mapping => $Params{Mapping}); } # Now no charset translations are needed. $csetobj->encoder($csetobj); # Concatenate adjacent ``words'' so that multibyte sequences will # be handled safely. # Note: Encoded-word and unencoded text must not adjoin without # separating whitespace(s). if (scalar(@triplets)) { my ($last, $lastenc, $lastcsetobj) = @{$triplets[-1]}; if ($csetobj->decoder and ($lastcsetobj->as_string || "") eq $csetobj->as_string and ($lastenc || "") eq ($enc || "")) { $triplets[-1]->[0] .= $s; next; } elsif (!$lastenc and $enc and $last !~ /[\r\n\t ]$/) { if ($last =~ /^(.*)([\r\n\t ])([$PRINTABLE]+)$/s) { $triplets[-1]->[0] = $1.$2; $s = $3.$s; } elsif ($lastcsetobj->as_string eq "US-ASCII") { $triplets[-1]->[0] .= $s; $triplets[-1]->[1] = $enc; $triplets[-1]->[2] = $csetobj; next; } } elsif ($lastenc and !$enc and $s !~ /^[\r\n\t ]/) { if ($s =~ /^([$PRINTABLE]+)([\r\n\t ])(.*)$/s) { $triplets[-1]->[0] .= $1; $s = $2.$3; } elsif ($csetobj->as_string eq "US-ASCII") { $triplets[-1]->[0] .= $s; next; } } } push @triplets, [$s, $enc, $csetobj]; } # (*) Resolve 'S' encoding based on global length. my @s_enc = grep { $_->[1] and $_->[1] eq 'S' } @triplets; if (scalar @s_enc) { my $enc; my $b = scalar grep { $_->[1] and $_->[1] eq 'B' } @triplets; my $q = scalar grep { $_->[1] and $_->[1] eq 'Q' } @triplets; # 'A' chooses 'B' or 'Q' when all other encoded-words have same enc. if ($Params{Encoding} eq 'A' and $b and ! $q) { $enc = 'B'; } elsif ($Params{Encoding} eq 'A' and ! $b and $q) { $enc = 'Q'; # Otherwise, assuming 'Q', when characters to be encoded are more than # 6th of total (plus a little fraction), 'B' will win. # Note: This might give 'Q' so great advantage... } else { my @no_enc = grep { ! $_->[1] } @triplets; my $total = length join('', map { $_->[0] } (@no_enc, @s_enc)); my $q = scalar(() = join('', map { $_->[0] } @s_enc) =~ m{[^- !*+/0-9A-Za-z]}g); if ($total + 8 < $q * 6) { $enc = 'B'; } else { $enc = 'Q'; } } foreach (@triplets) { $_->[1] = $enc if $_->[1] and $_->[1] eq 'S'; } } # chop leading FWS while (scalar(@triplets) and $triplets[0]->[0] =~ s/^[\r\n\t ]+//) { shift @triplets unless length($triplets[0]->[0]); } # Split long ``words''. my @splitwords; my $restlen; if ($Params{MaxLineLen} < 0) { @splitwords = @triplets; } else { $restlen = $firstlinelen; foreach (@triplets) { my ($s, $enc, $csetobj) = @$_; my @s = &_split($s, $enc, $csetobj, $restlen, $maxrestlen); push @splitwords, @s; my ($last, $lastenc, $lastcsetobj) = @{$s[-1]}; my $lastlen; if ($lastenc) { $lastlen = $lastcsetobj->encoded_header_len($last, $lastenc); } else { $lastlen = length($last); } $restlen = $maxrestlen if scalar @s > 1; # has split; new line(s) fed $restlen -= $lastlen; $restlen = $maxrestlen if $restlen <= 1; } } # Do encoding. my @lines; $restlen = $firstlinelen; foreach (@splitwords) { my ($str, $encoding, $charsetobj) = @$_; next unless length($str); my $s; if (!$encoding) { $s = $str; } else { $s = encode_mimeword($str, $encoding, $charsetobj); } my $spc = (scalar(@lines) and $lines[-1] =~ /[\r\n\t ]$/ or $s =~ /^[\r\n\t ]/)? '': ' '; if (!scalar(@lines)) { push @lines, $s; } elsif ($Params{MaxLineLen} < 0) { $lines[-1] .= $spc.$s; } elsif (length($lines[-1].$spc.$s) <= $restlen) { $lines[-1] .= $spc.$s; } else { if ($lines[-1] =~ s/([\r\n\t ]+)$//) { $s = $1.$s; } $s =~ s/^[\r\n]*[\t ]//; # strip only one WSP replaced by FWS push @lines, $s; $restlen = $maxrestlen; } } join($fwsbrk.$fwsspc, @lines); } #------------------------------ # _split RAW, ENCODING, CHARSET_OBJECT, ROOM_OF_FIRST_LINE, MAXRESTLEN # Private: used by encode_mimewords() to split a string into # (encoded or non-encoded) words. # Returns an array of arrayrefs [SUBSTRING, ENCODING, CHARSET]. sub _split { my $str = shift; my $encoding = shift; my $charset = shift; my $restlen = shift; my $maxrestlen = shift; if (!$charset->as_string or $charset->as_string eq '8BIT') {# Undecodable. $str =~ s/[\r\n]+[\t ]*|\x00/ /g; # Eliminate hostile characters. return ([$str, undef, $charset]); } if (!$encoding and $charset->as_string eq 'US-ASCII') { # Pure ASCII. return &_split_ascii($str, $restlen, $maxrestlen); } if (!$charset->decoder and MIME::Charset::USE_ENCODE) { # Unsupported. return ([$str, $encoding, $charset]); } my (@splitwords, $ustr, $first); while (length($str)) { if ($charset->encoded_header_len($str, $encoding) <= $restlen) { push @splitwords, [$str, $encoding, $charset]; last; } $ustr = $str; if (!(is_utf8($ustr) or $ustr =~ /$WIDECHAR/) and MIME::Charset::USE_ENCODE) { $ustr = $charset->decode($ustr); } ($first, $str) = &_clip_unsafe($ustr, $encoding, $charset, $restlen); # retry splitting if failed if ($first and !$str and $maxrestlen < $charset->encoded_header_len($first, $encoding)) { ($first, $str) = &_clip_unsafe($ustr, $encoding, $charset, $maxrestlen); } push @splitwords, [$first, $encoding, $charset]; $restlen = $maxrestlen; } return @splitwords; } # _split_ascii RAW, ROOM_OF_FIRST_LINE, MAXRESTLEN # Private: used by encode_mimewords() to split an US-ASCII string into # (encoded or non-encoded) words. # Returns an array of arrayrefs [SUBSTRING, undef, "US-ASCII"]. sub _split_ascii { my $s = shift; my $restlen = shift; my $maxrestlen = shift; $restlen ||= $maxrestlen; my @splitwords; my $ascii = MIME::Charset->new("US-ASCII", Mapping => 'STANDARD'); foreach my $line (split(/(?:[\t ]*[\r\n]+)+/, $s)) { my $spc = ''; foreach my $word (split(/([\t ]+)/, $line)) { next unless scalar(@splitwords) or $word; # skip first garbage if ($word =~ /[\t ]/) { $spc = $word; next; } my $cont = $spc.$word; my $elen = length($cont); next unless $elen; if (scalar(@splitwords)) { # Concatenate adjacent words so that encoded-word and # unencoded text will adjoin with separating whitespace. if ($elen <= $restlen) { $splitwords[-1]->[0] .= $cont; $restlen -= $elen; } else { push @splitwords, [$cont, undef, $ascii]; $restlen = $maxrestlen - $elen; } } else { push @splitwords, [$cont, undef, $ascii]; $restlen -= $elen; } $spc = ''; } if ($spc) { if (scalar(@splitwords)) { $splitwords[-1]->[0] .= $spc; $restlen -= length($spc); } else { # only WSPs push @splitwords, [$spc, undef, $ascii]; $restlen = $maxrestlen - length($spc); } } } return @splitwords; } # _clip_unsafe UNICODE, ENCODING, CHARSET_OBJECT, ROOM_OF_FIRST_LINE # Private: used by encode_mimewords() to bite off one encodable # ``word'' from a Unicode string. # Note: When Unicode/multibyte support is not enabled, character # boundaries of multibyte string shall be broken! sub _clip_unsafe { my $ustr = shift; my $encoding = shift; my $charset = shift; my $restlen = shift; return ("", "") unless length($ustr); # Seek maximal division point. my ($shorter, $longer) = (0, length($ustr)); while ($shorter < $longer) { my $cur = ($shorter + $longer + 1) >> 1; my $enc = substr($ustr, 0, $cur); if (MIME::Charset::USE_ENCODE ne '') { $enc = $charset->undecode($enc); } my $elen = $charset->encoded_header_len($enc, $encoding); if ($elen <= $restlen) { $shorter = $cur; } else { $longer = $cur - 1; } } # Make sure that combined characters won't be divided. my ($fenc, $renc); my $max = length($ustr); while (1) { $@ = ''; eval { ($fenc, $renc) = (substr($ustr, 0, $shorter), substr($ustr, $shorter)); if (MIME::Charset::USE_ENCODE ne '') { # FIXME: croak if $renc =~ /^\p{M}/ $fenc = $charset->undecode($fenc, FB_CROAK()); $renc = $charset->undecode($renc, FB_CROAK()); } }; last unless ($@); $shorter++; unless ($shorter < $max) { # Unencodable character(s) may be included. return ($charset->undecode($ustr), ""); } } if (length($fenc)) { return ($fenc, $renc); } else { return ($renc, ""); } } #------------------------------ # _getparams HASHREF, OPTS # Private: used to get option parameters. sub _getparams { my $params = shift; my %params = @_; my %Params; my %GotParams; foreach my $k (qw(NoDefault YesNo Others Obsoleted ToUpper)) { $Params{$k} = $params{$k} || []; } foreach my $k (keys %$params) { my $supported = 0; foreach my $i (@{$Params{NoDefault}}, @{$Params{YesNo}}, @{$Params{Others}}, @{$Params{Obsoleted}}) { if (lc $i eq lc $k) { $GotParams{$i} = $params->{$k}; $supported = 1; last; } } carp "unknown or deprecated option ``$k''" unless $supported; } # get defaults foreach my $i (@{$Params{YesNo}}, @{$Params{Others}}) { $GotParams{$i} = $Config->{$i} unless defined $GotParams{$i}; } # yesno params foreach my $i (@{$Params{YesNo}}) { if (!$GotParams{$i} or uc $GotParams{$i} eq "NO") { $GotParams{$i} = "NO"; } else { $GotParams{$i} = "YES"; } } # normalize case foreach my $i (@{$Params{ToUpper}}) { $GotParams{$i} &&= uc $GotParams{$i}; } return %GotParams; } #------------------------------ =back =head2 Configuration Files B<**> Built-in defaults of option parameters for L (except 'Charset' option) and L can be overridden by configuration files: F and F. For more details read F. =head1 VERSION Consult C<$VERSION> variable. Development versions of this module may be found at L. =head1 SEE ALSO L, L =head1 AUTHORS The original version of function decode_mimewords() is derived from L module that was written by: Eryq (F), ZeeGee Software Inc (F). David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com Other stuff are rewritten or added by: Hatuka*nezumi - IKEDA Soji . This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut 1; MIME-EncWords-1.014.2/lib/MIME/EncWords/000075500000000000000000000000001220667516400200215ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/MIME/EncWords/Defaults.pm.sample000064400000000000000000000022411220667516400234050ustar00viewvcviewvc00000010000000#-*- perl -*- package MIME::EncWords; =head1 NAME MIME::EncWords::Defaults - Configuration for MIME::EncWords =head1 SYNOPSIS Edit this file and place it on MIME/EncWords/Defaults.pm to activate custom settings. =head1 DESCRIPTION Following settings are derived from MIME/Charset/Defaults.pm, but you may override them for this module locally. =over 4 =item Detect7bit =item Mapping =item Replacement =back Following settings are specific to this module. =over 4 =item Charset =item Encoding =item Field =item Folding =item MaxLineLen =item Minimal =back =head1 SEE ALSO L =cut #--------------------------------------------------------------------------# # Add your own settings below. #--------------------------------------------------------------------------# ## Default settings on MIME::Charset are: # $Config->{Detect7bit} = 'YES'; # $Config->{Mapping} = 'EXTENDED'; # $Config->{Replacement} = 'DEFAULT'; ## Default settings on current release are: # $Config->{Charset} = 'ISO-8859-1'; # $Config->{Encoding} = 'A'; # $Config->{Field} = undef; # $Config->{Folding} = "\n"; # $Config->{MaxLineLen} = 76; # $Config->{Minimal} = 'YES'; 1; MIME-EncWords-1.014.2/lib/POD2/000075500000000000000000000000001220667516400162525ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/POD2/JA/000075500000000000000000000000001220667516400165445ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/POD2/JA/Encode/000075500000000000000000000000001220667516400177415ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/POD2/JA/Encode/MIME/000075500000000000000000000000001220667516400204705ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/POD2/JA/Encode/MIME/EncWords.pod000064400000000000000000000112341220667516400227210ustar00viewvcviewvc00000010000000use utf8; =encoding utf-8 =head1 NAME Encode::MIME::EncWords~[ja] - MIME の「B」・「Q」ヘッダエンコーディング (代替案) =head1 SYNOPSIS use Encode::MIME::EncWords; use Encode qw/encode decode/; # ヘッダデコードする。 $utf8 = decode('MIME-EncWords', $header); # 初期のキャラクタセット (UTF-8) でヘッダエンコードする。 $header = encode('MIME-EncWords', $utf8); # ほかのキャラクタセットでヘッダエンコードする。 Encode::MIME::EncWords->config(Charset => 'GB2312'); $header = encode('MIME-EncWords', $utf8); =head1 ABSTRACT このモジュールは、RFC 2047 に述べる MIME のヘッダエンコーディングを実装している。 エンコーディングの名前には、みっつの変種と、 キャラクタセットに特化したひとつの簡易版がある。 エンコーディング名 encode() の結果 備考 ------------------------------------------------------------------ MIME-EncWords (B と Q を自動判別) MIME-EncWords-B =?XXXX?B?...?= 初期値は UTF-8。 MIME-EncWords-Q =?XXXX?Q?...?= ,, MIME-EncWords-ISO_2022_JP =?ISO-2022-JP?B?...?= decode() の結果は、どのエンコーディングでも同じになる。 =head1 DESCRIPTION このモジュールは、L コアモジュールが提供する C エンコーディングの代替となることを意図している。 このモジュールの使いかたをよく知るためには、L を参照してほしい。 =head2 モジュール独自の機能 =over 4 =item config(KEY => VALUE, ...); I<クラスメソッド。> KEY => VALUE の対でオプションを設定する。 次のオプションが使える。 =over 4 =item Charset [encode] データ要素を変換するのに使うキャラクタセット。 初期値は C<"UTF-8">。 C では C<"ISO-2022-JP"> に固定。 =item Detect7bit [decode/encode] エンコードされていない部分の 7 ビットキャラクタセットを判別しようとする。 初期値は C<"YES">。 =item Field [encode] ヘッダフィールドの名前。エンコードされた結果の最初の行で、 これの長さが考慮される。 初期値は C。 =item Mapping [decode/encode] キャラクタセットの名前に対して実際に使うマッピングを指定する。 初期値は C<"EXTENDED">。 =item MaxLineLen [encode] 行の最大長 (改行を除く)。 初期値は C<76>。 =item Minimal [encode] エンコーディングを最小限にするか否か。 初期値は C<"YES">。 =back オプションの詳細については L を参照。 =back =head1 CAVEAT =over 4 =item * MIME ヘッダエンコーディング用のエンコーディングモジュールは、 こみいったヘッダフィールドを簡単に作り出したり、 そこから望みの要素を取り出したりできる打ち出の小槌ではない。 アドレスヘッダフィールド (To:、From: など) をデコードするには、 まず mailbox-list を分解する。 そして、個々の要素をエンコーディングモジュールでデコードする。 これをエンコードするには、 今度は個々の要素をエンコーディングモジュールでエンコードする。 そして、エンコードした要素から mailbox-list を組み立てる。 mailbox-list の組み立てや分解には、L のようなモジュールが使えるだろう。 =item * 行を LF (C<"\n">) で区切る。 RFC5322 は、インターネットのメッセージでは行を CRLF (C<"\r\n">) で区切るとしている。 =back =head1 BUGS バグやバグのような動作は開発者に知らせてください。 CPAN Request Tracker: L. =head1 VERSION C<$VERSION> 変数を参照してほしい。 B<これは実験的なリリースである>。仕様は近い将来、変わるかもしれない。 このパッケージの開発版が次の場所にある。 L。 =head1 SEE ALSO L, L, L. RFC 2047 I. =head1 AUTHOR Hatuka*nezumi - IKEDA Soji =head1 COPYRIGHT Copyright (C) 2011 Hatuka*nezumi - IKEDA Soji. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut MIME-EncWords-1.014.2/lib/POD2/JA/MIME/000075500000000000000000000000001220667516400172735ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/lib/POD2/JA/MIME/EncWords.pod000064400000000000000000000316071220667516400215320ustar00viewvcviewvc00000010000000=encoding utf-8 =head1 NAME MIME::EncWords~[ja] - RFC 2047 encoded-word 関連 (改良版) =head1 SYNOPSIS I は、RFC 2047 (旧 RFC 1522) の仕様により適合することをめざした L の別実装です。 加えて、いくらかの改良がなされています。 以下の梗概と説明は、もとの MIME::Words から採ったものに、 改良点の説明 (B<**>) および変更点の説明と明確化 (B<*>) を加えたものです。> 読み進める前に、L を見るべきだ。そうして、 あなたの成し遂げようとしていることのどこでこのモジュールを使うのかを、 理解してほしい。 いますぐ。待ってるから。 いいかな。はじめるよ... use MIME::EncWords qw(:all); ### 文字列を、キャラクタセットは無視してデコードした文字列にする: $decoded = decode_mimewords( 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= ', ); ### 文字列を、デコードされた [DATA,CHARSET] の対の配列にする: @decoded = decode_mimewords( 'To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= ', ); ### 単一の「安全でない語」をエンコードする: $encoded = encode_mimeword("\xABFran\xE7ois\xBB"); ### 文字列を、「安全でない語」を探しながらエンコードする: $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB in town"); =head1 DESCRIPTION 合衆国の諸君。このモジュールでいったい何をやらかそうというのか、 わからないかもしれないね。欧州、ロシア等の諸君なら、わかるだろう。C<(:-)>。 たとえば、これは有効な MIME ヘッダだ: From: =?US-ASCII?Q?Keith_Moore?= To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= =?US-ASCII?Q?.._cool!?= これらのフィールドは、だいたいつぎのようにデコードできる: From: Keith Moore To: Keld Jørn Simonsen CC: André Pirard Subject: If you can read this you understand the example... cool! B<追補>: 合衆国、欧州の諸君。 このモジュールでいったいなにをやらかそうというのか、 わからないかもしれないね。東アジア等の諸君なら、わかるだろう。 C<(^_^)>. たとえば、これは有効な MIME ヘッダだ: Subject: =?EUC-KR?B?sNTAuLinKGxhemluZXNzKSwgwvzB9ri7seIoaW1w?= =?EUC-KR?B?YXRpZW5jZSksILGzuLgoaHVicmlzKQ==?= これらのフィールドは、だいたいつぎのようにデコードできる: Subject: 게으름(laziness), 참지말기(impatience), 교만(hubris) =head1 PUBLIC INTERFACE =over 4 =cut =item decode_mimewords ENCODED, [OPTS...] I<関数>。 文字列から RFC 2047 スタイルの "Q" エンコーディング (quoted-printable の一種) や "B" エンコーディング (base64) を探し、それをデコードする。 B<配列コンテクストでは>、文字列 ENCODED をデコードした C<[DATA, CHARSET]> の対に分割し、そのリストを返す。 エンコードされていなかったデータは 1 要素の配列 C<[DATA]> で返す (CHARSET は実質的に C)。 $enc = '=?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= '; foreach (decode_mimewords($enc)) { print "", ($_[1] || 'US-ASCII'), ": ", $_[0], "\n"; } B<**> ただし、隣り合う「encoded-word」を、キャラクタセットがおなじなら連結する。 これは、マルチバイト列を安全に扱えるようにするためである。 B<**> RFC2231 第 5 節で定義している言語情報があれば、第 3 の要素として追加する。 B<*> エンコードされていなかったデータの両端の空白文字は取り去らない。 これは、L との互換性を保つためである。 B<スカラコンテクストでは>、上記のリストの DATA 要素をすべて連結し、 それを返す。I<注意: 情報の損失がある>ので、 望んだ結果が得られI<ない>かもしれない。 だが、文字列 ENCODED のすべての文字のキャラクタセットが同一だとわかっているのなら、 これは役に立つこともある。 (これを使う前に、L を見てほしい。 これが望みのものかもしれない。) B<**> 下記の "Charset" も参照。 構文エラーが発生すると、$@ にエラーの説明をセットするが、 解析はできるかぎり (ヘッダのデコードで得られたI<なにか>を返すために) 続行する。 エラーが見つからなければ、$@ は偽となる。 B<*> 「encoded-word」が壊れているときは、エンコードしたままのものを返す。 この場合、$@ をセットする。 ENCODED に引き続く引数は、ハッシュによるオプションの定義とみなす。 B<**> Unicode/マルチバイト文字対応が有効になっていないとき (L 参照) は、 以下のオプションはなんの効果も持たない。 =over 4 =item Charset B<**> スカラコンテクストで、DATA 要素をこの名前のキャラクタセットで変換する。 このオプションに特殊値 C<"_UNICODE_"> を指定すると、 返す値は Unicode 文字列となる。 B: この仕様は、I を指定したとき以外は>、 やはり情報の損失がある。 =item Detect7bit B<**> エンコードされていなかった部分の 7 ビットキャラクタセットを判別しようとする。 初期値は C<"YES">。 =cut =item Mapping B<**> スカラコンテクストで、 キャラクタセットの名前に対して実際に使うマッピングを指定する。 C<"EXTENDED"> は拡張マッピングを使う。 C<"STANDARD"> は標準化されている厳密なマッピングを使う。 初期値は C<"EXTENDED">。 =back =cut =item encode_mimeword RAW, [ENCODING], [CHARSET] I<関数>。 「安全でない」文字のある単一の「語」RAW をエンコードする。 「語」全体がエンコードされる。 ### "«François»" をエンコードする: $encoded = encode_mimeword("\xABFran\xE7ois\xBB"); エンコーディング ENCODING を指定できる (C<"Q"> または C<"B">)。 初期値は C<"Q">。 B<**> さらに、「特殊」な値も指定できる。 C<"S"> は C<"Q"> と C<"B"> のうち短くなるほうを選ぶ。 キャラクタセット CHARSET を指定できる。初期値は C。 B<*> C<"Q"> エンコーディングでは、空白を ``_'' でエスケープする。 =cut =item encode_mimewords RAW, [OPTS] I<関数>。 文字列 RAW から、「安全でない」文字の列を見つけてエンコードしようとする。 ### 「安全でない語」のある文字列をエンコードする: $encoded = encode_mimewords("Me and \xABFran\xE7ois\xBB"); エンコードした文字列を返す。 B<**> RAW は Unicode でもよい。ただし Unicode/マルチバイト対応が有効な場合 (L 参照)。 さらに RAW は、L が配列コンテクストで返すものへの参照でもよい。 後の場合は、"Charset" オプション (下記参照) が適宜上書きされる (下の注も参照)。 B: B<*> RAW が配列への参照であるときは、 隣り合う「encoded-word」 (つまり、ASCII 以外のキャラクタセット要素のある要素) を連結する。その上で、マルチバイト文字の文字境界を考慮しながら (ただしこれは Unicode/マルチバイト対応が有効なときだけ)、分割する。 エンコードしないデータ部分は両端に空白文字が必要。 そうしなければ隣り合う「encoded-word」に併合されてしまう。 RAW に引き続く引数は、ハッシュによるオプションの定義とみなす: =over 4 =item Charset 「安全でない」ものはこのキャラクタセットでエンコードする。 初期値は 'ISO-8859-1' (別名 "Latin-1")。 =item Detect7bit B<**> "Encoding" オプション (下記参照) が C<"a"> に指定してあって "Charset" オプションが不明なら、 RAW 文字列の 7 ビットキャラクタセットを判別しようとする。 初期値は C<"YES">。 Unicode/マルチバイト文字対応が有効になっていないとき (L 参照) は、 このオプションはなんの効果も持たない。 =item Encoding 使用するエンコーディング。C<"q"> または C<"b">。 B<**> 「特殊」な値も指定できる。C<"a"> は推奨されるエンコーディングを自動選択する (キャラクタセットに別のものが推奨されるときはキャラクタセット変換も行う。 L 参照)。 C<"s"> は C<"q"> と C<"b"> のうち短くなるほうを選ぶ。 B: B<*> リリース 1.005 で、初期値が C<"q"> (MIME::Words での初期値) から C<"a"> に変わった。 =item Field この文字列を使うメールフィールドの名前。 B<**> ヘッダをエンコードする際には、最初の行でメールフィールド名の長さを考慮する。 =item Folding B<**> エンコードする行を「行折り」する文字の列。初期値は C<"\n">。 空文字列 C<""> を指定すると、行長 (下記 L 参照) を超える「encoded-word」を SPACE で分割するだけ。 B: B<*> RFC 5322 (旧 RFC 2822) には、インターネットのメッセージでは行を CRLF (C<"\r\n">) で区切ると明記してあるが、 このモジュールでは後方互換性を保つために LF (C<"\n">) を初期値としてきた。 初期値を使っている場合、 エンコードしたヘッダをセッションへと放つ前に、 改行文字の変換が必要になることもある。 =item Mapping B<**> キャラクタセットの名前に対して実際に使うマッピングを指定する。 C<"EXTENDED"> は拡張マッピングを使う。 C<"STANDARD"> は標準化されている厳密なマッピングを使う。 初期値は C<"EXTENDED">。 Unicode/マルチバイト文字対応が有効になっていないとき (L 参照) は、 このオプションはなんの効果も持たない。 =item MaxLineLen B<**> 行の最大長 (改行を除く)。 初期値は 76。 負の値は行長無制限を意味する (リリース 1.012.3 以降)。 =item Minimal B<**> エンコードするテキストの中の自然な語分離子 (要するに空白文字) に注意を払う。 C<"NO"> を指定すると、 このモジュールは空白文字を考慮せずにテキスト全体をエンコード (エンコードが必要なら) し、行長を超える「encoded-word」は単にその長さによって分割される。 初期値は C<"YES"> で、最小限の部分だけエンコードする。 C<"DISPNAME"> を指定すると、RFC5322 (旧 RFC2822、RFC822) のアドレス仕様 (3.4節) で述べている特殊文字を含む部分もエンコードする。 これはアドレスフィールド中の display-name をエンコードする際に有用である。 B: リリース 0.040 で、初期値が C<"YES"> に変わった。 MIME::Words との互換性を保つためである。 それ以前のリリースでは、このオプションは C<"NO"> 固定であった。 B: C<"DISPNAME"> はリリース 1.012 で導入された。 =item Replacement B<**> L 参照。 =back =cut =back =head2 設定ファイル B<**> L ('Charset' オプションを除く) および L のオプション引数の組み込み初期値は、 設定ファイルで上書きできる。 F と F。 詳細は F を読んでほしい。 =head1 VERSION C<$VERSION> 変数を参照してほしい。 このモジュールの開発版が L にある。 =head1 SEE ALSO L, L =head1 AUTHORS decode_mimewords() 関数の元の版は L モジュールから引き継いだもので、著者は以下のとおり: Eryq (F), ZeeGee Software Inc (F). David F. Skoll (dfs@roaringpenguin.com) http://www.roaringpenguin.com そのほかの部分は、次の者が書き直しあるいは加えた: Hatuka*nezumi - IKEDA Soji . This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut MIME-EncWords-1.014.2/t/000075500000000000000000000000001220667516400152435ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/t/01decode.t000064400000000000000000000031001220667516400170060ustar00viewvcviewvc00000010000000use strict; use Test; BEGIN { plan tests => ($] >= 5.007003)? 48: 16 } use MIME::EncWords qw(decode_mimewords); $MIME::EncWords::Config = { Detect7bit => 'YES', Mapping => 'EXTENDED', Replacement => 'DEFAULT', Charset => 'ISO-8859-1', Encoding => 'A', Field => undef, Folding => "\n", MaxLineLen => 76, Minimal => 'YES', }; if (&MIME::Charset::USE_ENCODE && $] < 5.008) { require Encode::KR; } my @testins = qw(decode-singlebyte decode-multibyte decode-ascii); { local($/) = ''; foreach my $in (@testins) { open WORDS, ") { s{\A\s+|\s+\Z}{}g; # trim my ($isgood, $expect, $enc) = split /\n/, $_, 3; my ($charset, $ucharset); $isgood = (uc($isgood) eq 'GOOD'); ($expect, $charset, $ucharset) = eval $expect; # Convert to raw data... my $dec = decode_mimewords($enc); ok((($isgood && !$@) or (!$isgood && $@)) and ($isgood ? ($dec eq $expect) : 1)); if (MIME::Charset::USE_ENCODE ne '') { my $u; # Convert to other charset (or no conversion)... $u = $expect; Encode::from_to($u, $charset, "utf-8") if $charset; $dec = decode_mimewords($enc, Charset => $charset? "utf-8": ""); ok((($isgood && !$@) or (!$isgood && $@)) and ($isgood ? ($dec eq $u) : 1)); # Convert to Unicode... $u = Encode::decode($charset || $ucharset || "us-ascii", $expect); $dec = decode_mimewords($enc, Charset => "_UNICODE_"); ok((($isgood && !$@) or (!$isgood && $@)) and ($isgood ? ($dec eq $u) : 1)); } } close WORDS; } } 1; MIME-EncWords-1.014.2/t/02encode.t000064400000000000000000000020411220667516400170240ustar00viewvcviewvc00000010000000use strict; use Test; BEGIN { plan tests => ($] >= 5.007003)? 32: 12 } use MIME::EncWords qw(encode_mimewords); $MIME::EncWords::Config = { Detect7bit => 'YES', Mapping => 'EXTENDED', Replacement => 'DEFAULT', Charset => 'ISO-8859-1', Encoding => 'A', Field => undef, Folding => "\n", MaxLineLen => 76, Minimal => 'YES', }; if (&MIME::Charset::USE_ENCODE && $] < 5.008) { require Encode::JP; require Encode::CN; } my @testins = MIME::Charset::USE_ENCODE? qw(encode-singlebyte encode-multibyte encode-ascii encode-utf-8): qw(encode-singlebyte); { local($/) = ''; foreach my $in (@testins) { open WORDS, ") { s{\A\s+|\s+\Z}{}g; # trim my ($isgood, $dec, $expect) = split /\n/, $_, 3; $isgood = (uc($isgood) eq 'GOOD'); my @params = eval $dec; my $enc = encode_mimewords(@params); ok((($isgood && !$@) or (!$isgood && $@)) && ($isgood ? $enc : $expect), $expect, $@ || $enc); } close WORDS; } } 1; MIME-EncWords-1.014.2/t/03Encode-MIME-EncWords.t000064400000000000000000000173751220667516400212540ustar00viewvcviewvc00000010000000# -*- perl -*- # # Borrowed from mime-header.t in Encode module by DANKOGAI@CPAN. # Modified for Encode::MIME::EncWords by NEZUMI@CPAN. # no utf8; use strict; use Test::More; BEGIN { if (ord("A") == 193) { plan skip_all => 'No Encode::MIME::EncWords on EBCDIC Platforms'; } elsif ($] < 5.007003) { plan skip_all => 'Unicode/multibyte support is not available'; } else { plan tests => 18; } $| = 1; } use_ok("Encode::MIME::EncWords"); my $eheader =<<'EOS'; From: =?US-ASCII?Q?Keith_Moore?= To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= CC: =?ISO-8859-1?Q?Andr=E9?= Pirard Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= EOS my $dheader=<<"EOS"; From: Keith Moore To: Keld J\xF8rn Simonsen CC: Andr\xE9 Pirard Subject: If you can read this you understand the example. EOS is(Encode::decode('MIME-EncWords', $eheader), $dheader, "decode ASCII (RFC2047)"); my $uheader =<<'EOS'; From: =?US-ASCII?Q?Keith_Moore?= To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= CC: =?ISO-8859-1?Q?Andr=E9?= Pirard Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= EOS is(Encode::decode('MIME-EncWords', $uheader), $dheader, "decode UTF-8 (RFC2047)"); my $lheader =<<'EOS'; From: =?US-ASCII*en-US?Q?Keith_Moore?= To: =?ISO-8859-1*da-DK?Q?Keld_J=F8rn_Simonsen?= CC: =?ISO-8859-1*fr-BE?Q?Andr=E9?= Pirard Subject: =?ISO-8859-1*en?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= EOS is(Encode::decode('MIME-EncWords', $lheader), $dheader, "decode language tag (RFC2231)"); $dheader=Encode::decode_utf8(<<"EOS"); From: \xe5\xb0\x8f\xe9\xa3\xbc \xe5\xbc\xbe To: dankogai\@dan.co.jp (\xe5\xb0\x8f\xe9\xa3\xbc=Kogai, \xe5\xbc\xbe=Dan) Subject: \xe6\xbc\xa2\xe5\xad\x97\xe3\x80\x81\xe3\x82\xab\xe3\x82\xbf\xe3\x82\xab\xe3\x83\x8a\xe3\x80\x81\xe3\x81\xb2\xe3\x82\x89\xe3\x81\x8c\xe3\x81\xaa\xe3\x82\x92\xe5\x90\xab\xe3\x82\x80\xe3\x80\x81\xe9\x9d\x9e\xe5\xb8\xb8\xe3\x81\xab\xe9\x95\xb7\xe3\x81\x84\xe3\x82\xbf\xe3\x82\xa4\xe3\x83\x88\xe3\x83\xab\xe8\xa1\x8c\xe3\x81\x8c\xe4\xb8\x80\xe4\xbd\x93\xe5\x85\xa8\xe4\xbd\x93\xe3\x81\xa9\xe3\x81\xae\xe3\x82\x88\xe3\x81\x86\xe3\x81\xab\xe3\x81\x97\xe3\x81\xa6Encode\xe3\x81\x95\xe3\x82\x8c\xe3\x82\x8b\xe3\x81\xae\xe3\x81\x8b\xef\xbc\x9f EOS #my $bheader =<<'EOS'; #From:=?UTF-8?B?IOWwj+mjvCDlvL4g?= #To: dankogai@dan.co.jp (=?UTF-8?B?5bCP6aO8?==Kogai,=?UTF-8?B?IOW8vg==?== # Dan) #Subject: # =?UTF-8?B?IOa8ouWtl+OAgeOCq+OCv+OCq+ODiuOAgeOBsuOCieOBjOOBquOCkuWQq+OCgA==?= # =?UTF-8?B?44CB6Z2e5bi444Gr6ZW344GE44K/44Kk44OI44Or6KGM44GM5LiA5L2T5YWo?= # =?UTF-8?B?5L2T44Gp44Gu44KI44GG44Gr44GX44GmRW5jb2Rl44GV44KM44KL44Gu44GL?= # =?UTF-8?B?77yf?= #EOS my $bheader =<<'EOS'; From: =?UTF-8?B?5bCP6aO8IOW8vg==?= To: dankogai@dan.co.jp =?UTF-8?B?KOWwj+mjvD1Lb2dhaSwg5by+PURhbik=?= Subject: =?UTF-8?B?5ryi5a2X44CB44Kr44K/44Kr44OK44CB44Gy44KJ44GM44Gq44KS?= =?UTF-8?B?5ZCr44KA44CB6Z2e5bi444Gr6ZW344GE44K/44Kk44OI44Or6KGM44GM5LiA?= =?UTF-8?B?5L2T5YWo5L2T44Gp44Gu44KI44GG44Gr44GX44GmRW5jb2Rl44GV44KM44KL?= =?UTF-8?B?44Gu44GL77yf?= EOS #my $qheader=<<'EOS'; #From:=?UTF-8?Q?=20=E5=B0=8F=E9=A3=BC=20=E5=BC=BE=20?= #To: dankogai@dan.co.jp (=?UTF-8?Q?=E5=B0=8F=E9=A3=BC?==Kogai, # =?UTF-8?Q?=20=E5=BC=BE?==Dan) #Subject: # =?UTF-8?Q?=20=E6=BC=A2=E5=AD=97=E3=80=81=E3=82=AB=E3=82=BF=E3=82=AB?= # =?UTF-8?Q?=E3=83=8A=E3=80=81=E3=81=B2=E3=82=89=E3=81=8C=E3=81=AA=E3=82=92?= # =?UTF-8?Q?=E5=90=AB=E3=82=80=E3=80=81=E9=9D=9E=E5=B8=B8=E3=81=AB=E9=95=B7?= # =?UTF-8?Q?=E3=81=84=E3=82=BF=E3=82=A4=E3=83=88=E3=83=AB=E8=A1=8C=E3=81=8C?= # =?UTF-8?Q?=E4=B8=80=E4=BD=93=E5=85=A8=E4=BD=93=E3=81=A9=E3=81=AE=E3=82=88?= # =?UTF-8?Q?=E3=81=86=E3=81=AB=E3=81=97=E3=81=A6Encode=E3=81=95?= # =?UTF-8?Q?=E3=82=8C=E3=82=8B=E3=81=AE=E3=81=8B=EF=BC=9F?= #EOS my $qheader=<<'EOS'; From: =?UTF-8?Q?=E5=B0=8F=E9=A3=BC_=E5=BC=BE?= To: dankogai@dan.co.jp =?UTF-8?Q?=28=E5=B0=8F=E9=A3=BC=3DKogai=2C_?= =?UTF-8?Q?=E5=BC=BE=3DDan=29?= Subject: =?UTF-8?Q?=E6=BC=A2=E5=AD=97=E3=80=81=E3=82=AB=E3=82=BF=E3=82=AB?= =?UTF-8?Q?=E3=83=8A=E3=80=81=E3=81=B2=E3=82=89=E3=81=8C=E3=81=AA=E3=82=92?= =?UTF-8?Q?=E5=90=AB=E3=82=80=E3=80=81=E9=9D=9E=E5=B8=B8=E3=81=AB=E9=95=B7?= =?UTF-8?Q?=E3=81=84=E3=82=BF=E3=82=A4=E3=83=88=E3=83=AB=E8=A1=8C=E3=81=8C?= =?UTF-8?Q?=E4=B8=80=E4=BD=93=E5=85=A8=E4=BD=93=E3=81=A9=E3=81=AE=E3=82=88?= =?UTF-8?Q?=E3=81=86=E3=81=AB=E3=81=97=E3=81=A6Encode=E3=81=95=E3=82=8C?= =?UTF-8?Q?=E3=82=8B=E3=81=AE=E3=81=8B=EF=BC=9F?= EOS is(Encode::decode('MIME-EncWords', $bheader), $dheader, "decode B"); is(Encode::decode('MIME-EncWords', $qheader), $dheader, "decode Q"); #is(Encode::encode('MIME-EncWords-B', $dheader)."\n", $bheader, "encode B"); is(Encode::encode('MIME-EncWords-B', $dheader), $bheader, "encode B"); #is(Encode::encode('MIME-EncWords-Q', $dheader)."\n", $qheader, "encode Q"); is(Encode::encode('MIME-EncWords-Q', $dheader), $qheader, "encode Q"); $dheader = "What is =?UTF-8?B?w4RwZmVs?= ?"; $bheader = "What is =?UTF-8?B?PT9VVEYtOD9CP3c0UndabVZzPz0=?= ?"; #$qheader = "What is =?UTF-8?Q?=3D=3FUTF=2D8=3FB=3Fw4RwZmVs=3F=3D?= ?"; $qheader = "What is =?UTF-8?Q?=3D=3FUTF-8=3FB=3Fw4RwZmVs=3F=3D?= ?"; is(Encode::encode('MIME-EncWords-B', $dheader), $bheader, "Double decode B"); is(Encode::encode('MIME-EncWords-Q', $dheader), $qheader, "Double decode Q"); { # From: Dave Evans # Subject: Bug in Encode::MIME::Header # Message-Id: <3F43440B.7060606@rudolf.org.uk> use charnames ":full"; my $pound_1024 = "\N{POUND SIGN}1024"; is(Encode::encode('MIME-EncWords-Q' => $pound_1024), '=?UTF-8?Q?=C2=A31024?=', 'pound 1024'); } is(Encode::encode('MIME-EncWords-Q', "\x{fc}"), '=?UTF-8?Q?=C3=BC?=', 'Encode latin1 characters'); # RT42627 #my $rt42627 = Encode::decode_utf8("\x{c2}\x{a3}xxxxxxxxxxxxxxxxxxx0"); #is(Encode::encode('MIME-EncWords-Q', $rt42627), # '=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?= =?UTF-8?Q?0?=', # 'MIME-EncWords-Q encoding does not truncate trailing zeros'); my $rt42627; Encode::MIME::EncWords->config(MaxLineLen => 37); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx00"); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?=\n =?UTF-8?Q?00?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx.0"); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?=\n =?UTF-8?Q?=2E0?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx0."); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?=\n =?UTF-8?Q?0=2E?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); Encode::MIME::EncWords->config(MaxLineLen => 38); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx00"); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?=\n =?UTF-8?Q?00?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx.0"); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx?=\n =?UTF-8?Q?=2E0?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); $rt42627 = Encode::decode_utf8("\xc2\xa3xxxxxxxxxxxxxxxxxxx0."); is(Encode::encode('MIME-EncWords-Q', $rt42627), "=?UTF-8?Q?=C2=A3xxxxxxxxxxxxxxxxxxx0?=\n =?UTF-8?Q?=2E?=", 'MIME-EncWords-Q encoding does not truncate trailing zeros'); __END__; MIME-EncWords-1.014.2/t/04Encode-MIME-EncWords-ISO_2022_JP.t000064400000000000000000000026201220667516400227260ustar00viewvcviewvc00000010000000# -*- perl -*- use strict; use Test::More; BEGIN { if( ord("A") == 193 ) { plan skip_all => 'No Encode::MIME::EncWords on EBCDIC Platforms'; } elsif ($] < 5.007003) { plan skip_all => 'Unicode/multibyte support is not available'; } else { plan tests => 14; } } BEGIN{ use_ok('Encode::MIME::EncWords'); } require_ok('Encode::MIME::EncWords'); # Codes below are derived from mime_header_iso2022jp.t in Encode, # originally from mime.t in Jcode. # Non-ASCII characters are escaped but code values are intact. my %mime = ( "\xb4\xc1\xbb\xfa\xa1\xa2\xa5\xab\xa5\xbf\xa5\xab\xa5\xca\xa1\xa2\xa4\xd2\xa4\xe9\xa4\xac\xa4\xca" => "=?ISO-2022-JP?B?GyRCNEE7eiEiJSslPyUrJUohIiRSJGkkLCRKGyhC?=", "foo bar" => "foo bar", "\xb4\xc1\xbb\xfa\xa1\xa2\xa5\xab\xa5\xbf\xa5\xab\xa5\xca\xa1\xa2\xa4\xd2\xa4\xe9\xa4\xac\xa4\xca\xa4\xce\xba\xae\xa4\xb8\xa4\xc3\xa4\xbfSubject Header." => "=?ISO-2022-JP?B?GyRCNEE7eiEiJSslPyUrJUohIiRSJGkkLCRKJE46LiQ4JEMkPxsoQlN1?=\n =?ISO-2022-JP?B?YmplY3Q=?= Header.", ); for my $k (keys %mime){ $mime{"$k\n"} = $mime{$k} . "\n"; } for my $decoded (sort keys %mime){ my $encoded = $mime{$decoded}; my $header = Encode::encode('MIME-EncWords-ISO_2022_JP', Encode::decode('euc-jp', $decoded)); my $utf8 = Encode::decode('MIME-EncWords', $header); is(Encode::encode('euc-jp', $utf8), $decoded); is($header, $encoded); } __END__ MIME-EncWords-1.014.2/t/05encode_utf.t000064400000000000000000000025621220667516400177150ustar00viewvcviewvc00000010000000use strict; use Test::More; BEGIN { if ($] < 5.007003) { plan skip_all => 'No Unicode/multibyte support'; } else { plan tests => 36; } } use MIME::EncWords qw(encode_mimewords); $MIME::EncWords::Config = { Detect7bit => 'YES', Mapping => 'EXTENDED', Replacement => 'DEFAULT', Charset => 'ISO-8859-1', Encoding => 'A', Field => undef, Folding => "\n", MaxLineLen => 76, Minimal => 'YES', }; dotest('UTF-16'); dotest('UTF-16BE'); dotest('UTF-16LE'); dotest('UTF-32'); dotest('UTF-32BE'); dotest('UTF-32LE'); sub dotest { my $charset = shift; local($/) = ''; open WORDS, ") { s{\A\s+|\s+\Z}{}g; # trim my ($isgood, $dec, $expect) = split /\n/, $_, 3; $isgood = (uc($isgood) eq 'GOOD'); my @params = eval $dec; if (ref $params[0]) { foreach my $p (@{$params[0]}) { if ($p->[1] and uc $p->[1] eq 'UTF-8') { Encode::from_to($p->[0], 'UTF-8', $charset); $p->[1] = $charset; } } } else { if ($params[1] and $params[1] eq 'Charset' and uc $params[2] eq 'UTF-8') { Encode::from_to($params[0], 'UTF-8', $charset); $params[2] = $charset; } } my $enc = encode_mimewords(@params); is((($isgood && !$@) or (!$isgood && $@)) && ($isgood ? $enc : $expect), $expect, $@ || $enc); } close WORDS; } 1; MIME-EncWords-1.014.2/t/pod.t000064400000000000000000000004351220667516400162140ustar00viewvcviewvc00000010000000use strict; use Test::More; if ($] < 5.007003 ) { plan skip_all => "Perl 5.7.3 or later required for testing utf-8 POD"; } else { eval "use Test::Pod 1.00"; if ($@) { plan skip_all => "Test::Pod 1.00 or later required for testing POD"; } } all_pod_files_ok(); MIME-EncWords-1.014.2/testin/000075500000000000000000000000001220667516400163065ustar00viewvcviewvc00000010000000MIME-EncWords-1.014.2/testin/decode-ascii.txt000064400000000000000000000005071220667516400213620ustar00viewvcviewvc00000010000000GOOD ("Subject: Notification d'+AOk-tat de remise (+AOk-chec)","utf-7","iso-8859-1") Subject: Notification =?utf-7?Q?d=27+AOk-tat?= de remise =?utf-7?Q?(+AOk-chec)?= GOOD ("The next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye.", "hz") The next sentence is in =?HZ-GB-2312?B?R0Iufns8Okt5MjtTeyMsTnBKKWw2SEshI359QnllLg==?= MIME-EncWords-1.014.2/testin/decode-multibyte.txt000064400000000000000000000017471220667516400223170ustar00viewvcviewvc00000010000000GOOD ("Subject: \xB0\xD4\xC0\xB8\xB8\xA7(laziness), \xC2\xFC\xC1\xF6\xB8\xBB\xB1\xE2(impatience), \xB1\xB3\xB8\xB8(hubris)", "euc-kr") Subject: =?EUC-KR?B?sNTAuLinKGxhemluZXNzKSwgwvzB9ri7seIoaW1w?= =?EUC-KR?B?YXRpZW5jZSksILGzuLgoaHVicmlzKQ==?= GOOD ("Subject: \xB0\xD4\xC0\xB8\xB8\xA7(laziness), \xC2\xFC\xC1\xF6\xB8\xBB\xB1\xE2(impatience), \xB1\xB3\xB8\xB8(hubris)", "euc-kr") Subject: =?euc-kr?b?sNTAuLinKGxhemluZXNzKSwgwvzB?= =?euc-kr?b?9ri7seIoaW1wYXRpZW5jZSksILGzuLgoaHVicmlzKQ==?= GOOD ("Subject: \xB0\xD4\xC0\xB8\xB8\xA7(laziness), \xC2\xFC\xC1\xF6\xB8\xBB\xB1\xE2(impatience), \xB1\xB3\xB8\xB8(hubris)", "euc-kr") Subject: =?euc-kr?b?sNTAuLinKGxhemluZXNzKSwgwvzB9ri7seI=?= =?euc-kr?b?KGltcGF0aWVuY2UpLCCxs7i4KGh1YnJpcyk=?= GOOD ("Subject: \xB0\xD4\xC0\xB8\xB8\xA7(laziness), \xC2\xFC\xC1\xF6\xB8\xBB\xB1\xE2(impatience), \xB1\xB3\xB8\xB8(hubris)", undef, "euc-kr") Subject: =?euc-kr?b?sNTAuLinKGxhemluZXNzKSwgwvzB?= =?euc-kr?b?9ri7seIoaW1wYXRpZW5jZSksILGzuLgoaHVicmlzKQ==?= MIME-EncWords-1.014.2/testin/decode-singlebyte.txt000064400000000000000000000024271220667516400224420ustar00viewvcviewvc00000010000000GOOD ("Subject: Oc\xE9 3165 Network Copier down for maintenance", "iso-8859-1") Subject: =?ISO-8859-1?Q?Oc=E9_3165_Network_Copier_down_for_maintenance?= BAD ("") Subject: =?ISO-8859-1?Q?Oc=E9_3165_Network_Copier_down_for_maintenance? BAD ("") Subject: =?ISO-8859-1?Q?Oc=E9_3165_Network_Copier_down_for_maintenance GOOD ("Keith Moore ", "us-ascii") =?US-ASCII?Q?Keith_Moore?= GOOD ("Keld J\xF8rn Simonsen ", "iso-8859-1") =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= GOOD ("Andr\xE9 (<- one space) Pirard ", "iso-8859-1") =?ISO-8859-1?Q?Andr=E9_?=(<- one space) Pirard GOOD ("Andr\xE9 (<- two spaces) Pirard ", "iso-8859-1") =?ISO-8859-1?Q?Andr=E9_?= (<- two spaces) Pirard GOOD ("If you can read this you understand the example... cool!", "us-ascii") =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?==?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?==?US-ASCII?Q?.._cool!?= GOOD ("If you can read this you understand the example... cool!", "us-ascii") =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?= =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?= =?US-ASCII?Q?.._cool!?= GOOD ("_-_", "us-ascii") =?ISO-8859-1?Q?=5F-=5F?= MIME-EncWords-1.014.2/testin/encode-ascii.txt000064400000000000000000000040331220667516400213720ustar00viewvcviewvc00000010000000GOOD ([["Perl=?: "], ["\x1B\x24BIBE*\x40^Co "EUC-JP") =?ISO-2022-JP?B?UGVybD0/OiAbJEJJQkUqQF5DbzxnNUFFKkdRSio9UE5PNG8bKEI=?= (Pathologically Eclectic Rubbish Lister) GOOD ("Perl=?: \x1B\x24BIBE*\x40^Co "iso-2022-jp") =?ISO-2022-JP?B?UGVybD0/OiAbJEJJQkUqQF5DbzxnNUFFKkdRSio9UE5PNG8bKEI=?= (Pathologically Eclectic Rubbish Lister) GOOD ("Peeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeerl: \x1B\x24BIBE*\x40^Co "iso-2022-jp") =?ISO-2022-JP?B?UGVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVl?= =?ISO-2022-JP?B?ZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZWVlZXJsOiA=?= =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvGyhC?= (Pathologically Eclectic Rubbish Lister) GOOD ("Perl: \x1B\x24BIBE*\x40^Co "iso-2022-jp") Perl: =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvGyhCIChQYXRob2xv?= =?ISO-2022-JP?B?Z2ljYWxsbGxsbGxsbGx5LUVjbGVjbGVjbGVjbGVjbGVjbGVjdGljLVJ1?= =?ISO-2022-JP?B?YmJiYmJiYmJiYmlzaC1MaXN0ZWVlZWVlcik=?= GOOD ("The next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye.", MaxLineLen => 42, Charset => "HZ-GB-2312") The next sentence is in =?HZ-GB-2312?B?R0Iufns8Okt5MjtTeyMsfn0=?= =?HZ-GB-2312?B?fntOcEopbDZISyEjfn1CeWUu?= GOOD ([["The next sentence is in GB. "], ["~{<:Ky2;S{#,NpJ)l6HK!#~}", "hzgb2312"], [" Bye."]], MaxLineLen => 42, Charset => "") The next sentence is in GB. =?HZ-GB-2312?B?fns8Okt5MjtTeyMsTnBKKX59?= =?HZ-GB-2312?B?fntsNkhLISN+fQ==?= Bye. MIME-EncWords-1.014.2/testin/encode-multibyte.txt000064400000000000000000000037241220667516400223260ustar00viewvcviewvc00000010000000GOOD ("Perl:\x1B\x24BIBE*\x40^Co "iso-2022-jp", Field => "Subject") =?ISO-2022-JP?B?UGVybDobJEJJQkUqQF5DbzxnNUFFKkdRSio9UE5PNG8bKEIo?= =?ISO-2022-JP?B?UGF0aG9sb2dpY2FsbHk=?= Eclectic Rubbish Lister) GOOD ("Perl:\xC9\xC2\xC5\xAA\xC0\xDE\xC3\xEF\xBC\xE7\xB5\xC1\xC5\xAA\xC7\xD1\xCA\xAA\xBD\xD0\xCE\xCF\xB4\xEF(Pathologically Eclectic Rubbish Lister)", Charset => "euc-jp", Field => "Subject") =?ISO-2022-JP?B?UGVybDobJEJJQkUqQF5DbzxnNUFFKkdRSio9UE5PNG8bKEIo?= =?ISO-2022-JP?B?UGF0aG9sb2dpY2FsbHk=?= Eclectic Rubbish Lister) GOOD ([["Perl: "], ["\x1B\x24BIBE*\x40^Co "Subject") Perl: =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvGyhC?= (Pathologically Eclectic Rubbish Lister) GOOD ("Perl: \x1B\x24BIBE*\x40^Co "iso-2022-jp", Field => "Subject") Perl: =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvGyhC?= (Pathologically Eclectic Rubbish Lister) GOOD ("Perl: \x1B\x24BIBE*\x40^Co "iso-2022-jp", Field => "Subject") Perl: =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvGyhC?= (Pathologically Eclectic Rubbish Lister) GOOD ("Perl - Pathologically Eclectic Rubbish Lister \x1b\x24BIBE*\x40^Co "iso-2022-jp", Field => "") Perl - Pathologically Eclectic Rubbish Lister =?ISO-2022-JP?B?GyRCSUIbKEI=?= =?ISO-2022-JP?B?GyRCRSpAXkNvPGc1QUUqR1FKKj1QTk80byRyPEI4PSQ5JGs4QDhsGyhC?= GOOD ("Perl -- Pathologically Eclectic Rubbish Lister \x1b\x24BIBE*\x40^Co "iso-2022-jp", Field => "") Perl -- Pathologically Eclectic Rubbish Lister =?ISO-2022-JP?B?GyRCSUJFKkBeQ288ZzVBRSpHUUoqPVBOTzRvJHI8Qjg9JDkkazhAGyhC?= =?ISO-2022-JP?B?GyRCOGwbKEI=?= MIME-EncWords-1.014.2/testin/encode-singlebyte.txt000064400000000000000000000037141220667516400224540ustar00viewvcviewvc00000010000000GOOD ("Oc\xE9 3165 Network Copier down for maintenance", Charset => "iso-8859-1") =?ISO-8859-1?Q?Oc=E9?= 3165 Network Copier down for maintenance GOOD ("Keith Moore ", Charset => "iso-8859-1") Keith Moore GOOD ([["Keld J\xF8rn Simonsen"],[" "],[""]], Charset => "iso-8859-1") =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= GOOD ([["Andr\xE9 "],["(<- one space) Pirard"],[" "],[""]], Charset => "iso-8859-1") =?ISO-8859-1?Q?Andr=E9_=28=3C-?= one space) Pirard GOOD ([["Andr\xE9"],[" (<- one space) Pirard"],[" "],[""]], Charset => "iso-8859-1") =?ISO-8859-1?Q?Andr=E9?= (<- one space) Pirard GOOD ([["Andr\xE9 "],["(<- two spaces) Pirard"],[" "],[""]], Charset => "iso-8859-1") =?ISO-8859-1?Q?Andr=E9__=28=3C-?= two spaces) Pirard GOOD ([["Andr\xE9 "],[" (<- two spaces) Pirard"],[" "],[""]], Charset => "ISO-8859-1") =?ISO-8859-1?Q?Andr=E9_?= (<- two spaces) Pirard GOOD ([["Andr\xE9"],[" (<- two spaces) Pirard"],[" "],[""]], Charset => "iso-8859-1") =?ISO-8859-1?Q?Andr=E9?= (<- two spaces) Pirard GOOD ("Network Copier Oc\xE9 3165 down for maintenance", Charset => "iso-8859-1") Network Copier =?ISO-8859-1?Q?Oc=E9?= 3165 down for maintenance GOOD ("La r\xE9alisation du Syst\xE8me de R\xE9f\xE9rence C\xE9leste", Charset => "iso-8859-1") La =?ISO-8859-1?Q?r=E9alisation?= du =?ISO-8859-1?Q?Syst=E8me?= de =?ISO-8859-1?Q?R=E9f=E9rence_C=E9leste?= GOOD ("Th\xE8me tr\xE8s important\xA0: La r\xE9alisation du Syst\xE8me de R\xE9f\xE9rence C\xE9leste", Charset => "iso-8859-1") =?ISO-8859-1?Q?Th=E8me_tr=E8s_important=A0=3A?= La =?ISO-8859-1?Q?r=E9alis?= =?ISO-8859-1?Q?ation?= du =?ISO-8859-1?Q?Syst=E8me?= de =?ISO-8859-1?Q?R?= =?ISO-8859-1?Q?=E9f=E9rence_C=E9leste?= GOOD ("_-_") _-_ MIME-EncWords-1.014.2/testin/encode-utf-8.txt000064400000000000000000000035451220667516400212540ustar00viewvcviewvc00000010000000GOOD ("\xc3\xa1rv\xc3\xadzt\xc5\xb1r\xc5\x91 t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p", Charset=>'utf-8') =?UTF-8?B?w6FydsOtenTFsXLFkSB0w7xrw7ZyZsO6csOzZ8OpcA==?= GOOD ("\xc3\x81rv\xc3\xadzt\xc5\xb1r\xc5\x91 t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p, ETAOIN SHRDLU CMFWYP VBGKQJ XZ.", Charset=>'utf-8') =?UTF-8?B?w4FydsOtenTFsXLFkSB0w7xrw7ZyZsO6csOzZ8OpcCw=?= ETAOIN SHRDLU CMFWYP VBGKQJ XZ. GOOD ("\xc3\x81rv\xc3\xadzt\xc5\xb1r\xc5\x91 t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p---lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do!", Charset=>'utf-8') =?UTF-8?B?w4FydsOtenTFsXLFkSB0w7xrw7ZyZsO6csOzZ8OpcC0tLWxvcmVt?= ipsum dolor sit amet, consectetur adipisicing elit, sed do! GOOD ("\xc3\x81rv\xc3\xadzt\xc5\xb1r\xc5\x91 t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p---lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do!!", Charset=>'utf-8') =?UTF-8?Q?=C3=81rv=C3=ADzt=C5=B1r=C5=91_t=C3=BCk=C3=B6rf=C3=BAr=C3=B3g?= =?UTF-8?Q?=C3=A9p---lorem?= ipsum dolor sit amet, consectetur adipisicing elit, sed do!! GOOD ([["\xc3\x81rv\xc3\xadzt\xc5\xb1r\xc5\x91",'utf-8'], [" t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p",'utf-8'], ['---lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do!!']]) =?UTF-8?Q?=C3=81rv=C3=ADzt=C5=B1r=C5=91_t=C3=BCk=C3=B6rf=C3=BAr=C3=B3g?= =?UTF-8?Q?=C3=A9p---lorem?= ipsum dolor sit amet, consectetur adipisicing elit, sed do!! GOOD ([["\xc3\x81rv\xc3\xadzt\xc5\xb1r\xc5\x91",'utf-8'], ["t\xc3\xbck\xc3\xb6rf\xc3\xbar\xc3\xb3g\xc3\xa9p",'utf-8'], ['---lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore'], ["\xa1\xc4", 'euc-jp']]) =?UTF-8?B?w4FydsOtenTFsXLFkXTDvGvDtnJmw7pyw7Nnw6lwLS0tbG9yZW0=?= ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et =?ISO-2022-JP?B?ZG9sb3JlGyRCIUQbKEI=?=