HsYAML-0.1.2.0/0000755000000000000000000000000013467630442011104 5ustar0000000000000000HsYAML-0.1.2.0/LICENSE.GPLv20000644000000000000000000004317313467630442013012 0ustar0000000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. HsYAML-0.1.2.0/LICENSE.GPLv30000644000000000000000000010444213467630442013010 0ustar0000000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . HsYAML-0.1.2.0/Setup.hs0000644000000000000000000000005613467630442012541 0ustar0000000000000000import Distribution.Simple main = defaultMain HsYAML-0.1.2.0/HsYAML.cabal0000644000000000000000000000741513467630442013134 0ustar0000000000000000cabal-version: 1.14 build-type: Simple name: HsYAML version: 0.1.2.0 synopsis: Pure Haskell YAML 1.2 parser homepage: https://github.com/hvr/HsYAML bug-reports: https://github.com/hvr/HsYAML/issues license: GPL-2 X-SPDX-License-Identifier: GPL-2.0-or-later license-files: LICENSE.GPLv2 LICENSE.GPLv3 author: Herbert Valerio Riedel maintainer: hvr@gnu.org copyright: 2015-2018 Herbert Valerio Riedel , 2007-2008 Oren Ben-Kiki category: Text tested-with: GHC==8.6.5, GHC==8.4.4, GHC==8.4.1, GHC==8.2.2, GHC==8.0.2, GHC==7.10.3, GHC==7.8.4, GHC==7.6.3, GHC==7.4.2 description: @HsYAML@ is a [YAML 1.2](http://yaml.org/spec/1.2/spec.html) parser implementation for Haskell. . Features of @HsYAML@ include: . * Pure Haskell implementation with small dependency footprint and emphasis on strict compliance with the [YAML 1.2 specification](http://yaml.org/spec/1.2/spec.html). * Direct decoding to native Haskell types via (@aeson@-inspired) typeclass-based API (see "Data.YAML"). * Support for constructing custom YAML node graph representation (including support for cyclic YAML data structures). * Support for the standard (untyped) /Failsafe/, (strict) /JSON/, and (flexible) /Core/ \"schemas\" providing implicit typing rules as defined in the YAML 1.2 specification (including support for user-defined custom schemas). * Event-based API resembling LibYAML's Event-based API (see "Data.YAML.Event"). * Low-level API access to lexical token-based scanner (see "Data.YAML.Token"). . extra-source-files: ChangeLog.md source-repository head type: git location: https://github.com/hvr/HsYAML.git flag exe description: Enable @exe:yaml-test@ component manual: True default: False library hs-source-dirs: src exposed-modules: Data.YAML , Data.YAML.Event , Data.YAML.Token other-modules: Data.YAML.Loader , Data.YAML.Schema , Data.YAML.Token.Encoding , Util , Data.DList default-language: Haskell2010 other-extensions: FlexibleContexts FlexibleInstances FunctionalDependencies MultiParamTypeClasses OverloadedStrings PostfixOperators RecordWildCards RecursiveDo Safe ScopedTypeVariables Trustworthy TypeSynonymInstances build-depends: base >=4.5 && <4.13 , bytestring >=0.9 && <0.11 , containers >=0.4.2 && <0.7 , text >=1.2.3 && <1.3 , mtl >=2.2.1 && <2.3 , parsec >=3.1.13.0 && < 3.2 if !impl(ghc >= 8.0) build-depends: fail >=4.9.0.0 && <4.10 if !impl(ghc >= 7.10) build-depends: nats >=1.1.2 && <1.2 ghc-options: -Wall executable yaml-test hs-source-dirs: src-test main-is: Main.hs other-modules: TML default-language: Haskell2010 if flag(exe) build-depends: HsYAML -- inherited constraints , bytestring >= 0.10.8.0 , base , text , containers , mtl -- non-inherited , megaparsec >= 6.5.0 && < 6.6 , microaeson == 0.1.* , filepath == 1.4.* , directory >= 1.2 && < 1.4 else buildable: False ghc-options: -rtsopts HsYAML-0.1.2.0/ChangeLog.md0000644000000000000000000000305513467630442013260 0ustar0000000000000000See also http://pvp.haskell.org/faq ### 0.1.2.0 * Add convenience functions `decode1` and `decode1Strict` expecting exactly one YAML document ([#5](https://github.com/haskell-hvr/HsYAML/pull/5)) * Fix a couple corner-cases in the YAML tokenization ([#10](https://github.com/haskell-hvr/HsYAML/pull/10)) #### 0.1.1.3 * Fix bug in float regexp being too lax in the JSON and Core schema ([#7](https://github.com/hvr/HsYAML/issues/7)) * Remove dependency on `dlist` #### 0.1.1.2 * Tolerate BOM at *each* `l-document-prefix` (rather than only at the first one encountered in a YAML stream) * Workaround broken `mtl-2.2.2` bundled in GHC 8.4.1 ([#1](https://github.com/hvr/HsYAML/issues/1)) * Relax to GPL-2.0-or-later #### 0.1.1.1 * Reject (illegal) non-scalar code-points in UTF-32 streams * Tolerate BOM at start of stream * Disambiguate choice in `l-any-document` production regarding token separation of `c-directives-end` * Fix `c-indentation-indicator(n)` grammar production when auto-detecting indentation in the presence of empty leading lines; also reject (illegal) auto-indent-level scalars with leading more-indented all-space lines * Complete character escape rules for double-quoted scalars * Minor optimizations ### 0.1.1.0 * `Data.YAML` module promoted from `TrustWorthy` to `Safe` * Add `FromYAML Natural` instance * Add `MonadFail`, `Alternative`, and `MonadPlus` instances for `Data.YAML.Parser` * Add `Data.YAML.decodeStrict` function * Export `Data.YAML.typeMismatch` helper function ## 0.1.0.0 * First version. Released on an unsuspecting world. HsYAML-0.1.2.0/src/0000755000000000000000000000000013467630442011673 5ustar0000000000000000HsYAML-0.1.2.0/src/Util.hs0000644000000000000000000000460113467630442013145 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE Safe #-} {-# LANGUAGE ScopedTypeVariables #-} {-# OPTIONS_GHC -fno-warn-unused-imports #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Util ( liftEither , readMaybe , readEither , fromIntegerMaybe , module X ) where import Control.Applicative as X import Control.Monad as X import Data.Int as X import Data.Word as X import Numeric.Natural as X (Natural) #if !MIN_VERSION_mtl(2,2,2) || (__GLASGOW_HASKELL__ == 804 && __GLASGOW_HASKELL_PATCHLEVEL1__ < 2) import Control.Monad.Except (MonadError (throwError)) #else import Control.Monad.Except (liftEither) #endif import Control.Monad.Identity as X import Data.Char as X (chr, ord) import Data.Map as X (Map) import Data.Monoid as X (Monoid (mappend, mempty)) import Data.Set as X (Set) import Data.Text as X (Text) import Text.ParserCombinators.ReadP as P import Text.Read -- GHC 8.4.1 shipped with a phony `mtl-2.2.2` and so we have to assume -- pessimistically that on GHC 8.4.1 only, mtl-2.2.2 may be broken (even if it was reinstalled) #if !MIN_VERSION_mtl(2,2,2) || (__GLASGOW_HASKELL__ == 804 && __GLASGOW_HASKELL_PATCHLEVEL1__ < 2) liftEither :: MonadError e m => Either e a -> m a liftEither = either throwError return #endif #if !MIN_VERSION_base(4,6,0) readMaybe :: Read a => String -> Maybe a readMaybe = either (const Nothing) id . readEither readEither :: Read a => String -> Either String a readEither s = case [ x | (x,"") <- readPrec_to_S read' minPrec s ] of [x] -> Right x [] -> Left "Prelude.read: no parse" _ -> Left "Prelude.read: ambiguous parse" where read' = do x <- readPrec Text.Read.lift P.skipSpaces return x #endif fromIntegerMaybe :: forall n . (Integral n, Bounded n) => Integer -> Maybe n fromIntegerMaybe j | l <= j, j <= u = Just (fromInteger j) | otherwise = Nothing where u = toInteger (maxBound :: n) l = toInteger (minBound :: n) HsYAML-0.1.2.0/src/Data/0000755000000000000000000000000013467630442012544 5ustar0000000000000000HsYAML-0.1.2.0/src/Data/YAML.hs0000644000000000000000000004335313467630442013652 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} {-# LANGUAGE CPP #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Document oriented [YAML](http://yaml.org/spec/1.2/spec.html) parsing API inspired by [aeson](http://hackage.haskell.org/package/aeson). -- -- === Overview -- -- The diagram below depicts the standard layers of a [YAML 1.2](http://yaml.org/spec/1.2/spec.html) processor. This module covers the upper /Native/ and /Representation/ layers, whereas the "Data.YAML.Event" and "Data.YAML.Token" modules provide access to the lower /Serialization/ and /Presentation/ layers respectively. -- -- <> -- -- === Quick Start Tutorial -- -- Let's assume we want to decode (i.e. /load/) a simple YAML document -- -- > - name: Erik Weisz -- > age: 52 -- > magic: True -- > - name: Mina Crandon -- > age: 53 -- -- into a native Haskell data structure of type @[Person]@, i.e. a list of 'Person' records. -- -- The code below shows how to manually define a @Person@ record type together with a 'FromYAML' instance: -- -- > {-# LANGUAGE OverloadedStrings #-} -- > -- > import Data.YAML -- > -- > data Person = Person -- > { name :: Text -- > , age :: Int -- > , magic :: Bool -- > } deriving Show -- > -- > instance FromYAML Person where -- > parseYAML = withMap "Person" $ \m -> Person -- > <$> m .: "name" -- > <*> m .: "age" -- > <*> m .:? "magic" .!= False -- -- And now we can 'decode' the YAML document like so: -- -- >>> decode "- name: Erik Weisz\n age: 52\n magic: True\n- name: Mina Crandon\n age: 53" :: Either String [[Person]] -- Right [[Person {name = "Erik Weisz", age = 52, magic = True},Person {name = "Mina Crandon", age = 53, magic = False}]] -- -- module Data.YAML ( -- * Typeclass-based resolving/decoding decode , decode1 , decodeStrict , decode1Strict , FromYAML(..) , Parser , parseEither , typeMismatch -- ** Accessors for YAML 'Mapping's , Mapping , (.:), (.:?), (.:!), (.!=) -- ** Prism-style parsers , withSeq , withBool , withFloat , withInt , withNull , withStr , withMap -- * \"Concrete\" AST , decodeNode , decodeNode' , Doc(..) , Node(..) , Scalar(..) -- * YAML 1.2 Schema resolvers , SchemaResolver(..) , failsafeSchemaResolver , jsonSchemaResolver , coreSchemaResolver -- * Generalised AST construction , decodeLoader , Loader(..) , NodeId ) where import qualified Control.Monad.Fail as Fail import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BS.L import qualified Data.Map as Map import Data.Maybe (listToMaybe) import qualified Data.Text as T import Data.YAML.Event (Tag, isUntagged, tagToText) import Data.YAML.Loader import Data.YAML.Schema import Util -- | YAML Document tree/graph newtype Doc n = Doc n deriving (Eq,Ord,Show) -- | YAML Document node data Node = Scalar !Scalar | Mapping !Tag Mapping | Sequence !Tag [Node] | Anchor !NodeId !Node deriving (Eq,Ord,Show) -- | YAML mapping type Mapping = Map Node Node -- | Retrieve value in 'Mapping' indexed by a @!!str@ 'Text' key. -- -- This parser fails if the key doesn't exist. (.:) :: FromYAML a => Mapping -> Text -> Parser a m .: k = maybe (fail $ "key " ++ show k ++ " not found") parseYAML (Map.lookup (Scalar (SStr k)) m) -- | Retrieve optional value in 'Mapping' indexed by a @!!str@ 'Text' key. -- -- 'Nothing' is returned if the key is missing or points to a @tag:yaml.org,2002:null@ node. -- This combinator only fails if the key exists but cannot be converted to the required type. -- -- See also '.:!'. (.:?) :: FromYAML a => Mapping -> Text -> Parser (Maybe a) m .:? k = maybe (pure Nothing) parseYAML (Map.lookup (Scalar (SStr k)) m) -- | Retrieve optional value in 'Mapping' indexed by a @!!str@ 'Text' key. -- -- 'Nothing' is returned if the key is missing. -- This combinator only fails if the key exists but cannot be converted to the required type. -- -- __NOTE__: This is a variant of '.:?' which doesn't map a @tag:yaml.org,2002:null@ node to 'Nothing'. (.:!) :: FromYAML a => Mapping -> Text -> Parser (Maybe a) m .:! k = maybe (pure Nothing) (fmap Just . parseYAML) (Map.lookup (Scalar (SStr k)) m) -- | Defaulting helper to be used with '.:?' or '.:!'. (.!=) :: Parser (Maybe a) -> a -> Parser a mv .!= def = fmap (maybe def id) mv -- | Parse and decode YAML document(s) into 'Node' graphs -- -- This is a convenience wrapper over `decodeNode'` -- -- > decodeNode = decodeNode' coreSchemaResolver False False -- -- In other words, -- -- * Use the YAML 1.2 Core schema for resolving -- * Don't create 'Anchor' nodes -- * Disallow cyclic anchor references -- decodeNode :: BS.L.ByteString -> Either String [Doc Node] decodeNode = decodeNode' coreSchemaResolver False False -- | Customizable variant of 'decodeNode' -- decodeNode' :: SchemaResolver -- ^ YAML Schema resolver to use -> Bool -- ^ Whether to emit anchor nodes -> Bool -- ^ Whether to allow cyclic references -> BS.L.ByteString -- ^ YAML document to parse -> Either String [Doc Node] decodeNode' SchemaResolver{..} anchorNodes allowCycles bs0 = map Doc <$> runIdentity (decodeLoader failsafeLoader bs0) where failsafeLoader = Loader { yScalar = \t s v -> pure $ fmap Scalar (schemaResolverScalar t s v) , ySequence = \t vs -> pure $ schemaResolverSequence t >>= \t' -> Right (Sequence t' vs) , yMapping = \t kvs -> pure $ schemaResolverMapping t >>= \t' -> Right (Mapping t' (Map.fromList kvs)) , yAlias = if allowCycles then \_ _ n -> pure $ Right n else \_ c n -> pure $ if c then Left "cycle detected" else Right n , yAnchor = if anchorNodes then \j n -> pure $ Right (Anchor j n) else \_ n -> pure $ Right n } ---------------------------------------------------------------------------- -- | YAML Parser 'Monad' used by 'FromYAML' -- -- See also 'parseEither' or 'decode' newtype Parser a = P { unP :: Either String a } instance Functor Parser where fmap f (P x) = P (fmap f x) x <$ P (Right _) = P (Right x) _ <$ P (Left e) = P (Left e) instance Applicative Parser where pure = P . Right P (Left e) <*> _ = P (Left e) P (Right f) <*> P r = P (fmap f r) P (Left e) *> _ = P (Left e) P (Right _) *> p = p instance Monad Parser where return = pure P m >>= k = P (m >>= unP . k) (>>) = (*>) #if !(MIN_VERSION_base(4,13,0)) fail = Fail.fail #endif -- | @since 0.1.1.0 instance Fail.MonadFail Parser where fail = P . Left -- | @since 0.1.1.0 instance Alternative Parser where empty = fail "empty" P (Left _) <|> y = y x <|> _ = x -- | @since 0.1.1.0 instance MonadPlus Parser where mzero = empty mplus = (<|>) -- | Run 'Parser' -- -- A common use-case is 'parseEither' 'parseYAML'. parseEither :: Parser a -> Either String a parseEither = unP -- | Informative failure helper -- -- This is typically used in fall-through cases of 'parseYAML' like so -- -- > instance FromYAML ... where -- > parseYAML ... = ... -- > parseYAML node = typeMismatch "SomeThing" node -- -- @since 0.1.1.0 typeMismatch :: String -- ^ descriptive name of expected data -> Node -- ^ actual node -> Parser a typeMismatch expected node = fail ("expected " ++ expected ++ " instead of " ++ got) where got = case node of Scalar (SBool _) -> "!!bool" Scalar (SInt _) -> "!!int" Scalar SNull -> "!!null" Scalar (SStr _) -> "!!str" Scalar (SFloat _) -> "!!float" Scalar (SUnknown t v) | isUntagged t -> tagged t ++ show v | otherwise -> "(unsupported) " ++ tagged t ++ "scalar" (Anchor _ _) -> "anchor" (Mapping t _) -> tagged t ++ " mapping" (Sequence t _) -> tagged t ++ " sequence" tagged t0 = case tagToText t0 of Nothing -> "non-specifically ? tagged (i.e. unresolved) " Just t -> T.unpack t ++ " tagged" -- | A type into which YAML nodes can be converted/deserialized class FromYAML a where parseYAML :: Node -> Parser a -- | Operate on @tag:yaml.org,2002:null@ node (or fail) withNull :: String -> Parser a -> Node -> Parser a withNull _ f (Scalar SNull) = f withNull expected _ v = typeMismatch expected v -- | Trivial instance instance FromYAML Node where parseYAML = pure instance FromYAML Bool where parseYAML = withBool "!!bool" pure -- | Operate on @tag:yaml.org,2002:bool@ node (or fail) withBool :: String -> (Bool -> Parser a) -> Node -> Parser a withBool _ f (Scalar (SBool b)) = f b withBool expected _ v = typeMismatch expected v instance FromYAML Text where parseYAML = withStr "!!str" pure -- | Operate on @tag:yaml.org,2002:str@ node (or fail) withStr :: String -> (Text -> Parser a) -> Node -> Parser a withStr _ f (Scalar (SStr b)) = f b withStr expected _ v = typeMismatch expected v instance FromYAML Integer where parseYAML = withInt "!!int" pure -- | Operate on @tag:yaml.org,2002:int@ node (or fail) withInt :: String -> (Integer -> Parser a) -> Node -> Parser a withInt _ f (Scalar (SInt b)) = f b withInt expected _ v = typeMismatch expected v -- | @since 0.1.0.0 instance FromYAML Natural where parseYAML = withInt "!!int" $ \b -> if b < 0 then fail ("!!int " ++ show b ++ " out of range for 'Natural'") else pure (fromInteger b) -- helper for fixed-width integers {-# INLINE parseInt #-} parseInt :: (Integral a, Bounded a) => [Char] -> Node -> Parser a parseInt name = withInt "!!int" $ \b -> maybe (fail $ "!!int " ++ show b ++ " out of range for '" ++ name ++ "'") pure $ fromIntegerMaybe b instance FromYAML Int where parseYAML = parseInt "Int" instance FromYAML Int8 where parseYAML = parseInt "Int8" instance FromYAML Int16 where parseYAML = parseInt "Int16" instance FromYAML Int32 where parseYAML = parseInt "Int32" instance FromYAML Int64 where parseYAML = parseInt "Int64" instance FromYAML Word where parseYAML = parseInt "Word" instance FromYAML Word8 where parseYAML = parseInt "Word8" instance FromYAML Word16 where parseYAML = parseInt "Word16" instance FromYAML Word32 where parseYAML = parseInt "Word32" instance FromYAML Word64 where parseYAML = parseInt "Word64" instance FromYAML Double where parseYAML = withFloat "!!float" pure -- | Operate on @tag:yaml.org,2002:float@ node (or fail) withFloat :: String -> (Double -> Parser a) -> Node -> Parser a withFloat _ f (Scalar (SFloat b)) = f b withFloat expected _ v = typeMismatch expected v instance (Ord k, FromYAML k, FromYAML v) => FromYAML (Map k v) where parseYAML = withMap "!!map" $ \xs -> Map.fromList <$> mapM (\(a,b) -> (,) <$> parseYAML a <*> parseYAML b) (Map.toList xs) -- | Operate on @tag:yaml.org,2002:seq@ node (or fail) withMap :: String -> (Mapping -> Parser a) -> Node -> Parser a withMap _ f (Mapping tag xs) | tag == tagMap = f xs withMap expected _ v = typeMismatch expected v instance FromYAML v => FromYAML [v] where parseYAML = withSeq "!!seq" (mapM parseYAML) -- | Operate on @tag:yaml.org,2002:seq@ node (or fail) withSeq :: String -> ([Node] -> Parser a) -> Node -> Parser a withSeq _ f (Sequence tag xs) | tag == tagSeq = f xs withSeq expected _ v = typeMismatch expected v instance FromYAML a => FromYAML (Maybe a) where parseYAML (Scalar SNull) = pure Nothing parseYAML j = Just <$> parseYAML j ---------------------------------------------------------------------------- instance (FromYAML a, FromYAML b) => FromYAML (a,b) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b] -> (,) <$> parseYAML a <*> parseYAML b _ -> fail ("expected 2-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c) => FromYAML (a,b,c) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c] -> (,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c _ -> fail ("expected 3-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d) => FromYAML (a,b,c,d) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d] -> (,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d _ -> fail ("expected 4-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e) => FromYAML (a,b,c,d,e) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e] -> (,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e _ -> fail ("expected 5-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e, FromYAML f) => FromYAML (a,b,c,d,e,f) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e,f] -> (,,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e <*> parseYAML f _ -> fail ("expected 6-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e, FromYAML f, FromYAML g) => FromYAML (a,b,c,d,e,f,g) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e,f,g] -> (,,,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e <*> parseYAML f <*> parseYAML g _ -> fail ("expected 7-sequence but got " ++ show (length xs) ++ "-sequence instead") -- | Decode YAML document(s) using the YAML 1.2 Core schema -- -- Each document contained in the YAML stream produce one element of -- the response list. Here's an example of decoding two concatenated -- YAML documents: -- -- >>> decode "Foo\n---\nBar" :: Either String [Text] -- Right ["Foo","Bar"] -- -- Note that an empty stream doesn't contain any (non-comment) -- document nodes, and therefore results in an empty result list: -- -- >>> decode "# just a comment" :: Either String [Text] -- Right [] -- -- 'decode' uses the same settings as 'decodeNode' for tag-resolving. If -- you need a different custom parsing configuration, you need to -- combine 'parseEither' and `decodeNode'` yourself. -- -- The 'decode' as well as the 'decodeNode' functions supports -- decoding from YAML streams using the UTF-8, UTF-16 (LE or BE), or -- UTF-32 (LE or BE) encoding (which is auto-detected). -- decode :: FromYAML v => BS.L.ByteString -> Either String [v] decode bs0 = decodeNode bs0 >>= mapM (parseEither . parseYAML . (\(Doc x) -> x)) -- | Convenience wrapper over 'decode' expecting exactly one YAML document -- -- >>> decode1 "---\nBar\n..." :: Either String Text -- Right "Bar" -- -- >>> decode1 "Foo\n---\nBar" :: Either String Text -- Left "unexpected multiple YAML documents" -- -- >>> decode1 "# Just a comment" :: Either String Text -- Left "empty YAML stream" -- -- @since 0.1.2.0 decode1 :: FromYAML v => BS.L.ByteString -> Either String v decode1 text = do vs <- decode text case vs of [] -> Left "empty YAML stream" [v] -> Right v _ -> Left "unexpected multiple YAML documents" -- | Like 'decode' but takes a strict 'BS.ByteString' -- -- @since 0.1.1.0 decodeStrict :: FromYAML v => BS.ByteString -> Either String [v] decodeStrict = decode . BS.L.fromChunks . (:[]) -- | Like 'decode1' but takes a strict 'BS.ByteString' -- -- @since 0.1.2.0 decode1Strict :: FromYAML v => BS.ByteString -> Either String v decode1Strict text = do vs <- decodeStrict text maybe (Left "expected unique") Right $ listToMaybe vs HsYAML-0.1.2.0/src/Data/DList.hs0000644000000000000000000000102413467630442014114 0ustar0000000000000000{-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Minimal API-compatible rip-off of @Data.DList@ module Data.DList ( DList , empty , singleton , append , toList ) where newtype DList a = DList ([a] -> [a]) toList :: DList a -> [a] toList (DList dl) = dl [] singleton :: a -> DList a singleton x = DList (x:) empty :: DList a empty = DList id append :: DList a -> DList a -> DList a append (DList xs) (DList ys) = DList (xs . ys) HsYAML-0.1.2.0/src/Data/YAML/0000755000000000000000000000000013467630442013306 5ustar0000000000000000HsYAML-0.1.2.0/src/Data/YAML/Event.hs0000644000000000000000000004767513467630442014746 0ustar0000000000000000{-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Event-stream oriented YAML parsing API module Data.YAML.Event ( parseEvents , EvStream , Event(..) , Style(..) , Tag, untagged, isUntagged, tagToText, mkTag , Anchor , Pos(..) ) where import qualified Data.ByteString.Lazy as BS.L import Data.Char import qualified Data.Map as Map import qualified Data.Text as T import qualified Data.YAML.Token as Y import Numeric (readHex) import Util -- TODO: consider also non-essential attributes -- | YAML Event Types -- -- The events correspond to the ones from [LibYAML](http://pyyaml.org/wiki/LibYAML) -- -- The grammar below defines well-formed streams of 'Event's: -- -- @ -- stream ::= 'StreamStart' document* 'StreamEnd' -- document ::= 'DocumentStart' node 'DocumentEnd' -- node ::= 'Alias' -- | 'Scalar' -- | sequence -- | mapping -- sequence ::= 'SequenceStart' node* 'SequenceEnd' -- mapping ::= 'MappingStart' (node node)* 'MappingEnd' -- @ data Event = StreamStart | StreamEnd | DocumentStart !Bool | DocumentEnd !Bool | Alias !Anchor | Scalar !(Maybe Anchor) !Tag !Style !Text | SequenceStart !(Maybe Anchor) !Tag | SequenceEnd | MappingStart !(Maybe Anchor) !Tag | MappingEnd deriving (Show, Eq) -- | YAML Anchor identifiers type Anchor = Text -- | YAML Tags newtype Tag = Tag (Maybe Text) deriving (Eq,Ord) instance Show Tag where show (Tag x) = show x -- | Convert 'Tag' to its string representation -- -- Returns 'Nothing' for 'untagged' tagToText :: Tag -> Maybe T.Text tagToText (Tag x) = x -- | An \"untagged\" YAML tag untagged :: Tag untagged = Tag Nothing -- | Equivalent to @(== 'untagged')@ isUntagged :: Tag -> Bool isUntagged (Tag Nothing) = True isUntagged _ = False -- | Construct YAML tag mkTag :: String -> Tag mkTag "" = error "mkTag" mkTag "!" = Tag (Just $! T.pack "!") mkTag s = Tag (Just $! tagUnescape s) where tagUnescape = T.pack . go where go [] = [] go ('%':h:l:cs) | Just c <- decodeL1 [h,l] = c : go cs go (c:cs) = c : go cs mkTag' :: String -> Tag mkTag' "" = error "mkTag'" mkTag' s = Tag (Just $! T.pack s) mkTag'' :: String -> Tag mkTag'' "" = error "mkTag''" mkTag'' s = Tag (Just $! T.pack ("tag:yaml.org,2002:" ++ s)) -- | Event stream produced by 'parseEvents' -- -- A 'Left' value denotes parsing errors. The event stream ends -- immediately once a 'Left' value is returned. type EvStream = [Either (Pos,String) Event] -- | Position in parsed YAML source data Pos = Pos { posByteOffset :: !Int -- ^ 0-based byte offset , posCharOffset :: !Int -- ^ 0-based character (Unicode code-point) offset , posLine :: !Int -- ^ 1-based line number , posColumn :: !Int -- ^ 0-based character (Unicode code-point) column number } deriving Show tok2pos :: Y.Token -> Pos tok2pos Y.Token { Y.tByteOffset = posByteOffset, Y.tCharOffset = posCharOffset, Y.tLine = posLine, Y.tLineChar = posColumn } = Pos {..} -- | 'Scalar' node style data Style = Plain | SingleQuoted | DoubleQuoted | Literal | Folded deriving (Eq,Ord,Show) -- internal type TagHandle = Text type Props = (Maybe Text,Tag) getHandle :: [Y.Token] -> Maybe (TagHandle,[Y.Token]) getHandle toks0 = do Y.Token { Y.tCode = Y.BeginHandle } : toks1 <- Just toks0 (hs,Y.Token { Y.tCode = Y.EndHandle } : toks2) <- Just $ span (\Y.Token { Y.tCode = c } -> c `elem` [Y.Indicator,Y.Meta]) toks1 pure (T.pack $ concatMap Y.tText hs, toks2) getUriTag :: [Y.Token] -> Maybe (Text,[Y.Token]) getUriTag toks0 = do Y.Token { Y.tCode = Y.BeginTag } : toks1 <- Just toks0 (hs,Y.Token { Y.tCode = Y.EndTag } : toks2) <- Just $ span (\Y.Token { Y.tCode = c } -> c `elem` [Y.Indicator,Y.Meta]) toks1 pure (T.pack $ concatMap Y.tText hs, toks2) {- WARNING: the code that follows will make you cry; a safety pig is provided below for your benefit. _ _._ _..._ .-', _.._(`)) '-. ` ' /-._.-' ',/ ) \ '. / _ _ | \ | a a / | \ .-. ; '-('' ).-' ,' ; '-; | .' \ \ / | 7 .__ _.-\ \ | | | ``/ /` / /,_| | /,_/ / /,_/ '`-' -} -- | Parse YAML 'Event's from a lazy 'BS.L.ByteString'. -- -- The input 'BS.L.ByteString' is expected to have a YAML 1.2 stream -- using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encodings -- (which will be auto-detected). parseEvents :: BS.L.ByteString -> EvStream parseEvents = \bs0 -> Right StreamStart : (go0 mempty $ stripComments $ filter (not . isWhite) $ Y.tokenize bs0 False) where isTCode tc = (== tc) . Y.tCode skipPast tc (t : ts) | isTCode tc t = ts | otherwise = skipPast tc ts skipPast _ [] = error "the impossible happened" -- non-content whitespace isWhite :: Y.Token -> Bool isWhite (Y.Token { Y.tCode = Y.Bom }) = True -- BOMs can occur at each doc-start! isWhite (Y.Token { Y.tCode = Y.White }) = True isWhite (Y.Token { Y.tCode = Y.Indent }) = True isWhite (Y.Token { Y.tCode = Y.Break }) = True isWhite _ = False goDir :: Map TagHandle Text -> [Y.Token] -> EvStream goDir m (Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "YAML" } : Y.Token { Y.tCode = Y.Meta } : Y.Token { Y.tCode = Y.EndDirective } : rest) = go0 m rest goDir m (Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "TAG" } : rest) | Just (h, rest') <- getHandle rest , Just (t, rest'') <- getUriTag rest' = go0 (Map.insert h t m) (skipPast Y.EndDirective rest'') goDir m (Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = l } : rest) | l `notElem` ["TAG","YAML"] = go0 m (skipPast Y.EndDirective rest) goDir _ xs = err xs go0 :: Map.Map TagHandle Text -> Tok2EvStream go0 _ [] = [Right StreamEnd] go0 _ (Y.Token { Y.tCode = Y.White } : _) = error "the impossible happened" go0 m (Y.Token { Y.tCode = Y.Indicator } : rest) = go0 m rest -- ignore indicators here go0 m (Y.Token { Y.tCode = Y.DirectivesEnd } : rest) = go0 m rest go0 m (Y.Token { Y.tCode = Y.BeginDocument } : Y.Token { Y.tCode = Y.DirectivesEnd } : rest) = Right (DocumentStart True) : go0 m rest -- hack go0 m (Y.Token { Y.tCode = Y.BeginDocument } : rest@(Y.Token { Y.tCode = Y.BeginDirective } : _)) = Right (DocumentStart True) : go0 m rest -- hack go0 m (Y.Token { Y.tCode = Y.BeginDocument } : rest) = Right (DocumentStart False) : go0 m rest go0 _ (Y.Token { Y.tCode = Y.EndDocument } : Y.Token { Y.tCode = Y.DocumentEnd } : rest) = Right (DocumentEnd True) : go0 mempty rest go0 _ (Y.Token { Y.tCode = Y.EndDocument } : rest) = Right (DocumentEnd False) : go0 mempty rest go0 m (Y.Token { Y.tCode = Y.DocumentEnd } : rest) = go0 m rest -- should not occur go0 m (Y.Token { Y.tCode = Y.BeginNode } : rest) = goNode0 m rest (go0 m) go0 m (Y.Token { Y.tCode = Y.BeginDirective } : rest) = goDir m rest go0 _ xs = err xs err :: Tok2EvStream err (tok@Y.Token { Y.tCode = Y.Error, Y.tText = msg } : _) = [Left (tok2pos tok, msg)] err (tok@Y.Token { Y.tCode = Y.Unparsed, Y.tText = txt } : _) = [Left (tok2pos tok, ("Lexical error near " ++ show txt))] err (tok@Y.Token { Y.tCode = code } : _) = [Left (tok2pos tok, ("Parse failure near " ++ show code ++ " token"))] err [] = [Left ((Pos (-1) (-1) (-1) (-1)), "Unexpected end of token stream")] goNode0 :: Map TagHandle Text -> Tok2EvStreamCont goNode0 tagmap = goNode where goNode :: Tok2EvStreamCont goNode (Y.Token { Y.tCode = Y.BeginScalar } : rest) cont = goScalar (mempty,untagged) rest (flip goNodeEnd cont) goNode (Y.Token { Y.tCode = Y.BeginSequence } : rest) cont = Right (SequenceStart Nothing untagged) : goSeq rest (flip goNodeEnd cont) goNode (Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (MappingStart Nothing untagged) : goMap rest (flip goNodeEnd cont) goNode (Y.Token { Y.tCode = Y.BeginProperties } : rest) cont = goProp (mempty,untagged) rest (\p rest' -> goNode' p rest' cont) goNode (Y.Token { Y.tCode = Y.BeginAlias } : Y.Token { Y.tCode = Y.Indicator } : Y.Token { Y.tCode = Y.Meta, Y.tText = anchor } : Y.Token { Y.tCode = Y.EndAlias } : Y.Token { Y.tCode = Y.EndNode } : rest) cont = Right (Alias (T.pack anchor)) : cont rest goNode xs _cont = err xs goNode' :: Props -> Tok2EvStreamCont goNode' props (Y.Token { Y.tCode = Y.BeginScalar } : rest) cont = goScalar props rest (flip goNodeEnd cont) goNode' (manchor,mtag) (Y.Token { Y.tCode = Y.BeginSequence } : rest) cont = Right (SequenceStart manchor mtag) : goSeq rest (flip goNodeEnd cont) goNode' (manchor,mtag) (Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (MappingStart manchor mtag) : goMap rest (flip goNodeEnd cont) goNode' _ xs _cont = err xs goNodeEnd :: Tok2EvStreamCont goNodeEnd (Y.Token { Y.tCode = Y.EndNode } : rest) cont = cont rest goNodeEnd xs _cont = err xs goProp :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goProp props (Y.Token { Y.tCode = Y.EndProperties } : rest) cont = cont props rest goProp props (Y.Token { Y.tCode = Y.BeginAnchor } : rest) cont = goAnchor props rest (\x y -> goProp x y cont) goProp props (Y.Token { Y.tCode = Y.BeginTag } : rest) cont = goTag props rest (\x y -> goProp x y cont) goProp _props xs _cont = err xs goAnchor :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goAnchor props (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goAnchor props rest cont goAnchor (_,tag) (Y.Token { Y.tCode = Y.Meta, Y.tText = anchor } : rest) cont = goAnchor (Just $! T.pack anchor,tag) rest cont goAnchor props (Y.Token { Y.tCode = Y.EndAnchor } : rest) cont = cont props rest goAnchor _ xs _ = err xs goTag :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goTag (anchor,_) (Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndTag } : rest) cont = cont (anchor,mkTag' "!") rest goTag (anchor,_) (Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!!")) tagmap = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = cont (anchor,mkTag'' tag) rest goTag (anchor,_) (Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "<" } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ">" } : Y.Token { Y.tCode = Y.EndTag } : rest) cont = cont (anchor,mkTag tag) rest goTag (anchor,_) xs@(Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Meta, Y.tText = h } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!" ++ h ++ "!")) tagmap = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = err xs goTag (anchor,_) (Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!")) tagmap = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = cont (anchor,mkTag' ('!' : tag)) rest -- unresolved goTag _ xs _ = err xs goScalar :: Props -> Tok2EvStreamCont goScalar (manchor,tag) toks0 cont = go0 False Plain toks0 where go0 ii sty (Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) | "'" <- ind = go' ii "" SingleQuoted rest | "\"" <- ind = go' ii "" DoubleQuoted rest | "|" <- ind = go0 True Literal rest | ">" <- ind = go0 True Folded rest | "+" <- ind = go0 ii sty rest | "-" <- ind = go0 ii sty rest | [c] <- ind, '1' <= c, c <= '9' = go0 False sty rest go0 ii sty (Y.Token { Y.tCode = Y.Text, Y.tText = t } : rest) = go' ii t sty rest go0 ii sty (Y.Token { Y.tCode = Y.LineFold } : rest) = go' ii " " sty rest go0 ii sty (Y.Token { Y.tCode = Y.LineFeed } : rest) = go' ii "\n" sty rest go0 _ sty (Y.Token { Y.tCode = Y.EndScalar } : rest) = Right (Scalar manchor tag sty mempty) : cont rest go0 _ _ xs = err xs ---------------------------------------------------------------------------- go' ii acc sty (Y.Token { Y.tCode = Y.Text, Y.tText = t } : rest) = go' ii (acc ++ t) sty rest go' ii acc sty (Y.Token { Y.tCode = Y.LineFold } : rest) = go' ii (acc ++ " ") sty rest go' ii acc sty (Y.Token { Y.tCode = Y.LineFeed } : rest) = go' ii (acc ++ "\n") sty rest go' ii acc sty@SingleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "'" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "'" } : Y.Token { Y.tCode = Y.EndEscape } : rest) = go' ii (acc ++ "'") sty rest go' ii acc sty@SingleQuoted (Y.Token { Y.tCode = Y.Indicator, Y.tText = "'" } : rest) = go' ii acc sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : -- Y.Token { Y.tCode = Y.Break } : Y.Token { Y.tCode = Y.EndEscape } : rest) = go' ii acc sty rest -- line continuation escape go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : Y.Token { Y.tCode = Y.Meta, Y.tText = t } : Y.Token { Y.tCode = Y.EndEscape } : rest) | Just t' <- unescape t = go' ii (acc ++ t') sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = pfx } : Y.Token { Y.tCode = Y.Meta, Y.tText = ucode } : Y.Token { Y.tCode = Y.EndEscape } : rest) | pfx == "U", Just c <- decodeCP2 ucode = go' ii (acc ++ [c]) sty rest | pfx == "u", Just c <- decodeCP ucode = go' ii (acc ++ [c]) sty rest | pfx == "x", Just c <- decodeL1 ucode = go' ii (acc ++ [c]) sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.Indicator, Y.tText = "\"" } : rest) = go' ii acc sty rest go' ii acc sty (t@Y.Token { Y.tCode = Y.EndScalar } : rest) | ii, hasLeadingSpace acc = [Left (tok2pos t, "leading empty lines contain more spaces than the first non-empty line in scalar")] | otherwise = Right (Scalar manchor tag sty (T.pack acc)) : cont rest go' _ _ _ xs | False = error (show xs) go' _ _ _ xs = err xs hasLeadingSpace (' ':_) = True hasLeadingSpace ('\n':cs) = hasLeadingSpace cs hasLeadingSpace _ = False goSeq :: Tok2EvStreamCont goSeq (Y.Token { Y.tCode = Y.EndSequence } : rest) cont = Right SequenceEnd : cont rest goSeq (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goSeq cont) goSeq (Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (MappingStart Nothing untagged) : goMap rest (flip goSeq cont) goSeq (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goSeq rest cont -- goSeq xs _cont = error (show xs) goSeq xs _cont = err xs goMap :: Tok2EvStreamCont goMap (Y.Token { Y.tCode = Y.EndMapping } : rest) cont = Right MappingEnd : cont rest goMap (Y.Token { Y.tCode = Y.BeginPair } : rest) cont = goPair1 rest (flip goMap cont) goMap (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goMap rest cont goMap xs _cont = err xs goPair1 (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goPair2 cont) goPair1 (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goPair1 rest cont goPair1 xs _cont = err xs goPair2 (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goPairEnd cont) goPair2 (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goPair2 rest cont goPair2 xs _cont = err xs goPairEnd (Y.Token { Y.tCode = Y.EndPair } : rest) cont = cont rest goPairEnd xs _cont = err xs stripComments :: [Y.Token] -> [Y.Token] stripComments (Y.Token { Y.tCode = Y.BeginComment } : rest) = skip rest where skip (Y.Token { Y.tCode = Y.EndComment } : rest') = stripComments rest' skip (_ : rest') = skip rest' skip [] = error "the impossible happened" stripComments (t : rest) = t : stripComments rest stripComments [] = [] type Tok2EvStream = [Y.Token] -> EvStream type Tok2EvStreamCont = [Y.Token] -> Cont EvStream [Y.Token] type Cont r a = (a -> r) -> r -- decode 8-hex-digit unicode code-point decodeCP2 :: String -> Maybe Char decodeCP2 s = case s of [_,_,_,_,_,_,_,_] | all isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode 4-hex-digit unicode code-point decodeCP :: String -> Maybe Char decodeCP s = case s of [_,_,_,_] | all isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode 2-hex-digit latin1 code-point decodeL1 :: String -> Maybe Char decodeL1 s = case s of [_,_] | all isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode C-style escapes unescape :: String -> Maybe String unescape [c] = Map.lookup c m where m = Map.fromList [ (k,[v]) | (k,v) <- escapes ] escapes :: [(Char,Char)] escapes = [ ('0', '\0') , ('a', '\x7') , ('b', '\x8') , ('\x9', '\x9') , ('t', '\x9') , ('n', '\xa') , ('v', '\xb') , ('f', '\xc') , ('r', '\xd') , ('e', '\x1b') , (' ', ' ') , ('"', '"') , ('/', '/') , ('\\', '\\') , ('N', '\x85') , ('_', '\xa0') , ('L', '\x2028') , ('P', '\x2029') ] unescape _ = Nothing HsYAML-0.1.2.0/src/Data/YAML/Token.hs0000644000000000000000000023420213467630442014725 0ustar0000000000000000{-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE PostfixOperators #-} {-# LANGUAGE Safe #-} {-# LANGUAGE TypeSynonymInstances #-} {-# OPTIONS_GHC -fno-warn-missing-signatures #-} {-# OPTIONS_GHC -fno-warn-name-shadowing #-} {-# OPTIONS_GHC -fno-warn-incomplete-patterns #-} -- FIXME -- | -- Copyright: © Oren Ben-Kiki 2007, -- © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Tokenizer for the YAML 1.2 syntax as defined in . -- module Data.YAML.Token ( tokenize , Token(..) , Code(..) ) where import qualified Data.ByteString.Lazy.Char8 as BLC import qualified Data.DList as D import Prelude hiding ((*), (+), (-), (/), (^)) import qualified Prelude import Data.YAML.Token.Encoding (Encoding (..), decode) import Util hiding (empty) import qualified Util -- * Generic operators -- -- ** Numeric operators -- -- We rename the four numerical operators @+@ @-@ @*@ @\/@ to start with @.@ -- (@.+@, @.-@, @.*@, @.\/@). This allows us to use the originals for BNF -- notation (we also hijack the @^@ operator). This is not a generally -- recommended practice. It is justified in this case since we have very little -- arithmetic operations, and a lot of BNF rules which this makes extremely -- readable. infixl 6 .+ -- | \".+\" is the numeric addition (we use \"+\" for postfix \"one or more\"). (.+) :: Int -> Int -> Int (.+) = (Prelude.+) infixl 6 .- -- | \".-\" is the numeric subtraction (we use \"-\" for infix \"and not\"). (.-) :: Int -> Int -> Int (.-) = (Prelude.-) {- infixl 7 .* -- | \".*\" is the numeric multiplication (we use \"*\" for postfix \"zero or -- more\"). (.*) :: Int -> Int -> Int (.*) = (Prelude.*) -} -- ** Record field access -- -- We also define @^.@ for record access for increased readability. infixl 8 ^. -- | @record ^. field@ is the same as @field record@, but is more readable. -- -- NB: This trivially emulates the @lens@ operator (^.) :: record -> (record -> value) -> value record ^. field = field record -- * Result tokens -- -- The parsing result is a stream of tokens rather than a parse tree. The idea -- is to convert the YAML input into \"byte codes\". These byte codes are -- intended to be written into a byte codes file (or more likely a UNIX pipe) -- for further processing. -- | 'Token' codes. data Code = Bom -- ^ BOM, contains \"@TF8@\", \"@TF16LE@\", \"@TF32BE@\", etc. | Text -- ^ Content text characters. | Meta -- ^ Non-content (meta) text characters. | Break -- ^ Separation line break. | LineFeed -- ^ Line break normalized to content line feed. | LineFold -- ^ Line break folded to content space. | Indicator -- ^ Character indicating structure. | White -- ^ Separation white space. | Indent -- ^ Indentation spaces. | DirectivesEnd -- ^ Document start marker. | DocumentEnd -- ^ Document end marker. | BeginEscape -- ^ Begins escape sequence. | EndEscape -- ^ Ends escape sequence. | BeginComment -- ^ Begins comment. | EndComment -- ^ Ends comment. | BeginDirective -- ^ Begins directive. | EndDirective -- ^ Ends directive. | BeginTag -- ^ Begins tag. | EndTag -- ^ Ends tag. | BeginHandle -- ^ Begins tag handle. | EndHandle -- ^ Ends tag handle. | BeginAnchor -- ^ Begins anchor. | EndAnchor -- ^ Ends anchor. | BeginProperties -- ^ Begins node properties. | EndProperties -- ^ Ends node properties. | BeginAlias -- ^ Begins alias. | EndAlias -- ^ Ends alias. | BeginScalar -- ^ Begins scalar content. | EndScalar -- ^ Ends scalar content. | BeginSequence -- ^ Begins sequence content. | EndSequence -- ^ Ends sequence content. | BeginMapping -- ^ Begins mapping content. | EndMapping -- ^ Ends mapping content. | BeginPair -- ^ Begins mapping key:value pair. | EndPair -- ^ Ends mapping key:value pair. | BeginNode -- ^ Begins complete node. | EndNode -- ^ Ends complete node. | BeginDocument -- ^ Begins document. | EndDocument -- ^ Ends document. | BeginStream -- ^ Begins YAML stream. | EndStream -- ^ Ends YAML stream. | Error -- ^ Parsing error at this point. | Unparsed -- ^ Unparsed due to errors (or at end of test). | Detected -- ^ Detected parameter (for testing). deriving (Show,Eq) {- -- | @show code@ converts a 'Code' to the one-character YEAST token code char. -- The list of byte codes is also documented in the @yaml2yeast@ program. instance Show Code where show code = case code of Bom -> "U" Text -> "T" Meta -> "t" Break -> "b" LineFeed -> "L" LineFold -> "l" Indicator -> "I" White -> "w" Indent -> "i" DirectivesEnd -> "K" DocumentEnd -> "k" BeginEscape -> "E" EndEscape -> "e" BeginComment -> "C" EndComment -> "c" BeginDirective -> "D" EndDirective -> "d" BeginTag -> "G" EndTag -> "g" BeginHandle -> "H" EndHandle -> "h" BeginAnchor -> "A" EndAnchor -> "a" BeginProperties -> "P" EndProperties -> "p" BeginAlias -> "R" EndAlias -> "r" BeginScalar -> "S" EndScalar -> "s" BeginSequence -> "Q" EndSequence -> "q" BeginMapping -> "M" EndMapping -> "m" BeginNode -> "N" EndNode -> "n" BeginPair -> "X" EndPair -> "x" BeginDocument -> "O" EndDocument -> "o" Error -> "!" Unparsed -> "-" Detected -> "$" -} -- | Parsed token. data Token = Token { tByteOffset :: !Int, -- ^ 0-base byte offset in stream. tCharOffset :: !Int, -- ^ 0-base character offset in stream. tLine :: !Int, -- ^ 1-based line number. tLineChar :: !Int, -- ^ 0-based character in line. tCode :: !Code, -- ^ Specific token 'Code'. tText :: !String -- ^ Contained input chars, if any. } deriving Show -- * Parsing framework -- -- Haskell has no shortage of parsing frameworks. We use our own because: -- -- * Most available frameworks are inappropriate because of their focus on -- building a parse tree, and completing all of it before any of it is -- accessible to the caller. We return a stream of tokens, and would like -- its head to be accessible as soon as possible to allow for streaming. To -- do this with bounded memory usage we use a combination of continuation -- passing style and difference lists for the collected tokens. -- -- * Haskell makes it so easy to roll your own parsing framework. We need some -- specialized machinery (limited lookahead, forbidden patterns). It is -- possible to build these on top of existing frameworks but the end result -- isn't much shorter than rolling our own. -- -- Since we roll our own framework we don't bother with making it generalized, -- so we maintain a single 'State' type rather than having a generic one that -- contains a polymorphic \"UserState\" field etc. -- | A 'Parser' is basically a function computing a 'Reply'. newtype Parser result = Parser (State -> Reply result) applyParser :: Parser result -> State -> Reply result applyParser (Parser p) s = p s -- | The 'Result' of each invocation is either an error, the actual result, or -- a continuation for computing the actual result. data Result result = Failed String -- ^ Parsing aborted with a failure. | Result result -- ^ Parsing completed with a result. | More (Parser result) -- ^ Parsing is ongoing with a continuation. -- Showing a 'Result' is only used in debugging. instance (Show result) => Show (Result result) where show result = case result of Failed message -> "Failed " ++ message Result result -> "Result " ++ (show result) More _ -> "More" -- | Each invication of a 'Parser' yields a 'Reply'. The 'Result' is only one -- part of the 'Reply'. data Reply result = Reply { rResult :: !(Result result), -- ^ Parsing result. rTokens :: !(D.DList Token), -- ^ Tokens generated by the parser. rCommit :: !(Maybe Decision), -- ^ Commitment to a decision point. rState :: !State -- ^ The updated parser state. } -- Showing a 'State' is only used in debugging. instance (Show result) => Show (Reply result) where show reply = "Result: " ++ (show $ reply^.rResult) ++ ", Tokens: " ++ (show $ D.toList $ reply^.rTokens) ++ ", Commit: " ++ (show $ reply^.rCommit) ++ ", State: { " ++ (show $ reply^.rState) ++ "}" -- A 'Pattern' is a parser that doesn't have an (interesting) result. type Pattern = Parser () -- ** Parsing state -- | The internal parser state. We don't bother with parameterising it with a -- \"UserState\", we just bundle the generic and specific fields together (not -- that it is that easy to draw the line - is @sLine@ generic or specific?). data State = State { sEncoding :: !Encoding, -- ^ The input UTF encoding. sDecision :: !Decision, -- ^ Current decision name. sLimit :: !Int, -- ^ Lookahead characters limit. sForbidden :: !(Maybe Pattern), -- ^ Pattern we must not enter into. sIsPeek :: !Bool, -- ^ Disables token generation. sIsSol :: !Bool, -- ^ Is at start of line? sChars :: ![Char], -- ^ (Reversed) characters collected for a token. sCharsByteOffset :: !Int, -- ^ Byte offset of first collected character. sCharsCharOffset :: !Int, -- ^ Char offset of first collected character. sCharsLine :: !Int, -- ^ Line of first collected character. sCharsLineChar :: !Int, -- ^ Character in line of first collected character. sByteOffset :: !Int, -- ^ Offset in bytes in the input. sCharOffset :: !Int, -- ^ Offset in characters in the input. sLine :: !Int, -- ^ Builds on YAML's line break definition. sLineChar :: !Int, -- ^ Character number in line. sCode :: !Code, -- ^ Of token we are collecting chars for. sLast :: !Char, -- ^ Last matched character. sInput :: ![(Int, Char)] -- ^ The decoded input characters. } -- Showing a 'State' is only used in debugging. Note that forcing dump of -- @sInput@ will disable streaming it. instance Show State where show state = "Encoding: " ++ (show $ state^.sEncoding) ++ ", Decision: " ++ (show $ state^.sDecision) ++ ", Limit: " ++ (show $ state^.sLimit) ++ ", IsPeek: " ++ (show $ state^.sIsPeek) ++ ", IsSol: " ++ (show $ state^.sIsSol) ++ ", Chars: >>>" ++ (reverse $ state^.sChars) ++ "<<<" ++ ", CharsByteOffset: " ++ (show $ state^.sCharsByteOffset) ++ ", CharsCharOffset: " ++ (show $ state^.sCharsCharOffset) ++ ", CharsLine: " ++ (show $ state^.sCharsLine) ++ ", CharsLineChar: " ++ (show $ state^.sCharsLineChar) ++ ", ByteOffset: " ++ (show $ state^.sByteOffset) ++ ", CharOffset: " ++ (show $ state^.sCharOffset) ++ ", Line: " ++ (show $ state^.sLine) ++ ", LineChar: " ++ (show $ state^.sLineChar) ++ ", Code: " ++ (show $ state^.sCode) ++ ", Last: " ++ (show $ state^.sLast) -- ++ ", Input: >>>" ++ (show $ state^.sInput) ++ "<<<" -- | @initialState name input@ returns an initial 'State' for parsing the -- /input/ (with /name/ for error messages). initialState :: BLC.ByteString -> State initialState input = State { sEncoding = encoding , sDecision = DeNone , sLimit = -1 , sForbidden = Nothing , sIsPeek = False , sIsSol = True , sChars = [] , sCharsByteOffset = -1 , sCharsCharOffset = -1 , sCharsLine = -1 , sCharsLineChar = -1 , sByteOffset = 0 , sCharOffset = 0 , sLine = 1 , sLineChar = 0 , sCode = Unparsed , sLast = ' ' , sInput = decoded } where (encoding, decoded) = decode input -- *** Setters -- -- We need four setter functions to pass them around as arguments. For some -- reason, Haskell only generates getter functions. -- | @setLimit limit state@ sets the @sLimit@ field to /limit/. setLimit :: Int -> State -> State setLimit limit state = state { sLimit = limit } -- | @setForbidden forbidden state@ sets the @sForbidden@ field to /forbidden/. setForbidden :: Maybe Pattern -> State -> State setForbidden forbidden state = state { sForbidden = forbidden } -- | @setCode code state@ sets the @sCode@ field to /code/. setCode :: Code -> State -> State setCode code state = state { sCode = code } -- ** Implicit parsers -- -- It is tedious to have to wrap each expected character (or character range) -- in an explicit 'Parse' constructor. We let Haskell do that for us using a -- 'Match' class. -- | @Match parameter result@ specifies that we can convert the /parameter/ to -- a 'Parser' returning the /result/. class Match parameter result | parameter -> result where match :: parameter -> Parser result -- | We don't need to convert a 'Parser', it already is one. instance Match (Parser result) result where match = id -- | We convert 'Char' to a parser for a character (that returns nothing). instance Match Char () where match code = nextIf (== code) -- | We convert a 'Char' tuple to a parser for a character range (that returns -- nothing). instance Match (Char, Char) () where match (low, high) = nextIf $ \ code -> low <= code && code <= high -- | We convert 'String' to a parser for a sequence of characters (that returns -- nothing). instance Match String () where match = foldr (&) empty -- ** Reply constructors -- | @returnReply state result@ prepares a 'Reply' with the specified /state/ -- and /result/. returnReply :: State -> result -> Reply result returnReply state result = Reply { rResult = Result result, rTokens = D.empty, rCommit = Nothing, rState = state } -- | @tokenReply state token@ returns a 'Reply' containing the /state/ and -- /token/. Any collected characters are cleared (either there are none, or we -- put them in this token, or we don't want them). tokenReply :: State -> Token -> Reply () tokenReply state token = Reply { rResult = Result (), rTokens = D.singleton token, rCommit = Nothing, rState = state { sCharsByteOffset = -1, sCharsCharOffset = -1, sCharsLine = -1, sCharsLineChar = -1, sChars = [] } } -- | @failReply state message@ prepares a 'Reply' with the specified /state/ -- and error /message/. failReply :: State -> String -> Reply result failReply state message = Reply { rResult = Failed message, rTokens = D.empty, rCommit = Nothing, rState = state } -- | @unexpectedReply state@ returns a @failReply@ for an unexpected character. unexpectedReply :: State -> Reply result unexpectedReply state = case state^.sInput of ((_, char):_) -> failReply state $ "Unexpected '" ++ [char] ++ "'" [] -> failReply state "Unexpected end of input" instance Functor Parser where fmap g f = Parser $ \state -> let reply = applyParser f state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result x -> reply { rResult = Result (g x) } More parser -> reply { rResult = More $ fmap g parser } instance Applicative Parser where pure result = Parser $ \state -> returnReply state result (<*>) = ap left *> right = Parser $ \state -> let reply = applyParser left state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result _ -> reply { rResult = More right } More parser -> reply { rResult = More $ parser *> right } -- | Allow using the @do@ notation for our parsers, which makes for short and -- sweet @do@ syntax when we want to examine the results (we typically don't). instance Monad Parser where -- @return result@ does just that - return a /result/. return = pure -- @left >>= right@ applies the /left/ parser, and if it didn't fail -- applies the /right/ one (well, the one /right/ returns). left >>= right = Parser $ \state -> let reply = applyParser left state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result value -> reply { rResult = More $ right value } More parser -> reply { rResult = More $ parser >>= right } (>>) = (*>) -- | @fail message@ does just that - fails with a /message/. pfail :: String -> Parser a pfail message = Parser $ \state -> failReply state message -- ** Parsing operators -- -- Here we reap the benefits of renaming the numerical operators. The Operator -- precedence, in decreasing strength: -- -- @repeated % n@, @repeated <% n@, @match - rejected@, @match ! decision@, -- @match ?! decision@, @choice ^ (first \/ second)@. -- -- @match - first - second@ is @(match - first) - second@. -- -- @first & second & third@ is @first & (second & third)@. Note that @first - -- rejected & second@ is @(first - rejected) & second@, etc. -- -- @match \/ alternative \/ otherwise@ is @match \/ (alternative \/ -- otherwise)@. Note that @first & second \/ third@ is @(first & second) \/ -- third@. -- -- @( match *)@, @(match +)@, @(match ?)@, @(match ?)@, @(match -- >!)@, @(match ? infix 0 >! -- | @parser % n@ repeats /parser/ exactly /n/ times. (%) :: (Match match result) => match -> Int -> Pattern parser % n | n <= 0 = empty | n > 0 = parser' *> (parser' % n .- 1) where parser' = match parser -- | @parser <% n@ matches fewer than /n/ occurrences of /parser/. (<%) :: (Match match result) => match -> Int -> Pattern parser <% n | n < 1 = pfail "Fewer than 0 repetitions" | n == 1 = reject parser Nothing | n > 1 = DeLess ^ ( ((parser ! DeLess) *> (parser <% n .- 1)) <|> empty ) data Decision = DeNone -- "" | DeStar -- "*" | DeLess -- "<%" | DeDirective | DeDoc | DeEscape | DeEscaped | DeFold | DeKey | DeHeader | DeMore | DeNode | DePair deriving (Show,Eq) -- | @decision ^ (option \/ option \/ ...)@ provides a /decision/ name to the -- choice about to be made, to allow to @commit@ to it. (^) :: (Match match result) => Decision -> match -> Parser result decision ^ parser = choice decision $ match parser -- | @parser ! decision@ commits to /decision/ (in an option) after -- successfully matching the /parser/. (!) :: (Match match result) => match -> Decision -> Pattern parser ! decision = match parser *> commit decision -- | @parser ?! decision@ commits to /decision/ (in an option) if the current -- position matches /parser/, without consuming any characters. (?!) :: (Match match result) => match -> Decision -> Pattern parser ?! decision = peek parser *> commit decision -- | @lookbehind match -> Parser result (?@ matches the current point without consuming any characters -- if it matches the lookahead parser (positive lookahead) (>?) :: (Match match result) => match -> Parser result (>?) lookahead = peek lookahead -- | @lookahead >?@ matches the current point without consuming any characters -- if it matches the lookahead parser (negative lookahead) (>!) :: (Match match result) => match -> Pattern (>!) lookahead = reject lookahead Nothing -- | @parser - rejected@ matches /parser/, except if /rejected/ matches at this -- point. (-) :: (Match match1 result1, Match match2 result2) => match1 -> match2 -> Parser result1 parser - rejected = reject rejected Nothing *> match parser -- | @before & after@ parses /before/ and, if it succeeds, parses /after/. This -- basically invokes the monad's @>>=@ (bind) method. (&) :: (Match match1 result1, Match match2 result2) => match1 -> match2 -> Parser result2 before & after = match before *> match after -- | @first \/ second@ tries to parse /first/, and failing that parses -- /second/, unless /first/ has committed in which case is fails immediately. (/) :: (Match match1 result, Match match2 result) => match1 -> match2 -> Parser result first / second = Parser $ applyParser (match first <|> match second) -- | @(optional ?)@ tries to match /parser/, otherwise does nothing. (?) :: (Match match result) => match -> Pattern (?) optional = (match optional *> empty) <|> empty -- | @(parser *)@ matches zero or more occurrences of /repeat/, as long as each -- one actually consumes input characters. (*) :: (Match match result) => match -> Pattern (*) parser = DeStar ^ zomParser where zomParser = ((parser ! DeStar) *> match zomParser) <|> empty -- | @(parser +)@ matches one or more occurrences of /parser/, as long as each -- one actually consumed input characters. (+) :: (Match match result) => match -> Pattern (+) parser = match parser *> (parser *) -- ** Basic parsers -- | @first <|> second@ tries to parse /first/, and failing that parses -- /second/, unless /first/ has committed in which case is fails immediately. instance Alternative Parser where empty = pfail "empty" left <|> right = Parser $ \state -> decideParser state D.empty left right state where decideParser point tokens left right state = let reply = applyParser left state tokens' = D.append tokens $ reply^.rTokens in case (reply^.rResult, reply^.rCommit) of (Failed _, _) -> Reply { rState = point, rTokens = D.empty, rResult = More right, rCommit = Nothing } (Result _, _) -> reply { rTokens = tokens' } (More _, Just _) -> reply { rTokens = tokens' } (More left', Nothing) -> decideParser point tokens' left' right (reply^.rState) -- | @choice decision parser@ provides a /decision/ name to the choice about to -- be made in /parser/, to allow to @commit@ to it. choice :: Decision -> Parser result -> Parser result choice decision parser = Parser $ \ state -> applyParser (choiceParser (state^.sDecision) decision parser) state { sDecision = decision } where choiceParser parentDecision makingDecision parser = Parser $ \ state -> let reply = applyParser parser state commit' = case reply^.rCommit of Nothing -> Nothing Just decision | decision == makingDecision -> Nothing | otherwise -> reply^.rCommit reply' = case reply^.rResult of More parser' -> reply { rCommit = commit', rResult = More $ choiceParser parentDecision makingDecision parser' } _ -> reply { rCommit = commit', rState = (reply^.rState) { sDecision = parentDecision } } in reply' -- | @parser ``recovery`` pattern@ parses the specified /parser/; if it fails, -- it continues to the /recovery/ parser to recover. recovery :: (Match match1 result) => match1 -> Parser result -> Parser result recovery pattern recover = Parser $ \ state -> let reply = applyParser (match pattern) state in if state^.sIsPeek then reply else case reply^.rResult of Result _ -> reply More more -> reply { rResult = More $ more `recovery` recover } Failed message -> reply { rResult = More $ fake Error message *> unparsed *> recover } where unparsed = Parser $ \ state -> applyParser (match finishToken) $ state { sCode = Unparsed } -- | @prev parser@ succeeds if /parser/ matches at the previous character. It -- does not consume any input. prev :: (Match match result) => match -> Parser result prev parser = Parser $ \ state -> prevParser state (match parser) state { sIsPeek = True, sInput = (-1, state^.sLast) : state^.sInput } where prevParser point parser state = let reply = applyParser parser state in case reply^.rResult of Failed message -> failReply point message Result value -> returnReply point value More parser' -> prevParser point parser' $ reply^.rState -- | @peek parser@ succeeds if /parser/ matches at this point, but does not -- consume any input. peek :: (Match match result) => match -> Parser result peek parser = Parser $ \ state -> peekParser state (match parser) state { sIsPeek = True } where peekParser point parser state = let reply = applyParser parser state in case reply^.rResult of Failed message -> failReply point message Result value -> returnReply point value More parser' -> peekParser point parser' $ reply^.rState -- | @reject parser name@ fails if /parser/ matches at this point, and does -- nothing otherwise. If /name/ is provided, it is used in the error message, -- otherwise the messages uses the current character. reject :: (Match match result) => match -> Maybe String -> Pattern reject parser name = Parser $ \ state -> rejectParser state name (match parser) state { sIsPeek = True } where rejectParser point name parser state = let reply = applyParser parser state in case reply^.rResult of Failed _message -> returnReply point () Result _value -> case name of Nothing -> unexpectedReply point Just text -> failReply point $ "Unexpected " ++ text More parser' -> rejectParser point name parser' $ reply^.rState -- | @upto parser@ consumes all the character up to and not including the next -- point where the specified parser is a match. upto :: Pattern -> Pattern upto parser = ( ( parser >!) *> nextIf (const True) *) -- | @nonEmpty parser@ succeeds if /parser/ matches some non-empty input -- characters at this point. nonEmpty :: (Match match result) => match -> Parser result nonEmpty parser = Parser $ \ state -> applyParser (nonEmptyParser (state^.sCharOffset) (match parser)) state where nonEmptyParser offset parser = Parser $ \ state -> let reply = applyParser parser state state' = reply^.rState in case reply^.rResult of Failed _message -> reply Result _value -> if state'^.sCharOffset > offset then reply else failReply state' "Matched empty pattern" More parser' -> reply { rResult = More $ nonEmptyParser offset parser' } -- | @empty@ always matches without consuming any input. empty :: Pattern empty = return () -- | @eof@ matches the end of the input. eof :: Pattern eof = Parser $ \ state -> if state^.sInput == [] then returnReply state () else unexpectedReply state -- | @sol@ matches the start of a line. sol :: Pattern sol = Parser $ \ state -> if state^.sIsSol then returnReply state () else failReply state "Expected start of line" -- ** State manipulation pseudo-parsers -- | @commit decision@ commits the parser to all the decisions up to the most -- recent parent /decision/. This makes all tokens generated in this parsing -- path immediately available to the caller. commit :: Decision -> Pattern commit decision = Parser $ \ state -> Reply { rState = state, rTokens = D.empty, rResult = Result (), rCommit = Just decision } -- | @nextLine@ increments @sLine@ counter and resets @sLineChar@. nextLine :: Pattern nextLine = Parser $ \ state -> returnReply state { sIsSol = True, sLine = state^.sLine .+ 1, sLineChar = 0 } () -- | @with setField getField value parser@ invokes the specified /parser/ with -- the value of the specified field set to /value/ for the duration of the -- invocation, using the /setField/ and /getField/ functions to manipulate it. with :: (value -> State -> State) -> (State -> value) -> value -> Parser result -> Parser result with setField getField value parser = Parser $ \ state -> let value' = getField state Parser parser' = value' `seq` withParser value' parser in parser' $ setField value state where withParser parentValue parser = Parser $ \ state -> let reply = applyParser parser state in case reply^.rResult of Failed _ -> reply { rState = setField parentValue $ reply^.rState } Result _ -> reply { rState = setField parentValue $ reply^.rState } More parser' -> reply { rResult = More $ withParser parentValue parser' } -- | @parser ``forbidding`` pattern@ parses the specified /parser/ ensuring -- that it does not contain anything matching the /forbidden/ parser. forbidding :: (Match match1 result1) => match1 -> Parser result1 -> Parser result1 forbidding parser forbidden = with setForbidden sForbidden (Just $ forbidden *> empty) (match parser) -- | @parser ``limitedTo`` limit@ parses the specified /parser/ -- ensuring that it does not consume more than the /limit/ input chars. limitedTo :: (Match match result) => match -> Int -> Parser result limitedTo parser limit = with setLimit sLimit limit (match parser) -- ** Consuming input characters -- | @nextIf test@ fails if the current position matches the 'State' forbidden -- pattern or if the 'State' lookahead limit is reached. Otherwise it consumes -- (and buffers) the next input char if it satisfies /test/. nextIf :: (Char -> Bool) -> Pattern nextIf test = Parser $ \ state -> case state^.sForbidden of Nothing -> limitedNextIf state Just parser -> let reply = applyParser (reject parser $ Just "forbidden pattern") state { sForbidden = Nothing } in case reply^.rResult of Failed _ -> reply Result _ -> limitedNextIf state where limitedNextIf state = case state^.sLimit of -1 -> consumeNextIf state 0 -> failReply state "Lookahead limit reached" _limit -> consumeNextIf state { sLimit = state^.sLimit .- 1 } consumeNextIf state = case state^.sInput of ((offset, char):rest) | test char -> let chars = if state^.sIsPeek then [] else char:(state^.sChars) byte_offset = charsOf sByteOffset sCharsByteOffset char_offset = charsOf sCharOffset sCharsCharOffset line = charsOf sLine sCharsLine line_char = charsOf sLineChar sCharsLineChar is_sol = char == '\xFEFF' && state^.sIsSol state' = state { sInput = rest, sLast = char, sChars = chars, sCharsByteOffset = byte_offset, sCharsCharOffset = char_offset, sCharsLine = line, sCharsLineChar = line_char, sIsSol = is_sol, sByteOffset = offset, sCharOffset = state^.sCharOffset .+ 1, sLineChar = state^.sLineChar .+ 1 } in returnReply state' () | otherwise -> unexpectedReply state [] -> unexpectedReply state where charsOf field charsField | state^.sIsPeek = -1 | state^.sChars == [] = state^.field | otherwise = state^.charsField -- ** Producing tokens -- | @finishToken@ places all collected text into a new token and begins a new -- one, or does nothing if there are no collected characters. finishToken :: Pattern finishToken = Parser $ \ state -> let state' = state { sChars = [], sCharsByteOffset = -1, sCharsCharOffset = -1, sCharsLine = -1, sCharsLineChar = -1 } in if state^.sIsPeek then returnReply state' () else case state^.sChars of [] -> returnReply state' () chars@(_:_) -> tokenReply state' Token { tByteOffset = state^.sCharsByteOffset, tCharOffset = state^.sCharsCharOffset, tLine = state^.sCharsLine, tLineChar = state^.sCharsLineChar, tCode = state^.sCode, tText = reverse chars } -- | @wrap parser@ invokes the /parser/, ensures any unclaimed input characters -- are wrapped into a token (only happens when testing productions), ensures no -- input is left unparsed, and returns the parser's result. wrap :: (Match match result) => match -> Parser result wrap parser = do result <- match parser finishToken eof return result -- | @token code parser@ places all text matched by /parser/ into a 'Token' with -- the specified /code/ (unless it is empty). Note it collects the text even if -- there is an error. token :: (Match match result) => Code -> match -> Pattern token code parser = finishToken & with setCode sCode code (parser & finishToken) -- | @fake code text@ creates a token with the specified /code/ and \"fake\" -- /text/ characters, instead of whatever characters are collected so far. fake :: Code -> String -> Pattern fake code text = Parser $ \ state -> if state^.sIsPeek then returnReply state () else tokenReply state Token { tByteOffset = value state sByteOffset sCharsByteOffset, tCharOffset = value state sCharOffset sCharsCharOffset, tLine = value state sLine sCharsLine, tLineChar = value state sLineChar sCharsLineChar, tCode = code, tText = text } where value state field1 field2 = if field2 state == -1 then field1 state else field2 state -- | @meta parser@ collects the text matched by the specified /parser/ into a -- | @Meta@ token. meta :: (Match match result) => match -> Pattern meta parser = token Meta parser -- | @indicator code@ collects the text matched by the specified /parser/ into an -- @Indicator@ token. indicator :: (Match match result) => match -> Pattern indicator parser = token Indicator $ parser -- | @text parser@ collects the text matched by the specified /parser/ into a -- @Text@ token. text :: (Match match result) => match -> Pattern text parser = token Text parser -- | @emptyToken code@ returns an empty token. emptyToken :: Code -> Pattern emptyToken code = finishToken & parser code where parser code = Parser $ \ state -> if state^.sIsPeek then returnReply state () else tokenReply state Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = code, tText = "" } -- | @wrapTokens beginCode endCode parser@ wraps the specified /parser/ with -- matching /beginCode/ and /endCode/ tokens. wrapTokens :: Code -> Code -> Pattern -> Pattern wrapTokens beginCode endCode pattern = emptyToken beginCode & prefixErrorWith pattern (emptyToken endCode) & emptyToken endCode -- | @prefixErrorWith pattern prefix@ will invoke the @prefix@ parser if an -- error is detected during the @pattern@ parser, and then return the error. prefixErrorWith :: (Match match result) => match -> Pattern -> Parser result prefixErrorWith pattern prefix = Parser $ \ state -> let reply = applyParser (match pattern) state in case reply^.rResult of Result _ -> reply More more -> reply { rResult = More $ prefixErrorWith more prefix } Failed message -> reply { rResult = More $ prefix & (pfail message :: Parser result) } -- * Production parameters -- | Production context. data Context = BlockOut -- ^ Outside block sequence. | BlockIn -- ^ Inside block sequence. | FlowOut -- ^ Outside flow collection. | FlowIn -- ^ Inside flow collection. | BlockKey -- ^ Implicit block key. | FlowKey -- ^ Implicit flow key. -- | @show context@ converts a 'Context' to a 'String'. instance Show Context where show context = case context of BlockOut -> "block-out" BlockIn -> "block-in" FlowOut -> "flow-out" FlowIn -> "flow-in" BlockKey -> "block-key" FlowKey -> "flow-key" -- | @read context@ converts a 'String' to a 'Context'. We trust our callers to -- convert any @-@ characters into @_@ to allow the built-in @lex@ function to -- handle the names as single identifiers. instance Read Context where readsPrec _ text = [ ((r word), tail) | (word, tail) <- lex text ] where r word = case word of "block_out" -> BlockOut "block_in" -> BlockIn "flow_out" -> FlowOut "flow_in" -> FlowIn "block_key" -> BlockKey "flow_key" -> FlowKey _ -> error $ "unknown context: " ++ word -- | Chomp method. data Chomp = Strip -- ^ Remove all trailing line breaks. | Clip -- ^ Keep first trailing line break. | Keep -- ^ Keep all trailing line breaks. -- | @show chomp@ converts a 'Chomp' to a 'String'. instance Show Chomp where show chomp = case chomp of Strip -> "strip" Clip -> "clip" Keep -> "keep" -- | @read chomp@ converts a 'String' to a 'Chomp'. instance Read Chomp where readsPrec _ text = [ ((r word), tail) | (word, tail) <- lex text ] where r word = case word of "strip" -> Strip "clip" -> Clip "keep" -> Keep _ -> error $ "unknown chomp: " ++ word -- * Tokenizers -- -- We encapsulate the 'Parser' inside a 'Tokenizer'. This allows us to hide the -- implementation details from our callers. -- | 'Tokenizer' converts a input text into a list of 'Token'. Errors -- are reported as tokens with the @Error@ 'Code', and the unparsed text -- following an error may be attached as a final token (if the @Bool@ is -- @True@). Note that tokens are available \"immediately\", allowing for -- streaming of large YAML files with memory requirements depending only on the -- YAML nesting level. type Tokenizer = BLC.ByteString -> Bool -> [Token] -- | @patternTokenizer pattern@ converts the /pattern/ to a simple 'Tokenizer'. patternTokenizer :: Pattern -> Tokenizer patternTokenizer pattern input withFollowing = D.toList $ patternParser (wrap pattern) (initialState input) where patternParser parser state = let reply = applyParser parser state tokens = commitBugs reply state' = reply^.rState in case reply^.rResult of Failed message -> errorTokens tokens state' message withFollowing Result _ -> tokens More parser' -> D.append tokens $ patternParser parser' state' -- | @errorTokens tokens state message withFollowing@ appends an @Error@ token -- with the specified /message/ at the end of /tokens/, and if /withFollowing/ -- also appends the unparsed text following the error as a final @Unparsed@ -- token. errorTokens :: D.DList Token -> State -> String -> Bool -> D.DList Token errorTokens tokens state message withFollowing = let tokens' = D.append tokens $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Error, tText = message } in if withFollowing && state^.sInput /= [] then D.append tokens' $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Unparsed, tText = map snd $ state^.sInput } else tokens' -- | @commitBugs reply@ inserts an error token if a commit was made outside a -- named choice. This should never happen outside tests. commitBugs :: Reply result -> D.DList Token commitBugs reply = let tokens = reply^.rTokens state = reply^.rState in case reply^.rCommit of Nothing -> tokens Just commit -> D.append tokens $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Error, tText = "Commit to " ++ show commit ++ " was made outside it" } -- | @'tokenize' input emit_unparsed@ converts the Unicode /input/ -- (using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encoding) -- to a list of 'Token' according to the YAML 1.2 specification. -- -- Errors are reported as tokens with @'Error' :: 'Code'@, and the -- unparsed text following an error may be attached as a final 'Unparsed' token -- (if the /emit_unparsed/ argument is @True@). Note that tokens are available -- \"immediately\", allowing for streaming of large YAML files with -- memory requirements depending only on the YAML nesting level. tokenize :: BLC.ByteString -> Bool -> [Token] tokenize = patternTokenizer l_yaml_stream -- * Productions -- ** BNF compatibility helpers -- | @detect_utf_encoding@ doesn't actually detect the encoding, we just call it -- this way to make the productions compatible with the spec. Instead it simply -- reports the encoding (which was already detected when we started parsing). bom :: Match match1 result1 => match1 -> Parser () bom code = code & (Parser $ \ state -> let text = case state^.sEncoding of UTF8 -> "TF-8" UTF16LE -> "TF-16LE" UTF16BE -> "TF-16BE" UTF32LE -> "TF-32LE" UTF32BE -> "TF-32BE" in applyParser (fake Bom text) state) -- | @na@ is the \"non-applicable\" indentation value. We use Haskell's laziness -- to verify it really is never used. na :: Int na = error "Accessing non-applicable indentation" -- | @asInteger@ returns the last consumed character, which is assumed to be a -- decimal digit, as an integer. asInteger :: Parser Int asInteger = Parser $ \ state -> returnReply state $ ord (state^.sLast) .- 48 -- | @result value@ is the same as /return value/ except that we give the -- Haskell type deduction the additional boost it needs to figure out this is -- wrapped in a 'Parser'. result :: result -> Parser result result = return ---------------------------------------------------------------------------- -- ** Spec productions -- -- These are copied directly from the spec, with the sprinkling of -- additional token and decision point directives. -- 5.1 Character Set c_printable {- 1 -} = '\x9' / '\xA' / '\xD' / ('\x20', '\x7E') / '\x85' / ('\xA0', '\xD7FF') / ('\xE000', '\xFFFD') / ('\x10000', '\x10FFFF') nb_json {- 2 -} = '\x9' / ('\x20', '\x10FFFF') -- 5.2 Character Encodings c_byte_order_mark {- 3 -} = bom '\xFEFF' -- 5.3 Indicator Characters c_sequence_entry {- 4 -} = indicator '-' c_mapping_key {- 5 -} = indicator '?' c_mapping_value {- 6 -} = indicator ':' c_collect_entry {- 7 -} = indicator ',' c_sequence_start {- 8 -} = indicator '[' c_sequence_end {- 9 -} = indicator ']' c_mapping_start {- 10 -} = indicator '{' c_mapping_end {- 11 -} = indicator '}' c_comment {- 12 -} = indicator '#' c_anchor {- 13 -} = indicator '&' c_alias {- 14 -} = indicator '*' c_tag {- 15 -} = indicator '!' c_literal {- 16 -} = indicator '|' c_folded {- 17 -} = indicator '>' c_single_quote {- 18 -} = indicator '\'' c_double_quote {- 19 -} = indicator '"' c_directive {- 20 -} = indicator '%' c_reserved {- 21 -} = indicator ( '@' / '`' ) c_indicator {- 22 -} = c_sequence_entry / c_mapping_key / c_mapping_value / c_collect_entry / c_sequence_start / c_sequence_end / c_mapping_start / c_mapping_end / c_comment / c_anchor / c_alias / c_tag / c_literal / c_folded / c_single_quote / c_double_quote / c_directive / c_reserved c_flow_indicator {- 23 -} = c_collect_entry / c_sequence_start / c_sequence_end / c_mapping_start / c_mapping_end -- 5.4 Line Break Characters b_line_feed {- 24 -} = '\xA' b_carriage_return {- 25 -} = '\xD' b_char {- 26 -} = b_line_feed / b_carriage_return nb_char {- 27 -} = c_printable - b_char - c_byte_order_mark b_break {- 28 -} = ( b_carriage_return & b_line_feed / b_carriage_return / b_line_feed ) & nextLine b_as_line_feed {- 29 -} = token LineFeed b_break b_non_content {- 30 -} = token Break b_break -- 5.5 White Space Characters s_space {- 31 -} = '\x20' s_tab {- 32 -} = '\x9' s_white {- 33 -} = s_space / s_tab ns_char {- 34 -} = nb_char - s_white -- 5.6 Miscellaneous Characters ns_dec_digit {- 35 -} = ('\x30', '\x39') ns_hex_digit {- 36 -} = ns_dec_digit / ('\x41', '\x46') / ('\x61', '\x66') ns_ascii_letter {- 37 -} = ('\x41', '\x5A') / ('\x61', '\x7A') ns_word_char {- 38 -} = ns_dec_digit / ns_ascii_letter / '-' ns_uri_char {- 39 -} = DeEscape ^ ( '%' ! DeEscape & ns_hex_digit & ns_hex_digit / ns_word_char / '#' / ';' / '/' / '?' / ':' / '@' / '&' / '=' / '+' / '$' / ',' / '_' / '.' / '!' / '~' / '*' / '\'' / '(' / ')' / '[' / ']' ) ns_tag_char {- 40 -} = ns_uri_char - c_tag - c_flow_indicator -- 5.7 Escaped Characters c_escape {- 41 -} = indicator '\\' ns_esc_null {- 42 -} = meta '0' ns_esc_bell {- 43 -} = meta 'a' ns_esc_backspace {- 44 -} = meta 'b' ns_esc_horizontal_tab {- 45 -} = meta ( 't' / '\x9' ) ns_esc_line_feed {- 46 -} = meta 'n' ns_esc_vertical_tab {- 47 -} = meta 'v' ns_esc_form_feed {- 48 -} = meta 'f' ns_esc_carriage_return {- 49 -} = meta 'r' ns_esc_escape {- 50 -} = meta 'e' ns_esc_space {- 51 -} = meta '\x20' ns_esc_double_quote {- 52 -} = meta '"' ns_esc_slash {- 53 -} = meta '/' ns_esc_backslash {- 54 -} = meta '\\' ns_esc_next_line {- 55 -} = meta 'N' ns_esc_non_breaking_space {- 56 -} = meta '_' ns_esc_line_separator {- 57 -} = meta 'L' ns_esc_paragraph_separator {- 58 -} = meta 'P' ns_esc_8_bit {- 59 -} = indicator 'x' ! DeEscaped & meta ( ns_hex_digit % 2 ) ns_esc_16_bit {- 60 -} = indicator 'u' ! DeEscaped & meta ( ns_hex_digit % 4 ) ns_esc_32_bit {- 61 -} = indicator 'U' ! DeEscaped & meta ( ns_hex_digit % 8 ) c_ns_esc_char {- 62 -} = wrapTokens BeginEscape EndEscape $ c_escape ! DeEscape & DeEscaped ^ ( ns_esc_null / ns_esc_bell / ns_esc_backspace / ns_esc_horizontal_tab / ns_esc_line_feed / ns_esc_vertical_tab / ns_esc_form_feed / ns_esc_carriage_return / ns_esc_escape / ns_esc_space / ns_esc_double_quote / ns_esc_slash / ns_esc_backslash / ns_esc_next_line / ns_esc_non_breaking_space / ns_esc_line_separator / ns_esc_paragraph_separator / ns_esc_8_bit / ns_esc_16_bit / ns_esc_32_bit ) -- 6.1 Indentation Spaces s_indent n {- 63 -} = token Indent ( s_space % n ) s_indent_lt n {- 64 -} = token Indent ( s_space <% n ) s_indent_le n {- 65 -} = token Indent ( s_space <% (n .+ 1) ) -- 6.2 Separation Spaces s_separate_in_line {- 66 -} = token White ( s_white +) / sol -- 6.3 Line Prefixes s_line_prefix n c {- 67 -} = case c of BlockOut -> s_block_line_prefix n BlockIn -> s_block_line_prefix n FlowOut -> s_flow_line_prefix n FlowIn -> s_flow_line_prefix n s_block_line_prefix n {- 68 -} = s_indent n s_flow_line_prefix n {- 69 -} = s_indent n & ( s_separate_in_line ?) -- 6.4 Empty Lines l_empty n c {- 70 -} = ( s_line_prefix n c / s_indent_lt n ) & b_as_line_feed -- 6.5 Line Folding b_l_trimmed n c {- 71 -} = b_non_content & ( l_empty n c +) b_as_space {- 72 -} = token LineFold b_break b_l_folded n c {- 73 -} = b_l_trimmed n c / b_as_space s_flow_folded n {- 74 -} = ( s_separate_in_line ?) & b_l_folded n FlowIn & s_flow_line_prefix n -- 6.6 Comments c_nb_comment_text {- 75 -} = wrapTokens BeginComment EndComment $ c_comment & meta ( nb_char *) b_comment {- 76 -} = b_non_content / eof s_b_comment {- 77 -} = ( s_separate_in_line & ( c_nb_comment_text ?) ?) & b_comment l_comment {- 78 -} = s_separate_in_line & ( c_nb_comment_text ?) & b_comment s_l_comments {- 79 -} = ( s_b_comment / sol ) & ( nonEmpty l_comment *) -- 6.7 Separation Lines s_separate n c {- 80 -} = case c of BlockOut -> s_separate_lines n BlockIn -> s_separate_lines n FlowOut -> s_separate_lines n FlowIn -> s_separate_lines n BlockKey -> s_separate_in_line FlowKey -> s_separate_in_line s_separate_lines n {- 81 -} = s_l_comments & s_flow_line_prefix n / s_separate_in_line -- 6.8 Directives l_directive {- 82 -} = ( wrapTokens BeginDirective EndDirective $ c_directive ! DeDoc & DeDirective ^ ( ns_yaml_directive / ns_tag_directive / ns_reserved_directive ) ) & s_l_comments ns_reserved_directive {- 83 -} = ns_directive_name & ( s_separate_in_line & ns_directive_parameter *) ns_directive_name {- 84 -} = meta ( ns_char +) ns_directive_parameter {- 85 -} = meta ( ns_char +) -- 6.8.1 Yaml Directives ns_yaml_directive {- 86 -} = meta [ 'Y', 'A', 'M', 'L' ] ! DeDirective & s_separate_in_line & ns_yaml_version ns_yaml_version {- 87 -} = meta ( ( ns_dec_digit +) & '.' & ( ns_dec_digit +) ) -- 6.8.2 Tag Directives ns_tag_directive {- 88 -} = meta [ 'T', 'A', 'G' ] ! DeDirective & s_separate_in_line & c_tag_handle & s_separate_in_line & ns_tag_prefix -- 6.8.2.1 Tag Handles c_tag_handle {- 89 -} = c_named_tag_handle / c_secondary_tag_handle / c_primary_tag_handle c_primary_tag_handle {- 90 -} = wrapTokens BeginHandle EndHandle $ c_tag c_secondary_tag_handle {- 91 -} = wrapTokens BeginHandle EndHandle $ c_tag & c_tag c_named_tag_handle {- 92 -} = wrapTokens BeginHandle EndHandle $ c_tag & meta ( ns_word_char +) & c_tag -- 6.8.2.2 Tag Prefixes ns_tag_prefix {- 93 -} = wrapTokens BeginTag EndTag $ ( c_ns_local_tag_prefix / ns_global_tag_prefix ) c_ns_local_tag_prefix {- 94 -} = c_tag & meta ( ns_uri_char *) ns_global_tag_prefix {- 95 -} = meta ( ns_tag_char & ( ns_uri_char *) ) -- 6.9 Node Properties c_ns_properties n c {- 96 -} = wrapTokens BeginProperties EndProperties $ ( c_ns_tag_property & ( s_separate n c & c_ns_anchor_property ?) ) / ( c_ns_anchor_property & ( s_separate n c & c_ns_tag_property ?) ) -- 6.9.1 Node Tags c_ns_tag_property {- 97 -} = wrapTokens BeginTag EndTag $ c_verbatim_tag / c_ns_shorthand_tag / c_non_specific_tag c_verbatim_tag {- 98 -} = c_tag & indicator '<' & meta ( ns_uri_char +) & indicator '>' c_ns_shorthand_tag {- 99 -} = c_tag_handle & meta ( ns_tag_char +) c_non_specific_tag {- 100 -} = c_tag -- 6.9.2 Node Anchors c_ns_anchor_property {- 101 -} = wrapTokens BeginAnchor EndAnchor $ c_anchor & ns_anchor_name ns_anchor_char {- 102 -} = ns_char - c_flow_indicator ns_anchor_name {- 103 -} = meta ( ns_anchor_char +) -- 7.1 Alias Nodes c_ns_alias_node {- 104 -} = wrapTokens BeginAlias EndAlias $ c_alias ! DeNode & ns_anchor_name -- 7.2 Empty Nodes e_scalar {- 105 -} = wrapTokens BeginScalar EndScalar empty e_node {- 106 -} = wrapTokens BeginNode EndNode e_scalar -- 7.3.1 Double Quoted Style nb_double_char {- 107 -} = DeEscape ^ ( c_ns_esc_char / ( nb_json - c_escape - c_double_quote ) ) ns_double_char {- 108 -} = nb_double_char - s_white c_double_quoted n c {- 109 -} = wrapTokens BeginScalar EndScalar $ c_double_quote ! DeNode & text ( nb_double_text n c ) & c_double_quote nb_double_text n c {- 110 -} = case c of FlowOut -> nb_double_multi_line n FlowIn -> nb_double_multi_line n BlockKey -> nb_double_one_line FlowKey -> nb_double_one_line nb_double_one_line {- 111 -} = ( nb_double_char *) s_double_escaped n {- 112 -} = ( s_white *) & wrapTokens BeginEscape EndEscape ( c_escape ! DeEscape & b_non_content ) & ( l_empty n FlowIn *) & s_flow_line_prefix n s_double_break n {- 113 -} = DeEscape ^ ( s_double_escaped n / s_flow_folded n ) nb_ns_double_in_line {- 114 -} = ( ( s_white *) & ns_double_char *) s_double_next_line n {- 115 -} = s_double_break n & ( ns_double_char & nb_ns_double_in_line & ( s_double_next_line n / ( s_white *) ) ?) nb_double_multi_line n {- 116 -} = nb_ns_double_in_line & ( s_double_next_line n / ( s_white *) ) -- 7.3.2 Single Quoted Style c_quoted_quote {- 117 -} = wrapTokens BeginEscape EndEscape $ c_single_quote ! DeEscape & meta '\'' nb_single_char {- 118 -} = DeEscape ^ ( c_quoted_quote / ( nb_json - c_single_quote ) ) ns_single_char {- 119 -} = nb_single_char - s_white c_single_quoted n c {- 120 -} = wrapTokens BeginScalar EndScalar $ c_single_quote ! DeNode & text ( nb_single_text n c ) & c_single_quote nb_single_text n c {- 121 -} = case c of FlowOut -> nb_single_multi_line n FlowIn -> nb_single_multi_line n BlockKey -> nb_single_one_line FlowKey -> nb_single_one_line nb_single_one_line {- 122 -} = ( nb_single_char *) nb_ns_single_in_line {- 123 -} = ( ( s_white *) & ns_single_char *) s_single_next_line n {- 124 -} = s_flow_folded n & ( ns_single_char & nb_ns_single_in_line & ( s_single_next_line n / ( s_white *) ) ?) nb_single_multi_line n {- 125 -} = nb_ns_single_in_line & ( s_single_next_line n / ( s_white *) ) -- 7.3.3 Plain Style ns_plain_first _c {- 126 -} = ns_char - c_indicator / ( ':' / '?' / '-' ) & ( (ns_plain_safe _c) >?) ns_plain_safe c {- 127 -} = case c of FlowOut -> ns_plain_safe_out FlowIn -> ns_plain_safe_in BlockKey -> ns_plain_safe_out FlowKey -> ns_plain_safe_in ns_plain_safe_out {- 128 -} = ns_char ns_plain_safe_in {- 129 -} = ns_char - c_flow_indicator ns_plain_char c {- 130 -} = ns_plain_safe c - ':' - '#' / ( ns_char ?) ns_plain n c {- 131 -} = wrapTokens BeginScalar EndScalar $ text (case c of FlowOut -> ns_plain_multi_line n c FlowIn -> ns_plain_multi_line n c BlockKey -> ns_plain_one_line c FlowKey -> ns_plain_one_line c) nb_ns_plain_in_line c {- 132 -} = ( ( s_white *) & ns_plain_char c *) ns_plain_one_line c {- 133 -} = ns_plain_first c ! DeNode & nb_ns_plain_in_line c s_ns_plain_next_line n c {- 134 -} = s_flow_folded n & ns_plain_char c & nb_ns_plain_in_line c ns_plain_multi_line n c {- 135 -} = ns_plain_one_line c & ( s_ns_plain_next_line n c *) -- 7.4 Flow Collection Styles in_flow c {- 136 -} = case c of FlowOut -> FlowIn FlowIn -> FlowIn BlockKey -> FlowKey FlowKey -> FlowKey -- 7.4.1 Flow Sequences c_flow_sequence n c {- 137 -} = wrapTokens BeginSequence EndSequence $ c_sequence_start ! DeNode & ( s_separate n c ?) & ( ns_s_flow_seq_entries n (in_flow c) ?) & c_sequence_end ns_s_flow_seq_entries n c {- 138 -} = ns_flow_seq_entry n c & ( s_separate n c ?) & ( c_collect_entry & ( s_separate n c ?) & ( ns_s_flow_seq_entries n c ?) ?) ns_flow_seq_entry n c {- 139 -} = DePair ^ ( ns_flow_pair n c / DeNode ^ ns_flow_node n c ) -- 7.4.2 Flow Mappings c_flow_mapping n c {- 140 -} = wrapTokens BeginMapping EndMapping $ c_mapping_start ! DeNode & ( s_separate n c ?) & ( ns_s_flow_map_entries n (in_flow c) ?) & c_mapping_end ns_s_flow_map_entries n c {- 141 -} = ns_flow_map_entry n c & ( s_separate n c ?) & ( c_collect_entry & ( s_separate n c ?) & ( ns_s_flow_map_entries n c ?) ?) ns_flow_map_entry n c {- 142 -} = wrapTokens BeginPair EndPair $ DeKey ^ ( ( c_mapping_key ! DeKey & s_separate n c & ns_flow_map_explicit_entry n c ) / ns_flow_map_implicit_entry n c ) ns_flow_map_explicit_entry n c {- 143 -} = ns_flow_map_implicit_entry n c / ( e_node & e_node ) ns_flow_map_implicit_entry n c {- 144 -} = DePair ^ ( ns_flow_map_yaml_key_entry n c / c_ns_flow_map_empty_key_entry n c / c_ns_flow_map_json_key_entry n c ) ns_flow_map_yaml_key_entry n c {- 145 -} = ( DeNode ^ ns_flow_yaml_node n c ) ! DePair & ( ( ( s_separate n c ?) & c_ns_flow_map_separate_value n c ) / e_node ) c_ns_flow_map_empty_key_entry n c {- 146 -} = e_node & c_ns_flow_map_separate_value n c c_ns_flow_map_separate_value n c {- 147 -} = c_mapping_value & ( (ns_plain_safe c) >!) ! DePair & ( ( s_separate n c & ns_flow_node n c ) / e_node ) c_ns_flow_map_json_key_entry n c {- 148 -} = ( DeNode ^ c_flow_json_node n c ) ! DePair & ( ( ( s_separate n c ?) & c_ns_flow_map_adjacent_value n c ) / e_node ) c_ns_flow_map_adjacent_value n c {- 149 -} = c_mapping_value ! DePair & ( ( ( s_separate n c ?) & ns_flow_node n c ) / e_node ) ns_flow_pair n c {- 150 -} = wrapTokens BeginMapping EndMapping $ wrapTokens BeginPair EndPair $ ( ( c_mapping_key ! DePair & s_separate n c & ns_flow_map_explicit_entry n c ) / ns_flow_pair_entry n c ) ns_flow_pair_entry n c {- 151 -} = ( ns_flow_pair_yaml_key_entry n c / c_ns_flow_map_empty_key_entry n c / c_ns_flow_pair_json_key_entry n c ) ns_flow_pair_yaml_key_entry n c {- 152 -} = ns_s_implicit_yaml_key FlowKey & c_ns_flow_map_separate_value n c c_ns_flow_pair_json_key_entry n c {- 153 -} = c_s_implicit_json_key FlowKey & c_ns_flow_map_adjacent_value n c ns_s_implicit_yaml_key c {- 154 -} = ( DeNode ^ ( ns_flow_yaml_node na c ) & ( s_separate_in_line ?) ) `limitedTo` 1024 c_s_implicit_json_key c {- 155 -} = ( DeNode ^ ( c_flow_json_node na c ) & ( s_separate_in_line ?) ) `limitedTo` 1024 -- 7.5 Flow Nodes ns_flow_yaml_content n c {- 156 -} = ns_plain n c c_flow_json_content n c {- 157 -} = c_flow_sequence n c / c_flow_mapping n c / c_single_quoted n c / c_double_quoted n c ns_flow_content n c {- 158 -} = ns_flow_yaml_content n c / c_flow_json_content n c ns_flow_yaml_node n c {- 159 -} = wrapTokens BeginNode EndNode $ c_ns_alias_node / ns_flow_yaml_content n c / ( c_ns_properties n c & ( ( s_separate n c & ns_flow_yaml_content n c ) / e_scalar ) ) c_flow_json_node n c {- 160 -} = wrapTokens BeginNode EndNode $ ( c_ns_properties n c & s_separate n c ?) & c_flow_json_content n c ns_flow_node n c {- 161 -} = wrapTokens BeginNode EndNode $ c_ns_alias_node / ns_flow_content n c / ( c_ns_properties n c & ( ( s_separate n c & ns_flow_content n c ) / e_scalar ) ) -- 8.1.1 Block Scalar Headers c_b_block_header n {- 162 -} = DeHeader ^ ( do m <- c_indentation_indicator n t <- c_chomping_indicator ( s_white / b_char ) ?! DeHeader s_b_comment result (m, t) / do t <- c_chomping_indicator m <- c_indentation_indicator n s_b_comment result (m, t) ) -- 8.1.1.1 Block Indentation Indicator c_indentation_indicator n {- 163 -} = fmap fixup (indicator ( ns_dec_digit - '0' ) & asInteger) / detect_scalar_indentation n where fixup | n == -1 = (.+ 1) -- compensate for anomaly at left-most n | otherwise = id detect_scalar_indentation n = peek $ ( nb_char *) -- originally: -- & ( b_non_content & ( l_empty n BlockIn *) ?) & ( b_break & ( (s_space *) & b_break *) ?) & count_spaces (-n) count_spaces n = (s_space & count_spaces (n .+ 1)) / result (max 1 n) -- 8.1.1.2 Chomping Indicator c_chomping_indicator {- 164 -} = indicator '-' & result Strip / indicator '+' & result Keep / result Clip end_block_scalar t = case t of Strip -> emptyToken EndScalar Clip -> emptyToken EndScalar Keep -> empty b_chomped_last t {- 165 -} = case t of Strip -> emptyToken EndScalar & b_non_content Clip -> b_as_line_feed & emptyToken EndScalar Keep -> b_as_line_feed l_chomped_empty n t {- 166 -} = case t of Strip -> l_strip_empty n Clip -> l_strip_empty n Keep -> l_keep_empty n l_strip_empty n {- 167 -} = ( s_indent_le n & b_non_content *) & ( l_trail_comments n ?) l_keep_empty n {- 168 -} = ( l_empty n BlockIn *) & emptyToken EndScalar & ( l_trail_comments n ?) l_trail_comments n {- 169 -} = s_indent_lt n & c_nb_comment_text & b_comment & ( nonEmpty l_comment *) -- 8.1.2 Literal Style c_l__literal n {- 170 -} = do emptyToken BeginScalar c_literal ! DeNode (m, t) <- c_b_block_header n `prefixErrorWith` emptyToken EndScalar text ( l_literal_content (n .+ m) t ) l_nb_literal_text n {- 171 -} = ( l_empty n BlockIn *) & s_indent n & ( nb_char +) b_nb_literal_next n {- 172 -} = b_as_line_feed & l_nb_literal_text n l_literal_content n t {- 173 -} = ( ( l_nb_literal_text n & ( b_nb_literal_next n *) & b_chomped_last t ) / end_block_scalar t ) & l_chomped_empty n t -- 8.1.3 Folded Style c_l__folded n {- 174 -} = do emptyToken BeginScalar c_folded ! DeNode (m, t) <- c_b_block_header n `prefixErrorWith` emptyToken EndScalar text ( l_folded_content (n .+ m) t ) s_nb_folded_text n {- 175 -} = s_indent n & ns_char ! DeFold & ( nb_char *) l_nb_folded_lines n {- 176 -} = s_nb_folded_text n & ( b_l_folded n BlockIn & s_nb_folded_text n *) s_nb_spaced_text n {- 177 -} = s_indent n & s_white ! DeFold & ( nb_char *) b_l_spaced n {- 178 -} = b_as_line_feed & ( l_empty n BlockIn *) l_nb_spaced_lines n {- 179 -} = s_nb_spaced_text n & ( b_l_spaced n & s_nb_spaced_text n *) l_nb_same_lines n {- 180 -} = ( l_empty n BlockIn *) & DeFold ^ ( l_nb_folded_lines n / l_nb_spaced_lines n ) l_nb_diff_lines n {- 181 -} = l_nb_same_lines n & ( b_as_line_feed & l_nb_same_lines n *) l_folded_content n t {- 182 -} = ( ( l_nb_diff_lines n & b_chomped_last t ) / end_block_scalar t ) & l_chomped_empty n t -- 8.2.1 Block Sequences detect_collection_indentation n = peek $ ( nonEmpty l_comment* ) & count_spaces (-n) detect_inline_indentation = peek $ count_spaces 0 l__block_sequence n {- 183 -} = do m <- detect_collection_indentation n wrapTokens BeginSequence EndSequence $ ( s_indent (n .+ m) & c_l_block_seq_entry (n .+ m) +) c_l_block_seq_entry n {- 184 -} = c_sequence_entry & ( ns_char >!) ! DeNode & s_l__block_indented n BlockIn s_l__block_indented n c {- 185 -} = do m <- detect_inline_indentation DeNode ^ ( ( s_indent m & ( ns_l_in_line_sequence (n .+ 1 .+ m) / ns_l_in_line_mapping (n .+ 1 .+ m) ) ) / s_l__block_node n c / ( e_node & ( s_l_comments ?) & unparsed (n .+ 1) ) ) `recovery` unparsed (n .+ 1) ns_l_in_line_sequence n {- 186 -} = wrapTokens BeginNode EndNode $ wrapTokens BeginSequence EndSequence $ c_l_block_seq_entry n & ( s_indent n & c_l_block_seq_entry n *) -- 8.2.2 Block Mappings l__block_mapping n = {- 187 -} do m <- detect_collection_indentation n wrapTokens BeginMapping EndMapping $ ( s_indent (n .+ m) & ns_l_block_map_entry (n .+ m) +) ns_l_block_map_entry n {- 188 -} = wrapTokens BeginPair EndPair $ c_l_block_map_explicit_entry n / ns_l_block_map_implicit_entry n c_l_block_map_explicit_entry n {- 189 -} = c_l_block_map_explicit_key n & ( l_block_map_explicit_value n / e_node ) c_l_block_map_explicit_key n {- 190 -} = c_mapping_key & ( ns_char >!) ! DeNode & s_l__block_indented n BlockOut l_block_map_explicit_value n {- 191 -} = s_indent n & c_mapping_value & s_l__block_indented n BlockOut ns_l_block_map_implicit_entry n {- 192 -} = ( ns_s_block_map_implicit_key / e_node ) & c_l_block_map_implicit_value n ns_s_block_map_implicit_key {- 193 -} = c_s_implicit_json_key BlockKey / ns_s_implicit_yaml_key BlockKey c_l_block_map_implicit_value n {- 194 -} = c_mapping_value ! DeNode & ( ( s_l__block_node n BlockOut / ( e_node & ( s_l_comments ?) & unparsed (n .+ 1) ) ) `recovery` unparsed (n .+ 1) ) ns_l_in_line_mapping n {- 195 -} = wrapTokens BeginNode EndNode $ wrapTokens BeginMapping EndMapping $ ns_l_block_map_entry n & ( s_indent n & ns_l_block_map_entry n *) -- 8.2.3 Block Nodes unparsed n = ( sol / unparsed_text & unparsed_break ) & ( nonEmpty ( unparsed_indent n & unparsed_text & unparsed_break ) *) unparsed_indent n = token Unparsed ( s_space % n ) unparsed_text = token Unparsed ( upto ( eof / c_forbidden / b_break ) ) unparsed_break = eof / peek c_forbidden / token Unparsed b_break / empty s_l__block_node n c {- 196 -} = s_l__block_in_block n c / s_l__flow_in_block n s_l__flow_in_block n {- 197 -} = s_separate (n .+ 1) FlowOut & ns_flow_node (n .+ 1) FlowOut & s_l_comments s_l__block_in_block n c {- 198 -} = wrapTokens BeginNode EndNode $ ( s_l__block_scalar n c / s_l__block_collection n c ) s_l__block_scalar n c {- 199 -} = s_separate (n .+ 1) c & ( c_ns_properties (n .+ 1) c & s_separate (n .+ 1) c ?) & ( c_l__literal n / c_l__folded n ) s_l__block_collection n c {- 200 -} = ( s_separate (n .+ 1) c & c_ns_properties (n .+ 1) c & ( s_l_comments >?) ?) & s_l_comments & ( l__block_sequence (seq_spaces n c) / l__block_mapping n ) seq_spaces n c {- 201 -} = case c of BlockOut -> n .- 1 BlockIn -> n -- 9.1.1 Document Prefix l_document_prefix {- 202 -} = ( c_byte_order_mark ?) & ( nonEmpty l_comment *) -- 9.1.2 Document Markers c_directives_end {- 203 -} = token DirectivesEnd [ '-', '-', '-' ] c_document_end {- 204 -} = token DocumentEnd [ '.', '.', '.' ] l_document_suffix {- 205 -} = c_document_end & s_l_comments c_forbidden {- 206 -} = sol & ( c_directives_end / c_document_end ) & ( b_char / s_white / eof ) -- 9.1.3 Explicit Documents l_bare_document {- 207 -} = DeNode ^ s_l__block_node (-1) BlockIn `forbidding` c_forbidden -- 9.1.4 Explicit Documents l_explicit_document {- 208 -} = ( c_directives_end & ( b_char / s_white / eof >?)) ! DeDoc & ( ( l_bare_document / e_node & ( s_l_comments ?) & unparsed 0 ) `recovery` unparsed 0 ) -- 9.1.5 Directives Documents l_directives_document {- 209 -} = ( l_directive +) & l_explicit_document -- 9.2 Streams: l_any_document {- 210 -} = wrapTokens BeginDocument EndDocument $ DeDoc ^ ( l_directives_document / l_explicit_document / l_bare_document ) `recovery` unparsed 0 l_yaml_stream {- 211 -} = ( nonEmpty l_document_prefix *) & ( eof / ( c_document_end & ( b_char / s_white / eof ) >?) / l_any_document ) & ( nonEmpty ( DeMore ^ ( ( l_document_suffix ! DeMore +) & ( nonEmpty l_document_prefix *) & ( eof / l_any_document ) / ( nonEmpty l_document_prefix *) & DeDoc ^ ( wrapTokens BeginDocument EndDocument l_explicit_document ?) ) ) *) HsYAML-0.1.2.0/src/Data/YAML/Loader.hs0000644000000000000000000001460513467630442015056 0ustar0000000000000000{-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE RecursiveDo #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE Trustworthy #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Loader ( decodeLoader , Loader(..) , NodeId ) where import Control.Monad.Except import Control.Monad.State import qualified Data.ByteString.Lazy as BS.L import qualified Data.Map as Map import qualified Data.Set as Set import Data.YAML.Event (Tag) import qualified Data.YAML.Event as YE import Util -- | Unique identifier for identifying nodes -- -- This is allows to observe the alias/anchor-reference structure type NodeId = Word -- | Structure defining how to construct a document tree/graph -- data Loader m n = Loader { yScalar :: Tag -> YE.Style -> Text -> m (Either String n) , ySequence :: Tag -> [n] -> m (Either String n) , yMapping :: Tag -> [(n,n)] -> m (Either String n) , yAlias :: NodeId -> Bool -> n -> m (Either String n) , yAnchor :: NodeId -> n -> m (Either String n) } -- | Generalised document tree/graph construction -- -- This doesn't yet perform any tag resolution (thus all scalars are -- represented as 'Text' values). See also 'decodeNode' for a more -- convenient interface. {-# INLINEABLE decodeLoader #-} decodeLoader :: forall n m . MonadFix m => Loader m n -> BS.L.ByteString -> m (Either String [n]) decodeLoader Loader{..} bs0 = do case sequence . YE.parseEvents $ bs0 of Left (pos,err) | YE.posCharOffset pos < 0 -> return (Left err) | otherwise -> return (Left $ ":" ++ show (YE.posLine pos) ++ ":" ++ show (YE.posColumn pos) ++ ": " ++ err) Right evs -> runParserT goStream evs where goStream :: PT n m [n] goStream = do _ <- satisfy (== YE.StreamStart) ds <- manyUnless (== YE.StreamEnd) goDoc eof return ds goDoc :: PT n m n goDoc = do _ <- satisfy isDocStart modify $ \s0 -> s0 { sDict = mempty, sCycle = mempty } n <- goNode _ <- satisfy isDocEnd return n getNewNid :: PT n m Word getNewNid = state $ \s0 -> let i0 = sIdCnt s0 in (i0, s0 { sIdCnt = i0+1 }) returnNode :: (Maybe YE.Anchor) -> Either String n -> PT n m n returnNode _ (Left err) = throwError err returnNode Nothing (Right node) = return node returnNode (Just a) (Right node) = do nid <- getNewNid node0 <- lift $ yAnchor nid node node' <- liftEither node0 modify $ \s0 -> s0 { sDict = Map.insert a (nid,node') (sDict s0) } return node' registerAnchor :: Maybe YE.Anchor -> PT n m n -> PT n m n registerAnchor Nothing pn = pn registerAnchor (Just a) pn = do modify $ \s0 -> s0 { sCycle = Set.insert a (sCycle s0) } nid <- getNewNid mdo modify $ \s0 -> s0 { sDict = Map.insert a (nid,n) (sDict s0) } n0 <- pn n1 <- lift $ yAnchor nid n0 n <- liftEither n1 return n exitAnchor :: Maybe YE.Anchor -> PT n m () exitAnchor Nothing = return () exitAnchor (Just a) = modify $ \s0 -> s0 { sCycle = Set.delete a (sCycle s0) } goNode :: PT n m n goNode = do n <- satisfy (const True) case n of YE.Scalar manc tag sty val -> do exitAnchor manc n' <- lift $ yScalar tag sty val returnNode manc $! n' YE.SequenceStart manc tag -> registerAnchor manc $ do ns <- manyUnless (== YE.SequenceEnd) goNode exitAnchor manc liftEither =<< (lift $ ySequence tag ns) YE.MappingStart manc tag -> registerAnchor manc $ do kvs <- manyUnless (== YE.MappingEnd) (liftM2 (,) goNode goNode) exitAnchor manc liftEither =<< (lift $ yMapping tag kvs) YE.Alias a -> do d <- gets sDict cy <- gets sCycle case Map.lookup a d of Nothing -> throwError ("anchor not found: " ++ show a) Just (nid,n') -> liftEither =<< (lift $ yAlias nid (Set.member a cy) n') _ -> throwError "goNode: unexpected event" ---------------------------------------------------------------------------- -- small parser framework data S n = S { sEvs :: [YE.Event] , sDict :: Map YE.Anchor (Word,n) , sCycle :: Set YE.Anchor , sIdCnt :: !Word } newtype PT n m a = PT (StateT (S n) (ExceptT String m) a) deriving ( Functor , Applicative , Monad , MonadState (S n) , MonadError String , MonadFix ) instance MonadTrans (PT n) where lift = PT . lift . lift runParserT :: Monad m => PT n m a -> [YE.Event] -> m (Either String a) runParserT (PT act) s0 = runExceptT $ evalStateT act (S s0 mempty mempty 0) satisfy :: Monad m => (YE.Event -> Bool) -> PT n m YE.Event satisfy p = do s0 <- get case sEvs s0 of [] -> throwError "satisfy: premature eof" (ev:rest) | p ev -> do put (s0 { sEvs = rest}) return ev | otherwise -> throwError ("satisfy: predicate failed " ++ show ev) peek :: Monad m => PT n m (Maybe YE.Event) peek = do s0 <- get case sEvs s0 of [] -> return Nothing (ev:_) -> return (Just ev) peek1 :: Monad m => PT n m YE.Event peek1 = maybe (throwError "peek1: premature eof") return =<< peek anyEv :: Monad m => PT n m YE.Event anyEv = satisfy (const True) eof :: Monad m => PT n m () eof = do s0 <- get case sEvs s0 of [] -> return () _ -> throwError "eof expected" -- NB: consumes the end-event manyUnless :: Monad m => (YE.Event -> Bool) -> PT n m a -> PT n m [a] manyUnless p act = do t0 <- peek1 if p t0 then anyEv >> return [] else liftM2 (:) act (manyUnless p act) {- tryError :: MonadError e m => m a -> m (Either e a) tryError act = catchError (Right <$> act) (pure . Left) -} isDocStart :: YE.Event -> Bool isDocStart (YE.DocumentStart _) = True isDocStart _ = False isDocEnd :: YE.Event -> Bool isDocEnd (YE.DocumentEnd _) = True isDocEnd _ = False HsYAML-0.1.2.0/src/Data/YAML/Schema.hs0000644000000000000000000002552413467630442015052 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} {-# LANGUAGE ScopedTypeVariables #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- YAML 1.2 Schema resolvers -- module Data.YAML.Schema ( SchemaResolver(..) , failsafeSchemaResolver , jsonSchemaResolver , coreSchemaResolver , Scalar(..) , tagNull, tagBool, tagStr, tagInt, tagFloat, tagSeq, tagMap ) where import Control.Monad.Except import qualified Data.Char as C import qualified Data.Map as Map import qualified Data.Set as Set import qualified Data.Text as T import Numeric (readHex, readOct) import Text.Parsec as P import Text.Parsec.Text import Data.YAML.Event (Tag, isUntagged, mkTag, untagged) import qualified Data.YAML.Event as YE import Util -- | Primitive scalar types as defined in YAML 1.2 data Scalar = SNull -- ^ @tag:yaml.org,2002:null@ | SBool !Bool -- ^ @tag:yaml.org,2002:bool@ | SFloat !Double -- ^ @tag:yaml.org,2002:float@ | SInt !Integer -- ^ @tag:yaml.org,2002:int@ | SStr !Text -- ^ @tag:yaml.org,2002:str@ | SUnknown !Tag !Text -- ^ unknown/unsupported tag or untagged (thus unresolved) scalar deriving (Eq,Ord,Show) -- | Definition of a [YAML 1.2 Schema](http://yaml.org/spec/1.2/spec.html#Schema) -- -- A YAML schema defines how implicit tags are resolved to concrete tags and how data is represented textually in YAML. data SchemaResolver = SchemaResolver { schemaResolverScalar :: Tag -> YE.Style -> T.Text -> Either String Scalar , schemaResolverSequence :: Tag -> Either String Tag , schemaResolverMapping :: Tag -> Either String Tag } data ScalarTag = ScalarBangTag -- ^ non-specific ! tag | ScalarQMarkTag -- ^ non-specific ? tag | ScalarTag !Tag -- ^ specific tag -- common logic for 'schemaResolverScalar' scalarTag :: (ScalarTag -> T.Text -> Either String Scalar) -> Tag -> YE.Style -> T.Text -> Either String Scalar scalarTag f tag sty val = f tag' val where tag' = case sty of YE.Plain | tag == untagged -> ScalarQMarkTag -- implicit ? tag _ | tag == untagged -> ScalarBangTag -- implicit ! tag | tag == tagBang -> ScalarBangTag -- explicit ! tag | otherwise -> ScalarTag tag -- | \"Failsafe\" schema resolver as specified -- in [YAML 1.2 / 10.1.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2803036) failsafeSchemaResolver :: SchemaResolver failsafeSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | otherwise = Right (SUnknown t v) go ScalarQMarkTag v = Right (SUnknown untagged v) -- leave unresolved -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | otherwise = Right t -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | otherwise = Right t -- | Strict JSON schema resolver as specified -- in [YAML 1.2 / 10.2.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2804356) jsonSchemaResolver :: SchemaResolver jsonSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | t == tagNull = if isNullLiteral v then Right SNull else Left ("invalid !!null " ++ show v) | t == tagInt = maybe (Left $ "invalid !!int " ++ show v) (Right . SInt) $ jsonDecodeInt v | t == tagFloat = maybe (Left $ "invalid !!float " ++ show v) (Right . SFloat) $ jsonDecodeFloat v | t == tagBool = maybe (Left $ "invalid !!bool " ++ show v) (Right . SBool) $ jsonDecodeBool v | otherwise = Right (SUnknown t v) -- unknown specific tag go ScalarQMarkTag v | isNullLiteral v = Right SNull | Just b <- jsonDecodeBool v = Right $! SBool b | Just i <- jsonDecodeInt v = Right $! SInt i | Just f <- jsonDecodeFloat v = Right $! SFloat f | otherwise = Right (SUnknown untagged v) -- leave unresolved -- FIXME: YAML 1.2 spec requires an error here isNullLiteral = (== "null") -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | isUntagged t = Right tagMap | otherwise = Right t -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | isUntagged t = Right tagSeq | otherwise = Right t -- | Core schema resolver as specified -- in [YAML 1.2 / 10.3.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2805071) coreSchemaResolver :: SchemaResolver coreSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | t == tagNull = if isNullLiteral v then Right SNull else Left ("invalid !!null " ++ show v) | t == tagInt = maybe (Left $ "invalid !!int " ++ show v) (Right . SInt) $ coreDecodeInt v | t == tagFloat = maybe (Left $ "invalid !!float " ++ show v) (Right . SFloat) $ coreDecodeFloat v | t == tagBool = maybe (Left $ "invalid !!bool " ++ show v) (Right . SBool) $ coreDecodeBool v | otherwise = Right (SUnknown t v) -- unknown specific tag go ScalarQMarkTag v | isNullLiteral v = Right SNull | Just b <- coreDecodeBool v = Right $! SBool b | Just i <- coreDecodeInt v = Right $! SInt i | Just f <- coreDecodeFloat v = Right $! SFloat f | otherwise = Right (SStr v) -- map to !!str by default isNullLiteral = flip Set.member (Set.fromList [ "", "null", "NULL", "Null", "~" ]) -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | isUntagged t = Right tagMap | otherwise = Right t -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | isUntagged t = Right tagSeq | otherwise = Right t -- | @tag:yaml.org,2002:bool@ (JSON Schema) jsonDecodeBool :: T.Text -> Maybe Bool jsonDecodeBool "false" = Just False jsonDecodeBool "true" = Just True jsonDecodeBool _ = Nothing -- | @tag:yaml.org,2002:bool@ (Core Schema) coreDecodeBool :: T.Text -> Maybe Bool coreDecodeBool = flip Map.lookup $ Map.fromList [ ("true", True) , ("True", True) , ("TRUE", True) , ("false", False) , ("False", False) , ("FALSE", False) ] -- | @tag:yaml.org,2002:int@ according to JSON Schema -- -- > 0 | -? [1-9] [0-9]* jsonDecodeInt :: T.Text -> Maybe Integer jsonDecodeInt t | T.null t = Nothing jsonDecodeInt "0" = Just 0 jsonDecodeInt t = do -- [-]? [1-9] [0-9]* let tabs | T.isPrefixOf "-" t = T.tail t | otherwise = t guard (not (T.null tabs)) guard (T.head tabs /= '0') guard (T.all C.isDigit tabs) readMaybe (T.unpack t) -- | @tag:yaml.org,2002:int@ according to Core Schema -- -- > [-+]? [0-9]+ (Base 10) -- > 0o [0-7]+ (Base 8) -- > 0x [0-9a-fA-F]+ (Base 16) -- coreDecodeInt :: T.Text -> Maybe Integer coreDecodeInt t | T.null t = Nothing -- > 0x [0-9a-fA-F]+ (Base 16) | Just rest <- T.stripPrefix "0x" t , T.all C.isHexDigit rest , [(j,"")] <- readHex (T.unpack rest) = Just $! j -- 0o [0-7]+ (Base 8) | Just rest <- T.stripPrefix "0o" t , T.all C.isOctDigit rest , [(j,"")] <- readOct (T.unpack rest) = Just $! j -- [-+]? [0-9]+ (Base 10) | T.all C.isDigit t = Just $! read (T.unpack t) | Just rest <- T.stripPrefix "+" t , not (T.null rest) , T.all C.isDigit rest = Just $! read (T.unpack rest) | Just rest <- T.stripPrefix "-" t , not (T.null rest) , T.all C.isDigit rest = Just $! read (T.unpack t) | otherwise = Nothing -- | @tag:yaml.org,2002:float@ according to JSON Schema -- -- > -? ( 0 | [1-9] [0-9]* ) ( \. [0-9]* )? ( [eE] [-+]? [0-9]+ )? -- jsonDecodeFloat :: T.Text -> Maybe Double jsonDecodeFloat = either (const Nothing) Just . parse float "" where float :: Parser Double float = do -- -? p0 <- option "" ("-" <$ char '-') -- ( 0 | [1-9] [0-9]* ) p1 <- do d <- digit if (d /= '0') then (d:) <$> P.many digit else pure [d] -- ( \. [0-9]* )? p2 <- option "" $ (:) <$> char '.' <*> option "0" (many1 digit) -- ( [eE] [-+]? [0-9]+ )? p3 <- option "" $ do void (char 'e' P.<|> char 'E') s <- option "" (("-" <$ char '-') P.<|> ("" <$ char '+')) d <- P.many1 digit pure ("e" ++ s ++ d) eof let t' = p0++p1++p2++p3 pure $! read t' -- | @tag:yaml.org,2002:float@ according to Core Schema -- -- > [-+]? ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )? -- coreDecodeFloat :: T.Text -> Maybe Double coreDecodeFloat t | Just j <- Map.lookup t literals = Just j -- short-cut | otherwise = either (const Nothing) Just . parse float "" $ t where float :: Parser Double float = do -- [-+]? p0 <- option "" (("-" <$ char '-') P.<|> "" <$ char '+') -- ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) p1 <- (char '.' *> (("0."++) <$> many1 digit)) P.<|> do d1 <- many1 digit d2 <- option "" $ (:) <$> char '.' <*> option "0" (many1 digit) pure (d1++d2) -- ( [eE] [-+]? [0-9]+ )? p2 <- option "" $ do void (char 'e' P.<|> char 'E') s <- option "" (("-" <$ char '-') P.<|> ("" <$ char '+')) d <- P.many1 digit pure ("e" ++ s ++ d) eof let t' = p0++p1++p2 pure $! read t' literals = Map.fromList [ ("0" , 0) , (".nan", (0/0)) , (".NaN", (0/0)) , (".NAN", (0/0)) , (".inf", (1/0)) , (".Inf", (1/0)) , (".INF", (1/0)) , ("+.inf", (1/0)) , ("+.Inf", (1/0)) , ("+.INF", (1/0)) , ("-.inf", (-1/0)) , ("-.Inf", (-1/0)) , ("-.INF", (-1/0)) ] tagNull, tagBool, tagStr, tagInt, tagFloat, tagSeq, tagMap, tagBang :: Tag tagNull = mkTag "tag:yaml.org,2002:null" tagStr = mkTag "tag:yaml.org,2002:str" tagInt = mkTag "tag:yaml.org,2002:int" tagFloat = mkTag "tag:yaml.org,2002:float" tagBool = mkTag "tag:yaml.org,2002:bool" tagSeq = mkTag "tag:yaml.org,2002:seq" tagMap = mkTag "tag:yaml.org,2002:map" tagBang = mkTag "!" HsYAML-0.1.2.0/src/Data/YAML/Token/0000755000000000000000000000000013467630442014366 5ustar0000000000000000HsYAML-0.1.2.0/src/Data/YAML/Token/Encoding.hs0000644000000000000000000002610613467630442016455 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Oren Ben-Kiki 2007, -- © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- UTF decoding -- -- This really should be factored out to the standard libraries. Since it isn't -- there, we get to tailor it exactly to our needs. We use lazy byte strings as -- input, which should give reasonable I\/O performance when reading large -- files. The output is a normal 'Char' list which is easy to work with and -- should be efficient enough as long as the 'Parser' does its job right. -- module Data.YAML.Token.Encoding ( decode , Encoding(..) ) where import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Char8 as BLC import Util -- | Recognized Unicode encodings. As of YAML 1.2 UTF-32 is also required. data Encoding = UTF8 -- ^ UTF-8 encoding (or ASCII) | UTF16LE -- ^ UTF-16 little endian | UTF16BE -- ^ UTF-16 big endian | UTF32LE -- ^ UTF-32 little endian | UTF32BE -- ^ UTF-32 big endian -- | @show encoding@ converts an 'Encoding' to the encoding name (with a "-") -- as used by most programs. instance Show Encoding where show UTF8 = "UTF-8" show UTF16LE = "UTF-16LE" show UTF16BE = "UTF-16BE" show UTF32LE = "UTF-32LE" show UTF32BE = "UTF-32BE" -- | @decode bytes@ automatically detects the 'Encoding' used and converts the -- /bytes/ to Unicode characters, with byte offsets. Note the offset is for -- past end of the character, not its beginning. decode :: BLC.ByteString -> (Encoding, [(Int, Char)]) decode text = (encoding, undoEncoding encoding text) where encoding = detectEncoding $ BLC.unpack $ BLC.take 4 text -- | @detectEncoding text@ examines the first few chars (bytes) of the /text/ -- to deduce the Unicode encoding used according to the YAML spec. detectEncoding :: [Char] -> Encoding detectEncoding text = case text of '\x00' : '\x00' : '\xFE' : '\xFF' : _ -> UTF32BE '\x00' : '\x00' : '\x00' : _ : _ -> UTF32BE '\xFF' : '\xFE' : '\x00' : '\x00' : _ -> UTF32LE _ : '\x00' : '\x00' : '\x00' : _ -> UTF32LE '\xFE' : '\xFF' : _ -> UTF16BE '\x00' : _ : _ -> UTF16BE '\xFF' : '\xFE' : _ -> UTF16LE _ : '\x00' : _ -> UTF16LE '\xEF' : '\xBB' : '\xBF' : _ -> UTF8 _ -> UTF8 -- | @undoEncoding encoding bytes@ converts a /bytes/ stream to Unicode -- characters according to the /encoding/. undoEncoding :: Encoding -> BLC.ByteString -> [(Int, Char)] undoEncoding encoding bytes = case encoding of UTF8 -> undoUTF8 bytes 0 UTF16LE -> combinePairs $ undoUTF16LE bytes 0 UTF16BE -> combinePairs $ undoUTF16BE bytes 0 UTF32LE -> validateScalars $ undoUTF32LE bytes 0 UTF32BE -> validateScalars $ undoUTF32BE bytes 0 where validateScalars [] = [] validateScalars (x@(_,c):rest) | '\xD800' <= c, c <= '\xDFFF' = error "UTF-32 stream contains invalid surrogate code-point" | otherwise = x : validateScalars rest -- ** UTF-32 decoding -- | @hasFewerThan bytes n@ checks whether there are fewer than /n/ /bytes/ -- left to read. hasFewerThan :: Int -> BLC.ByteString -> Bool hasFewerThan n bytes | n == 1 = BLC.null bytes | n > 1 = BLC.null bytes || hasFewerThan (n - 1) (BLC.tail bytes) | otherwise = False -- | @undoUTF32LE bytes offset@ decoded a UTF-32LE /bytes/ stream to Unicode -- chars. undoUTF32LE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF32LE bytes offset | BLC.null bytes = [] | hasFewerThan 4 bytes = error "UTF-32LE input contains invalid number of bytes" | otherwise = let first = BLC.head bytes bytes' = BLC.tail bytes second = BLC.head bytes' bytes'' = BLC.tail bytes' third = BLC.head bytes'' bytes''' = BLC.tail bytes'' fourth = BLC.head bytes''' rest = BLC.tail bytes''' in (offset + 4, chr $ (ord first) + 256 * ((ord second) + 256 * ((ord third) + 256 * ((ord fourth))))):(undoUTF32LE rest $ offset + 4) -- | @undoUTF32BE bytes offset@ decoded a UTF-32BE /bytes/ stream to Unicode -- chars. undoUTF32BE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF32BE bytes offset | BLC.null bytes = [] | hasFewerThan 4 bytes = error "UTF-32BE input contains invalid number of bytes" | otherwise = let first = BLC.head bytes bytes' = BLC.tail bytes second = BLC.head bytes' bytes'' = BLC.tail bytes' third = BLC.head bytes'' bytes''' = BLC.tail bytes'' fourth = BLC.head bytes''' rest = BLC.tail bytes''' in (offset + 4, chr $ (ord fourth) + 256 * ((ord third) + 256 * ((ord second) + 256 * ((ord first))))):(undoUTF32BE rest $ offset + 4) -- ** UTF-16 decoding -- | @combinePairs chars@ converts each pair of UTF-16 surrogate characters to a -- single Unicode character. combinePairs :: [(Int, Char)] -> [(Int, Char)] combinePairs [] = [] combinePairs (head'@(_, head_char):tail') | '\xD800' <= head_char && head_char <= '\xDBFF' = combineLead head' tail' | '\xDC00' <= head_char && head_char <= '\xDFFF' = error "UTF-16 contains trail surrogate without lead surrogate" | otherwise = head':(combinePairs tail') -- | @combineLead lead rest@ combines the /lead/ surrogate with the head of the -- /rest/ of the input chars, assumed to be a /trail/ surrogate, and continues -- combining surrogate pairs. combineLead :: (Int, Char) -> [(Int, Char)] -> [(Int, Char)] combineLead _lead [] = error "UTF-16 contains lead surrogate as final character" combineLead (_, lead_char) ((trail_offset, trail_char):rest) | '\xDC00' <= trail_char && trail_char <= '\xDFFF' = (trail_offset, combineSurrogates lead_char trail_char):combinePairs rest | otherwise = error "UTF-16 contains lead surrogate without trail surrogate" -- | @surrogateOffset@ is copied from the Unicode FAQs. surrogateOffset :: Int surrogateOffset = 0x10000 - (0xD800 * 1024) - 0xDC00 -- | @combineSurrogates lead trail@ combines two UTF-16 surrogates into a single -- Unicode character. combineSurrogates :: Char -> Char -> Char combineSurrogates lead trail = chr $ (ord lead) * 1024 + (ord trail) + surrogateOffset -- | @undoUTF18LE bytes offset@ decoded a UTF-16LE /bytes/ stream to Unicode -- chars. undoUTF16LE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF16LE bytes offset | BLC.null bytes = [] | hasFewerThan 2 bytes = error "UTF-16LE input contains odd number of bytes" | otherwise = let low = BLC.head bytes bytes' = BLC.tail bytes high = BLC.head bytes' rest = BLC.tail bytes' in (offset + 2, chr $ (ord low) + (ord high) * 256):(undoUTF16LE rest $ offset + 2) -- | @undoUTF18BE bytes offset@ decoded a UTF-16BE /bytes/ stream to Unicode -- chars. undoUTF16BE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF16BE bytes offset | BLC.null bytes = [] | hasFewerThan 2 bytes = error "UTF-16BE input contains odd number of bytes" | otherwise = let high = BLC.head bytes bytes' = BLC.tail bytes low = BLC.head bytes' rest = BLC.tail bytes' in (offset + 2, chr $ (ord low) + (ord high) * 256):(undoUTF16BE rest $ offset + 2) -- ** UTF-8 decoding -- | @undoUTF8 bytes offset@ decoded a UTF-8 /bytes/ stream to Unicode chars. undoUTF8 :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF8 bytes offset = undoUTF8' (BL.unpack bytes) offset w2c :: Word8 -> Char w2c = chr . fromIntegral w2i :: Word8 -> Int w2i = fromIntegral undoUTF8' :: [Word8] -> Int -> [(Int, Char)] undoUTF8' [] _ = [] undoUTF8' (first:rest) !offset | first < 0x80 = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 1 !c = w2c first undoUTF8' (first:rest) !offset | first < 0xC0 = error "UTF-8 input contains invalid first byte" | first < 0xE0 = decodeTwoUTF8 first offset rest | first < 0xF0 = decodeThreeUTF8 first offset rest | first < 0xF8 = decodeFourUTF8 first offset rest | otherwise = error "UTF-8 input contains invalid first byte" -- | @decodeTwoUTF8 first offset bytes@ decodes a two-byte UTF-8 character, -- where the /first/ byte is already available and the second is the head of -- the /bytes/, and then continues to undo the UTF-8 encoding. decodeTwoUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeTwoUTF8 first offset (second:rest) | second < 0x80 || 0xBF < second = error $ "UTF-8 double byte char has invalid second byte" | otherwise = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 2 !c = chr ((w2i first - 0xc0) * 0x40 + (w2i second - 0x80)) decodeTwoUTF8 _ _ [] = error "UTF-8 double byte char is missing second byte at eof" -- | @decodeThreeUTF8 first offset bytes@ decodes a three-byte UTF-8 character, -- where the /first/ byte is already available and the second and third are the -- head of the /bytes/, and then continues to undo the UTF-8 encoding. decodeThreeUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeThreeUTF8 first offset (second:third:rest) | second < 0x80 || 0xBF < second = error "UTF-8 triple byte char has invalid second byte" | third < 0x80 || 0xBF < third = error "UTF-8 triple byte char has invalid third byte" | otherwise = (offset', c): undoUTF8' rest offset' where !offset' = offset + 3 !c = chr(((w2i first) - 0xE0) * 0x1000 + ((w2i second) - 0x80) * 0x40 + ((w2i third) - 0x80)) decodeThreeUTF8 _ _ _ =error "UTF-8 triple byte char is missing bytes at eof" -- | @decodeFourUTF8 first offset bytes@ decodes a four-byte UTF-8 character, -- where the /first/ byte is already available and the second, third and fourth -- are the head of the /bytes/, and then continues to undo the UTF-8 encoding. decodeFourUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeFourUTF8 first offset (second:third:fourth:rest) | second < 0x80 || 0xBF < second = error "UTF-8 quad byte char has invalid second byte" | third < 0x80 || 0xBF < third = error "UTF-8 quad byte char has invalid third byte" | third < 0x80 || 0xBF < third = error "UTF-8 quad byte char has invalid fourth byte" | otherwise = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 4 !c = chr(((w2i first) - 0xF0) * 0x40000 + ((w2i second) - 0x80) * 0x1000 + ((w2i third) - 0x80) * 0x40 + ((w2i fourth) - 0x80)) decodeFourUTF8 _ _ _ = error "UTF-8 quad byte char is missing bytes at eof" HsYAML-0.1.2.0/src-test/0000755000000000000000000000000013467630442012650 5ustar0000000000000000HsYAML-0.1.2.0/src-test/Main.hs0000644000000000000000000003060313467630442014072 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Main where import Control.Monad import Control.Monad.Identity import qualified Data.ByteString.Char8 as BS import qualified Data.ByteString.Lazy.Char8 as BS.L import Data.Int (Int64) import Data.List (groupBy) import Data.Maybe import System.Directory import System.Environment import System.Exit import System.FilePath import System.IO import Text.Printf (printf) import Text.Read import qualified Data.Aeson.Micro as J import Data.Map (Map) import qualified Data.Map as Map import Data.Text (Text) import qualified Data.Text as T import qualified Data.Text.Encoding as T import qualified Data.Text.IO as T import Data.YAML as Y import Data.YAML.Event as YE import qualified Data.YAML.Token as YT import qualified TML main :: IO () main = do args <- getArgs case args of ("yaml2event":args') | null args' -> cmdYaml2Event | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event sub-command" exitFailure ("yaml2token":args') | null args' -> cmdYaml2Token | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2token sub-command" exitFailure ("yaml2json":args') | null args' -> cmdYaml2Json | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2json sub-command" exitFailure ("run-tml":args') -> cmdRunTml args' ("testml-compiler":args') -> cmdTestmlCompiler args' _ -> do hPutStrLn stderr "usage: yaml-test []" hPutStrLn stderr "" hPutStrLn stderr "Commands:" hPutStrLn stderr "" hPutStrLn stderr " yaml2event reads YAML stream from STDIN and dumps events to STDOUT" hPutStrLn stderr " yaml2json reads YAML stream from STDIN and dumps JSON to STDOUT" hPutStrLn stderr " run-tml run/validate YAML-specific .tml file(s)" hPutStrLn stderr " testml-compiler emulate testml-compiler" exitFailure cmdYaml2Token :: IO () cmdYaml2Token = do inYamlDat <- BS.L.getContents forM_ (groupBy (\x y -> YT.tLine x == YT.tLine y) $ YT.tokenize inYamlDat False) $ \lgrp -> do forM_ lgrp $ \YT.Token{..} -> do let tText' | null tText = "" | any (== ' ') tText = replicate tLineChar ' ' ++ show tText | otherwise = replicate (tLineChar+1) ' ' ++ tail (init (show tText)) hPutStrLn stdout $ printf ":%d:%d: %-15s| %s" tLine tLineChar (show tCode) tText' hPutStrLn stdout "" hFlush stdout cmdYaml2Event :: IO () cmdYaml2Event = do inYamlDat <- BS.L.getContents forM_ (parseEvents inYamlDat) $ \ev -> case ev of Left (ofs,msg) -> do case msg of "" -> hPutStrLn stderr ("parsing error near byte offset " ++ show ofs) _ -> hPutStrLn stderr ("parsing error near byte offset " ++ show ofs ++ " (" ++ msg ++ ")") exitFailure Right event -> do hPutStrLn stdout (ev2str event) hFlush stdout -- | 'J.Value' look-alike data Value' = Object' (Map Text Value') | Array' [Value'] | String' !Text | NumberD' !Double | NumberI' !Integer | Bool' !Bool | Null' deriving Show toProperValue :: Value' -> J.Value toProperValue v = case v of Null' -> J.Null String' t -> J.String t NumberD' x -> J.Number x NumberI' x -> J.Number (fromInteger x) Bool' b -> J.Bool b Array' xs -> J.Array (map toProperValue xs) Object' xs -> J.Object (fmap toProperValue xs) instance FromYAML Value' where parseYAML (Y.Scalar s) = case s of SNull -> pure Null' SBool b -> pure (Bool' b) SFloat x -> pure (NumberD' x) SInt x -> pure (NumberI' x) SStr t -> pure (String' t) SUnknown _ t -> pure (String' t) -- HACK parseYAML (Y.Sequence _ xs) = Array' <$> mapM parseYAML xs parseYAML (Y.Mapping _ m) = Object' . Map.fromList <$> mapM parseKV (Map.toList m) where parseKV :: (Y.Node,Y.Node) -> Parser (Text,Value') parseKV (k,v) = (,) <$> parseK k <*> parseYAML v -- for numbers and !!null we apply implicit conversions parseK n = do k <- parseYAML n case k of NumberI' t -> pure (T.pack (show t)) NumberD' t -> pure (T.pack (show t)) String' t -> pure t Null' -> pure "" -- we stringify the key with an added risk of nameclashing _ -> pure $ T.decodeUtf8 $ J.encodeStrict $ toProperValue k -- _ -> fail ("dictionary entry had non-string key " ++ show k) decodeAeson :: BS.L.ByteString -> Either String [J.Value] decodeAeson = fmap (map toProperValue) . decode -- | Try to convert 'Double' into 'Int64', return 'Nothing' if not -- representable loss-free as integral 'Int64' value. doubleToInt64 :: Double -> Maybe Int64 doubleToInt64 x | fromInteger x' == x , x' <= toInteger (maxBound :: Int64) , x' >= toInteger (minBound :: Int64) = Just (fromIntegral x') | otherwise = Nothing where x' = round x decodeNumber :: T.Text -> Maybe Double decodeNumber = readMaybe . T.unpack -- fixme cmdYaml2Json :: IO () cmdYaml2Json = do inYamlDat <- BS.L.getContents case decodeAeson inYamlDat of Left e -> fail e Right vs -> do forM_ vs $ \v -> BS.L.putStrLn (J.encode v) return () unescapeSpcTab :: T.Text -> T.Text unescapeSpcTab = T.replace "" " " . T.replace "" "\t" data TestPass = PassExpErr -- ^ expected parse fail | PassEvs -- ^ events ok | PassEvsJson -- ^ events+json ok deriving (Eq,Ord,Show) data TestFail = FailParse -- ^ unexpected parse fail | FailSuccess -- ^ unexpected parse success | FailEvs -- ^ events wrong/mismatched | FailJson -- ^ JSON wrong/mismatched deriving (Eq,Ord,Show) data TestRes = Pass !TestPass | Fail !TestFail deriving (Eq,Ord,Show) cmdRunTml :: [FilePath] -> IO () cmdRunTml args = do results <- forM args $ \fn -> do tml <- BS.readFile fn hPutStr stdout (fn ++ " : ") hFlush stdout TML.Document _ blocks <- either (fail . T.unpack) pure $ TML.parse fn (T.decodeUtf8 tml) forM blocks $ \(TML.Block label points) -> do let dats = [ (k,v) | TML.PointStr k v <- points ] let isErr = isJust (lookup "error" dats) Just inYamlDat = BS.L.fromStrict . T.encodeUtf8 . unescapeSpcTab <$> lookup "in-yaml" dats Just testEvDat = lines . T.unpack . unescapeSpcTab <$> lookup "test-event" dats mInJsonDat :: Maybe [J.Value] mInJsonDat = (maybe (error ("invalid JSON in " ++ show fn)) id . J.decodeStrictN . T.encodeUtf8) <$> lookup "in-json" dats case sequence (parseEvents inYamlDat) of Left err | isErr -> do putStrLn "OK! (error)" pure (Pass PassExpErr) | otherwise -> do putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' (show err) putStrLn "" putStrLn' (show testEvDat) putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail FailParse) Right evs' -> do let evs'' = map ev2str evs' if evs'' == testEvDat then do case mInJsonDat of Nothing -> do putStrLn "OK!" pure (Pass PassEvs) Just inJsonDat -> do iutJson <- either fail pure $ decodeAeson inYamlDat if iutJson == inJsonDat then do putStrLn "OK! (+JSON)" pure (Pass PassEvsJson) else do putStrLn "FAIL! (bad JSON)" putStrLn' ("ref = " ++ show inJsonDat) putStrLn' ("iut = " ++ show iutJson) pure (Fail FailJson) else do if isErr then putStrLn "FAIL! (unexpected parser success)" else putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' ("ref = " ++ show testEvDat) putStrLn' ("iut = " ++ show evs'') putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail (if isErr then FailSuccess else FailEvs)) putStrLn "" let ok = length [ () | Pass _ <- results' ] nok = length [ () | Fail _ <- results' ] stat j = show $ Map.findWithDefault 0 j $ Map.fromListWith (+) [ (k,1::Int) | k <- results' ] results' = concat results putStrLn $ concat [ "done -- passed: ", show ok , " (ev: ", stat (Pass PassEvs), ", ev+json: ", stat (Pass PassEvsJson), ", err: ", stat (Pass PassExpErr), ") / " , "failed: ", show nok , " (err: ", stat (Fail FailParse), ", ev:", stat (Fail FailEvs), ", json:", stat (Fail FailJson), ", ok:", stat (Fail FailSuccess), ")" ] -- | Incomplete proof-of-concept 'testml-compiler' operation cmdTestmlCompiler :: [FilePath] -> IO () cmdTestmlCompiler [fn0] = do (fn,raw) <- case fn0 of "-" -> (,) "" <$> T.getContents _ -> (,) fn0 <$> T.readFile fn0 case TML.parse fn raw of Left e -> T.hPutStrLn stderr e >> exitFailure Right doc -> BS.putStrLn (J.encodeStrict doc) cmdTestmlCompiler _ = do hPutStrLn stderr ("Usage: yaml-test testml-compiler [ | - ]") exitFailure putStrLn' :: String -> IO () putStrLn' msg = putStrLn (" " ++ msg) ev2str :: Event -> String ev2str StreamStart = "+STR" ev2str (DocumentStart True) = "+DOC ---" ev2str (DocumentStart False) = "+DOC" ev2str MappingEnd = "-MAP" ev2str (MappingStart manc mtag) = "+MAP" ++ ancTagStr manc mtag ev2str SequenceEnd = "-SEQ" ev2str (SequenceStart manc mtag) = "+SEQ" ++ ancTagStr manc mtag ev2str (DocumentEnd True) = "-DOC ..." ev2str (DocumentEnd False) = "-DOC" ev2str StreamEnd = "-STR" ev2str (Alias a) = "=ALI *" ++ T.unpack a ev2str (YE.Scalar manc mtag sty v) = "=VAL" ++ ancTagStr manc mtag ++ v' where v' = case sty of Plain -> " :" ++ quote2 v DoubleQuoted -> " \"" ++ quote2 v Literal -> " |" ++ quote2 v Folded -> " >" ++ quote2 v SingleQuoted -> " '" ++ quote2 v ancTagStr manc mtag = anc' ++ tag' where anc' = case manc of Nothing -> "" Just anc -> " &" ++ T.unpack anc tag' = case tagToText mtag of Nothing -> "" Just t -> " <" ++ T.unpack t ++ ">" quote2 :: T.Text -> String quote2 = concatMap go . T.unpack where go c | c == '\n' = "\\n" | c == '\t' = "\\t" | c == '\b' = "\\b" | c == '\r' = "\\r" | c == '\\' = "\\\\" | otherwise = [c] testParse :: BS.L.ByteString -> IO () testParse bs0 = mapM_ (putStrLn' . showT) $ YT.tokenize bs0 False where showT :: YT.Token -> String showT t = replicate (YT.tLineChar t) ' ' ++ show (YT.tText t) ++ " " ++ show (YT.tCode t) HsYAML-0.1.2.0/src-test/TML.hs0000644000000000000000000003007613467630442013646 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Incomplete TestML 0.3.0 parser module TML ( TML.parse , Document(..) , Block(..) , Point(..) , PseudoId(..) , Code(..) , AssertOp(..) , CodeExpr(..) , CodeObject(..) , FunCall(..) ) where import Data.ByteString (ByteString) import qualified Data.ByteString.Char8 as B import Control.Applicative hiding (many, some) import Control.Monad import qualified Data.Aeson.Micro as J import qualified Data.ByteString as BS import qualified Data.Char as C import Data.List import qualified Data.Map as Map import Data.Maybe import qualified Data.Text as T import qualified Data.Text.Encoding as T import qualified Data.Text.IO as T import Data.Void import System.Environment import Text.Megaparsec import Text.Megaparsec.Char import qualified Text.Megaparsec.Char.Lexer as L import Text.Megaparsec.Expr type Parser = Parsec Void T.Text parse :: String -> T.Text -> Either T.Text Document parse fn raw = either (Left . T.pack . parseErrorPretty' raw) (Right .process_pseudo) (Text.Megaparsec.parse testml_document fn raw) ---------------------------------------------------------------------------- data Document = Document [Code] [Block] deriving Show instance J.ToJSON Document where toJSON (Document code dat) = J.object [ "testml" J..= ("0.3.0" :: T.Text) , "code" J..= code , "data" J..= dat ] data Block = Block !T.Text [Point] deriving Show instance J.ToJSON Block where toJSON (Block label points) = J.object [ "label" J..= label , "point" J..= J.object (map f points) ] where f (PointStr k v) = k J..= v f (PointPseudo k) = (T.pack (show k)) J..= True f (PointInt k v) = k J..= v data Point = PointStr !T.Text !T.Text | PointInt !T.Text !Integer | PointPseudo !PseudoId deriving Show instance J.ToJSON Code where toJSON (CodeAssignmentStmt lhs rhs) = J.Array [J.String "=", J.String lhs, J.toJSON rhs] toJSON stmt@(CodeExpressionStmt lhs massert) | pobjs@(_:_) <- pointObjsInExpr stmt = J.Array [ J.String "%()" , J.Array [ J.String ("*" `mappend` p) | p <- pobjs ] , expr' ] | otherwise = expr' where expr' = case massert of Just (op,rhs) -> J.toJSON (op,lhs,rhs) Nothing -> J.toJSON lhs data Code = CodeAssignmentStmt !T.Text !CodeExpr | CodeExpressionStmt !CodeExpr !(Maybe (AssertOp,CodeExpr)) | CodeImportStmt [T.Text] deriving Show instance J.ToJSON AssertOp where toJSON AssertEq = J.String "==" toJSON AssertHas = J.String "~~" toJSON AssertLike = J.String "=~" data AssertOp = AssertEq | AssertHas | AssertLike deriving Show instance J.ToJSON CodeExpr where toJSON (CodeExpr obj []) = J.toJSON obj toJSON (CodeExpr obj fns) = J.Array $ [J.String ".", J.toJSON obj] ++ map J.toJSON fns data CodeExpr = CodeExpr !CodeObject [FunCall] deriving Show instance J.ToJSON CodeObject where toJSON (StrObj s) = J.String s toJSON (NumObj n) = J.Number n toJSON (PointObj j) = J.Array [J.String "*", J.String j] toJSON (CallObj fn) = J.toJSON fn data CodeObject = StrObj !T.Text | CallObj !FunCall | NumObj !Double | PointObj !T.Text deriving Show instance J.ToJSON FunCall where toJSON (FunCall fn args) = J.Array (J.String fn : map J.toJSON args) data FunCall = FunCall !T.Text [CodeExpr] deriving Show pointObjsInExpr :: Code -> [T.Text] pointObjsInExpr co = nub $ case co of CodeAssignmentStmt _ expr -> goExpr expr CodeExpressionStmt e1 Nothing -> goExpr e1 CodeExpressionStmt e1 (Just (_,e2)) -> goExpr e1 ++ goExpr e2 where goExpr (CodeExpr obj fns) = goObj obj ++ concatMap goFun fns goFun (FunCall _ exprs) = concatMap goExpr exprs goObj (PointObj j) = [j] goObj (CallObj fn) = goFun fn goObj (StrObj _) = [] goObj (NumObj _) = [] testml_document :: Parser Document testml_document = Document <$> code_section <*> data_section <* eof pseudo_point_name :: Parser PseudoId pseudo_point_name = choice [ HEAD <$ string "HEAD" , LAST <$ string "LAST" , ONLY <$ string "ONLY" , SKIP <$ string "SKIP" , TODO <$ string "TODO" , DIFF <$ string "DIFF" ] data PseudoId = HEAD | LAST | ONLY | SKIP | TODO | DIFF deriving (Eq,Show) process_pseudo :: Document -> Document process_pseudo (Document code bs0) = Document code (go bs0) where go blocks | Just b <- find isOnly blocks' = [b] | Just bs <- goHead blocks' = bs | Just bs <- goLast [] blocks' = bs | otherwise = blocks' where blocks' = filter (not . isSkip) blocks isOnly b = ONLY `elem` pseudos b isSkip b = SKIP `elem` pseudos b isHead b = HEAD `elem` pseudos b isLast b = LAST `elem` pseudos b pseudos (Block _ ps) = [ k | PointPseudo k <- ps ] goHead [] = Nothing goHead (b:bs) | isHead b = Just (b:bs) | otherwise = goHead bs goLast acc [] = Nothing goLast acc (b:bs) | isLast b = Just $ reverse (b:bs) | otherwise = goLast (b:acc) bs code_section :: Parser [Code] code_section = do xs <- many code_statement pure (catMaybes xs) where code_statement = choice [ Nothing <$ comment_lines , Just <$> import_directive , Just <$> assignment_statement , Just <$> expression_statement ] import_directive = do string "%Import" ws mods <- module_name `sepBy1` ws ws0 eol pure $! CodeImportStmt mods module_name :: Parser T.Text module_name = T.pack <$> some alphaNumChar assignment_statement = do v <- try $ do v' <- identifier_name ws void (char '=') <|> void (string "||=") -- FIXME ws pure v' e <- code_expression eol pure (CodeAssignmentStmt v e) expression_statement = do -- TODO: expression-label -- optional (double_string >> char ':' >> ws0) -- TODO: pick-expression lhs <- code_expression ws op <- choice [ AssertEq <$ string "==" , AssertHas <$ string "~~" , AssertLike <$ string "=~" ] ws rhs <- code_expression optional $ do ws0 char ':' double_string eol pure (CodeExpressionStmt lhs (Just (op,rhs))) code_expression :: Parser CodeExpr code_expression = CodeExpr <$> code_object <*> many function_call -- quoted string double_string :: Parser T.Text double_string = do char '"' str <- many (noneOf ("\n\"\\" :: [Char]) <|> (char '\\' >> (unesc <$> oneOf ("\\\"0nt" :: [Char])))) char '"' pure $! (T.pack str) where unesc '0' = '\0' unesc 'n' = '\n' unesc 't' = '\t' unesc c = c single_string :: Parser T.Text single_string = do char '\'' str <- many (noneOf ("\n'\\" :: [Char]) <|> (char '\\' >> (oneOf ("\\'" :: [Char])))) char '\'' pure $! (T.pack str) function_call :: Parser FunCall function_call = do char '.' call_object call_object :: Parser FunCall call_object = FunCall <$> identifier_name <*> optional' [] (between (char '(') (char ')') $ code_expression `sepBy1` (char ',' >> ws0)) optional' def p = do x <- optional p case x of Nothing -> pure def Just y -> pure y code_object :: Parser CodeObject code_object = choice [ mkPoint <$> char '*' <*> lowerChar <*> many (lowerChar <|> digitChar <|> char '-' <|> char '_') , mkNum <$> optional (char '-') <*> some digitChar <*> optional (char '.' >> some digitChar) , CallObj <$> call_object , StrObj <$> single_string , StrObj <$> double_string ] "code-object" where mkPoint _ c cs = PointObj $! (T.pack (c:cs)) mkNum msign ds1 mds2 = NumObj $! (read $ (maybe id (:) msign) ds1 ++ (maybe "" ('.':) mds2)) data_section :: Parser [Block] data_section = many block_definition where block_definition = do -- block_heading string "===" *> ws l <- T.pack <$> manyTill anyChar eol -- TODO: user_defined ps <- many point_definition pure (Block l ps) point_definition = do string "---" *> ws j <- eitherP identifier_user pseudo_point_name filters <- maybe [] id <$> optional filter_spec let single = do _ <- char ':' *> ws x <- T.pack <$> manyTill anyChar eol -- consume and ignore any point_lines _ <- point_lines pure $! case j of Left j' -> mkSinglePointVal j' (transformPoint True filters x) Right j' -> PointPseudo j' -- is this allowed? multi = do ws0 *> eol x <- point_lines pure $! case j of Left j' -> PointStr j' (transformPoint False filters x) Right j' -> PointPseudo j' single <|> multi filter_spec = between (char '(') (char ')') $ many (oneOf ("<#+-~/@" :: [Char])) mkSinglePointVal k v | T.all C.isDigit v = PointInt k (read (T.unpack v)) | otherwise = PointStr k v point_lines :: Parser T.Text point_lines = T.pack . unlines <$> go where go = many (notFollowedBy point_boundary *> manyTill anyChar eol) point_boundary :: Parser () point_boundary = void (string "---") <|> void (string "===") <|> eof identifier_user :: Parser T.Text identifier_user = do x <- (:) <$> lowerChar <*> many alphaNumChar xs <- many ((:) <$> char '-' <*> some alphaNumChar) pure $! T.pack (concat (x:xs)) identifier_name :: Parser T.Text identifier_name = do x <- (:) <$> letterChar <*> many alphaNumChar xs <- many ((:) <$> char '-' <*> some alphaNumChar) pure $! T.pack (concat (x:xs)) ws :: Parser () ws = void $ takeWhile1P (Just "BLANK") (\c -> c == ' ' || c == '\t') ws0 :: Parser () ws0 = void $ takeWhileP (Just "BLANK") (\c -> c == ' ' || c == '\t') blank_line :: Parser () blank_line = (try (ws0 <* eol) <|> try (ws <* eof)) "blank-line" comment_line :: Parser () comment_line = (char '#' *> takeWhileP Nothing (/= '\n') *> void eol) "comment-line" comment_lines :: Parser () comment_lines = void (some (comment_line <|> blank_line)) stripTrailEols :: T.Text -> T.Text stripTrailEols = go where go t | T.isSuffixOf "\n\n" t = go (T.init t) | T.isSuffixOf "\r\n\r\n" t = go (T.init (T.init t)) | t == "\n" = "" | otherwise = t -- 'undent' stripPrefixInd :: T.Text -> T.Text stripPrefixInd = T.unlines . map go . T.lines where go t | T.isPrefixOf " " t = T.drop 4 t | T.isPrefixOf " " t = T.drop 3 t | T.isPrefixOf " " t = T.drop 2 t | T.isPrefixOf " " t = T.drop 1 t | otherwise = t stripComments :: T.Text -> T.Text stripComments = T.unlines . filter (not . T.isPrefixOf "#") . T.lines transformPoint :: Bool -> [Char] -> T.Text -> T.Text transformPoint single mods0 -- TODO: backslash = go mods0 . (if keepBlanks then id else stripTrailEols) . (if keepComments then id else stripComments) where keepBlanks = single || ('+' `elem` mods0) keepComments = single || ('#' `elem` mods0) go [] = id go ('<':xs) | single = error "invalid filter for point-single" | otherwise = go xs . stripPrefixInd go ('+':xs) = go xs -- negative flag go ('#':xs) = go xs -- negative flag go ('-':xs) = go xs . T.dropWhileEnd C.isSpace go (c:_) = error ("unknown filter " ++ show c)