HsYAML-0.2.1.4/0000755000000000000000000000000007346545000011102 5ustar0000000000000000HsYAML-0.2.1.4/ChangeLog.md0000644000000000000000000001216707346545000013262 0ustar0000000000000000See also http://pvp.haskell.org/faq ### 0.2.1.4 _2024-04-25_ * Drop support for GHC 7 * Testsuite: relax lower bounds to accommodate LTS 11.22 (GHC 8.2) for new Stack CI * Tested with GHC 8.0 - 9.10.0 (alpha3) ### 0.2.1.3 _2023-10-14_ * Pacify `x-partial` warning of GHC 9.8 * Tested with GHC 7.10 - 9.8.1 ### 0.2.1.2 _2023-09-29_ * Add `default-extensions: TypeOperators` to silence warning under GHC ≥ 9.4. * Support latest versions of dependencies. * Tested with GHC 7.10 - 9.8.0 ### 0.2.1.1 _2022-05-11_ * Compatibility with `mtl-2.3` * Tested with GHC 7.4 - 9.2 ### 0.2.1.0 * Define `Functor Doc` instance ([#33](https://github.com/haskell-hvr/HsYAML/issues/33)) * New `withScalar` function and also define `ToYAML Scalar` and `FromYAML Scalar` instances * Export `Pair` `type` synonym from `Data.YAML` ([#31](https://github.com/haskell-hvr/HsYAML/issues/31)) * New `Data.YAML.prettyPosWithSource` function for pretty-printing source locations (i.e. `Pos` values) * Add export `docRoot :: Doc n -> n` field accessor for convenience ([#32](https://github.com/haskell-hvr/HsYAML/issues/32)) ## 0.2.0.0 This release incorporates the work from [Vijay Tadikamalla's GSOC 2019 Project](https://vijayphoenix.github.io/blog/gsoc-the-conclusion/). Highlights of this major release include support for emitting YAML as well as providing direct access to source locations throughout the parsing pipeline for improved error reporting. * Changes in `Data.YAML` module * YAML 1.2 Schema encoders ([#21](https://github.com/haskell-hvr/HsYAML/pull/21)) * New `ToYAML` class for encoding Haskell Data-types from which YAML nodes can be constructed ([#20](https://github.com/haskell-hvr/HsYAML/pull/20)) * New functions like `encodeNode`, `encodeNode'` for constructing AST * New functions like `encode`, `encode1`, `encodeStrict`, `encode1Strict` for supporting typeclass-based dumping * Some ToYAML instances and other api * Modify `typeMismatch` function to show error source location in error messages ([#19](https://github.com/haskell-hvr/HsYAML/pull/19)) * Provide location-aware `failAtNode` alternative to `fail` * Changes in `Data.YAML.Event` module * Preserve and round-trip Comments at Event level([#24](https://github.com/haskell-hvr/HsYAML/pull/24)) * New `Comment` Event to preserve comments while parsing * Some additional implementations to preserve and round-trip comments * Fix issue [#22](https://github.com/haskell-hvr/HsYAML/issues/22) * New `EvPos` type for recording event and their corresponding position ([#19](https://github.com/haskell-hvr/HsYAML/pull/19)) * Preserve Flow Mapping and Flow sequence ([#18](https://github.com/haskell-hvr/HsYAML/pull/18)) * Features to preserve Literal/Folded ScalarStyle ([#15](https://github.com/haskell-hvr/HsYAML/pull/15)) * New `Chomp` type denoting Block Chomping Indicator * New `IndentOfs` type denoting Block Indentation Indicator * New `NodeStyle` type denoting flow/block style * `Event(SequenceStart,MappingStart)` constructors now record `NodeStyle` * `Style` type renamed to `ScalarType` * New `writeEvents` and `writeEventsText` function * `Event(DocumentStart)` now records YAML directive * Event parser now rejects duplicate/unsupported YAML/TAG directives as mandated by the YAML 1.2 specification * Move some schema related definitions from `Data.YAML` into the new `Data.YAML.Schema` module * Make `decode`, `decode1`, `decodeStrict`, `decode1Strict`, `decodeNode`, and `decodeNode'` treat duplicate keys (under the respective YAML schema) in YAML mappings as a loader-error (controllable via new `schemaResolverMappingDuplicates` schema property) * Define `Generic` and `NFData` instances for most types * Fix `X38W` testcase ([#13](https://github.com/haskell-hvr/HsYAML/issues/13), [#14](https://github.com/haskell-hvr/HsYAML/issues/14)) --- #### 0.1.1.3 * Fix bug in float regexp being too lax in the JSON and Core schema ([#7](https://github.com/hvr/HsYAML/issues/7)) * Remove dependency on `dlist` #### 0.1.1.2 * Tolerate BOM at *each* `l-document-prefix` (rather than only at the first one encountered in a YAML stream) * Workaround broken `mtl-2.2.2` bundled in GHC 8.4.1 ([#1](https://github.com/hvr/HsYAML/issues/1)) * Relax to GPL-2.0-or-later #### 0.1.1.1 * Reject (illegal) non-scalar code-points in UTF-32 streams * Tolerate BOM at start of stream * Disambiguate choice in `l-any-document` production regarding token separation of `c-directives-end` * Fix `c-indentation-indicator(n)` grammar production when auto-detecting indentation in the presence of empty leading lines; also reject (illegal) auto-indent-level scalars with leading more-indented all-space lines * Complete character escape rules for double-quoted scalars * Minor optimizations ### 0.1.1.0 * `Data.YAML` module promoted from `TrustWorthy` to `Safe` * Add `FromYAML Natural` instance * Add `MonadFail`, `Alternative`, and `MonadPlus` instances for `Data.YAML.Parser` * Add `Data.YAML.decodeStrict` function * Export `Data.YAML.typeMismatch` helper function ## 0.1.0.0 * First version. Released on an unsuspecting world. HsYAML-0.2.1.4/HsYAML.cabal0000644000000000000000000001251607346545000013130 0ustar0000000000000000cabal-version: 1.14 name: HsYAML version: 0.2.1.4 synopsis: Pure Haskell YAML 1.2 processor homepage: https://github.com/haskell-hvr/HsYAML bug-reports: https://github.com/haskell-hvr/HsYAML/issues license: GPL-2 X-SPDX-License-Identifier: GPL-2.0-or-later license-files: LICENSE.GPLv2 LICENSE.GPLv3 author: Herbert Valerio Riedel maintainer: https://github.com/haskell-hvr/HsYAML copyright: 2015-2018 Herbert Valerio Riedel , 2007-2008 Oren Ben-Kiki category: Text build-type: Simple tested-with: GHC == 9.10.0 GHC == 9.8.2 GHC == 9.6.5 GHC == 9.4.8 GHC == 9.2.8 GHC == 9.0.2 GHC == 8.10.7 GHC == 8.8.4 GHC == 8.6.5 GHC == 8.4.4 GHC == 8.2.2 GHC == 8.0.2 description: @HsYAML@ is a [YAML 1.2](http://yaml.org/spec/1.2/spec.html) processor, i.e. a library for parsing and serializing YAML documents. . Features of @HsYAML@ include: . * Pure Haskell implementation with small dependency footprint and emphasis on strict compliance with the [YAML 1.2 specification](http://yaml.org/spec/1.2/spec.html). * Direct decoding to native Haskell types via (@aeson@-inspired) typeclass-based API (see "Data.YAML"). * Allows round-tripping while preserving ordering, anchors, and comments at Event-level. * Support for constructing custom YAML node graph representation (including support for cyclic YAML data structures). * Support for the standard (untyped) /Failsafe/, (strict) /JSON/, and (flexible) /Core/ \"schemas\" providing implicit typing rules as defined in the YAML 1.2 specification (including support for user-defined custom schemas; see "Data.YAML.Schema"). * Support for emitting YAML using /Failsafe/, (strict) /JSON/, and (flexible) /Core/ \"schemas\" (including support for user-defined custom encoding schemas; see "Data.YAML.Schema"). * Event-based API resembling LibYAML's Event-based API (see "Data.YAML.Event"). * Low-level API access to lexical token-based scanner (see "Data.YAML.Token"). . See also the package which allows to decode and encode YAML by leveraging @aeson@'s 'FromJSON' and 'ToJSON' instances. extra-source-files: ChangeLog.md source-repository head type: git location: https://github.com/haskell-hvr/HsYAML.git flag exe description: Enable @exe:yaml-test@ component manual: True default: False library hs-source-dirs: src exposed-modules: Data.YAML , Data.YAML.Schema , Data.YAML.Event , Data.YAML.Token other-modules: Data.YAML.Loader , Data.YAML.Dumper , Data.YAML.Internal , Data.YAML.Event.Internal , Data.YAML.Event.Writer , Data.YAML.Pos , Data.YAML.Schema.Internal , Data.YAML.Token.Encoding , Util , Data.DList default-language: Haskell2010 default-extensions: TypeOperators other-extensions: DeriveGeneric FlexibleContexts FlexibleInstances FunctionalDependencies MultiParamTypeClasses OverloadedStrings PostfixOperators RecordWildCards RecursiveDo Safe ScopedTypeVariables Trustworthy TypeSynonymInstances build-depends: base >= 4.9 && < 5 , bytestring >= 0.9 && < 0.13 , containers >= 0.4.2 && < 0.8 , deepseq >= 1.3.0 && < 1.6 , text >= 1.2.3 && < 2.2 , mtl >= 2.2.1 && < 2.4 , parsec >= 3.1.13.0 && < 3.2 , transformers >= 0.4 && < 0.7 ghc-options: -Wall -Wcompat executable yaml-test hs-source-dirs: src-test main-is: Main.hs other-modules: TML default-language: Haskell2010 if flag(exe) build-depends: HsYAML -- inherited constraints , bytestring >= 0.10.8.0 , base , text , containers , mtl -- non-inherited , megaparsec >= 7.0 && < 10 , microaeson == 0.1.* , filepath >= 1.4 && < 1.6 , directory >= 1.2 && < 1.4 else buildable: False ghc-options: -rtsopts test-suite tests default-language: Haskell2010 type: exitcode-stdio-1.0 hs-source-dirs: tests main-is: Tests.hs ghc-options: -rtsopts build-depends: HsYAML -- inherited constraints , bytestring >= 0.10.8.0 , base , text , containers , mtl -- non-inherited -- lower bounds chosen from lts-11.22 (GHC 8.2) , QuickCheck >= 2.10.1 && < 2.16 , tasty >= 1.0.1.1 && < 1.6 , tasty-quickcheck >= 0.9.2 && < 0.11 HsYAML-0.2.1.4/LICENSE.GPLv20000644000000000000000000004317307346545000013010 0ustar0000000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. HsYAML-0.2.1.4/LICENSE.GPLv30000644000000000000000000010444207346545000013006 0ustar0000000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . HsYAML-0.2.1.4/Setup.hs0000644000000000000000000000005607346545000012537 0ustar0000000000000000import Distribution.Simple main = defaultMain HsYAML-0.2.1.4/src-test/0000755000000000000000000000000007346545000012646 5ustar0000000000000000HsYAML-0.2.1.4/src-test/Main.hs0000644000000000000000000007255407346545000014103 0ustar0000000000000000{-# LANGUAGE LambdaCase #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Main where import Control.Monad import Control.Monad.Identity import qualified Data.ByteString.Char8 as BS import qualified Data.ByteString.Lazy.Char8 as BS.L import Data.Int (Int64) import Data.List (groupBy) import Data.Maybe import System.Directory import System.Environment import System.Exit import System.FilePath import System.IO import Text.Printf (printf) import Text.Read import qualified Data.Aeson.Micro as J import Data.Map (Map) import qualified Data.Map as Map import Data.Text (Text) import qualified Data.Text as T import qualified Data.Text.Encoding as T import qualified Data.Text.IO as T import Data.YAML as Y import Data.YAML.Event as YE import Data.YAML.Schema as Y import qualified Data.YAML.Token as YT import qualified TML main :: IO () main = do args <- getArgs case args of ("yaml2event":args') | null args' -> cmdYaml2Event | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event sub-command" exitFailure ("yaml2event-pos":args') | null args' -> cmdYaml2EventPos | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event sub-command" exitFailure ("yaml2yaml-validate":args') | null args' -> cmdYaml2YamlVal | otherwise -> do hPutStrLn stderr "unexpected arguments passed to check sub-command" exitFailure ("yaml2event0":args') | null args' -> cmdYaml2Event0 | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event0 sub-command" exitFailure ("yaml2token":args') | null args' -> cmdYaml2Token | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2token sub-command" exitFailure ("yaml2token0":args') | null args' -> cmdYaml2Token0 | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2token0 sub-command" exitFailure ("yaml2json":args') | null args' -> cmdYaml2Json | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2json sub-command" exitFailure ("yaml2yaml":args') | null args' -> cmdYaml2Yaml | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2yaml sub-command" exitFailure ("yaml2yaml-":args') | null args' -> cmdYaml2Yaml' | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2yaml- sub-command" exitFailure ("yaml2yaml-dump": args') | null args' -> cmdDumpYAML | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event sub-command" exitFailure ("yaml2node":args') | null args' -> cmdPrintNode | otherwise -> do hPutStrLn stderr "unexpected arguments passed to yaml2event sub-command" exitFailure ("run-tml":args') -> cmdRunTml args' ("run-tml2":args') -> cmdRunTml' args' -- Temp function for check comment round-trip ("testml-compiler":args') -> cmdTestmlCompiler args' _ -> do hPutStrLn stderr "usage: yaml-test []" hPutStrLn stderr "" hPutStrLn stderr "Commands:" hPutStrLn stderr "" hPutStrLn stderr " yaml2token reads YAML stream from STDIN and dumps tokens to STDOUT" hPutStrLn stderr " yaml2token0 reads YAML stream from STDIN and prints count of tokens to STDOUT" hPutStrLn stderr " yaml2event reads YAML stream from STDIN and dumps events to STDOUT" hPutStrLn stderr " yaml2event0 reads YAML stream from STDIN and prints count of events to STDOUT" hPutStrLn stderr " yaml2event-pos reads YAML stream from STDIN and dumps events & position to STDOUT" hPutStrLn stderr " yaml2json reads YAML stream from STDIN and dumps JSON to STDOUT" hPutStrLn stderr " yaml2yaml reads YAML stream from STDIN and dumps YAML to STDOUT (non-streaming version)" hPutStrLn stderr " yaml2yaml- reads YAML stream from STDIN and dumps YAML to STDOUT (streaming version)" hPutStrLn stderr " yaml2yaml-validate reads YAML stream from STDIN and dumps YAML to STDOUT and also outputs the no. of differences and differences after a round-trip" hPutStrLn stderr " yaml2node reads YAML stream from STDIN and dumps YAML Nodes to STDOUT" hPutStrLn stderr " yaml2yaml-dump reads YAML stream from STDIN and dumps YAML to STDOUT after a complete round-trip" hPutStrLn stderr " run-tml run/validate YAML-specific .tml file(s)" hPutStrLn stderr " run-tml2 run/validate YAML-specific .tml file(s) while preserving comments" hPutStrLn stderr " testml-compiler emulate testml-compiler" exitFailure cmdYaml2Token :: IO () cmdYaml2Token = do inYamlDat <- BS.L.getContents forM_ (groupBy (\x y -> YT.tLine x == YT.tLine y) $ YT.tokenize inYamlDat False) $ \lgrp -> do forM_ lgrp $ \YT.Token{..} -> do let tText' | null tText = "" | any (== ' ') tText = replicate tLineChar ' ' ++ show tText | otherwise = replicate (tLineChar+1) ' ' ++ drop 1 (init (show tText)) hPutStrLn stdout $ printf ":%d:%d: %-15s| %s" tLine tLineChar (show tCode) tText' hPutStrLn stdout "" hFlush stdout cmdYaml2Token0 :: IO () cmdYaml2Token0 = do inYamlDat <- BS.L.getContents print (length (YT.tokenize inYamlDat False)) cmdYaml2Yaml :: IO () cmdYaml2Yaml = do inYamlDat <- BS.L.getContents case sequence $ parseEvents inYamlDat of Left (ofs,msg) -> do hPutStrLn stderr ("Parsing error near byte offset " ++ show ofs ++ if null msg then "" else " (" ++ msg ++ ")") exitFailure Right events -> do BS.L.hPutStr stdout (writeEvents YT.UTF8 (map eEvent events)) hFlush stdout -- lazy streaming version cmdYaml2Yaml' :: IO () cmdYaml2Yaml' = do inYamlDat <- BS.L.getContents BS.L.hPutStr stdout $ writeEvents YT.UTF8 $ parseEvents' inYamlDat hFlush stdout where parseEvents' = map (either (\(ofs,msg) -> error ("parsing error near byte offset " ++ show ofs ++ " (" ++ msg ++ ")")) (\evPos -> eEvent evPos)). filter (not. isComment). parseEvents cmdYaml2Event :: IO () cmdYaml2Event = do inYamlDat <- BS.L.getContents forM_ (parseEvents inYamlDat) $ \ev -> case ev of Left (ofs,msg) -> do hPutStrLn stderr ("Parsing error near byte offset " ++ show ofs ++ if null msg then "" else " (" ++ msg ++ ")") exitFailure Right event -> do hPutStrLn stdout (ev2str True (eEvent event)) hFlush stdout cmdYaml2EventPos :: IO () cmdYaml2EventPos = do inYamlDat <- BS.L.getContents let inYamlDatTxt = T.decodeUtf8 (BS.L.toStrict inYamlDat) inYamlDatLns = T.lines inYamlDatTxt maxLine = length inYamlDatLns forM_ (parseEvents inYamlDat) $ \ev -> case ev of Left (ofs,msg) -> do hPutStrLn stderr (prettyPosWithSource ofs inYamlDat (" error [" ++ show ofs ++ "]") ++ msg) exitFailure Right event -> do let Pos{..} = ePos event putStrLn (prettyPosWithSource (ePos event) inYamlDat ("\t" ++ ev2str True (eEvent event))) cmdYaml2Event0 :: IO () cmdYaml2Event0 = do inYamlDat <- BS.L.getContents print (length (parseEvents' inYamlDat)) where parseEvents' = map (either (\(ofs,msg) -> error ("parsing error near byte offset " ++ show ofs ++ " (" ++ msg ++ ")")) id) . parseEvents cmdYaml2YamlVal :: IO() cmdYaml2YamlVal = do inYamlDat <- BS.L.getContents case sequence $ parseEvents inYamlDat of Left (ofs,msg) -> do hPutStrLn stderr ("Parsing error near byte offset " ++ show ofs ++ if null msg then "" else " (" ++ msg ++ ")") exitFailure Right oldEvents -> do let output = writeEvents YT.UTF8 (map eEvent oldEvents) BS.L.hPutStr stdout output hFlush stdout case sequence (parseEvents output) of Left (ofs',msg') -> do hPutStrLn stderr ("Parsing error in the generated YAML stream near byte offset " ++ show ofs' ++ if null msg' then "" else " (" ++ msg' ++ ")") exitFailure Right newEvents -> do hPutStrLn stdout $ printf "\nInput Event Stream Length: %d\nOutput Event Stream Length: %d\n" (length oldEvents) (length newEvents) let diffList = filter (uncurry (/=)) $ zipWith (\a b -> (eEvent a, eEvent b)) oldEvents newEvents hPutStrLn stdout $ printf "No of difference detected: %d\n" $ length diffList forM_ diffList $ \(old,new) -> do hPutStrLn stdout $ "Input > " ++ show old hPutStrLn stdout $ "Output < " ++ show new cmdPrintNode :: IO() cmdPrintNode = do str <- BS.L.getContents case decode str :: Either (Pos, String) [Node Pos] of Left (pos, s) -> do hPutStrLn stdout s hFlush stdout Right nodeSeq -> forM_ nodeSeq $ \node -> do printNode node putStrLn "" cmdDumpYAML :: IO() cmdDumpYAML = do str <- BS.L.getContents case decode str :: Either (Pos, String) [Node Pos] of Left (pos, str) -> do hPutStrLn stdout str hFlush stdout Right nodes -> do BS.L.hPutStrLn stdout $ encode nodes hFlush stdout -- | 'J.Value' look-alike data Value' = Object' (Map Text Value') | Array' [Value'] | String' !Text | NumberD' !Double | NumberI' !Integer | Bool' !Bool | Null' deriving Show toProperValue :: Value' -> J.Value toProperValue v = case v of Null' -> J.Null String' t -> J.String t NumberD' x -> J.Number x NumberI' x -> J.Number (fromInteger x) Bool' b -> J.Bool b Array' xs -> J.Array (map toProperValue xs) Object' xs -> J.Object (fmap toProperValue xs) instance FromYAML Value' where parseYAML (Y.Scalar _ s) = case s of SNull -> pure Null' SBool b -> pure (Bool' b) SFloat x -> pure (NumberD' x) SInt x -> pure (NumberI' x) SStr t -> pure (String' t) SUnknown _ t -> pure (String' t) -- HACK parseYAML (Y.Sequence _ _ xs) = Array' <$> mapM parseYAML xs parseYAML (Y.Mapping _ _ m) = Object' . Map.fromList <$> mapM parseKV (Map.toList m) where parseKV :: (Y.Node Pos,Y.Node Pos) -> Parser (Text,Value') parseKV (k,v) = (,) <$> parseK k <*> parseYAML v -- for numbers and !!null we apply implicit conversions parseK n = do k <- parseYAML n case k of NumberI' t -> pure (T.pack (show t)) NumberD' t -> pure (T.pack (show t)) String' t -> pure t Null' -> pure "" -- we stringify the key with an added risk of nameclashing _ -> pure $ T.decodeUtf8 $ J.encodeStrict $ toProperValue k -- _ -> fail ("dictionary entry had non-string key " ++ show k) decodeAeson :: BS.L.ByteString -> Either (Pos,String) [J.Value] decodeAeson = fmap (map toProperValue) . decode' where -- TODO decode' :: FromYAML v => BS.L.ByteString -> Either (Pos,String) [v] decode' bs0 = case decodeNode' coreSchemaResolver { schemaResolverMappingDuplicates = True } False False bs0 of Left (pos, err) -> Left (pos, err) Right a -> Right a >>= mapM (parseEither . parseYAML . (\(Doc x) -> x)) -- | Try to convert 'Double' into 'Int64', return 'Nothing' if not -- representable loss-free as integral 'Int64' value. doubleToInt64 :: Double -> Maybe Int64 doubleToInt64 x | fromInteger x' == x , x' <= toInteger (maxBound :: Int64) , x' >= toInteger (minBound :: Int64) = Just (fromIntegral x') | otherwise = Nothing where x' = round x decodeNumber :: T.Text -> Maybe Double decodeNumber = readMaybe . T.unpack -- fixme cmdYaml2Json :: IO () cmdYaml2Json = do inYamlDat <- BS.L.getContents case decodeAeson inYamlDat of Left (_, e) -> fail e Right vs -> do forM_ vs $ \v -> BS.L.putStrLn (J.encode v) return () unescapeSpcTab :: T.Text -> T.Text unescapeSpcTab = T.replace "" " " . T.replace "" "\t" data TestPass = PassExpErr -- ^ expected parse fail | PassEvs -- ^ events ok | PassEvsJson -- ^ events+json ok | PassEvsJsonYaml deriving (Eq,Ord,Show) data TestFail = FailParse -- ^ unexpected parse fail | FailSuccess -- ^ unexpected parse success | FailEvs -- ^ events wrong/mismatched | FailJson -- ^ JSON wrong/mismatched | FailYaml -- ^ YAML wrong/mismatched deriving (Eq,Ord,Show) data TestRes = Pass !TestPass | Fail !TestFail deriving (Eq,Ord,Show) cmdRunTml :: [FilePath] -> IO () cmdRunTml args = do results <- forM args $ \fn -> do tml <- BS.readFile fn hPutStr stdout (fn ++ " : ") hFlush stdout TML.Document _ blocks <- either (fail . T.unpack) pure $ TML.parse fn (T.decodeUtf8 tml) forM blocks $ \(TML.Block label points) -> do let dats = [ (k,v) | TML.PointStr k v <- points ] let isErr = isJust (lookup "error" dats) Just inYamlDat = BS.L.fromStrict . T.encodeUtf8 . unescapeSpcTab <$> lookup "in-yaml" dats Just testEvDat = lines . T.unpack . unescapeSpcTab <$> lookup "test-event" dats mInJsonDat :: Maybe [J.Value] mInJsonDat = (maybe (error ("invalid JSON in " ++ show fn)) id . J.decodeStrictN . T.encodeUtf8) <$> lookup "in-json" dats mOutYamlDat = BS.L.fromStrict . T.encodeUtf8 . unescapeSpcTab <$> lookup "out-yaml" dats case sequence $ filter (not. isComment) (parseEvents inYamlDat) of Left err | isErr -> do putStrLn "OK! (error)" pure (Pass PassExpErr) | otherwise -> do putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' (show err) putStrLn "" putStrLn' (show testEvDat) putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail FailParse) Right evs' -> do let events = map eEvent evs' evs'' = map (ev2str False) events if evs'' == testEvDat then do let outYamlDatIut = writeEvents YT.UTF8 (map toBlockStyle events) where toBlockStyle ev = case ev of SequenceStart a b _ -> SequenceStart a b Block MappingStart a b _ -> MappingStart a b Block otherwise -> ev Right ev = sequence $ filter (not. isComment) (parseEvents outYamlDatIut) outYamlEvsIut = either (const []) (map (ev2str False)) (Right (map eEvent ev)) unless (outYamlEvsIut == evs'') $ do putStrLn' ("\nWARNING: (iut /= ref)") putStrLn' ("iut[yaml] = " ++ show outYamlDatIut) putStrLn' ("ref[raw-evs] = " ++ show evs') putStrLn' ("ref[evs] = " ++ show evs'') putStrLn' ("iut[evs] = " ++ show outYamlEvsIut) putStrLn "" case mInJsonDat of Nothing -> do putStrLn "OK!" pure (Pass PassEvs) Just inJsonDat -> do iutJson <- either (fail. snd) pure $ decodeAeson inYamlDat if iutJson == inJsonDat then do case mOutYamlDat of Nothing -> do putStrLn "OK! (+JSON)" pure (Pass PassEvsJson) Just outYamlDat -> do case () of _ | outYamlDat == outYamlDatIut -> do putStrLn "OK! (+JSON+YAML)" pure (Pass PassEvsJsonYaml) | otherwise -> do putStrLn $ if outYamlEvsIut == evs'' then "OK (+JSON-YAML)" else "FAIL! (bad out-YAML)" putStrLn' ("ref = " ++ show outYamlDat) putStrLn' ("iut = " ++ show outYamlDatIut) putStrLn "" putStrLn' ("ref = " ++ show evs'') putStrLn' ("iut = " ++ show outYamlEvsIut) case outYamlEvsIut == evs'' of True -> do putStrLn' ("(iut == ref)") pure (Pass PassEvsJson) False -> pure (Fail FailYaml) else do putStrLn "FAIL! (bad JSON)" putStrLn' ("ref = " ++ show inJsonDat) putStrLn' ("iut = " ++ show iutJson) pure (Fail FailJson) else do if isErr then putStrLn "FAIL! (unexpected parser success)" else putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' ("ref = " ++ show testEvDat) putStrLn' ("iut = " ++ show evs'') putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail (if isErr then FailSuccess else FailEvs)) putStrLn "" let ok = length [ () | Pass _ <- results' ] nok = length [ () | Fail _ <- results' ] stat j = show $ Map.findWithDefault 0 j $ Map.fromListWith (+) [ (k,1::Int) | k <- results' ] results' = concat results putStrLn $ concat [ "done -- passed: ", show ok , " (ev: ", stat (Pass PassEvs), ", ev+json: ", stat (Pass PassEvsJson), ", ev+json+yaml: ", stat (Pass PassEvsJsonYaml), ", err: ", stat (Pass PassExpErr), ") / " , "failed: ", show nok , " (err: ", stat (Fail FailParse), ", ev:", stat (Fail FailEvs), ", json:", stat (Fail FailJson), ", yaml:", stat (Fail FailYaml), ", ok:", stat (Fail FailSuccess), ")" ] cmdRunTml' :: [FilePath] -> IO () cmdRunTml' args = do results <- forM args $ \fn -> do tml <- BS.readFile fn hPutStr stdout (fn ++ " : ") hFlush stdout TML.Document _ blocks <- either (fail . T.unpack) pure $ TML.parse fn (T.decodeUtf8 tml) forM blocks $ \(TML.Block label points) -> do let dats = [ (k,v) | TML.PointStr k v <- points ] let isErr = isJust (lookup "error" dats) Just inYamlDat = BS.L.fromStrict . T.encodeUtf8 . unescapeSpcTab <$> lookup "in-yaml" dats Just testEvDat = lines . T.unpack . unescapeSpcTab <$> lookup "test-event" dats mInJsonDat :: Maybe [J.Value] mInJsonDat = (maybe (error ("invalid JSON in " ++ show fn)) id . J.decodeStrictN . T.encodeUtf8) <$> lookup "in-json" dats mOutYamlDat = BS.L.fromStrict . T.encodeUtf8 . unescapeSpcTab <$> lookup "out-yaml" dats case sequence $ parseEvents inYamlDat of -- allow parsing with comments Left err | isErr -> do putStrLn "OK! (error)" pure (Pass PassExpErr) | otherwise -> do putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' (show err) putStrLn "" putStrLn' (show testEvDat) putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail FailParse) Right evs' -> do let events = map eEvent (filter (not. isComment'. eEvent) evs') -- filter comments before comparing evs'' = map (ev2str False) events if evs'' == testEvDat then do let outYamlDatIut = writeEvents YT.UTF8 (map eEvent evs') -- Allow both block and flow style -- let outYamlDatIut = writeEvents YT.UTF8 (map (toBlockStyle. eEvent) evs') -- Allow only Block style -- where toBlockStyle ev = case ev of -- SequenceStart a b _ -> SequenceStart a b Block -- MappingStart a b _ -> MappingStart a b Block -- otherwise -> ev Right ev = sequence $ parseEvents outYamlDatIut outYamlEvsIut = either (const []) (map (ev2str False)) (Right (map eEvent (filter (not. isComment'. eEvent) ev))) unless (outYamlEvsIut == evs'') $ do putStrLn' ("\nWARNING: (iut /= ref)") putStrLn' ("iut[yaml] = " ++ show outYamlDatIut) putStrLn' ("ref[raw-evs] = " ++ show evs') putStrLn' ("ref[evs] = " ++ show evs'') putStrLn' ("iut[evs] = " ++ show outYamlEvsIut) putStrLn "" case mInJsonDat of Nothing -> do putStrLn "OK!" pure (Pass PassEvs) Just inJsonDat -> do iutJson <- either (fail. snd) pure $ decodeAeson inYamlDat if iutJson == inJsonDat then do case mOutYamlDat of Nothing -> do putStrLn "OK! (+JSON)" pure (Pass PassEvsJson) Just outYamlDat -> do case () of _ | outYamlDat == outYamlDatIut -> do putStrLn "OK! (+JSON+YAML)" pure (Pass PassEvsJsonYaml) | otherwise -> do putStrLn $ if outYamlEvsIut == evs'' then "OK (+JSON-YAML)" else "FAIL! (bad out-YAML)" putStrLn' ("ref = " ++ show outYamlDat) putStrLn' ("iut = " ++ show outYamlDatIut) putStrLn "" putStrLn' ("ref = " ++ show evs'') putStrLn' ("iut = " ++ show outYamlEvsIut) case outYamlEvsIut == evs'' of True -> do putStrLn' ("(iut == ref)") pure (Pass PassEvsJson) False -> pure (Fail FailYaml) else do putStrLn "FAIL! (bad JSON)" putStrLn' ("ref = " ++ show inJsonDat) putStrLn' ("iut = " ++ show iutJson) pure (Fail FailJson) else do if isErr then putStrLn "FAIL! (unexpected parser success)" else putStrLn "FAIL!" putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn' (T.unpack label) putStrLn "" putStrLn' ("ref = " ++ show testEvDat) putStrLn' ("iut = " ++ show evs'') putStrLn "" BS.L.putStr inYamlDat putStrLn "" testParse inYamlDat putStrLn "" -- forM_ (parseEvents inYamlDat) (putStrLn' . show) putStrLn "" putStrLn "----------------------------------------------------------------------------" putStrLn "" pure (Fail (if isErr then FailSuccess else FailEvs)) putStrLn "" let ok = length [ () | Pass _ <- results' ] nok = length [ () | Fail _ <- results' ] stat j = show $ Map.findWithDefault 0 j $ Map.fromListWith (+) [ (k,1::Int) | k <- results' ] results' = concat results putStrLn $ concat [ "done -- passed: ", show ok , " (ev: ", stat (Pass PassEvs), ", ev+json: ", stat (Pass PassEvsJson), ", ev+json+yaml: ", stat (Pass PassEvsJsonYaml), ", err: ", stat (Pass PassExpErr), ") / " , "failed: ", show nok , " (err: ", stat (Fail FailParse), ", ev:", stat (Fail FailEvs), ", json:", stat (Fail FailJson), ", yaml:", stat (Fail FailYaml), ", ok:", stat (Fail FailSuccess), ")" ] -- | Incomplete proof-of-concept 'testml-compiler' operation cmdTestmlCompiler :: [FilePath] -> IO () cmdTestmlCompiler [fn0] = do (fn,raw) <- case fn0 of "-" -> (,) "" <$> T.getContents _ -> (,) fn0 <$> T.readFile fn0 case TML.parse fn raw of Left e -> T.hPutStrLn stderr e >> exitFailure Right doc -> BS.putStrLn (J.encodeStrict doc) cmdTestmlCompiler _ = do hPutStrLn stderr ("Usage: yaml-test testml-compiler [ | - ]") exitFailure putStrLn' :: String -> IO () putStrLn' msg = putStrLn (" " ++ msg) printNode :: Node loc -> IO () printNode node = case node of (Y.Scalar _ a) -> hPutStrLn stdout $ "Scalar " ++ show a (Y.Mapping _ a b) -> do hPutStrLn stdout $ "Mapping " ++ show a printMap b (Y.Sequence _ a b) -> do hPutStrLn stdout $ "Sequence " ++ show a mapM_ printNode b (Y.Anchor _ a b) -> do hPutStr stdout $ "Anchor " ++ show a ++ " " printNode b printMap :: Map (Node loc) (Node loc) -> IO () printMap b = forM_ (Map.toList b) $ \(k,v) -> do hPutStr stdout "Key: " printNode k hPutStr stdout "Value: " printNode v isComment evPos = case evPos of Right (YE.EvPos {eEvent = (YE.Comment _), ePos = _}) -> True _ -> False isComment' ev = case ev of (Comment _) -> True _ -> False ev2str :: Bool -> Event -> String ev2str withColSty = \case StreamStart -> "+STR" DocumentStart NoDirEndMarker-> "+DOC" DocumentStart _ -> "+DOC ---" MappingEnd -> "-MAP" (MappingStart manc mtag Flow) | withColSty -> "+MAP {}" ++ ancTagStr manc mtag (MappingStart manc mtag _) -> "+MAP" ++ ancTagStr manc mtag SequenceEnd -> "-SEQ" (SequenceStart manc mtag Flow) | withColSty -> "+SEQ []" ++ ancTagStr manc mtag SequenceStart manc mtag _ -> "+SEQ" ++ ancTagStr manc mtag DocumentEnd True -> "-DOC ..." DocumentEnd False -> "-DOC" StreamEnd -> "-STR" Alias a -> "=ALI *" ++ T.unpack a YE.Scalar manc mtag sty v -> "=VAL" ++ ancTagStr manc mtag ++ styStr sty ++ quote2 v Comment comment -> "=COMMENT "++ quote2 comment where styStr = \case Plain -> " :" DoubleQuoted -> " \"" Literal _ _ -> " |" Folded _ _ -> " >" SingleQuoted -> " '" ancTagStr manc mtag = anc' ++ tag' where anc' = case manc of Nothing -> "" Just anc -> " &" ++ T.unpack anc tag' = case tagToText mtag of Nothing -> "" Just t -> " <" ++ T.unpack t ++ ">" quote2 :: T.Text -> String quote2 = concatMap go . T.unpack where go c | c == '\n' = "\\n" | c == '\t' = "\\t" | c == '\b' = "\\b" | c == '\r' = "\\r" | c == '\\' = "\\\\" | otherwise = [c] testParse :: BS.L.ByteString -> IO () testParse bs0 = mapM_ (putStrLn' . showT) $ YT.tokenize bs0 False where showT :: YT.Token -> String showT t = replicate (YT.tLineChar t) ' ' ++ show (YT.tText t) ++ " " ++ show (YT.tCode t) HsYAML-0.2.1.4/src-test/TML.hs0000644000000000000000000003003207346545000013634 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Incomplete TestML 0.3.0 parser module TML ( TML.parse , Document(..) , Block(..) , Point(..) , PseudoId(..) , Code(..) , AssertOp(..) , CodeExpr(..) , CodeObject(..) , FunCall(..) ) where import Data.ByteString (ByteString) import qualified Data.ByteString.Char8 as B import Control.Applicative hiding (many, some) import Control.Monad import qualified Data.Aeson.Micro as J import qualified Data.ByteString as BS import qualified Data.Char as C import Data.List import qualified Data.Map as Map import Data.Maybe import qualified Data.Text as T import qualified Data.Text.Encoding as T import qualified Data.Text.IO as T import Data.Void import System.Environment import Text.Megaparsec import Text.Megaparsec.Char import qualified Text.Megaparsec.Char.Lexer as L type Parser = Parsec Void T.Text parse :: String -> T.Text -> Either T.Text Document parse fn raw = either (Left . T.pack . errorBundlePretty) (Right .process_pseudo) (Text.Megaparsec.parse testml_document fn raw) ---------------------------------------------------------------------------- data Document = Document [Code] [Block] deriving Show instance J.ToJSON Document where toJSON (Document code dat) = J.object [ "testml" J..= ("0.3.0" :: T.Text) , "code" J..= code , "data" J..= dat ] data Block = Block !T.Text [Point] deriving Show instance J.ToJSON Block where toJSON (Block label points) = J.object [ "label" J..= label , "point" J..= J.object (map f points) ] where f (PointStr k v) = k J..= v f (PointPseudo k) = (T.pack (show k)) J..= True f (PointInt k v) = k J..= v data Point = PointStr !T.Text !T.Text | PointInt !T.Text !Integer | PointPseudo !PseudoId deriving Show instance J.ToJSON Code where toJSON (CodeAssignmentStmt lhs rhs) = J.Array [J.String "=", J.String lhs, J.toJSON rhs] toJSON stmt@(CodeExpressionStmt lhs massert) | pobjs@(_:_) <- pointObjsInExpr stmt = J.Array [ J.String "%()" , J.Array [ J.String ("*" `mappend` p) | p <- pobjs ] , expr' ] | otherwise = expr' where expr' = case massert of Just (op,rhs) -> J.toJSON (op,lhs,rhs) Nothing -> J.toJSON lhs data Code = CodeAssignmentStmt !T.Text !CodeExpr | CodeExpressionStmt !CodeExpr !(Maybe (AssertOp,CodeExpr)) | CodeImportStmt [T.Text] deriving Show instance J.ToJSON AssertOp where toJSON AssertEq = J.String "==" toJSON AssertHas = J.String "~~" toJSON AssertLike = J.String "=~" data AssertOp = AssertEq | AssertHas | AssertLike deriving Show instance J.ToJSON CodeExpr where toJSON (CodeExpr obj []) = J.toJSON obj toJSON (CodeExpr obj fns) = J.Array $ [J.String ".", J.toJSON obj] ++ map J.toJSON fns data CodeExpr = CodeExpr !CodeObject [FunCall] deriving Show instance J.ToJSON CodeObject where toJSON (StrObj s) = J.String s toJSON (NumObj n) = J.Number n toJSON (PointObj j) = J.Array [J.String "*", J.String j] toJSON (CallObj fn) = J.toJSON fn data CodeObject = StrObj !T.Text | CallObj !FunCall | NumObj !Double | PointObj !T.Text deriving Show instance J.ToJSON FunCall where toJSON (FunCall fn args) = J.Array (J.String fn : map J.toJSON args) data FunCall = FunCall !T.Text [CodeExpr] deriving Show pointObjsInExpr :: Code -> [T.Text] pointObjsInExpr co = nub $ case co of CodeAssignmentStmt _ expr -> goExpr expr CodeExpressionStmt e1 Nothing -> goExpr e1 CodeExpressionStmt e1 (Just (_,e2)) -> goExpr e1 ++ goExpr e2 where goExpr (CodeExpr obj fns) = goObj obj ++ concatMap goFun fns goFun (FunCall _ exprs) = concatMap goExpr exprs goObj (PointObj j) = [j] goObj (CallObj fn) = goFun fn goObj (StrObj _) = [] goObj (NumObj _) = [] testml_document :: Parser Document testml_document = Document <$> code_section <*> data_section <* eof pseudo_point_name :: Parser PseudoId pseudo_point_name = choice [ HEAD <$ string "HEAD" , LAST <$ string "LAST" , ONLY <$ string "ONLY" , SKIP <$ string "SKIP" , TODO <$ string "TODO" , DIFF <$ string "DIFF" ] data PseudoId = HEAD | LAST | ONLY | SKIP | TODO | DIFF deriving (Eq,Show) process_pseudo :: Document -> Document process_pseudo (Document code bs0) = Document code (go bs0) where go blocks | Just b <- find isOnly blocks' = [b] | Just bs <- goHead blocks' = bs | Just bs <- goLast [] blocks' = bs | otherwise = blocks' where blocks' = filter (not . isSkip) blocks isOnly b = ONLY `elem` pseudos b isSkip b = SKIP `elem` pseudos b isHead b = HEAD `elem` pseudos b isLast b = LAST `elem` pseudos b pseudos (Block _ ps) = [ k | PointPseudo k <- ps ] goHead [] = Nothing goHead (b:bs) | isHead b = Just (b:bs) | otherwise = goHead bs goLast acc [] = Nothing goLast acc (b:bs) | isLast b = Just $ reverse (b:bs) | otherwise = goLast (b:acc) bs code_section :: Parser [Code] code_section = do xs <- many code_statement pure (catMaybes xs) where code_statement = choice [ Nothing <$ comment_lines , Just <$> import_directive , Just <$> assignment_statement , Just <$> expression_statement ] import_directive = do string "%Import" ws mods <- module_name `sepBy1` ws ws0 eol pure $! CodeImportStmt mods module_name :: Parser T.Text module_name = T.pack <$> some alphaNumChar assignment_statement = do v <- try $ do v' <- identifier_name ws void (char '=') <|> void (string "||=") -- FIXME ws pure v' e <- code_expression eol pure (CodeAssignmentStmt v e) expression_statement = do -- TODO: expression-label -- optional (double_string >> char ':' >> ws0) -- TODO: pick-expression lhs <- code_expression ws op <- choice [ AssertEq <$ string "==" , AssertHas <$ string "~~" , AssertLike <$ string "=~" ] ws rhs <- code_expression optional $ do ws0 char ':' double_string eol pure (CodeExpressionStmt lhs (Just (op,rhs))) code_expression :: Parser CodeExpr code_expression = CodeExpr <$> code_object <*> many function_call -- quoted string double_string :: Parser T.Text double_string = do char '"' str <- many (noneOf ("\n\"\\" :: [Char]) <|> (char '\\' >> (unesc <$> oneOf ("\\\"0nt" :: [Char])))) char '"' pure $! (T.pack str) where unesc '0' = '\0' unesc 'n' = '\n' unesc 't' = '\t' unesc c = c single_string :: Parser T.Text single_string = do char '\'' str <- many (noneOf ("\n'\\" :: [Char]) <|> (char '\\' >> (oneOf ("\\'" :: [Char])))) char '\'' pure $! (T.pack str) function_call :: Parser FunCall function_call = do char '.' call_object call_object :: Parser FunCall call_object = FunCall <$> identifier_name <*> optional' [] (between (char '(') (char ')') $ code_expression `sepBy1` (char ',' >> ws0)) optional' def p = do x <- optional p case x of Nothing -> pure def Just y -> pure y code_object :: Parser CodeObject code_object = choice [ mkPoint <$> char '*' <*> lowerChar <*> many (lowerChar <|> digitChar <|> char '-' <|> char '_') , mkNum <$> optional (char '-') <*> some digitChar <*> optional (char '.' >> some digitChar) , CallObj <$> call_object , StrObj <$> single_string , StrObj <$> double_string ] "code-object" where mkPoint _ c cs = PointObj $! (T.pack (c:cs)) mkNum msign ds1 mds2 = NumObj $! (read $ (maybe id (:) msign) ds1 ++ (maybe "" ('.':) mds2)) data_section :: Parser [Block] data_section = many block_definition where block_definition = do -- block_heading string "===" *> ws l <- T.pack <$> manyTill anySingle eol -- TODO: user_defined ps <- many point_definition pure (Block l ps) point_definition = do string "---" *> ws j <- eitherP identifier_user pseudo_point_name filters <- maybe [] id <$> optional filter_spec let single = do _ <- char ':' *> ws x <- T.pack <$> manyTill anySingle eol -- consume and ignore any point_lines _ <- point_lines pure $! case j of Left j' -> mkSinglePointVal j' (transformPoint True filters x) Right j' -> PointPseudo j' -- is this allowed? multi = do ws0 *> eol x <- point_lines pure $! case j of Left j' -> PointStr j' (transformPoint False filters x) Right j' -> PointPseudo j' single <|> multi filter_spec = between (char '(') (char ')') $ many (oneOf ("<#+-~/@" :: [Char])) mkSinglePointVal k v | T.all C.isDigit v = PointInt k (read (T.unpack v)) | otherwise = PointStr k v point_lines :: Parser T.Text point_lines = T.pack . unlines <$> go where go = many (notFollowedBy point_boundary *> manyTill anySingle eol) point_boundary :: Parser () point_boundary = void (string "---") <|> void (string "===") <|> eof identifier_user :: Parser T.Text identifier_user = do x <- (:) <$> lowerChar <*> many alphaNumChar xs <- many ((:) <$> char '-' <*> some alphaNumChar) pure $! T.pack (concat (x:xs)) identifier_name :: Parser T.Text identifier_name = do x <- (:) <$> letterChar <*> many alphaNumChar xs <- many ((:) <$> char '-' <*> some alphaNumChar) pure $! T.pack (concat (x:xs)) ws :: Parser () ws = void $ takeWhile1P (Just "BLANK") (\c -> c == ' ' || c == '\t') ws0 :: Parser () ws0 = void $ takeWhileP (Just "BLANK") (\c -> c == ' ' || c == '\t') blank_line :: Parser () blank_line = (try (ws0 <* eol) <|> try (ws <* eof)) "blank-line" comment_line :: Parser () comment_line = (char '#' *> takeWhileP Nothing (/= '\n') *> void eol) "comment-line" comment_lines :: Parser () comment_lines = void (some (comment_line <|> blank_line)) stripTrailEols :: T.Text -> T.Text stripTrailEols = go where go t | T.isSuffixOf "\n\n" t = go (T.init t) | T.isSuffixOf "\r\n\r\n" t = go (T.init (T.init t)) | t == "\n" = "" | otherwise = t -- 'undent' stripPrefixInd :: T.Text -> T.Text stripPrefixInd = T.unlines . map go . T.lines where go t | T.isPrefixOf " " t = T.drop 4 t | T.isPrefixOf " " t = T.drop 3 t | T.isPrefixOf " " t = T.drop 2 t | T.isPrefixOf " " t = T.drop 1 t | otherwise = t stripComments :: T.Text -> T.Text stripComments = T.unlines . filter (not . T.isPrefixOf "#") . T.lines transformPoint :: Bool -> [Char] -> T.Text -> T.Text transformPoint single mods0 -- TODO: backslash = go mods0 . (if keepBlanks then id else stripTrailEols) . (if keepComments then id else stripComments) where keepBlanks = single || ('+' `elem` mods0) keepComments = single || ('#' `elem` mods0) go [] = id go ('<':xs) | single = error "invalid filter for point-single" | otherwise = go xs . stripPrefixInd go ('+':xs) = go xs -- negative flag go ('#':xs) = go xs -- negative flag go ('-':xs) = go xs . T.dropWhileEnd C.isSpace go (c:_) = error ("unknown filter " ++ show c) HsYAML-0.2.1.4/src/Data/0000755000000000000000000000000007346545000012542 5ustar0000000000000000HsYAML-0.2.1.4/src/Data/DList.hs0000644000000000000000000000145507346545000014122 0ustar0000000000000000{-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Minimal API-compatible rip-off of @Data.DList@ module Data.DList ( DList , empty , singleton , append , toList ) where -- | A difference list is a function that, given a list, returns the original -- contents of the difference list prepended to the given list. newtype DList a = DList ([a] -> [a]) -- | Convert a dlist to a list toList :: DList a -> [a] toList (DList dl) = dl [] -- | Create dlist with a single element singleton :: a -> DList a singleton x = DList (x:) -- | Create a dlist containing no elements empty :: DList a empty = DList id -- | O(1). Append dlists append :: DList a -> DList a -> DList a append (DList xs) (DList ys) = DList (xs . ys) HsYAML-0.2.1.4/src/Data/YAML.hs0000644000000000000000000007210407346545000013644 0ustar0000000000000000{-# LANGUAGE CPP #-} {-# LANGUAGE GADTs #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Document oriented [YAML](http://yaml.org/spec/1.2/spec.html) parsing API inspired by [aeson](http://hackage.haskell.org/package/aeson). module Data.YAML ( -- * Overview -- $overview -- * Quick Start Tutorial -- $start -- ** Decoding/Loading YAML document -- $loading -- ** Encoding/dumping -- $dumping -- * Typeclass-based resolving/decoding decode , decode1 , decodeStrict , decode1Strict , FromYAML(..) , Parser , parseEither , failAtNode , typeMismatch -- ** Accessors for YAML t'Mapping's , Mapping , (.:), (.:?), (.:!), (.!=) -- * Typeclass-based dumping , encode , encode1 , encodeStrict , encode1Strict , ToYAML(..) -- ** Accessors for encoding t'Mapping's , Pair , mapping , (.=) -- ** Prism-style parsers , withScalar , withSeq , withBool , withFloat , withInt , withNull , withStr , withMap -- * \"Concrete\" AST , decodeNode , decodeNode' , encodeNode , encodeNode' , Doc(Doc,docRoot) , Node(..) , Scalar(..) -- * Source locations , Pos(..) , prettyPosWithSource -- * YAML 1.2 Schema resolvers -- -- | See also "Data.YAML.Schema" , SchemaResolver , failsafeSchemaResolver , jsonSchemaResolver , coreSchemaResolver -- * YAML 1.2 Schema encoders -- -- | See also "Data.YAML.Schema" , SchemaEncoder , failsafeSchemaEncoder , jsonSchemaEncoder , coreSchemaEncoder -- * Generalised AST construction , decodeLoader , Loader(..) , LoaderT , NodeId ) where import qualified Control.Monad.Fail as Fail import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BS.L import qualified Data.Map as Map import qualified Data.Text as T import Data.YAML.Dumper import Data.YAML.Event (isUntagged, tagToText) import Data.YAML.Internal import Data.YAML.Loader import Data.YAML.Pos import Data.YAML.Schema.Internal import Util -- $overview -- -- The diagram below depicts the standard layers of a [YAML 1.2](http://yaml.org/spec/1.2/spec.html) processor. This module covers the upper /Native/ and /Representation/ layers, whereas the "Data.YAML.Event" and "Data.YAML.Token" modules provide access to the lower /Serialization/ and /Presentation/ layers respectively. -- -- <> -- -- $start -- -- This section contains basic information on the different ways to work with YAML data using this library. -- -- $loading -- -- We address the process of loading data from a YAML document as decoding. -- -- Let's assume we want to decode (i.e. /load/) a simple YAML document -- -- > - name: Erik Weisz -- > age: 52 -- > magic: True -- > - name: Mina Crandon -- > age: 53 -- -- into a native Haskell data structure of type @[Person]@, i.e. a list of @Person@ records. -- -- The code below shows how to manually define a @Person@ record type together with a 'FromYAML' instance: -- -- > {-# LANGUAGE OverloadedStrings #-} -- > -- > import Data.YAML -- > -- > data Person = Person -- > { name :: Text -- > , age :: Int -- > , magic :: Bool -- > } deriving Show -- > -- > instance FromYAML Person where -- > parseYAML = withMap "Person" $ \m -> Person -- > <$> m .: "name" -- > <*> m .: "age" -- > <*> m .:? "magic" .!= False -- -- And now we can 'decode' the YAML document like so: -- -- >>> decode "- name: Erik Weisz\n age: 52\n magic: True\n- name: Mina Crandon\n age: 53" :: Either (Pos,String) [[Person]] -- Right [[Person {name = "Erik Weisz", age = 52, magic = True},Person {name = "Mina Crandon", age = 53, magic = False}]] -- -- There are predefined 'FromYAML' instance for many types. -- -- The example below shows decoding multiple YAML documents into a list of 'Int' lists: -- -- >>> decode "---\n- 1\n- 2\n- 3\n---\n- 4\n- 5\n- 6" :: Either (Pos,String) [[Int]] -- Right [[1,2,3],[4,5,6]] -- -- If you are expecting exactly one YAML document then you can use convenience function 'decode1' -- -- >>> decode1 "- 1\n- 2\n- 3\n" :: Either (Pos,String) [Int] -- Right [1,2,3] -- -- == Working with AST -- -- Sometimes we want to work with YAML data directly, without first converting it to a custom data type. -- -- We can easily do that by using the 'Node' type, which is an instance of 'FromYAML', is used to represent an arbitrary YAML AST (abstract syntax tree). For example, -- -- >>> decode1 "Name: Vijay" :: Either (Pos,String) (Node Pos) -- Right (Mapping (Pos {posByteOffset = 0, posCharOffset = 0, posLine = 1, posColumn = 0}) Just "tag:yaml.org,2002:map" (fromList [(Scalar (Pos {posByteOffset = 0, posCharOffset = 0, posLine = 1, posColumn = 0}) (SStr "Name"),Scalar (Pos {posByteOffset = 6, posCharOffset = 6, posLine = 1, posColumn = 6}) (SStr "Vijay"))])) -- -- The type parameter 'Pos' is used to indicate the position of each YAML 'Node' in the document. -- So using the 'Node' type we can easily decode any YAML document. -- -- == Pretty-printing source locations -- -- Syntax errors or even conversion errors are reported with a source location, e.g. -- -- >>> decode "- name: Erik Weisz\n age: 52\n magic: True\n- name: Mina Crandon\n age: young" :: Either (Pos,String) [[Person]] -- Left (Pos {posByteOffset = 71, posCharOffset = 71, posLine = 5, posColumn = 7},"expected !!int instead of !!str") -- -- While accurate this isn't a very convenient error representation. Instead we can use the 'prettyPosWithSource' helper function to create more convenient error report like so -- -- @ -- readPersons :: FilePath -> IO [Person] -- readPersons fname = do -- raw <- BS.L.readFile fname -- case 'decode1' raw of -- Left (loc,emsg) -> do -- hPutStrLn stderr (fname ++ ":" ++ 'prettyPosWithSource' loc raw " error" ++ emsg) -- pure [] -- Right persons -> pure persons -- @ -- -- which will then print errors in a common form such as -- -- > people.yaml:5:7: error -- > | -- > 5 | age: young -- > | ^ -- > expected !!int instead of !!str -- -- | Retrieve value in t'Mapping' indexed by a @!!str@ 'Text' key. -- -- This parser fails if the key doesn't exist. (.:) :: FromYAML a => Mapping Pos -> Text -> Parser a m .: k = maybe (fail $ "key " ++ show k ++ " not found") parseYAML (Map.lookup (Scalar fakePos (SStr k)) m) -- | Retrieve optional value in t'Mapping' indexed by a @!!str@ 'Text' key. -- -- 'Nothing' is returned if the key is missing or points to a @tag:yaml.org,2002:null@ node. -- This combinator only fails if the key exists but cannot be converted to the required type. -- -- See also '.:!'. (.:?) :: FromYAML a => Mapping Pos -> Text -> Parser (Maybe a) m .:? k = maybe (pure Nothing) parseYAML (Map.lookup (Scalar fakePos (SStr k)) m) -- | Retrieve optional value in t'Mapping' indexed by a @!!str@ 'Text' key. -- -- 'Nothing' is returned if the key is missing. -- This combinator only fails if the key exists but cannot be converted to the required type. -- -- __NOTE__: This is a variant of '.:?' which doesn't map a @tag:yaml.org,2002:null@ node to 'Nothing'. (.:!) :: FromYAML a => Mapping Pos -> Text -> Parser (Maybe a) m .:! k = maybe (pure Nothing) (fmap Just . parseYAML) (Map.lookup (Scalar fakePos (SStr k)) m) -- | Defaulting helper to be used with '.:?' or '.:!'. (.!=) :: Parser (Maybe a) -> a -> Parser a mv .!= def = fmap (maybe def id) mv fakePos :: Pos fakePos = Pos { posByteOffset = -1 , posCharOffset = -1 , posLine = 1 , posColumn = 0 } -- | Parse and decode YAML document(s) into 'Node' graphs -- -- This is a convenience wrapper over `decodeNode'`, i.e. -- -- @ -- decodeNode = decodeNode' 'coreSchemaResolver' False False -- @ -- -- In other words, -- -- * Use the YAML 1.2 Core schema for resolving -- * Don't create 'Anchor' nodes -- * Disallow cyclic anchor references -- -- @since 0.2.0 -- decodeNode :: BS.L.ByteString -> Either (Pos, String) [Doc (Node Pos)] decodeNode = decodeNode' coreSchemaResolver False False -- | Customizable variant of 'decodeNode' -- -- @since 0.2.0 -- decodeNode' :: SchemaResolver -- ^ YAML Schema resolver to use -> Bool -- ^ Whether to emit anchor nodes -> Bool -- ^ Whether to allow cyclic references -> BS.L.ByteString -- ^ YAML document to parse -> Either (Pos, String) [Doc (Node Pos)] decodeNode' SchemaResolver{..} anchorNodes allowCycles bs0 = map Doc <$> runIdentity (decodeLoader failsafeLoader bs0) where failsafeLoader = Loader { yScalar = \t s v pos-> pure $ case schemaResolverScalar t s v of Left e -> Left (pos,e) Right v' -> Right (Scalar pos v') , ySequence = \t vs pos -> pure $ case schemaResolverSequence t of Left e -> Left (pos,e) Right t' -> Right (Sequence pos t' vs) , yMapping = \t kvs pos-> pure $ case schemaResolverMapping t of Left e -> Left (pos,e) Right t' -> Mapping pos t' <$> mkMap kvs , yAlias = if allowCycles then \_ _ n _-> pure $ Right n else \_ c n pos -> pure $ if c then Left (pos,"cycle detected") else Right n , yAnchor = if anchorNodes then \j n pos -> pure $ Right (Anchor pos j n) else \_ n _ -> pure $ Right n } mkMap :: [(Node Pos, Node Pos)] -> Either (Pos, String) (Map (Node Pos) (Node Pos)) mkMap kvs | schemaResolverMappingDuplicates = Right $! Map.fromList kvs | otherwise = case mapFromListNoDupes kvs of Left (k,_) -> Left (nodeLoc k,"Duplicate key in mapping: " ++ show k) Right m -> Right m ---------------------------------------------------------------------------- -- | YAML Parser 'Monad' used by 'FromYAML' -- -- See also 'parseEither' or 'decode' newtype Parser a = P { unP :: Either (Pos, String) a } instance Functor Parser where fmap f (P x) = P (fmap f x) x <$ P (Right _) = P (Right x) _ <$ P (Left e) = P (Left e) instance Applicative Parser where pure = P . Right P (Left e) <*> _ = P (Left e) P (Right f) <*> P r = P (fmap f r) P (Left e) *> _ = P (Left e) P (Right _) *> p = p instance Monad Parser where return = pure P m >>= k = P (m >>= unP . k) (>>) = (*>) #if !(MIN_VERSION_base(4,13,0)) fail = Fail.fail #endif -- | @since 0.1.1.0 -- -- __NOTE__: 'fail' doesn't convey proper position information unless used within the @with*@-style helpers; consequently it's recommended to use 'failAtNode' when /not/ covered by the location scope of a @with*@-style combinator. instance Fail.MonadFail Parser where fail s = P (Left (fakePos, s)) -- | Trigger parsing failure located at a specific 'Node' -- -- @since 0.2.0.0 failAtNode :: Node Pos -> String -> Parser a failAtNode n s = P (Left (nodeLoc n, s)) -- | @since 0.1.1.0 instance Alternative Parser where empty = fail "empty" P (Left _) <|> y = y x <|> _ = x -- | @since 0.1.1.0 instance MonadPlus Parser where mzero = empty mplus = (<|>) -- | Run 'Parser' -- -- A common use-case is 'parseEither' 'parseYAML'. parseEither :: Parser a -> Either (Pos, String) a parseEither = unP -- | Informative failure helper -- -- This is typically used in fall-through cases of 'parseYAML' like so -- -- > instance FromYAML ... where -- > parseYAML ... = ... -- > parseYAML node = typeMismatch "SomeThing" node -- -- @since 0.1.1.0 typeMismatch :: String -- ^ descriptive name of expected data -> Node Pos -- ^ actual node -> Parser a typeMismatch expected node = failAtNode node ("expected " ++ expected ++ " instead of " ++ got) where got = case node of Scalar _ (SBool _) -> "!!bool" Scalar _ (SInt _) -> "!!int" Scalar _ SNull -> "!!null" Scalar _ (SStr _) -> "!!str" Scalar _ (SFloat _) -> "!!float" Scalar _ (SUnknown t v) | isUntagged t -> tagged t ++ show v | otherwise -> "(unsupported) " ++ tagged t ++ "scalar" Anchor _ _ _ -> "anchor" Mapping _ t _ -> tagged t ++ " mapping" Sequence _ t _ -> tagged t ++ " sequence" tagged t0 = case tagToText t0 of Nothing -> "non-specifically ? tagged (i.e. unresolved) " Just t -> T.unpack t ++ " tagged" -- | A type into which YAML nodes can be converted/deserialized class FromYAML a where parseYAML :: Node Pos -> Parser a -- This helper fixes up 'fakePos' locations to a better guess; this is -- mostly used by the with*-style combinators {-# INLINE fixupFailPos #-} fixupFailPos :: Pos -> Parser a -> Parser a fixupFailPos pos (P (Left (pos0,emsg))) | pos0 == fakePos = P (Left (pos,emsg)) fixupFailPos _ p = p -- | Operate on @tag:yaml.org,2002:null@ node (or fail) withNull :: String -> Parser a -> Node Pos -> Parser a withNull _ f (Scalar pos SNull) = fixupFailPos pos f withNull expected _ v = typeMismatch expected v -- | Operate on t'Scalar' node (or fail) -- -- @since 0.2.1 withScalar :: String -> (Scalar -> Parser a) -> Node Pos -> Parser a withScalar _ f (Scalar pos sca) = fixupFailPos pos (f sca) withScalar expected _ v = typeMismatch expected v -- | Trivial instance instance (loc ~ Pos) => FromYAML (Node loc) where parseYAML = pure -- | @since 0.2.1 instance FromYAML Scalar where parseYAML = withScalar "scalar" pure instance FromYAML Bool where parseYAML = withBool "!!bool" pure -- | Operate on @tag:yaml.org,2002:bool@ node (or fail) withBool :: String -> (Bool -> Parser a) -> Node Pos -> Parser a withBool _ f (Scalar pos (SBool b)) = fixupFailPos pos (f b) withBool expected _ v = typeMismatch expected v instance FromYAML Text where parseYAML = withStr "!!str" pure -- | Operate on @tag:yaml.org,2002:str@ node (or fail) withStr :: String -> (Text -> Parser a) -> Node Pos -> Parser a withStr _ f (Scalar pos (SStr b)) = fixupFailPos pos (f b) withStr expected _ v = typeMismatch expected v instance FromYAML Integer where parseYAML = withInt "!!int" pure -- | Operate on @tag:yaml.org,2002:int@ node (or fail) withInt :: String -> (Integer -> Parser a) -> Node Pos -> Parser a withInt _ f (Scalar pos (SInt b)) = fixupFailPos pos (f b) withInt expected _ v = typeMismatch expected v -- | @since 0.1.1.0 instance FromYAML Natural where parseYAML = withInt "!!int" $ \b -> if b < 0 then fail ("!!int " ++ show b ++ " out of range for 'Natural'") else pure (fromInteger b) -- helper for fixed-width integers {-# INLINE parseInt #-} parseInt :: (Integral a, Bounded a) => [Char] -> Node Pos -> Parser a parseInt name = withInt "!!int" $ \b -> maybe (fail $ "!!int " ++ show b ++ " out of range for '" ++ name ++ "'") pure $ fromIntegerMaybe b instance FromYAML Int where parseYAML = parseInt "Int" instance FromYAML Int8 where parseYAML = parseInt "Int8" instance FromYAML Int16 where parseYAML = parseInt "Int16" instance FromYAML Int32 where parseYAML = parseInt "Int32" instance FromYAML Int64 where parseYAML = parseInt "Int64" instance FromYAML Word where parseYAML = parseInt "Word" instance FromYAML Word8 where parseYAML = parseInt "Word8" instance FromYAML Word16 where parseYAML = parseInt "Word16" instance FromYAML Word32 where parseYAML = parseInt "Word32" instance FromYAML Word64 where parseYAML = parseInt "Word64" instance FromYAML Double where parseYAML = withFloat "!!float" pure -- | Operate on @tag:yaml.org,2002:float@ node (or fail) withFloat :: String -> (Double -> Parser a) -> Node Pos -> Parser a withFloat _ f (Scalar pos (SFloat b)) = fixupFailPos pos (f b) withFloat expected _ v = typeMismatch expected v instance (Ord k, FromYAML k, FromYAML v) => FromYAML (Map k v) where parseYAML = withMap "!!map" $ \xs -> Map.fromList <$> mapM (\(a,b) -> (,) <$> parseYAML a <*> parseYAML b) (Map.toList xs) -- | Operate on @tag:yaml.org,2002:map@ node (or fail) withMap :: String -> (Mapping Pos -> Parser a) -> Node Pos -> Parser a withMap _ f (Mapping pos tag xs) | tag == tagMap = fixupFailPos pos (f xs) withMap expected _ v = typeMismatch expected v instance FromYAML v => FromYAML [v] where parseYAML = withSeq "!!seq" (mapM parseYAML) -- | Operate on @tag:yaml.org,2002:seq@ node (or fail) withSeq :: String -> ([Node Pos] -> Parser a) -> Node Pos-> Parser a withSeq _ f (Sequence pos tag xs) | tag == tagSeq = fixupFailPos pos (f xs) withSeq expected _ v = typeMismatch expected v instance FromYAML a => FromYAML (Maybe a) where parseYAML (Scalar _ SNull) = pure Nothing parseYAML j = Just <$> parseYAML j ---------------------------------------------------------------------------- instance (FromYAML a, FromYAML b) => FromYAML (a,b) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b] -> (,) <$> parseYAML a <*> parseYAML b _ -> fail ("expected 2-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c) => FromYAML (a,b,c) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c] -> (,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c _ -> fail ("expected 3-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d) => FromYAML (a,b,c,d) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d] -> (,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d _ -> fail ("expected 4-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e) => FromYAML (a,b,c,d,e) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e] -> (,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e _ -> fail ("expected 5-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e, FromYAML f) => FromYAML (a,b,c,d,e,f) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e,f] -> (,,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e <*> parseYAML f _ -> fail ("expected 6-sequence but got " ++ show (length xs) ++ "-sequence instead") instance (FromYAML a, FromYAML b, FromYAML c, FromYAML d, FromYAML e, FromYAML f, FromYAML g) => FromYAML (a,b,c,d,e,f,g) where parseYAML = withSeq "!!seq" $ \xs -> case xs of [a,b,c,d,e,f,g] -> (,,,,,,) <$> parseYAML a <*> parseYAML b <*> parseYAML c <*> parseYAML d <*> parseYAML e <*> parseYAML f <*> parseYAML g _ -> fail ("expected 7-sequence but got " ++ show (length xs) ++ "-sequence instead") -- | Decode YAML document(s) using the YAML 1.2 Core schema -- -- Each document contained in the YAML stream produce one element of -- the response list. Here's an example of decoding two concatenated -- YAML documents: -- -- >>> decode "Foo\n---\nBar" :: Either (Pos,String) [Text] -- Right ["Foo","Bar"] -- -- Note that an empty stream doesn't contain any (non-comment) -- document nodes, and therefore results in an empty result list: -- -- >>> decode "# just a comment" :: Either (Pos,String) [Text] -- Right [] -- -- 'decode' uses the same settings as 'decodeNode' for tag-resolving. If -- you need a different custom parsing configuration, you need to -- combine 'parseEither' and `decodeNode'` yourself. -- -- The 'decode' as well as the 'decodeNode' functions supports -- decoding from YAML streams using the UTF-8, UTF-16 (LE or BE), or -- UTF-32 (LE or BE) encoding (which is auto-detected). -- -- @since 0.2.0 -- decode :: FromYAML v => BS.L.ByteString -> Either (Pos, String) [v] decode bs0 = decodeNode bs0 >>= mapM (parseEither . parseYAML . (\(Doc x) -> x)) -- | Convenience wrapper over 'decode' expecting exactly one YAML document -- -- >>> decode1 "---\nBar\n..." :: Either (Pos,String) Text -- Right "Bar" -- -- >>> decode1 "Foo\n---\nBar" :: Either (Pos,String) Text -- Left (Pos {posByteOffset = 8, posCharOffset = 8, posLine = 3, posColumn = 0},"unexpected multiple YAML documents") -- -- >>> decode1 "# Just a comment" :: Either (Pos,String) Text -- Left (Pos {posByteOffset = 0, posCharOffset = 0, posLine = 1, posColumn = 0},"empty YAML stream") -- -- @since 0.2.0 -- decode1 :: FromYAML v => BS.L.ByteString -> Either (Pos, String) v decode1 bs0 = do docs <- decodeNode bs0 case docs of [] -> Left (Pos { posByteOffset = 0, posCharOffset = 0, posLine = 1, posColumn = 0 }, "empty YAML stream") [Doc v] -> parseEither $ parseYAML $ v (_:Doc n:_) -> Left (nodeLoc n, "unexpected multiple YAML documents") -- | Like 'decode' but takes a strict 'BS.ByteString' -- -- @since 0.2.0 -- decodeStrict :: FromYAML v => BS.ByteString -> Either (Pos, String) [v] decodeStrict = decode . BS.L.fromChunks . (:[]) -- | Like 'decode1' but takes a strict 'BS.ByteString' -- -- @since 0.2.0 -- decode1Strict :: FromYAML v => BS.ByteString -> Either (Pos, String) v decode1Strict = decode1 . BS.L.fromChunks . (:[]) -- $dumping -- -- We address the process of dumping information from a Haskell-data type(s) to a YAML document(s) as encoding. -- -- Suppose we want to 'encode' a Haskell-data type Person -- -- @ -- data Person = Person -- { name :: Text -- , age :: Int -- } deriving Show -- @ -- -- To 'encode' data, we need to define a 'ToYAML' instance. -- -- @ -- -- instance 'ToYAML' Person where -- \-- this generates a 'Node' -- 'toYAML' (Person n a) = 'mapping' [ "name" .= n, "age" .= a] -- -- @ -- -- We can now 'encode' a node like so: -- -- >>> encode [Person {name = "Vijay", age = 19}] -- "age: 19\nname: Vijay\n" -- -- There are predefined 'ToYAML' instances for many types. Here's an example encoding a complex Haskell Node' -- -- >>> encode1 $ toYAML ([1,2,3], Map.fromList [(1, 2)]) -- "- - 1\n - 2\n - 3\n- 1: 2\n" -- -- | A type from which YAML nodes can be constructed -- -- @since 0.2.0.0 class ToYAML a where -- | Convert a Haskell Data-type to a YAML Node data type. toYAML :: a -> Node () instance Loc loc => ToYAML (Node loc) where toYAML = toUnit instance ToYAML Bool where toYAML = Scalar () . SBool instance ToYAML Double where toYAML = Scalar () . SFloat instance ToYAML Int where toYAML = Scalar () . SInt . toInteger instance ToYAML Int8 where toYAML = Scalar () . SInt . toInteger instance ToYAML Int16 where toYAML = Scalar () . SInt . toInteger instance ToYAML Int32 where toYAML = Scalar () . SInt . toInteger instance ToYAML Int64 where toYAML = Scalar () . SInt . toInteger instance ToYAML Word where toYAML = Scalar () . SInt . toInteger instance ToYAML Word8 where toYAML = Scalar () . SInt . toInteger instance ToYAML Word16 where toYAML = Scalar () . SInt . toInteger instance ToYAML Word32 where toYAML = Scalar () . SInt . toInteger instance ToYAML Word64 where toYAML = Scalar () . SInt . toInteger instance ToYAML Natural where toYAML = Scalar () . SInt . toInteger instance ToYAML Integer where toYAML = Scalar () . SInt instance ToYAML Text where toYAML = Scalar () . SStr -- | @since 0.2.1 instance ToYAML Scalar where toYAML = Scalar () instance ToYAML a => ToYAML (Maybe a) where toYAML Nothing = Scalar () SNull toYAML (Just a) = toYAML a -- instance (ToYAML a, ToYAML b) => ToYAML (Either a b) where -- toYAML (Left a) = toYAML a -- toYAML (Right b) = toYAML b instance ToYAML a => ToYAML [a] where toYAML = Sequence () tagSeq . map toYAML instance (Ord k, ToYAML k, ToYAML v) => ToYAML (Map k v) where toYAML kv = Mapping () tagMap (Map.fromList $ map (\(k,v) -> (toYAML k , toYAML v)) (Map.toList kv)) instance (ToYAML a, ToYAML b) => ToYAML (a, b) where toYAML (a,b) = toYAML [toYAML a, toYAML b] instance (ToYAML a, ToYAML b, ToYAML c) => ToYAML (a, b, c) where toYAML (a,b,c) = toYAML [toYAML a, toYAML b, toYAML c] instance (ToYAML a, ToYAML b, ToYAML c, ToYAML d) => ToYAML (a, b, c, d) where toYAML (a,b,c,d) = toYAML [toYAML a, toYAML b, toYAML c, toYAML d] instance (ToYAML a, ToYAML b, ToYAML c, ToYAML d, ToYAML e) => ToYAML (a, b, c, d, e) where toYAML (a,b,c,d,e) = toYAML [toYAML a, toYAML b, toYAML c, toYAML d, toYAML e] instance (ToYAML a, ToYAML b, ToYAML c, ToYAML d, ToYAML e, ToYAML f) => ToYAML (a, b, c, d, e, f) where toYAML (a,b,c,d,e,f) = toYAML [toYAML a, toYAML b, toYAML c, toYAML d, toYAML e, toYAML f] instance (ToYAML a, ToYAML b, ToYAML c, ToYAML d, ToYAML e, ToYAML f, ToYAML g) => ToYAML (a, b, c, d, e, f, g) where toYAML (a,b,c,d,e,f,g) = toYAML [toYAML a, toYAML b, toYAML c, toYAML d, toYAML e, toYAML f, toYAML g] -- | Serialize YAML Node(s) using the YAML 1.2 Core schema to a lazy 'Data.YAML.Token.UTF8' encoded 'BS.L.ByteString'. -- -- Each YAML Node produces exactly one YAML Document. -- -- Here is an example of encoding a list of strings to produce a list of YAML Documents -- -- >>> encode (["Document 1", "Document 2"] :: [Text]) -- "Document 1\n...\nDocument 2\n" -- -- If we treat the above list of strings as a single sequence then we will produce a single YAML Document having a single sequence. -- -- >>> encode ([["Document 1", "Document 2"]] :: [[Text]]) -- "- Document 1\n- Document 2\n" -- -- Alternatively, if you only need a single YAML document in a YAML stream you might want to use the convenience function 'encode1'; or, if you need more control over the encoding, see 'encodeNode''. -- -- @since 0.2.0 encode :: ToYAML v => [v] -> BS.L.ByteString encode vList = encodeNode $ map (Doc . toYAML) vList -- | Convenience wrapper over 'encode' taking exactly one YAML Node. -- Hence it will always output exactly one YAML Document -- -- Here is example of encoding a list of strings to produce exactly one of YAML Documents -- -- >>> encode1 (["Document 1", "Document 2"] :: [Text]) -- "- Document 1\n- Document 2\n" -- -- @since 0.2.0 encode1 :: ToYAML v => v -> BS.L.ByteString encode1 a = encode [a] -- | Like 'encode' but outputs 'BS.ByteString' -- -- @since 0.2.0 encodeStrict :: ToYAML v => [v] -> BS.ByteString encodeStrict = bsToStrict . encode -- | Like 'encode1' but outputs a strict 'BS.ByteString' -- -- @since 0.2.0 encode1Strict :: ToYAML v => v -> BS.ByteString encode1Strict = bsToStrict . encode1 -- Internal helper class Loc loc where toUnit :: Functor f => f loc -> f () toUnit = (() <$) instance Loc Pos instance Loc () where toUnit = id -- | Represents a key-value pair in YAML t'Mapping's -- -- See also '.=' and 'mapping' -- -- @since 0.2.1 type Pair = (Node (), Node ()) -- | @since 0.2.0 (.=) :: ToYAML a => Text -> a -> Pair name .= node = (toYAML name, toYAML node) -- | @since 0.2.0 mapping :: [Pair] -> Node () mapping = Mapping () tagMap . Map.fromList HsYAML-0.2.1.4/src/Data/YAML/0000755000000000000000000000000007346545000013304 5ustar0000000000000000HsYAML-0.2.1.4/src/Data/YAML/Dumper.hs0000644000000000000000000001065107346545000015077 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Dumper ( encodeNode , encodeNode' ) where import Data.YAML.Event.Internal as YE import Data.YAML.Event.Writer (writeEvents) import Data.YAML.Internal as YI import Data.YAML.Schema.Internal as YS import qualified Data.ByteString.Lazy as BS.L import qualified Data.Map as Map import qualified Data.Text as T -- internal type EvList = [Either String Event] type Node2EvList = [Node ()] -> EvList -- | Dump YAML Nodes as a lazy 'UTF8' encoded 'BS.L.ByteString' -- -- Each YAML 'Node' is emitted as a individual YAML Document where each Document is terminated by a 'DocumentEnd' indicator. -- -- This is a convenience wrapper over `encodeNode'` -- -- @since 0.2.0 encodeNode :: [Doc (Node ())] -> BS.L.ByteString encodeNode = encodeNode' coreSchemaEncoder UTF8 -- | Customizable variant of 'encodeNode' -- -- __NOTE__: A leading will be emitted for all encodings /other than/ 'UTF8'. -- -- @since 0.2.0 encodeNode' :: SchemaEncoder -> Encoding -> [Doc (Node ())] -> BS.L.ByteString encodeNode' SchemaEncoder{..} encoding nodes = writeEvents encoding $ map getEvent (dumpEvents (map docRoot nodes)) where getEvent :: Either String Event -> Event getEvent = \x -> case x of Right ev -> ev Left str -> error str dumpEvents :: Node2EvList dumpEvents nodes' = Right StreamStart: go0 nodes' where go0 :: [Node ()] -> EvList go0 [] = [Right StreamEnd] go0 n = Right (DocumentStart NoDirEndMarker): goNode (0 :: Int) n (\ev -> go0 ev) goNode :: Int -> [Node ()] -> Node2EvList -> EvList goNode _ [] _ = [Left "Dumper: unexpected pattern in goNode"] goNode lvl (node: rest) cont = case node of YI.Scalar _ scalar -> goScalar scalar Nothing: isDocEnd lvl rest cont Mapping _ tag m -> Right (MappingStart Nothing (getTag schemaEncoderMapping tag) Block) : goMap (lvl + 1) m rest cont Sequence _ tag s -> Right (SequenceStart Nothing (getTag schemaEncoderSequence tag) Block) : goSeq (lvl + 1) s rest cont Anchor _ nid n -> goAnchor lvl nid n rest cont goScalar :: YS.Scalar -> Maybe Anchor -> Either String Event goScalar s anc = case schemaEncoderScalar s of Right (t, sty, text) -> Right (YE.Scalar anc t sty text) Left err -> Left err goMap :: Int -> Mapping () -> [Node ()] -> Node2EvList -> EvList goMap lvl m rest cont = case (mapToList m) of [] -> Right MappingEnd : isDocEnd (lvl - 1) rest cont list -> goNode lvl list g where g [] = Right MappingEnd : isDocEnd (lvl - 1) rest cont g rest' = goNode lvl rest' g mapToList = Map.foldrWithKey (\k v a -> k : v : a) [] goSeq :: Int -> [Node ()] -> [Node ()] -> Node2EvList -> EvList goSeq lvl [] rest cont = Right SequenceEnd : isDocEnd (lvl - 1) rest cont goSeq lvl nod rest cont = goNode lvl nod g where g [] = Right SequenceEnd : isDocEnd (lvl - 1) rest cont g rest' = goNode lvl rest' g goAnchor :: Int -> NodeId -> Node () -> [Node ()] -> Node2EvList -> EvList goAnchor lvl nid nod rest cont = case nod of YI.Scalar _ scalar -> goScalar scalar (ancName nid): isDocEnd lvl rest cont Mapping _ tag m -> Right (MappingStart (ancName nid) (getTag schemaEncoderMapping tag) Block) : goMap (lvl + 1) m rest cont Sequence _ tag s -> Right (SequenceStart (ancName nid) (getTag schemaEncoderSequence tag) Block) : goSeq (lvl + 1) s rest cont Anchor _ _ _ -> Left "Anchor has a anchor node" : (cont rest) isDocEnd :: Int -> [Node ()] -> Node2EvList -> EvList isDocEnd lvl rest cont = if lvl == 0 then Right (DocumentEnd (rest /= [])): (cont rest) else (cont rest) ancName :: NodeId -> Maybe Anchor ancName nid | nid == (0-1) = Nothing | otherwise = Just $! T.pack ("a" ++ show nid) getTag :: (Tag -> Either String Tag) -> Tag -> Tag getTag f tag = case f tag of Right t -> t Left err -> error err HsYAML-0.2.1.4/src/Data/YAML/Event.hs0000644000000000000000000007677107346545000014743 0ustar0000000000000000{-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Event-stream oriented YAML parsing and serializing API module Data.YAML.Event ( -- * Tutorial -- $start -- ** Parsing YAML Documents -- $parsing parseEvents -- ** Serializing Events to YAML Character Stream -- $serialize , writeEvents , writeEventsText -- ** How to comment your yaml document for best results -- $commenting -- ** Event-stream Internals , EvStream , Event(..) , EvPos(..) , Directives(..) , ScalarStyle(..) , NodeStyle(..) , Chomp(..) , IndentOfs(..) , Tag, untagged, isUntagged, tagToText, mkTag , Anchor , Pos(..) ) where import Data.YAML.Event.Internal import Data.YAML.Event.Writer (writeEvents, writeEventsText) import qualified Data.ByteString.Lazy as BS.L import qualified Data.Char as C import qualified Data.Map as Map import qualified Data.Text as T import qualified Data.YAML.Token as Y import Numeric (readHex) import Util -- | Construct YAML tag mkTag :: String -> Tag mkTag "" = error "mkTag" mkTag "!" = Tag (Just $! T.pack "!") mkTag s = Tag (Just $! tagUnescape s) where tagUnescape = T.pack . go where go [] = [] go ('%':h:l:cs) | Just c <- decodeL1 [h,l] = c : go cs go (c:cs) = c : go cs mkTag' :: String -> Tag mkTag' "" = error "mkTag'" mkTag' s = Tag (Just $! T.pack s) mkTag'' :: String -> Tag mkTag'' "" = error "mkTag''" mkTag'' s = Tag (Just $! T.pack ("tag:yaml.org,2002:" ++ s)) -- Returns the position corresponding to the 'Token' tok2pos :: Y.Token -> Pos tok2pos Y.Token { Y.tByteOffset = posByteOffset, Y.tCharOffset = posCharOffset, Y.tLine = posLine, Y.tLineChar = posColumn } = Pos {..} -- Construct a 'EvPos' from the given 'Event' and 'Pos' getEvPos :: Event -> Y.Token -> EvPos getEvPos ev tok = EvPos { eEvent = ev , ePos = tok2pos tok } -- Initial position('Pos' corresponding to the 'StreamStart') initPos :: Pos initPos = Pos { posByteOffset = 0 , posCharOffset = 0 , posLine = 1 , posColumn = 0 } -- internal type TagHandle = Text type Props = (Maybe Text,Tag) getHandle :: [Y.Token] -> Maybe (TagHandle,[Y.Token]) getHandle toks0 = do Y.Token { Y.tCode = Y.BeginHandle } : toks1 <- Just toks0 (hs,Y.Token { Y.tCode = Y.EndHandle } : toks2) <- Just $ span (\Y.Token { Y.tCode = c } -> c `elem` [Y.Indicator,Y.Meta]) toks1 pure (T.pack $ concatMap Y.tText hs, toks2) getUriTag :: [Y.Token] -> Maybe (Text,[Y.Token]) getUriTag toks0 = do Y.Token { Y.tCode = Y.BeginTag } : toks1 <- Just toks0 (hs,Y.Token { Y.tCode = Y.EndTag } : toks2) <- Just $ span (\Y.Token { Y.tCode = c } -> c `elem` [Y.Indicator,Y.Meta]) toks1 pure (T.pack $ concatMap Y.tText hs, toks2) {- WARNING: the code that follows will make you cry; a safety pig is provided below for your benefit. _ _._ _..._ .-', _.._(`)) '-. ` ' /-._.-' ',/ ) \ '. / _ _ | \ | a a / | \ .-. ; '-('' ).-' ,' ; '-; | .' \ \ / | 7 .__ _.-\ \ | | | ``/ /` / /,_| | /,_/ / /,_/ '`-' -} fixUpEOS :: EvStream -> EvStream fixUpEOS = go initPos where go :: Pos -> EvStream -> EvStream go _ [] = [] go p [Right (EvPos StreamEnd _)] = [Right (EvPos StreamEnd p)] go _ (e@(Right (EvPos _ p)):es) = e : go p es go _ (e@(Left (p,_)):es) = e : go p es -- | Parse YAML 'Event's from a lazy 'BS.L.ByteString'. -- -- The parsed Events allow us to round-trip at the event-level while preserving many features and presentation details like -- 'Comment's,'ScalarStyle','NodeStyle', 'Anchor's, 'Directives' marker along with YAML document version, -- 'Chomp'ing Indicator,Indentation Indicator ('IndentOfs') ,ordering, etc. -- It does not preserve non-content white spaces. -- -- The input 'BS.L.ByteString' is expected to have a YAML 1.2 stream -- using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encodings -- (which will be auto-detected). -- parseEvents :: BS.L.ByteString -> EvStream parseEvents = \bs0 -> fixUpEOS $ Right (EvPos StreamStart initPos) : (go0 $ filter (not . isWhite) $ Y.tokenize bs0 False) where isTCode tc = (== tc) . Y.tCode skipPast tc (t : ts) | isTCode tc t = ts | otherwise = skipPast tc ts skipPast _ [] = error "the impossible happened" -- non-content whitespace isWhite :: Y.Token -> Bool isWhite (Y.Token { Y.tCode = Y.Bom }) = True -- BOMs can occur at each doc-start! isWhite (Y.Token { Y.tCode = Y.White }) = True isWhite (Y.Token { Y.tCode = Y.Indent }) = True isWhite (Y.Token { Y.tCode = Y.Break }) = True isWhite _ = False go0 :: Tok2EvStream go0 [] = [Right (EvPos StreamEnd initPos {- fixed up by fixUpEOS -} )] go0 toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) = goComment toks0 go0 go0 toks0@(Y.Token { Y.tCode = Y.BeginDocument } : _) = go1 dinfo0 toks0 go0 (Y.Token { Y.tCode = Y.DocumentEnd } : rest) = go0 rest -- stray/redundant document-end markers cause this go0 xs = err xs go1 :: DInfo -> Tok2EvStream go1 m (Y.Token { Y.tCode = Y.BeginDocument } : rest) = goDirs m rest go1 _ (tok@Y.Token { Y.tCode = Y.EndDocument } : Y.Token { Y.tCode = Y.DocumentEnd } : rest) = ( Right (getEvPos (DocumentEnd True) tok )): go0 rest go1 _ (tok@Y.Token { Y.tCode = Y.EndDocument } : rest) = ( Right (getEvPos (DocumentEnd False) tok )) : go0 rest go1 m toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) = goComment toks0 (go1 m) go1 m (Y.Token { Y.tCode = Y.BeginNode } : rest) = goNode0 m rest (go1 m) go1 _ xs = err xs -- consume {Begin,End}Directives and emit DocumentStart event goDirs :: DInfo -> Tok2EvStream goDirs m (Y.Token { Y.tCode = Y.BeginDirective } : rest) = goDir1 m rest goDirs m toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) = goComment toks0 (goDirs m) goDirs m (tok@Y.Token { Y.tCode = Y.DirectivesEnd } : rest) | Just (1,mi) <- diVer m = Right (getEvPos (DocumentStart (DirEndMarkerVersion mi)) tok) : go1 m rest | otherwise = Right (getEvPos (DocumentStart DirEndMarkerNoVersion) tok) : go1 m rest goDirs _ xs@(Y.Token { Y.tCode = Y.BeginDocument } : _) = err xs goDirs m xs@(tok : _) = Right (getEvPos (DocumentStart NoDirEndMarker) tok) : go1 m xs goDirs _ xs = err xs -- single directive goDir1 :: DInfo -> [Y.Token] -> EvStream goDir1 m toks0@(Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "YAML" } : Y.Token { Y.tCode = Y.Meta, Y.tText = v } : Y.Token { Y.tCode = Y.EndDirective } : rest) | diVer m /= Nothing = errMsg "Multiple %YAML directives" toks0 | Just (1,mi) <- decodeVer v = goDirs (m { diVer = Just (1,mi) }) rest -- TODO: warn for non-1.2 | otherwise = errMsg ("Unsupported YAML version " <> show v) toks0 goDir1 m toks0@(Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "TAG" } : rest) | Just (h, rest') <- getHandle rest , Just (t, rest'') <- getUriTag rest' = case mapInsertNoDupe h t (diTags m) of Just tm -> goDirs (m { diTags = tm }) (skipPast Y.EndDirective rest'') Nothing -> errMsg ("Multiple %TAG definitions for handle " <> show h) toks0 goDir1 m (Y.Token { Y.tCode = Y.Indicator, Y.tText = "%" } : Y.Token { Y.tCode = Y.Meta, Y.tText = l } : rest) | l `notElem` ["TAG","YAML"] = goDirs m (skipPast Y.EndDirective rest) goDir1 _ xs = err xs -- | Decode versions of the form @.@ decodeVer :: String -> Maybe (Word,Word) decodeVer s = do (lhs,'.':rhs) <- Just (break (=='.') s) (,) <$> readMaybe lhs <*> readMaybe rhs data DInfo = DInfo { diTags :: Map.Map TagHandle Text , diVer :: Maybe (Word,Word) } dinfo0 :: DInfo dinfo0 = DInfo mempty Nothing errMsg :: String -> Tok2EvStream errMsg msg (tok : _) = [Left (tok2pos tok, msg)] errMsg msg [] = [Left ((Pos (-1) (-1) (-1) (-1)), ("Unexpected end of token stream: " <> msg))] err :: Tok2EvStream err (tok@Y.Token { Y.tCode = Y.Error, Y.tText = msg } : _) = [Left (tok2pos tok, msg)] err (tok@Y.Token { Y.tCode = Y.Unparsed, Y.tText = txt } : _) = [Left (tok2pos tok, ("Lexical error near " ++ show txt))] err (tok@Y.Token { Y.tCode = code } : _) = [Left (tok2pos tok, ("Parse failure near " ++ show code ++ " token"))] err [] = [Left ((Pos (-1) (-1) (-1) (-1)), "Unexpected end of token stream")] goNode0 :: DInfo -> Tok2EvStreamCont goNode0 DInfo {..} = goNode where seqInd "[" = Flow seqInd "-" = Block seqInd _ = error "seqInd: internal error" -- impossible mapInd "{" = Flow mapInd _ = error "mapInd: internal error" -- impossible goNode :: Tok2EvStreamCont goNode toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goNode cont) goNode (tok@Y.Token { Y.tCode = Y.BeginScalar } : rest) cont = goScalar (tok2pos tok) (mempty,untagged) rest (flip goNodeEnd cont) goNode (tok@Y.Token { Y.tCode = Y.BeginSequence } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) cont = Right (getEvPos (SequenceStart Nothing untagged (seqInd ind)) tok): goSeq rest (flip goNodeEnd cont) goNode (tok@Y.Token { Y.tCode = Y.BeginMapping } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) cont = Right (getEvPos (MappingStart Nothing untagged (mapInd ind)) tok) : goMap rest (flip goNodeEnd cont) goNode (tok@Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (getEvPos (MappingStart Nothing untagged Block) tok) : goMap rest (flip goNodeEnd cont) goNode (Y.Token { Y.tCode = Y.BeginProperties } : rest) cont = goProp (mempty,untagged) rest (\p rest' -> goNode' p rest' cont) goNode (tok@Y.Token { Y.tCode = Y.BeginAlias } : Y.Token { Y.tCode = Y.Indicator } : Y.Token { Y.tCode = Y.Meta, Y.tText = anchor } : Y.Token { Y.tCode = Y.EndAlias } : Y.Token { Y.tCode = Y.EndNode } : rest) cont = Right (getEvPos (Alias (T.pack anchor)) tok) : cont rest goNode xs _cont = err xs goNode' :: Props -> Tok2EvStreamCont goNode' props toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip (goNode' props) cont) goNode' props (tok@Y.Token { Y.tCode = Y.BeginScalar } : rest) cont = goScalar (tok2pos tok) props rest (flip goNodeEnd cont) goNode' (manchor,mtag) (tok@Y.Token { Y.tCode = Y.BeginSequence } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) cont = Right (getEvPos (SequenceStart manchor mtag (seqInd ind)) tok) : goSeq rest (flip goNodeEnd cont) goNode' (manchor,mtag) (tok@Y.Token { Y.tCode = Y.BeginMapping } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) cont = Right (getEvPos (MappingStart manchor mtag (mapInd ind)) tok) : goMap rest (flip goNodeEnd cont) goNode' (manchor,mtag) (tok@Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (getEvPos (MappingStart manchor mtag Block) tok) : goMap rest (flip goNodeEnd cont) goNode' _ xs _cont = err xs goNodeEnd :: Tok2EvStreamCont goNodeEnd toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goNodeEnd cont) goNodeEnd (Y.Token { Y.tCode = Y.EndNode } : rest) cont = cont rest goNodeEnd xs _cont = err xs goProp :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goProp props (Y.Token { Y.tCode = Y.EndProperties } : rest) cont = cont props rest goProp props (Y.Token { Y.tCode = Y.BeginAnchor } : rest) cont = goAnchor props rest (\x y -> goProp x y cont) goProp props (Y.Token { Y.tCode = Y.BeginTag } : rest) cont = goTag props rest (\x y -> goProp x y cont) goProp _props xs _cont = err xs goAnchor :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goAnchor props (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goAnchor props rest cont goAnchor (_,tag) (Y.Token { Y.tCode = Y.Meta, Y.tText = anchor } : rest) cont = goAnchor (Just $! T.pack anchor,tag) rest cont goAnchor props (Y.Token { Y.tCode = Y.EndAnchor } : rest) cont = cont props rest goAnchor _ xs _ = err xs goTag :: Props -> [Y.Token] -> (Props -> [Y.Token] -> EvStream) -> EvStream goTag (anchor,_) (Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndTag } : rest) cont = cont (anchor,mkTag' "!") rest goTag (anchor,_) (Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!!")) diTags = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = cont (anchor,mkTag'' tag) rest goTag (anchor,_) (Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "<" } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ">" } : Y.Token { Y.tCode = Y.EndTag } : rest) cont = cont (anchor,mkTag tag) rest goTag (anchor,_) xs@(Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.Meta, Y.tText = h } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!" ++ h ++ "!")) diTags = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = err xs goTag (anchor,_) (Y.Token { Y.tCode = Y.BeginHandle } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "!" } : Y.Token { Y.tCode = Y.EndHandle } : Y.Token { Y.tCode = Y.Meta, Y.tText = tag } : Y.Token { Y.tCode = Y.EndTag } : rest) cont | Just t' <- Map.lookup (T.pack ("!")) diTags = cont (anchor,mkTag (T.unpack t' ++ tag)) rest | otherwise = cont (anchor,mkTag' ('!' : tag)) rest -- unresolved goTag _ xs _ = err xs goScalar :: Pos -> Props -> Tok2EvStreamCont goScalar pos0 (manchor,tag) toks0 cont = go0 False Plain toks0 where go0 ii sty (Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) | "'" <- ind = go' ii "" SingleQuoted rest | "\"" <- ind = go' ii "" DoubleQuoted rest | "|" <- ind = go0 True (Literal Clip IndentAuto) rest | ">" <- ind = go0 True (Folded Clip IndentAuto) rest | "+" <- ind = go0 ii (chn sty Keep) rest | "-" <- ind = go0 ii (chn sty Strip) rest | [c] <- ind, '1' <= c, c <= '9' = go0 False (chn' sty (C.digitToInt c)) rest go0 ii sty tok@(Y.Token { Y.tCode = Y.BeginComment} : _) = goComment tok (go0 ii sty) go0 ii sty (Y.Token { Y.tCode = Y.Text, Y.tText = t } : rest) = go' ii t sty rest go0 ii sty (Y.Token { Y.tCode = Y.LineFold } : rest) = go' ii " " sty rest go0 ii sty (Y.Token { Y.tCode = Y.LineFeed } : rest) = go' ii "\n" sty rest go0 _ sty (Y.Token { Y.tCode = Y.EndScalar } : rest) = Right (EvPos (Scalar manchor tag sty mempty) pos0) : cont rest go0 _ _ xs = err xs chn :: ScalarStyle -> Chomp -> ScalarStyle chn (Literal _ digit) chmp = Literal chmp digit chn (Folded _ digit) chmp = Folded chmp digit chn _ _ = error "impossible" chn' :: ScalarStyle -> Int -> ScalarStyle chn' (Literal b _) digit = Literal b (toEnum digit) chn' (Folded b _) digit = Folded b (toEnum digit) chn' _ _ = error "impossible" ---------------------------------------------------------------------------- go' ii acc sty (Y.Token { Y.tCode = Y.Text, Y.tText = t } : rest) = go' ii (acc ++ t) sty rest go' ii acc sty (Y.Token { Y.tCode = Y.LineFold } : rest) = go' ii (acc ++ " ") sty rest go' ii acc sty (Y.Token { Y.tCode = Y.LineFeed } : rest) = go' ii (acc ++ "\n") sty rest go' ii acc sty@SingleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "'" } : Y.Token { Y.tCode = Y.Meta, Y.tText = "'" } : Y.Token { Y.tCode = Y.EndEscape } : rest) = go' ii (acc ++ "'") sty rest go' ii acc sty@SingleQuoted (Y.Token { Y.tCode = Y.Indicator, Y.tText = "'" } : rest) = go' ii acc sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : -- Y.Token { Y.tCode = Y.Break } : Y.Token { Y.tCode = Y.EndEscape } : rest) = go' ii acc sty rest -- line continuation escape go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : Y.Token { Y.tCode = Y.Meta, Y.tText = t } : Y.Token { Y.tCode = Y.EndEscape } : rest) | Just t' <- unescape t = go' ii (acc ++ t') sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.BeginEscape } : Y.Token { Y.tCode = Y.Indicator, Y.tText = "\\" } : Y.Token { Y.tCode = Y.Indicator, Y.tText = pfx } : Y.Token { Y.tCode = Y.Meta, Y.tText = ucode } : Y.Token { Y.tCode = Y.EndEscape } : rest) | pfx == "U", Just c <- decodeCP2 ucode = go' ii (acc ++ [c]) sty rest | pfx == "u", Just c <- decodeCP ucode = go' ii (acc ++ [c]) sty rest | pfx == "x", Just c <- decodeL1 ucode = go' ii (acc ++ [c]) sty rest go' ii acc sty@DoubleQuoted (Y.Token { Y.tCode = Y.Indicator, Y.tText = "\"" } : rest) = go' ii acc sty rest go' ii acc sty (t@Y.Token { Y.tCode = Y.EndScalar } : rest) | ii, hasLeadingSpace acc = [Left (tok2pos t, "leading empty lines contain more spaces than the first non-empty line in scalar: " ++ show acc)] | otherwise = Right (EvPos (Scalar manchor tag sty (T.pack acc)) pos0) : cont rest go' _ _ _ xs = err xs hasLeadingSpace (' ':_) = True hasLeadingSpace ('\n':cs) = hasLeadingSpace cs hasLeadingSpace _ = False goSeq :: Tok2EvStreamCont goSeq (tok@Y.Token { Y.tCode = Y.EndSequence } : rest) cont = Right (getEvPos SequenceEnd tok): cont rest goSeq toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goSeq cont) goSeq (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goSeq cont) goSeq (tok@Y.Token { Y.tCode = Y.BeginMapping } : Y.Token { Y.tCode = Y.Indicator, Y.tText = ind } : rest) cont = Right (getEvPos (MappingStart Nothing untagged (mapInd ind)) tok) : goMap rest (flip goSeq cont) goSeq (tok@Y.Token { Y.tCode = Y.BeginMapping } : rest) cont = Right (getEvPos (MappingStart Nothing untagged Block) tok) : goMap rest (flip goSeq cont) goSeq (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goSeq rest cont -- goSeq xs _cont = error (show xs) goSeq xs _cont = err xs goMap :: Tok2EvStreamCont goMap (tok@Y.Token { Y.tCode = Y.EndMapping } : rest) cont = Right (getEvPos MappingEnd tok) : cont rest goMap toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goMap cont) goMap (Y.Token { Y.tCode = Y.BeginPair } : rest) cont = goPair1 rest (flip goMap cont) goMap (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goMap rest cont goMap xs _cont = err xs goPair1 (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goPair2 cont) goPair1 toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goPair1 cont) goPair1 (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goPair1 rest cont goPair1 xs _cont = err xs goPair2 toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goPair2 cont) goPair2 (Y.Token { Y.tCode = Y.BeginNode } : rest) cont = goNode rest (flip goPairEnd cont) goPair2 (Y.Token { Y.tCode = Y.Indicator } : rest) cont = goPair2 rest cont goPair2 xs _cont = err xs goPairEnd toks0@(Y.Token { Y.tCode = Y.BeginComment} : _) cont = goComment toks0 (flip goPairEnd cont) goPairEnd (Y.Token { Y.tCode = Y.EndPair } : rest) cont = cont rest goPairEnd xs _cont = err xs goComment :: Tok2EvStreamCont goComment (tok@Y.Token { Y.tCode = Y.BeginComment} : Y.Token { Y.tCode = Y.Indicator, Y.tText = "#" } : Y.Token { Y.tCode = Y.Meta, Y.tText = comment } : Y.Token { Y.tCode = Y.EndComment } : rest) cont = (Right (getEvPos (Comment (T.pack comment)) tok)) : cont rest goComment (tok@Y.Token { Y.tCode = Y.BeginComment} : Y.Token { Y.tCode = Y.Indicator, Y.tText = "#" } : Y.Token { Y.tCode = Y.EndComment } : rest) cont = (Right (getEvPos (Comment T.empty) tok)) : cont rest goComment xs _cont = err xs -- stripComments :: [Y.Token] -> [Y.Token] -- stripComments (Y.Token { Y.tCode = Y.BeginComment } : rest) = skip rest -- where -- skip (Y.Token { Y.tCode = Y.EndComment } : rest') = stripComments rest' -- skip (_ : rest') = skip rest' -- skip [] = error "the impossible happened" -- stripComments (t : rest) = t : stripComments rest -- stripComments [] = [] type Tok2EvStream = [Y.Token] -> EvStream type Tok2EvStreamCont = [Y.Token] -> Cont EvStream [Y.Token] type Cont r a = (a -> r) -> r -- decode 8-hex-digit unicode code-point decodeCP2 :: String -> Maybe Char decodeCP2 s = case s of [_,_,_,_,_,_,_,_] | all C.isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode 4-hex-digit unicode code-point decodeCP :: String -> Maybe Char decodeCP s = case s of [_,_,_,_] | all C.isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode 2-hex-digit latin1 code-point decodeL1 :: String -> Maybe Char decodeL1 s = case s of [_,_] | all C.isHexDigit s , [(j, "")] <- readHex s -> Just (chr (fromInteger j)) _ -> Nothing -- decode C-style escapes unescape :: String -> Maybe String unescape [c] = Map.lookup c m where m = Map.fromList [ (k,[v]) | (k,v) <- escapes ] escapes :: [(Char,Char)] escapes = [ ('0', '\0') , ('a', '\x7') , ('b', '\x8') , ('\x9', '\x9') , ('t', '\x9') , ('n', '\xa') , ('v', '\xb') , ('f', '\xc') , ('r', '\xd') , ('e', '\x1b') , (' ', ' ') , ('"', '"') , ('/', '/') , ('\\', '\\') , ('N', '\x85') , ('_', '\xa0') , ('L', '\x2028') , ('P', '\x2029') ] unescape _ = Nothing -- -- $start -- -- "Data.YAML" module provides us with API which allow us to interact with YAML data at the cost of some presentation details. -- In contrast, this module provide us with API which gives us access to a other significant details like 'ScalarStyle's, 'NodeStyle's, 'Comment's, etc. -- -- $parsing -- -- Suppose you want to parse this YAML Document while preserving its format and comments -- -- @ -- # Home runs -- hr: 65 -- # Runs Batted In -- rbi: 147 -- @ -- -- then you might want to use the function 'parseEvents'. -- -- The following is a reference implementation of a function using 'parseEvents'. -- It takes a YAML document as input and prints the parsed YAML 'Event's. -- -- @ -- import Data.YAML.Event -- import qualified Data.ByteString.Lazy as BS.L -- -- printEvents :: BS.L.ByteString -> IO () -- printEvents input = -- forM_ ('parseEvents' input) $ \ev -> case ev of -- Left _ -> error "Failed to parse" -- Right event -> print ('eEvent' event) -- @ -- -- When we pass the above mentioned YAML document to the function /printEvents/ it outputs the following: -- -- > StreamStart -- > DocumentStart NoDirEndMarker -- > MappingStart Nothing Nothing Block -- > Comment " Home runs" -- > Scalar Nothing Nothing Plain "hr" -- > Scalar Nothing Nothing Plain "65" -- > Comment " Runs Batted In" -- > Scalar Nothing Nothing Plain "rbi" -- > Scalar Nothing Nothing Plain "147" -- > MappingEnd -- > DocumentEnd False -- > StreamEnd -- -- Notice that now we have all the necessary details in the form of 'Event's. -- -- We can now write simple functions to work with this data without losing any more details. -- -- $serialize -- -- Now, suppose we want to generate back the YAML document after playing with the Event-stream, -- then you might want to use 'writeEvents'. -- -- The following function takes a YAML document as a input and dumps it back to STDOUT after a round-trip. -- -- @ -- import Data.YAML.Event -- import qualified Data.YAML.Token as YT -- import qualified Data.ByteString.Lazy as BS.L -- -- yaml2yaml :: BS.L.ByteString -> IO () -- yaml2yaml input = case sequence $ parseEvents input of -- Left _ -> error "Parsing Failure!" -- Right events -> do -- BS.L.hPutStr stdout (writeEvents YT.UTF8 (map eEvent events)) -- hFlush stdout -- @ -- -- Let this be the sample document passed to the above function -- -- @ -- # This is a 'Directives' Marker -- --- -- # All 'Comment's are preserved -- date : 2019-07-12 -- bill-to : # 'Anchor' represents a map node -- &id001 -- address: -- lines: # This a Block 'Scalar' with 'Keep' chomping Indicator and 'IndentAuto' Indentant indicator -- |+ # Extra Indentation (non-content white space) will not be preserved -- Vijay -- IIT Hyderabad -- -- -- # Trailing newlines are a preserved here as they are a part of the 'Scalar' node -- country : India -- ship-to : # This is an 'Alias' -- *id001 -- # Key is a 'Scalar' and Value is a Sequence -- Other Details: -- total: $ 3000 -- # 'Tag's are also preserved -- Online Payment: !!bool True -- product: -- - Item1 -- # This comment is inside a Sequence -- - Item2 -- ... -- # 'DocumentEnd' True -- # 'StreamEnd' -- @ -- -- This function outputs the following -- -- @ -- # This is a 'Directives' Marker -- --- -- # All 'Comment's are preserved -- date: 2019-07-12 -- bill-to: # 'Anchor' represents a map node -- &id001 -- address: -- lines: # This a Block 'Scalar' with 'Keep' chomping Indicator and 'IndentAuto' Indentant indicator -- # Extra Indentation (non-content white space) will not be preserved -- |+ -- Vijay -- IIT Hyderabad -- -- -- # Trailing newlines are a preserved here as they are a part of the 'Scalar' node -- country: India -- ship-to: # This is an 'Alias' -- *id001 -- # Key is a 'Scalar' and Value is a Sequence -- Other Details: -- total: $ 3000 -- # 'Tag's are also preserved -- Online Payment: !!bool True -- product: -- - Item1 -- # This comment is inside a Sequence -- - Item2 -- ... -- # 'DocumentEnd' True -- # 'StreamEnd' -- @ -- -- $commenting -- -- Round-tripping at event-level will preserve all the comments and their relative position in the YAML-document but still, -- we lose some information like the exact indentation and the position at which the comments were present previously. -- This information sometimes can be quite important for human-perception of comments. -- Below are some guildlines, so that you can avoid ambiguities. -- -- 1) Always try to start your comment in a newline. This step will avoid most of the ambiguities. -- -- 2) Comments automaticly get indented according to the level in which they are present. For example, -- -- Input YAML-document -- -- @ -- # Level 0 -- - a -- # Level 0 -- - - a -- # Level 1 -- - a -- - - a -- # Level 2 -- - a -- @ -- -- After a round-trip looks like -- -- @ -- # Level 0 -- - a -- # Level 0 -- - - a -- # Level 1 -- - a -- - - a -- # Level 2 -- - a -- @ -- -- 3) Comments immediately after a 'Scalar' node, start from a newline. So avoid commenting just after a scalar ends, as it may lead to some ambiguity. For example, -- -- Input YAML-document -- -- @ -- - scalar # After scalar -- - random : scalar # After scalar -- key: 1 -- # not after scalar -- - random : scalar -- key: 1 -- - random : # not after scalar -- scalar -- # not after scalar -- key: 1 -- @ -- -- After a round-trip looks like -- -- @ -- - scalar -- # After scalar -- - random: scalar -- # After scalar -- key: 1 -- # not after scalar -- - random: scalar -- key: 1 -- - random: # not after scalar -- scalar -- # not after scalar -- key: 1 -- @ -- -- 4) Similarly in flow-style, avoid commenting immediately after a /comma/ (@,@) seperator. Comments immediately after a /comma/ (@,@) seperator will start from a new line -- -- Input YAML-document -- -- @ -- { -- # comment 0 -- Name: Vijay # comment 1 -- , -- # comment 2 -- age: 19, # comment 3 -- # comment 4 -- country: India # comment 5 -- } -- @ -- -- After a round-trip looks like -- -- @ -- { -- # comment 0 -- Name: Vijay, -- # comment 1 -- # comment 2 -- age: 19, -- # comment 3 -- # comment 4 -- country: India, -- # comment 5 -- } -- @ -- -- 5) Avoid commenting in between syntatical elements. For example, -- -- Input YAML-document -- -- @ -- ? # Complex key starts -- [ -- a, -- b -- ] -- # Complex key ends -- : # Complex Value starts -- ? # Complex key starts -- [ -- a, -- b -- ] -- # Complex key ends -- : # Simple value -- a -- # Complex value ends -- @ -- -- After a round-trip looks like -- -- @ -- ? # Complex key starts -- [ -- a, -- b -- ] -- : # Complex key ends -- # Complex Value starts -- -- ? # Complex key starts -- [ -- a, -- b -- ] -- : # Complex key ends -- # Simple value -- a -- # Complex value ends -- @ -- -- The above two YAML-documents, after parsing produce the same 'Event'-stream. -- -- So, these are some limitation of this Format-preserving YAML processor. HsYAML-0.2.1.4/src/Data/YAML/Event/0000755000000000000000000000000007346545000014365 5ustar0000000000000000HsYAML-0.2.1.4/src/Data/YAML/Event/Internal.hs0000644000000000000000000001257307346545000016505 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Event.Internal ( EvStream , Event(..) , EvPos(..) , Directives(..) , ScalarStyle(..) , Chomp(..) , IndentOfs(..) , NodeStyle(..) , scalarNodeStyle , Tag(..), untagged, isUntagged, tagToText , Anchor , Pos(..) , Y.Encoding(..) ) where import qualified Data.Text as T import Data.YAML.Pos (Pos (..)) import qualified Data.YAML.Token as Y import Util -- | YAML Event Types -- -- The events correspond to the ones from [LibYAML](http://pyyaml.org/wiki/LibYAML) -- -- The grammar below defines well-formed streams of 'Event's: -- -- @ -- stream ::= 'StreamStart' document* 'StreamEnd' -- document ::= 'DocumentStart' node 'DocumentEnd' -- node ::= 'Alias' -- | 'Scalar' -- | 'Comment' -- | sequence -- | mapping -- sequence ::= 'SequenceStart' node* 'SequenceEnd' -- mapping ::= 'MappingStart' (node node)* 'MappingEnd' -- @ -- -- @since 0.2.0 data Event = StreamStart | StreamEnd | DocumentStart !Directives | DocumentEnd !Bool | Comment !Text | Alias !Anchor | Scalar !(Maybe Anchor) !Tag !ScalarStyle !Text | SequenceStart !(Maybe Anchor) !Tag !NodeStyle | SequenceEnd | MappingStart !(Maybe Anchor) !Tag !NodeStyle | MappingEnd deriving (Show, Eq, Generic) -- | @since 0.2.0 instance NFData Event where rnf StreamStart = () rnf StreamEnd = () rnf (DocumentStart _) = () rnf (DocumentEnd _) = () rnf (Comment _) = () rnf (Alias _) = () rnf (Scalar a _ _ _) = rnf a rnf (SequenceStart a _ _) = rnf a rnf SequenceEnd = () rnf (MappingStart a _ _) = rnf a rnf MappingEnd = () -- |'Event' with corresponding Pos in YAML stream -- -- @since 0.2.0 data EvPos = EvPos { eEvent :: !Event , ePos :: !Pos } deriving (Eq, Show, Generic) -- | @since 0.2.0 instance NFData EvPos where rnf (EvPos ev p) = rnf (ev,p) -- | Encodes document @%YAML@ directives and the directives end-marker -- -- @since 0.2.0 data Directives = NoDirEndMarker -- ^ no directives and also no @---@ marker | DirEndMarkerNoVersion -- ^ @---@ marker present, but no explicit @%YAML@ directive present | DirEndMarkerVersion !Word -- ^ @---@ marker present, as well as a @%YAML 1.mi@ version directive; the minor version @mi@ is stored in the 'Word' field. deriving (Show, Eq, Generic) -- | @since 0.2.0 instance NFData Directives where rnf !_ = () -- | 'Scalar'-specific node style -- -- This can be considered a more granular superset of 'NodeStyle'. -- See also 'scalarNodeStyle'. -- -- @since 0.2.0 data ScalarStyle = Plain | SingleQuoted | DoubleQuoted | Literal !Chomp !IndentOfs | Folded !Chomp !IndentOfs deriving (Eq,Ord,Show,Generic) -- | @since 0.2.0 instance NFData ScalarStyle where rnf !_ = () -- | -- -- @since 0.2.0 data Chomp = Strip -- ^ Remove all trailing line breaks and shows the presence of @-@ chomping indicator. | Clip -- ^ Keep first trailing line break; this also the default behavior used if no explicit chomping indicator is specified. | Keep -- ^ Keep all trailing line breaks and shows the presence of @+@ chomping indicator. deriving (Eq,Ord,Show,Generic) -- | @since 0.2.0 instance NFData Chomp where rnf !_ = () -- | Block Indentation Indicator -- -- 'IndentAuto' is the special case for auto Block Indentation Indicator -- -- @since 0.2.0 data IndentOfs = IndentAuto | IndentOfs1 | IndentOfs2 | IndentOfs3 | IndentOfs4 | IndentOfs5 | IndentOfs6 | IndentOfs7 | IndentOfs8 | IndentOfs9 deriving (Eq, Ord, Show, Enum, Generic) -- | @since 0.2.0 instance NFData IndentOfs where rnf !_ = () -- | Node style -- -- @since 0.2.0 data NodeStyle = Flow | Block deriving (Eq,Ord,Show,Generic) -- | @since 0.2.0 instance NFData NodeStyle where rnf !_ = () -- | Convert 'ScalarStyle' to 'NodeStyle' -- -- @since 0.2.0 scalarNodeStyle :: ScalarStyle -> NodeStyle scalarNodeStyle Plain = Flow scalarNodeStyle SingleQuoted = Flow scalarNodeStyle DoubleQuoted = Flow scalarNodeStyle (Literal _ _) = Block scalarNodeStyle (Folded _ _ ) = Block -- | YAML Anchor identifiers type Anchor = Text -- | YAML Tags newtype Tag = Tag (Maybe Text) deriving (Eq,Ord,Generic) instance Show Tag where show (Tag x) = show x -- | @since 0.2.0 instance NFData Tag where rnf (Tag x) = rnf x -- | Event stream produced by 'Data.YAML.Event.parseEvents' -- -- A 'Left' value denotes parsing errors. The event stream ends -- immediately once a 'Left' value is returned. type EvStream = [Either (Pos,String) EvPos] -- | Convert 'Tag' to its string representation -- -- Returns 'Nothing' for 'untagged' tagToText :: Tag -> Maybe T.Text tagToText (Tag x) = x -- | An \"untagged\" YAML tag untagged :: Tag untagged = Tag Nothing -- | Equivalent to @(== 'untagged')@ isUntagged :: Tag -> Bool isUntagged (Tag Nothing) = True isUntagged _ = False HsYAML-0.2.1.4/src/Data/YAML/Event/Writer.hs0000644000000000000000000004757607346545000016220 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Event-stream oriented YAML writer API -- module Data.YAML.Event.Writer ( writeEvents , writeEventsText ) where import Data.YAML.Event.Internal import qualified Data.ByteString.Lazy as BS.L import qualified Data.Char as C import qualified Data.Map as Map import qualified Data.Text as T import Text.Printf (printf) import qualified Data.Text.Lazy as T.L import qualified Data.Text.Lazy.Builder as T.B import qualified Data.Text.Lazy.Encoding as T.L import Util {- WARNING: the code that follows will make you cry; a safety pig is provided below for your benefit. _ _._ _..._ .-', _.._(`)) '-. ` ' /-._.-' ',/ ) \ '. / _ _ | \ | a a / | \ .-. ; '-('' ).-' ,' ; '-; | .' \ \ / | 7 .__ _.-\ \ | | | ``/ /` / /,_| | /,_/ / /,_/ '`-' -} -- | Serialise 'Event's using specified UTF encoding to a lazy 'BS.L.ByteString' -- -- __NOTE__: This function is only well-defined for valid 'Event' streams -- -- @since 0.2.0.0 writeEvents :: Encoding -> [Event] -> BS.L.ByteString writeEvents UTF8 = T.L.encodeUtf8 . writeEventsText writeEvents UTF16LE = T.L.encodeUtf16LE . T.L.cons '\xfeff' . writeEventsText writeEvents UTF16BE = T.L.encodeUtf16BE . T.L.cons '\xfeff' . writeEventsText writeEvents UTF32LE = T.L.encodeUtf32LE . T.L.cons '\xfeff' . writeEventsText writeEvents UTF32BE = T.L.encodeUtf32BE . T.L.cons '\xfeff' . writeEventsText -- | Serialise 'Event's to lazy 'T.L.Text' -- -- __NOTE__: This function is only well-defined for valid 'Event' streams -- -- @since 0.2.0.0 writeEventsText :: [Event] -> T.L.Text writeEventsText [] = mempty writeEventsText (StreamStart:xs) = T.B.toLazyText $ goStream xs (error "writeEvents: internal error") where -- goStream :: [Event] -> [Event] -> T.B.Builder goStream [StreamEnd] _ = mempty goStream (StreamEnd : _ : _ ) _cont = error "writeEvents: events after StreamEnd" goStream (Comment com: rest) cont = goComment (0 :: Int) True BlockIn com (goStream rest cont) goStream (DocumentStart marker : rest) cont = case marker of NoDirEndMarker -> putNode False rest (\zs -> goDoc zs cont) DirEndMarkerNoVersion -> "---" <> putNode True rest (\zs -> goDoc zs cont) DirEndMarkerVersion mi -> "%YAML 1." <> (T.B.fromString (show mi)) <> "\n---" <> putNode True rest (\zs -> goDoc zs cont) goStream (x:_) _cont = error ("writeEvents: unexpected " ++ show x ++ " (expected DocumentStart or StreamEnd)") goStream [] _cont = error ("writeEvents: unexpected end of stream (expected DocumentStart or StreamEnd)") goDoc (DocumentEnd marker : rest) cont = (if marker then "...\n" else mempty) <> goStream rest cont goDoc (Comment com: rest) cont = goComment (0 :: Int) True BlockIn com (goDoc rest cont) goDoc ys _ = error (show ys) -- unexpected s l = error ("writeEvents: unexpected " ++ show l ++ " " ++ show s) writeEventsText (x:_) = error ("writeEvents: unexpected " ++ show x ++ " (expected StreamStart)") -- | Production context -- copied from Data.YAML.Token data Context = BlockOut -- ^ Outside block sequence. | BlockIn -- ^ Inside block sequence. | BlockKey -- ^ Implicit block key. | FlowOut -- ^ Outside flow collection. | FlowIn -- ^ Inside flow collection. | FlowKey -- ^ Implicit flow key. deriving (Eq,Show) goComment :: Int -> Bool -> Context -> T.Text -> T.B.Builder -> T.B.Builder goComment !n !sol c comment cont = doSol <> "#" <> (T.B.fromText comment) <> doEol <> doIndent <> cont where doEol | not sol && n == 0 = mempty -- "--- " case | sol && FlowIn == c = mempty | otherwise = eol doSol | not sol && (BlockOut == c || FlowOut == c) = ws | sol = mkInd n' | otherwise = eol <> mkInd n' n' | BlockOut <- c = max 0 (n - 1) | FlowOut <- c = n + 1 | otherwise = n doIndent | BlockOut <- c = mkInd n' | FlowOut <- c = mkInd n' | otherwise = mempty putNode :: Bool -> [Event] -> ([Event] -> T.B.Builder) -> T.B.Builder putNode = \docMarker -> go (-1 :: Int) (not docMarker) BlockIn where {- s-l+block-node(n,c) [196] s-l+block-node(n,c) ::= s-l+block-in-block(n,c) | s-l+flow-in-block(n) [197] s-l+flow-in-block(n) ::= s-separate(n+1,flow-out) ns-flow-node(n+1,flow-out) s-l-comments [198] s-l+block-in-block(n,c) ::= s-l+block-scalar(n,c) | s-l+block-collection(n,c) [199] s-l+block-scalar(n,c) ::= s-separate(n+1,c) ( c-ns-properties(n+1,c) s-separate(n+1,c) )? ( c-l+literal(n) | c-l+folded(n) ) [200] s-l+block-collection(n,c) ::= ( s-separate(n+1,c) c-ns-properties(n+1,c) )? s-l-comments ( l+block-sequence(seq-spaces(n,c)) | l+block-mapping(n) ) [201] seq-spaces(n,c) ::= c = block-out ⇒ n-1 c = block-in ⇒ n -} go :: Int -> Bool -> Context -> [Event] -> ([Event] -> T.B.Builder) -> T.B.Builder go _ _ _ [] _cont = error ("putNode: expected node-start event instead of end-of-stream") go !n !sol c (t : rest) cont = case t of Scalar anc tag sty t' -> goStr (n+1) sol c anc tag sty t' (cont rest) SequenceStart anc tag sty -> goSeq (n+1) sol (chn sty) anc tag sty rest cont MappingStart anc tag sty -> goMap (n+1) sol (chn sty) anc tag sty rest cont Alias a -> pfx <> goAlias c a (cont rest) Comment com -> goComment (n+1) sol c com (go n sol c rest cont) _ -> error ("putNode: expected node-start event instead of " ++ show t) where pfx | sol = mempty | BlockKey <- c = mempty | FlowKey <- c = mempty | otherwise = T.B.singleton ' ' chn sty | Flow <-sty, (BlockIn == c || BlockOut == c) = FlowOut | otherwise = c goMap _ sol _ anc tag _ (MappingEnd : rest) cont = pfx $ "{}\n" <> cont rest where pfx cont' = wsSol sol <> anchorTag'' (Right ws) anc tag cont' goMap n sol c anc tag Block xs cont = case c of BlockIn | not (not sol && n == 0) -- avoid "--- " case -> wsSol sol <> anchorTag'' (Right (eol <> mkInd n)) anc tag (putKey xs putValue') _ -> anchorTag'' (Left ws) anc tag $ doEol <> g' xs where g' (MappingEnd : rest) = cont rest -- All comments should be part of the key g' ys = pfx <> putKey ys putValue' g (Comment com: rest) = goComment n True c' com (g rest) -- For trailing comments g (MappingEnd : rest) = cont rest g ys = pfx <> putKey ys putValue' pfx = if c == BlockIn || c == BlockOut || c == BlockKey then mkInd n else ws c' = if FlowIn == c then FlowKey else BlockKey doEol = case c of FlowKey -> mempty FlowIn -> mempty _ -> eol putKey zs cont2 | isSmallKey zs = go n (n == 0) c' zs (\ys -> ":" <> cont2 ys) | Comment com: rest <- zs = "?" <> ws <> goComment 0 True BlockIn com (f rest cont2) | otherwise = "?" <> go n False BlockIn zs (putValue cont2) f (Comment com: rest) cont2 = goComment (n + 1) True BlockIn com (f rest cont2) -- Comments should not change position in key f zs cont2 = ws <> mkInd n <> go n False BlockIn zs (putValue cont2) putValue cont2 zs | FlowIn <- c = ws <> mkInd (n - 1) <> ":" <> cont2 zs | otherwise = mkInd n <> ":" <> cont2 zs putValue' (Comment com: rest) = goComment (n + 1) False BlockOut com (ws <> putValue' rest) -- Comments should not change position in value putValue' zs = go n False (if FlowIn == c then FlowIn else BlockOut) zs g goMap n sol c anc tag Flow xs cont = wsSol sol <> anchorTag'' (Right ws) anc tag ("{" <> f xs) where f (Comment com: rest) = eol <> wsSol sol <> goComment n' True (inFlow c) com (f rest) f (MappingEnd : rest) = eol <> wsSol sol <> mkInd (n - 1) <> "}" <> doEol <> cont rest f ys = eol <> mkInd n' <> putKey ys putValue' n' = n + 1 doEol = case c of FlowKey -> mempty FlowIn -> mempty _ -> eol g (Comment com: rest) = "," <> eol <> wsSol sol <> goComment n' True (inFlow c) com (f rest) g (MappingEnd : rest) = eol <> wsSol sol <> mkInd (n - 1) <> "}" <> doEol <> cont rest g ys = "," <> eol <> mkInd n' <> putKey ys putValue' putKey zs cont2 | (Comment com: rest) <- zs = goComment n' True c com (eol <> mkInd n' <> putKey rest cont2) | isSmallKey zs = go n' (n == 0) FlowKey zs (if isComEv zs then putValue cont2 else (\ys -> ":" <> cont2 ys)) | otherwise = "?" <> go n False FlowIn zs (putValue cont2) putValue cont2 zs | Comment com: rest <- zs = eol <> wsSol sol <> goComment n' True (inFlow c) com (putValue cont2 rest) | otherwise = eol <> mkInd n' <> ":" <> cont2 zs putValue' zs | Comment com : rest <- zs = goComment n' False FlowOut com (putValue' rest) | otherwise = go n' False FlowIn zs g goSeq _ sol _ anc tag _ (SequenceEnd : rest) cont = pfx $ "[]\n" <> cont rest where pfx cont' = wsSol sol <> anchorTag'' (Right ws) anc tag cont' goSeq n sol c anc tag Block xs cont = case c of BlockOut -> anchorTag'' (Left ws) anc tag (eol <> if isComEv xs then "-" <> eol <> f xs else g xs) BlockIn | not sol && n == 0 {- "---" case -} -> goSeq n sol BlockOut anc tag Block xs cont | Comment com: rest <- xs -> wsSol sol <> anchorTag'' (Right (eol <> mkInd n')) anc tag ("-" <> ws <> goComment 0 True BlockIn com (f rest)) | otherwise -> wsSol sol <> anchorTag'' (Right (eol <> mkInd n')) anc tag ("-" <> go n' False BlockIn xs g) BlockKey -> error "sequence in block-key context not supported" _ -> error "Invalid Context in Block style" where n' | BlockOut <- c = max 0 (n - 1) | otherwise = n g (Comment com: rest) = goComment n' True BlockIn com (g rest) g (SequenceEnd : rest) = cont rest g ys = mkInd n' <> "-" <> go n' False BlockIn ys g f (Comment com: rest) = goComment n' True BlockIn com (f rest) f (SequenceEnd : rest) = cont rest f ys = ws <> mkInd n' <> go n' False BlockIn ys g goSeq n sol c anc tag Flow xs cont = wsSol sol <> anchorTag'' (Right ws) anc tag ("[" <> f xs) where f (Comment com: rest) = eol <> wsSol sol <> goComment n' True (inFlow c) com (f rest) f (SequenceEnd : rest) = eol <> wsSol sol <> mkInd (n - 1) <> "]" <> doEol <> cont rest f ys = eol <> mkInd n' <> go n' False (inFlow c) ys g n' = n + 1 doEol = case c of FlowKey -> mempty FlowIn -> mempty _ -> eol g (Comment com: rest) = "," <> eol <> wsSol sol <> goComment n' True (inFlow c) com (f rest) g (SequenceEnd : rest) = eol <> wsSol sol <> mkInd (n - 1) <> "]" <> doEol <> cont rest g ys = "," <> eol <> mkInd n' <> go n' False (inFlow c) ys g goAlias c a cont = T.B.singleton '*' <> T.B.fromText a <> sep <> cont where sep = case c of BlockIn -> eol BlockOut -> eol BlockKey -> T.B.singleton ' ' FlowIn -> mempty FlowOut -> eol FlowKey -> T.B.singleton ' ' goStr :: Int -> Bool -> Context -> Maybe Anchor -> Tag -> ScalarStyle -> Text -> T.B.Builder -> T.B.Builder goStr !n !sol c anc tag sty t cont = case sty of -- flow-style Plain -- empty scalars | t == "" -> case () of _ | Nothing <- anc, Tag Nothing <- tag -> contEol -- not even node properties | sol -> anchorTag0 anc tag (if c == BlockKey || c == FlowKey then ws <> cont else contEol) | BlockKey <- c -> anchorTag0 anc tag (ws <> cont) | FlowKey <- c -> anchorTag0 anc tag (ws <> cont) | otherwise -> anchorTag'' (Left ws) anc tag contEol Plain -> pfx $ let h [] = contEol h (x:xs) = T.B.fromText x <> f' xs where f' [] = contEol f' (y:ys) = eol <> mkInd (n+1) <> T.B.fromText y <> f' ys in h (insFoldNls (T.lines t)) -- FIXME: unquoted plain-strings can't handle leading/trailing whitespace properly SingleQuoted -> pfx $ T.B.singleton '\'' <> f (insFoldNls $ T.lines (T.replace "'" "''" t) ++ [ mempty | T.isSuffixOf "\n" t]) (T.B.singleton '\'' <> contEol) -- FIXME: leading white-space (i.e. SPC) before/after LF DoubleQuoted -> pfx $ T.B.singleton '"' <> T.B.fromText (escapeDQ t) <> T.B.singleton '"' <> contEol -- block style Folded chm iden -> pfx $ ">" <> goChomp chm <> goDigit iden <> g (insFoldNls' $ T.lines t) (fromEnum iden) cont Literal chm iden -> pfx $ "|" <> goChomp chm <> goDigit iden <> g (T.lines t) (fromEnum iden) cont where goDigit :: IndentOfs -> T.B.Builder goDigit iden = let ch = C.intToDigit.fromEnum $ iden in if(ch == '0') then mempty else T.B.singleton ch goChomp :: Chomp -> T.B.Builder goChomp chm = case chm of Strip -> T.B.singleton '-' Clip -> mempty Keep -> T.B.singleton '+' pfx cont' = (if sol || c == BlockKey || c == FlowKey then mempty else ws) <> anchorTag'' (Right ws) anc tag cont' doEol = case c of BlockKey -> False FlowKey -> False FlowIn -> False _ -> True contEol | doEol = eol <> cont | otherwise = cont g [] _ cont' = eol <> cont' g (x:xs) dig cont' | T.null x = eol <> g xs dig cont' | dig == 0 = eol <> (if n > 0 then mkInd n else mkInd' 1) <> T.B.fromText x <> g xs dig cont' | otherwise = eol <> mkInd (n-1) <> mkInd' dig <> T.B.fromText x <> g xs dig cont' g' [] cont' = cont' g' (x:xs) cont' = eol <> mkInd (n+1) <> T.B.fromText x <> g' xs cont' f [] cont' = cont' f (x:xs) cont' = T.B.fromText x <> g' xs cont' isSmallKey (Alias _ : _) = True isSmallKey (Scalar _ _ (Folded _ _) _: _) = False isSmallKey (Scalar _ _ (Literal _ _) _: _) = False isSmallKey (Scalar _ _ _ _ : _) = True isSmallKey (SequenceStart _ _ _ : _) = False isSmallKey (MappingStart _ _ _ : _) = False isSmallKey _ = False -- inFlow c = case c of FlowIn -> FlowIn FlowOut -> FlowIn BlockKey -> FlowKey FlowKey -> FlowKey _ -> error "Invalid context in Flow style" putTag t cont | Just t' <- T.stripPrefix "tag:yaml.org,2002:" t = "!!" <> T.B.fromText t' <> cont | "!" `T.isPrefixOf` t = T.B.fromText t <> cont | otherwise = "!<" <> T.B.fromText t <> T.B.singleton '>' <> cont anchorTag'' :: Either T.B.Builder T.B.Builder -> Maybe Anchor -> Tag -> T.B.Builder -> T.B.Builder anchorTag'' _ Nothing (Tag Nothing) cont = cont anchorTag'' (Right pad) Nothing (Tag (Just t)) cont = putTag t (pad <> cont) anchorTag'' (Right pad) (Just a) (Tag Nothing) cont = T.B.singleton '&' <> T.B.fromText a <> pad <> cont anchorTag'' (Right pad) (Just a) (Tag (Just t)) cont = T.B.singleton '&' <> T.B.fromText a <> T.B.singleton ' ' <> putTag t (pad <> cont) anchorTag'' (Left pad) Nothing (Tag (Just t)) cont = pad <> putTag t cont anchorTag'' (Left pad) (Just a) (Tag Nothing) cont = pad <> T.B.singleton '&' <> T.B.fromText a <> cont anchorTag'' (Left pad) (Just a) (Tag (Just t)) cont = pad <> T.B.singleton '&' <> T.B.fromText a <> T.B.singleton ' ' <> putTag t cont anchorTag0 = anchorTag'' (Left mempty) -- anchorTag = anchorTag'' (Right (T.B.singleton ' ')) -- anchorTag' = anchorTag'' (Left (T.B.singleton ' ')) isComEv :: [Event] -> Bool isComEv (Comment _: _) = True isComEv _ = False -- indentation helper mkInd :: Int -> T.B.Builder mkInd (-1) = mempty mkInd 0 = mempty mkInd 1 = " " mkInd 2 = " " mkInd 3 = " " mkInd 4 = " " mkInd l | l < 0 = error (show l) | otherwise = T.B.fromText (T.replicate l " ") mkInd' :: Int -> T.B.Builder mkInd' 1 = " " mkInd' 2 = " " mkInd' 3 = " " mkInd' 4 = " " mkInd' 5 = " " mkInd' 6 = " " mkInd' 7 = " " mkInd' 8 = " " mkInd' 9 = " " mkInd' l = error ("Impossible Indentation-level" ++ show l) eol, ws:: T.B.Builder eol = T.B.singleton '\n' ws = T.B.singleton ' ' wsSol :: Bool -> T.B.Builder wsSol sol = if sol then mempty else ws escapeDQ :: Text -> Text escapeDQ t | T.all (\c -> C.isPrint c && c /= '\\' && c /= '"') t = t | otherwise = T.concatMap escapeChar t escapeChar :: Char -> Text escapeChar c | c == '\\' = "\\\\" | c == '"' = "\\\"" | C.isPrint c = T.singleton c | Just e <- Map.lookup c emap = e | x <= 0xff = T.pack (printf "\\x%02x" x) | x <= 0xffff = T.pack (printf "\\u%04x" x) | otherwise = T.pack (printf "\\U%08x" x) where x = ord c emap = Map.fromList [ (v,T.pack ['\\',k]) | (k,v) <- escapes ] escapes :: [(Char,Char)] escapes = [ ('0', '\0') , ('a', '\x7') , ('b', '\x8') , ('\x9', '\x9') , ('t', '\x9') , ('n', '\xa') , ('v', '\xb') , ('f', '\xc') , ('r', '\xd') , ('e', '\x1b') , (' ', ' ') , ('"', '"') , ('/', '/') , ('\\', '\\') , ('N', '\x85') , ('_', '\xa0') , ('L', '\x2028') , ('P', '\x2029') ] -- flow style line folding -- FIXME: check single-quoted strings with leading '\n' or trailing '\n's insFoldNls :: [Text] -> [Text] insFoldNls [] = [] insFoldNls z0@(z:zs) | all T.null z0 = "" : z0 -- HACK | otherwise = z : go zs where go [] = [] go (l:ls) | T.null l = l : go' ls | otherwise = "" : l : go ls go' [] = [""] go' (l:ls) | T.null l = l : go' ls | otherwise = "" : l : go ls {- block style line folding The combined effect of the block line folding rules is that each “paragraph” is interpreted as a line, empty lines are interpreted as a line feed, and the formatting of more-indented lines is preserved. -} insFoldNls' :: [Text] -> [Text] insFoldNls' = go' where go [] = [] go (l:ls) | T.null l = l : go ls | isWhite (T.head l) = l : go' ls | otherwise = "" : l : go ls go' [] = [] go' (l:ls) | T.null l = l : go' ls | isWhite (T.head l) = l : go' ls | otherwise = l : go ls -- @s-white@ isWhite :: Char -> Bool isWhite ' ' = True isWhite '\t' = True isWhite _ = False HsYAML-0.2.1.4/src/Data/YAML/Internal.hs0000644000000000000000000000566607346545000015431 0ustar0000000000000000{-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Internal ( Node(..) , nodeLoc , NodeId , Doc(..) , Mapping ) where import qualified Data.Map as Map import Data.YAML.Event (Tag) import Data.YAML.Loader (NodeId) import Data.YAML.Schema.Internal (Scalar (..)) import Util -- | YAML Document tree/graph -- -- __NOTE__: In future versions of this API meta-data about the YAML document might be included as additional fields inside 'Doc' newtype Doc n = Doc { docRoot :: n -- ^ @since 0.2.1 } deriving (Eq,Ord,Show,Generic) -- | @since 0.2.0 instance NFData n => NFData (Doc n) where rnf (Doc n) = rnf n -- | @since 0.2.1 instance Functor Doc where fmap f (Doc n) = Doc (f n) x <$ _ = Doc x -- | YAML mapping type Mapping loc = Map (Node loc) (Node loc) -- | YAML Document node -- -- @since 0.2.0 data Node loc = Scalar !loc !Scalar | Mapping !loc !Tag (Mapping loc) | Sequence !loc !Tag [Node loc] | Anchor !loc !NodeId !(Node loc) deriving (Show,Generic) nodeLoc :: Node loc -> loc nodeLoc (Scalar pos _) = pos nodeLoc (Anchor pos _ _) = pos nodeLoc (Mapping pos _ _) = pos nodeLoc (Sequence pos _ _) = pos instance Functor Node where fmap f node = case node of Scalar x scalar -> Scalar (f x) scalar Mapping x tag m -> Mapping (f x) tag (mappingFmapLoc f m) Sequence x tag s -> Sequence (f x) tag (map (fmap f) s) Anchor x n nod -> Anchor (f x) n (fmap f nod) mappingFmapLoc :: (a -> b) -> Mapping a -> Mapping b mappingFmapLoc f = Map.mapKeysMonotonic (fmap f) . Map.map (fmap f) instance Eq (Node loc) where Scalar _ a == Scalar _ a' = a == a' Mapping _ a b == Mapping _ a' b' = a == a' && b == b' Sequence _ a b == Sequence _ a' b' = a == a' && b == b' Anchor _ a b == Anchor _ a' b' = a == a' && b == b' _ == _ = False instance Ord (Node loc) where compare (Scalar _ a) (Scalar _ a') = compare a a' compare (Scalar _ _) (Mapping _ _ _) = LT compare (Scalar _ _) (Sequence _ _ _) = LT compare (Scalar _ _) (Anchor _ _ _) = LT compare (Mapping _ _ _) (Scalar _ _) = GT compare (Mapping _ a b) (Mapping _ a' b') = compare (a,b) (a',b') compare (Mapping _ _ _) (Sequence _ _ _) = LT compare (Mapping _ _ _) (Anchor _ _ _) = LT compare (Sequence _ _ _) (Scalar _ _) = GT compare (Sequence _ _ _) (Mapping _ _ _) = GT compare (Sequence _ a b) (Sequence _ a' b') = compare (a,b) (a',b') compare (Sequence _ _ _) (Anchor _ _ _) = LT compare (Anchor _ _ _) (Scalar _ _) = GT compare (Anchor _ _ _) (Mapping _ _ _) = GT compare (Anchor _ _ _) (Sequence _ _ _) = GT compare (Anchor _ a b) (Anchor _ a' b') = compare (a,b) (a',b') HsYAML-0.2.1.4/src/Data/YAML/Loader.hs0000644000000000000000000001613407346545000015053 0ustar0000000000000000{-# LANGUAGE GeneralizedNewtypeDeriving #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE RecursiveDo #-} {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE Trustworthy #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Loader ( decodeLoader , Loader(..) , LoaderT , NodeId ) where import Control.Monad.State (MonadState(..), gets, modify, StateT, evalStateT, state) import Control.Monad.Trans (MonadTrans(..)) import qualified Data.ByteString.Lazy as BS.L import qualified Data.Map as Map import qualified Data.Set as Set import Data.YAML.Event (Tag) import qualified Data.YAML.Event as YE import Util -- | Unique identifier for identifying nodes -- -- This is allows to observe the alias/anchor-reference structure type NodeId = Word -- | Structure defining how to construct a document tree/graph -- -- @since 0.2.0 -- data Loader m n = Loader { yScalar :: Tag -> YE.ScalarStyle -> Text -> LoaderT m n , ySequence :: Tag -> [n] -> LoaderT m n , yMapping :: Tag -> [(n,n)] -> LoaderT m n , yAlias :: NodeId -> Bool -> n -> LoaderT m n , yAnchor :: NodeId -> n -> LoaderT m n } -- | Helper type for 'Loader' -- -- @since 0.2.0 type LoaderT m n = YE.Pos -> m (Either (YE.Pos,String) n) -- TODO: newtype LoaderT m n = LoaderT { runLoaderT :: YE.Pos -> m (Either String n) } -- | Generalised document tree/graph construction -- -- This doesn't yet perform any tag resolution (thus all scalars are -- represented as 'Text' values). See also 'Data.YAML.decodeNode' for a more -- convenient interface. -- -- @since 0.2.0 {-# INLINEABLE decodeLoader #-} decodeLoader :: forall n m . MonadFix m => Loader m n -> BS.L.ByteString -> m (Either (YE.Pos, String) [n]) decodeLoader Loader{..} bs0 = do case sequence $ filter (not. isComment) (YE.parseEvents bs0) of Left (pos,err) -> return $ Left (pos,err) Right evs -> runParserT goStream evs where isComment evPos = case evPos of Right (YE.EvPos {eEvent = (YE.Comment _), ePos = _}) -> True _ -> False goStream :: PT n m [n] goStream = do _ <- satisfy (== YE.StreamStart) ds <- manyUnless (== YE.StreamEnd) goDoc eof return ds goDoc :: PT n m n goDoc = do _ <- satisfy isDocStart modify $ \s0 -> s0 { sDict = mempty, sCycle = mempty } n <- goNode _ <- satisfy isDocEnd return n getNewNid :: PT n m Word getNewNid = state $ \s0 -> let i0 = sIdCnt s0 in (i0, s0 { sIdCnt = i0+1 }) returnNode :: YE.Pos -> Maybe YE.Anchor -> Either (YE.Pos, String) n -> PT n m n returnNode _ _ (Left err) = throwError err returnNode _ Nothing (Right node) = return node returnNode pos (Just a) (Right node) = do nid <- getNewNid node' <- liftEither' =<< lift (yAnchor nid node pos) modify $ \s0 -> s0 { sDict = Map.insert a (nid,node') (sDict s0) } return node' registerAnchor :: YE.Pos -> Maybe YE.Anchor -> PT n m n -> PT n m n registerAnchor _ Nothing pn = pn registerAnchor pos (Just a) pn = do modify $ \s0 -> s0 { sCycle = Set.insert a (sCycle s0) } nid <- getNewNid mdo modify $ \s0 -> s0 { sDict = Map.insert a (nid,n) (sDict s0) } n0 <- pn n <- liftEither' =<< lift (yAnchor nid n0 pos) return n exitAnchor :: Maybe YE.Anchor -> PT n m () exitAnchor Nothing = return () exitAnchor (Just a) = modify $ \s0 -> s0 { sCycle = Set.delete a (sCycle s0) } goNode :: PT n m n goNode = do n <- anyEv let pos = YE.ePos n case YE.eEvent n of YE.Scalar manc tag sty val -> do exitAnchor manc n' <- lift (yScalar tag sty val pos) returnNode pos manc $! n' YE.SequenceStart manc tag _ -> registerAnchor pos manc $ do ns <- manyUnless (== YE.SequenceEnd) goNode exitAnchor manc liftEither' =<< lift (ySequence tag ns pos) YE.MappingStart manc tag _ -> registerAnchor pos manc $ do kvs <- manyUnless (== YE.MappingEnd) (liftM2 (,) goNode goNode) exitAnchor manc liftEither' =<< lift (yMapping tag kvs pos) YE.Alias a -> do d <- gets sDict cy <- gets sCycle case Map.lookup a d of Nothing -> throwError (pos, ("anchor not found: " ++ show a)) Just (nid,n') -> liftEither' =<< lift (yAlias nid (Set.member a cy) n' pos) _ -> throwError (pos, "goNode: unexpected event") ---------------------------------------------------------------------------- -- small parser framework data S n = S { sEvs :: [YE.EvPos] , sDict :: Map YE.Anchor (Word,n) , sCycle :: Set YE.Anchor , sIdCnt :: !Word } newtype PT n m a = PT (StateT (S n) (ExceptT (YE.Pos, String) m) a) deriving ( Functor , Applicative , Monad , MonadState (S n) , MonadError (YE.Pos, String) , MonadFix ) instance MonadTrans (PT n) where lift = PT . lift . lift runParserT :: Monad m => PT n m a -> [YE.EvPos] -> m (Either (YE.Pos, String) a) runParserT (PT act) s0 = runExceptT $ evalStateT act (S s0 mempty mempty 0) satisfy :: Monad m => (YE.Event -> Bool) -> PT n m YE.EvPos satisfy p = do s0 <- get case sEvs s0 of [] -> throwError (fakePos, "satisfy: premature eof") (ev:rest) | p (YE.eEvent ev) -> do put (s0 { sEvs = rest}) return ev | otherwise -> throwError (YE.ePos ev, ("satisfy: predicate failed " ++ show ev)) peek :: Monad m => PT n m (Maybe YE.EvPos) peek = do s0 <- get case sEvs s0 of [] -> return Nothing (ev:_) -> return (Just ev) peek1 :: Monad m => PT n m YE.EvPos peek1 = maybe (throwError (fakePos,"peek1: premature eof")) return =<< peek anyEv :: Monad m => PT n m YE.EvPos anyEv = satisfy (const True) eof :: Monad m => PT n m () eof = do s0 <- get case sEvs s0 of [] -> return () (ev:_) -> throwError (YE.ePos ev, "eof expected") -- NB: consumes the end-event manyUnless :: Monad m => (YE.Event -> Bool) -> PT n m a -> PT n m [a] manyUnless p act = do t0 <- peek1 if p (YE.eEvent t0) then anyEv >> return [] else liftM2 (:) act (manyUnless p act) {- tryError :: MonadError e m => m a -> m (Either e a) tryError act = catchError (Right <$> act) (pure . Left) -} isDocStart :: YE.Event -> Bool isDocStart (YE.DocumentStart _) = True isDocStart _ = False isDocEnd :: YE.Event -> Bool isDocEnd (YE.DocumentEnd _) = True isDocEnd _ = False fakePos :: YE.Pos fakePos = YE.Pos { posByteOffset = -1 , posCharOffset = -1 , posLine = 1 , posColumn = 0 } HsYAML-0.2.1.4/src/Data/YAML/Pos.hs0000644000000000000000000000547207346545000014411 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Data.YAML.Pos ( Pos(..) , prettyPosWithSource ) where import qualified Data.ByteString.Lazy as BL import qualified Data.YAML.Token.Encoding as Enc import Util -- | Position in parsed YAML source -- -- See also 'prettyPosWithSource'. -- -- __NOTE__: if 'posCharOffset' is negative the 'Pos' value doesn't refer to a proper location; this may be emitted in corner cases when no proper location can be inferred. data Pos = Pos { posByteOffset :: !Int -- ^ 0-based byte offset , posCharOffset :: !Int -- ^ 0-based character (Unicode code-point) offset , posLine :: !Int -- ^ 1-based line number , posColumn :: !Int -- ^ 0-based character (Unicode code-point) column number } deriving (Eq, Show, Generic) -- | @since 0.2.0 instance NFData Pos where rnf !_ = () -- | Pretty prints a 'Pos' together with the line the 'Pos' refers and the column position. -- -- The input 'BL.ByteString' must be the same that was passed to the -- YAML decoding function that produced the 'Pos' value. The 'String' -- argument is inserted right after the @::@ in the -- first line. The pretty-printed position result 'String' will be -- terminated by a trailing newline. -- -- For instance, -- -- @ -- 'prettyPosWithSource' somePos someInput " error" ++ "unexpected character\\n" -- @ results in -- -- > 11:7: error -- > | -- > 11 | foo: | bar -- > | ^ -- > unexpected character -- -- @since 0.2.1 prettyPosWithSource :: Pos -> BL.ByteString -> String -> String prettyPosWithSource Pos{..} source msg | posCharOffset < 0 || posByteOffset < 0 = "0:0:" ++ msg ++ "\n" -- unproper location | otherwise = unlines [ show posLine ++ ":" ++ show posColumn ++ ":" ++ msg , lpfx , lnostr ++ "| " ++ line , lpfx ++ replicate posColumn ' ' ++ "^" ] where lnostr = " " ++ show posLine ++ " " lpfx = (' ' <$ lnostr) ++ "| " (_,lstart) = findLineStartByByteOffset posByteOffset source line = map snd $ takeWhile (not . isNL . snd) lstart isNL c = c == '\r' || c == '\n' findLineStartByByteOffset :: Int -> BL.ByteString -> (Int,[(Int,Char)]) findLineStartByByteOffset bofs0 input = go 0 inputChars inputChars where (_,inputChars) = Enc.decode input go lsOfs lsChars [] = (lsOfs,lsChars) go lsOfs lsChars ((ofs',_):_) | bofs0 < ofs' = (lsOfs,lsChars) go _ _ ((_,'\r'):(ofs','\n'):rest) = go ofs' rest rest go _ _ ((ofs','\r'):rest) = go ofs' rest rest go _ _ ((ofs','\n'):rest) = go ofs' rest rest go lsOfs lsChars (_:rest) = go lsOfs lsChars rest HsYAML-0.2.1.4/src/Data/YAML/Schema.hs0000644000000000000000000000305507346545000015043 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} {-# LANGUAGE ScopedTypeVariables #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Predefined YAML 1.2 Schema resolvers and encoders as well as support for defining custom resolvers and encoders. -- -- @since 0.2.0.0 module Data.YAML.Schema ( -- * Schema resolvers -- ** YAML 1.2 Schema resolvers SchemaResolver(..) , failsafeSchemaResolver , jsonSchemaResolver , coreSchemaResolver -- * Schema encoders -- ** YAML 1.2 Schema encoders , SchemaEncoder(..) , failsafeSchemaEncoder , jsonSchemaEncoder , coreSchemaEncoder -- ** Custom Schema encoding -- -- | According to YAML 1.2 the recommended default 'SchemaEncoder' is 'coreSchemaEncoder' under which 'Scalar's are encoded as follows: -- -- * String which are made of Plain Characters (see 'isPlainChar'), unambiguous (see 'isAmbiguous') and do not contain any leading/trailing spaces are encoded as 'Data.YAML.Event.Plain' 'Scalar'. -- -- * Rest of the strings are encoded in DoubleQuotes -- -- * Booleans are encoded using 'encodeBool' -- -- * Double values are encoded using 'encodeDouble' -- -- * Integral values are encoded using 'encodeInt' -- , setScalarStyle , isPlainChar , isAmbiguous , encodeDouble , encodeBool , encodeInt ) where import Data.YAML.Schema.Internal HsYAML-0.2.1.4/src/Data/YAML/Schema/0000755000000000000000000000000007346545000014504 5ustar0000000000000000HsYAML-0.2.1.4/src/Data/YAML/Schema/Internal.hs0000644000000000000000000004441307346545000016622 0ustar0000000000000000{-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} {-# LANGUAGE Safe #-} {-# LANGUAGE ScopedTypeVariables #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- YAML 1.2 Schema resolvers and encoders -- module Data.YAML.Schema.Internal ( SchemaResolver(..) , failsafeSchemaResolver , jsonSchemaResolver , coreSchemaResolver , Scalar(..) , SchemaEncoder(..) , failsafeSchemaEncoder , jsonSchemaEncoder , coreSchemaEncoder , tagNull, tagBool, tagStr, tagInt, tagFloat, tagSeq, tagMap , isPlainChar , isAmbiguous, defaultSchemaEncoder, setScalarStyle , encodeDouble, encodeBool, encodeInt ) where import qualified Data.Char as C import qualified Data.Map as Map import qualified Data.Set as Set import qualified Data.Text as T import Numeric (readHex, readOct) import Text.Parsec as P import Text.Parsec.Text import Data.YAML.Event (ScalarStyle (..), Tag, isUntagged, mkTag, untagged) import qualified Data.YAML.Event as YE import Util -- | Primitive scalar types as defined in YAML 1.2 data Scalar = SNull -- ^ @tag:yaml.org,2002:null@ | SBool !Bool -- ^ @tag:yaml.org,2002:bool@ | SFloat !Double -- ^ @tag:yaml.org,2002:float@ | SInt !Integer -- ^ @tag:yaml.org,2002:int@ | SStr !Text -- ^ @tag:yaml.org,2002:str@ | SUnknown !Tag !Text -- ^ unknown/unsupported tag or untagged (thus unresolved) scalar deriving (Eq,Ord,Show,Generic) -- | @since 0.2.0 instance NFData Scalar where rnf SNull = () rnf (SBool _) = () rnf (SFloat _) = () rnf (SInt _) = () rnf (SStr _) = () rnf (SUnknown t _) = rnf t -- | Definition of a [YAML 1.2 Schema](http://yaml.org/spec/1.2/spec.html#Schema) -- -- A YAML schema defines how implicit tags are resolved to concrete tags and how data is represented textually in YAML. data SchemaResolver = SchemaResolver { schemaResolverScalar :: Tag -> YE.ScalarStyle -> T.Text -> Either String Scalar , schemaResolverSequence :: Tag -> Either String Tag , schemaResolverMapping :: Tag -> Either String Tag , schemaResolverMappingDuplicates :: Bool -- TODO: use something different from 'Bool' } data ScalarTag = ScalarBangTag -- ^ non-specific ! tag | ScalarQMarkTag -- ^ non-specific ? tag | ScalarTag !Tag -- ^ specific tag -- common logic for 'schemaResolverScalar' scalarTag :: (ScalarTag -> T.Text -> Either String Scalar) -> Tag -> YE.ScalarStyle -> T.Text -> Either String Scalar scalarTag f tag sty val = f tag' val where tag' = case sty of YE.Plain | tag == untagged -> ScalarQMarkTag -- implicit ? tag _ | tag == untagged -> ScalarBangTag -- implicit ! tag | tag == tagBang -> ScalarBangTag -- explicit ! tag | otherwise -> ScalarTag tag -- | \"Failsafe\" schema resolver as specified -- in [YAML 1.2 / 10.1.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2803036) failsafeSchemaResolver :: SchemaResolver failsafeSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | otherwise = Right (SUnknown t v) go ScalarQMarkTag v = Right (SUnknown untagged v) -- leave unresolved -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | otherwise = Right t schemaResolverMappingDuplicates = False -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | otherwise = Right t -- | Strict JSON schema resolver as specified -- in [YAML 1.2 / 10.2.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2804356) jsonSchemaResolver :: SchemaResolver jsonSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | t == tagNull = if isNullLiteral v then Right SNull else Left ("invalid !!null " ++ show v) | t == tagInt = maybe (Left $ "invalid !!int " ++ show v) (Right . SInt) $ jsonDecodeInt v | t == tagFloat = maybe (Left $ "invalid !!float " ++ show v) (Right . SFloat) $ jsonDecodeFloat v | t == tagBool = maybe (Left $ "invalid !!bool " ++ show v) (Right . SBool) $ jsonDecodeBool v | otherwise = Right (SUnknown t v) -- unknown specific tag go ScalarQMarkTag v | isNullLiteral v = Right SNull | Just b <- jsonDecodeBool v = Right $! SBool b | Just i <- jsonDecodeInt v = Right $! SInt i | Just f <- jsonDecodeFloat v = Right $! SFloat f | otherwise = Right (SUnknown untagged v) -- leave unresolved -- FIXME: YAML 1.2 spec requires an error here isNullLiteral = (== "null") -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | isUntagged t = Right tagMap | otherwise = Right t schemaResolverMappingDuplicates = False -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | isUntagged t = Right tagSeq | otherwise = Right t -- | Core schema resolver as specified -- in [YAML 1.2 / 10.3.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2805071) coreSchemaResolver :: SchemaResolver coreSchemaResolver = SchemaResolver{..} where -- scalars schemaResolverScalar = scalarTag go where go ScalarBangTag v = Right (SStr v) go (ScalarTag t) v | t == tagStr = Right (SStr v) | t == tagNull = if isNullLiteral v then Right SNull else Left ("invalid !!null " ++ show v) | t == tagInt = maybe (Left $ "invalid !!int " ++ show v) (Right . SInt) $ coreDecodeInt v | t == tagFloat = maybe (Left $ "invalid !!float " ++ show v) (Right . SFloat) $ coreDecodeFloat v | t == tagBool = maybe (Left $ "invalid !!bool " ++ show v) (Right . SBool) $ coreDecodeBool v | otherwise = Right (SUnknown t v) -- unknown specific tag go ScalarQMarkTag v | isNullLiteral v = Right SNull | Just b <- coreDecodeBool v = Right $! SBool b | Just i <- coreDecodeInt v = Right $! SInt i | Just f <- coreDecodeFloat v = Right $! SFloat f | otherwise = Right (SStr v) -- map to !!str by default isNullLiteral = flip Set.member (Set.fromList [ "", "null", "NULL", "Null", "~" ]) -- mappings schemaResolverMapping t | t == tagBang = Right tagMap | isUntagged t = Right tagMap | otherwise = Right t schemaResolverMappingDuplicates = False -- sequences schemaResolverSequence t | t == tagBang = Right tagSeq | isUntagged t = Right tagSeq | otherwise = Right t -- | @tag:yaml.org,2002:bool@ (JSON Schema) jsonDecodeBool :: T.Text -> Maybe Bool jsonDecodeBool "false" = Just False jsonDecodeBool "true" = Just True jsonDecodeBool _ = Nothing -- | @tag:yaml.org,2002:bool@ (Core Schema) coreDecodeBool :: T.Text -> Maybe Bool coreDecodeBool = flip Map.lookup $ Map.fromList [ ("true", True) , ("True", True) , ("TRUE", True) , ("false", False) , ("False", False) , ("FALSE", False) ] -- | @tag:yaml.org,2002:int@ according to JSON Schema -- -- > 0 | -? [1-9] [0-9]* jsonDecodeInt :: T.Text -> Maybe Integer jsonDecodeInt t | T.null t = Nothing jsonDecodeInt "0" = Just 0 jsonDecodeInt t = do -- [-]? [1-9] [0-9]* let tabs | T.isPrefixOf "-" t = T.tail t | otherwise = t guard (not (T.null tabs)) guard (T.head tabs /= '0') guard (T.all C.isDigit tabs) readMaybe (T.unpack t) -- | @tag:yaml.org,2002:int@ according to Core Schema -- -- > [-+]? [0-9]+ (Base 10) -- > 0o [0-7]+ (Base 8) -- > 0x [0-9a-fA-F]+ (Base 16) -- coreDecodeInt :: T.Text -> Maybe Integer coreDecodeInt t | T.null t = Nothing -- > 0x [0-9a-fA-F]+ (Base 16) | Just rest <- T.stripPrefix "0x" t , T.all C.isHexDigit rest , [(j,"")] <- readHex (T.unpack rest) = Just $! j -- 0o [0-7]+ (Base 8) | Just rest <- T.stripPrefix "0o" t , T.all C.isOctDigit rest , [(j,"")] <- readOct (T.unpack rest) = Just $! j -- [-+]? [0-9]+ (Base 10) | T.all C.isDigit t = Just $! read (T.unpack t) | Just rest <- T.stripPrefix "+" t , not (T.null rest) , T.all C.isDigit rest = Just $! read (T.unpack rest) | Just rest <- T.stripPrefix "-" t , not (T.null rest) , T.all C.isDigit rest = Just $! read (T.unpack t) | otherwise = Nothing -- | @tag:yaml.org,2002:float@ according to JSON Schema -- -- > -? ( 0 | [1-9] [0-9]* ) ( \. [0-9]* )? ( [eE] [-+]? [0-9]+ )? -- jsonDecodeFloat :: T.Text -> Maybe Double jsonDecodeFloat = either (const Nothing) Just . parse float "" where float :: Parser Double float = do -- -? p0 <- option "" ("-" <$ char '-') -- ( 0 | [1-9] [0-9]* ) p1 <- do d <- digit if (d /= '0') then (d:) <$> P.many digit else pure [d] -- ( \. [0-9]* )? p2 <- option "" $ (:) <$> char '.' <*> option "0" (many1 digit) -- ( [eE] [-+]? [0-9]+ )? p3 <- option "" $ do void (char 'e' P.<|> char 'E') s <- option "" (("-" <$ char '-') P.<|> ("" <$ char '+')) d <- P.many1 digit pure ("e" ++ s ++ d) eof let t' = p0++p1++p2++p3 pure $! read t' -- | @tag:yaml.org,2002:float@ according to Core Schema -- -- > [-+]? ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )? -- coreDecodeFloat :: T.Text -> Maybe Double coreDecodeFloat t | Just j <- Map.lookup t literals = Just j -- short-cut | otherwise = either (const Nothing) Just . parse float "" $ t where float :: Parser Double float = do -- [-+]? p0 <- option "" (("-" <$ char '-') P.<|> "" <$ char '+') -- ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) p1 <- (char '.' *> (("0."++) <$> many1 digit)) P.<|> do d1 <- many1 digit d2 <- option "" $ (:) <$> char '.' <*> option "0" (many1 digit) pure (d1++d2) -- ( [eE] [-+]? [0-9]+ )? p2 <- option "" $ do void (char 'e' P.<|> char 'E') s <- option "" (("-" <$ char '-') P.<|> ("" <$ char '+')) d <- P.many1 digit pure ("e" ++ s ++ d) eof let t' = p0++p1++p2 pure $! read t' literals = Map.fromList [ ("0" , 0) , (".nan", (0/0)) , (".NaN", (0/0)) , (".NAN", (0/0)) , (".inf", (1/0)) , (".Inf", (1/0)) , (".INF", (1/0)) , ("+.inf", (1/0)) , ("+.Inf", (1/0)) , ("+.INF", (1/0)) , ("-.inf", (-1/0)) , ("-.Inf", (-1/0)) , ("-.INF", (-1/0)) ] -- | Some tags specified in YAML 1.2 tagNull, tagBool, tagStr, tagInt, tagFloat, tagSeq, tagMap, tagBang :: Tag tagNull = mkTag "tag:yaml.org,2002:null" tagStr = mkTag "tag:yaml.org,2002:str" tagInt = mkTag "tag:yaml.org,2002:int" tagFloat = mkTag "tag:yaml.org,2002:float" tagBool = mkTag "tag:yaml.org,2002:bool" tagSeq = mkTag "tag:yaml.org,2002:seq" tagMap = mkTag "tag:yaml.org,2002:map" tagBang = mkTag "!" -- | @since 0.2.0 data SchemaEncoder = SchemaEncoder { schemaEncoderScalar :: Scalar -> Either String (Tag, ScalarStyle, T.Text) , schemaEncoderSequence :: Tag -> Either String Tag , schemaEncoderMapping :: Tag -> Either String Tag } mappingTag :: Tag -> Either String Tag mappingTag t | t == tagMap = Right untagged | otherwise = Right t seqTag :: Tag -> Either String Tag seqTag t | t == tagSeq = Right untagged | otherwise = Right t -- | \"Failsafe\" schema encoder as specified -- in [YAML 1.2 / 10.1.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2803036) -- -- @since 0.2.0 failsafeSchemaEncoder :: SchemaEncoder failsafeSchemaEncoder = SchemaEncoder{..} where schemaEncoderScalar s = case s of SNull -> Left "SNull scalar type not supported in failsafeSchemaEncoder" SBool _ -> Left "SBool scalar type not supported in failsafeSchemaEncoder" SFloat _ -> Left "SFloat scalar type not supported in failsafeSchemaEncoder" SInt _ -> Left "SInt scalar type not supported in failsafeSchemaEncoder" SStr text -> failEncodeStr text SUnknown t v -> Right (t, DoubleQuoted, v) schemaEncoderMapping = mappingTag schemaEncoderSequence = seqTag -- | Strict JSON schema encoder as specified -- in [YAML 1.2 / 10.2.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2804356) -- -- @since 0.2.0 jsonSchemaEncoder :: SchemaEncoder jsonSchemaEncoder = SchemaEncoder{..} where schemaEncoderScalar s = case s of SNull -> Right (untagged, Plain, "null") SBool bool -> Right (untagged, Plain, encodeBool bool) SFloat double -> Right (untagged, Plain, encodeDouble double) SInt int -> Right (untagged, Plain, encodeInt int) SStr text -> jsonEncodeStr text SUnknown _ _ -> Left "SUnknown scalar type not supported in jsonSchemaEncoder" schemaEncoderMapping = mappingTag schemaEncoderSequence = seqTag -- | Core schema encoder as specified -- in [YAML 1.2 / 10.3.2. Tag Resolution](http://yaml.org/spec/1.2/spec.html#id2805071) -- -- @since 0.2.0 coreSchemaEncoder :: SchemaEncoder coreSchemaEncoder = SchemaEncoder{..} where schemaEncoderScalar s = case s of SNull -> Right (untagged, Plain, "null") SBool bool -> Right (untagged, Plain, encodeBool bool) SFloat double -> Right (untagged, Plain, encodeDouble double) SInt int -> Right (untagged, Plain, encodeInt int) SStr text -> coreEncodeStr text SUnknown t v -> Right (t, DoubleQuoted, v) schemaEncoderMapping = mappingTag schemaEncoderSequence = seqTag -- | Encode Boolean -- -- @since 0.2.0 encodeBool :: Bool -> T.Text encodeBool b = if b then "true" else "false" -- | Encode Double -- -- @since 0.2.0 encodeDouble :: Double -> T.Text encodeDouble d | d /= d = ".nan" | d == (1/0) = ".inf" | d == (-1/0) = "-.inf" | otherwise = T.pack . show $ d -- | Encode Integer -- -- @since 0.2.0 encodeInt :: Integer -> T.Text encodeInt = T.pack . show failEncodeStr :: T.Text -> Either String (Tag, ScalarStyle, T.Text) failEncodeStr t | T.isPrefixOf " " t = Right (untagged, DoubleQuoted, t) | T.last t == ' ' = Right (untagged, DoubleQuoted, t) | T.any (not. isPlainChar) t = Right (untagged, DoubleQuoted, t) | otherwise = Right (untagged, Plain, t) jsonEncodeStr :: T.Text -> Either String (Tag, ScalarStyle, T.Text) jsonEncodeStr t | T.null t = Right (untagged, DoubleQuoted, t) | T.isPrefixOf " " t = Right (untagged, DoubleQuoted, t) | T.last t == ' ' = Right (untagged, DoubleQuoted, t) | T.any (not. isPlainChar) t = Right (untagged, DoubleQuoted, t) | isAmbiguous jsonSchemaResolver t = Right (untagged, DoubleQuoted, t) | otherwise = Right (untagged, Plain, t) coreEncodeStr :: T.Text -> Either String (Tag, ScalarStyle, T.Text) coreEncodeStr t | T.null t = Right (untagged, DoubleQuoted, t) | T.isPrefixOf " " t = Right (untagged, DoubleQuoted, t) | T.last t == ' ' = Right (untagged, DoubleQuoted, t) | T.any (not. isPlainChar) t = Right (untagged, DoubleQuoted, t) | isAmbiguous coreSchemaResolver t = Right (untagged, DoubleQuoted, t) | otherwise = Right (untagged, Plain, t) -- | These are some characters which can be used in 'Plain' 'Scalar's safely without any quotes (see ). -- -- __NOTE__: This does not mean that other characters (like @"\\n"@ and other special characters like @"-?:,[]{}#&*!,>%\@`\"\'"@) cannot be used in 'Plain' 'Scalar's. -- -- @since 0.2.0 isPlainChar :: Char -> Bool isPlainChar c = C.isAlphaNum c || c `elem` (" ~$^+=%@`\\'\"" -- | Returns True if the string can be decoded by the given 'SchemaResolver' -- into a 'Scalar' which is not a of type 'SStr'. -- -- >>> isAmbiguous coreSchemaResolver "true" -- True -- -- >>> isAmbiguous failSchemaResolver "true" -- False -- -- @since 0.2.0 isAmbiguous :: SchemaResolver -> T.Text -> Bool isAmbiguous SchemaResolver{..} t = case schemaResolverScalar untagged Plain t of Left err -> error err Right (SStr _ ) -> False Right _ -> True -- | According to YAML 1.2 'coreSchemaEncoder' is the default 'SchemaEncoder' -- -- By default, 'Scalar's are encoded as follows: -- -- * String which are made of Plain Characters (see 'isPlainChar'), unambiguous (see 'isAmbiguous') and do not contain any leading/trailing spaces are encoded as 'Plain' 'Scalar'. -- -- * Rest of the strings are encoded in DoubleQuotes -- -- * Booleans are encoded using 'encodeBool' -- -- * Double values are encoded using 'encodeDouble' -- -- * Integral values are encoded using 'encodeInt' -- -- @since 0.2.0 defaultSchemaEncoder :: SchemaEncoder defaultSchemaEncoder = coreSchemaEncoder -- | Set the 'Scalar' style in the encoded YAML. This is a function that decides -- for each 'Scalar' the type of YAML string to output. -- -- __WARNING__: You must ensure that special strings (like @"true"@\/@"false"@\/@"null"@\/@"1234"@) are not encoded with the 'Plain' style, because -- then they will be decoded as boolean, null or numeric values. You can use 'isAmbiguous' to detect them. -- -- __NOTE__: For different 'SchemaResolver's, different strings are ambiguous. For example, @"true"@ is not ambiguous for 'failsafeSchemaResolver'. -- -- @since 0.2.0 setScalarStyle :: (Scalar -> Either String (Tag, ScalarStyle, T.Text)) -> SchemaEncoder -> SchemaEncoder setScalarStyle customScalarEncoder encoder = encoder { schemaEncoderScalar = customScalarEncoder } HsYAML-0.2.1.4/src/Data/YAML/Token.hs0000644000000000000000000023623507346545000014733 0ustar0000000000000000{-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FunctionalDependencies #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE PostfixOperators #-} {-# LANGUAGE Safe #-} {-# OPTIONS_GHC -fno-warn-missing-signatures #-} {-# OPTIONS_GHC -fno-warn-name-shadowing #-} -- | -- Copyright: © Oren Ben-Kiki 2007, -- © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- Tokenizer for the YAML 1.2 syntax as defined in . -- module Data.YAML.Token ( tokenize , Token(..) , Code(..) , Encoding(..) ) where import qualified Data.ByteString.Lazy.Char8 as BLC import qualified Data.DList as D import Prelude hiding ((*), (+), (-), (/), (^)) import qualified Prelude import Data.YAML.Token.Encoding (Encoding (..), decode) import Util hiding (empty) import qualified Util -- * Generic operators -- -- ** Numeric operators -- -- We rename the four numerical operators @+@ @-@ @*@ @\/@ to start with @.@ -- (@.+@, @.-@, @.*@, @.\/@). This allows us to use the originals for BNF -- notation (we also hijack the @^@ operator). This is not a generally -- recommended practice. It is justified in this case since we have very little -- arithmetic operations, and a lot of BNF rules which this makes extremely -- readable. infixl 6 .+ -- | \".+\" is the numeric addition (we use \"+\" for postfix \"one or more\"). (.+) :: Int -> Int -> Int (.+) = (Prelude.+) infixl 6 .- -- | \".-\" is the numeric subtraction (we use \"-\" for infix \"and not\"). (.-) :: Int -> Int -> Int (.-) = (Prelude.-) {- infixl 7 .* -- | \".*\" is the numeric multiplication (we use \"*\" for postfix \"zero or -- more\"). (.*) :: Int -> Int -> Int (.*) = (Prelude.*) -} -- ** Record field access -- -- We also define @^.@ for record access for increased readability. infixl 8 ^. -- | @record ^. field@ is the same as @field record@, but is more readable. -- -- NB: This trivially emulates the @lens@ operator (^.) :: record -> (record -> value) -> value record ^. field = field record -- * Result tokens -- -- The parsing result is a stream of tokens rather than a parse tree. The idea -- is to convert the YAML input into \"byte codes\". These byte codes are -- intended to be written into a byte codes file (or more likely a UNIX pipe) -- for further processing. -- | 'Token' codes. data Code = Bom -- ^ BOM, contains \"@TF8@\", \"@TF16LE@\", \"@TF32BE@\", etc. | Text -- ^ Content text characters. | Meta -- ^ Non-content (meta) text characters. | Break -- ^ Separation line break. | LineFeed -- ^ Line break normalized to content line feed. | LineFold -- ^ Line break folded to content space. | Indicator -- ^ Character indicating structure. | White -- ^ Separation white space. | Indent -- ^ Indentation spaces. | DirectivesEnd -- ^ Document start marker. | DocumentEnd -- ^ Document end marker. | BeginEscape -- ^ Begins escape sequence. | EndEscape -- ^ Ends escape sequence. | BeginComment -- ^ Begins comment. | EndComment -- ^ Ends comment. | BeginDirective -- ^ Begins directive. | EndDirective -- ^ Ends directive. | BeginTag -- ^ Begins tag. | EndTag -- ^ Ends tag. | BeginHandle -- ^ Begins tag handle. | EndHandle -- ^ Ends tag handle. | BeginAnchor -- ^ Begins anchor. | EndAnchor -- ^ Ends anchor. | BeginProperties -- ^ Begins node properties. | EndProperties -- ^ Ends node properties. | BeginAlias -- ^ Begins alias. | EndAlias -- ^ Ends alias. | BeginScalar -- ^ Begins scalar content. | EndScalar -- ^ Ends scalar content. | BeginSequence -- ^ Begins sequence content. | EndSequence -- ^ Ends sequence content. | BeginMapping -- ^ Begins mapping content. | EndMapping -- ^ Ends mapping content. | BeginPair -- ^ Begins mapping key:value pair. | EndPair -- ^ Ends mapping key:value pair. | BeginNode -- ^ Begins complete node. | EndNode -- ^ Ends complete node. | BeginDocument -- ^ Begins document. | EndDocument -- ^ Ends document. | BeginStream -- ^ Begins YAML stream. | EndStream -- ^ Ends YAML stream. | Error -- ^ Parsing error at this point. | Unparsed -- ^ Unparsed due to errors (or at end of test). | Detected -- ^ Detected parameter (for testing). deriving (Show,Eq,Generic) -- | @since 0.2.0 instance NFData Code where rnf x = seq x () {- -- | @show code@ converts a 'Code' to the one-character YEAST token code char. -- The list of byte codes is also documented in the @yaml2yeast@ program. instance Show Code where show code = case code of Bom -> "U" Text -> "T" Meta -> "t" Break -> "b" LineFeed -> "L" LineFold -> "l" Indicator -> "I" White -> "w" Indent -> "i" DirectivesEnd -> "K" DocumentEnd -> "k" BeginEscape -> "E" EndEscape -> "e" BeginComment -> "C" EndComment -> "c" BeginDirective -> "D" EndDirective -> "d" BeginTag -> "G" EndTag -> "g" BeginHandle -> "H" EndHandle -> "h" BeginAnchor -> "A" EndAnchor -> "a" BeginProperties -> "P" EndProperties -> "p" BeginAlias -> "R" EndAlias -> "r" BeginScalar -> "S" EndScalar -> "s" BeginSequence -> "Q" EndSequence -> "q" BeginMapping -> "M" EndMapping -> "m" BeginNode -> "N" EndNode -> "n" BeginPair -> "X" EndPair -> "x" BeginDocument -> "O" EndDocument -> "o" Error -> "!" Unparsed -> "-" Detected -> "$" -} -- | Parsed token. data Token = Token { tByteOffset :: !Int, -- ^ 0-base byte offset in stream. tCharOffset :: !Int, -- ^ 0-base character offset in stream. tLine :: !Int, -- ^ 1-based line number. tLineChar :: !Int, -- ^ 0-based character in line. tCode :: !Code, -- ^ Specific token 'Code'. tText :: !String -- ^ Contained input chars, if any. } deriving (Show,Generic) -- | @since 0.2.0 instance NFData Token where rnf Token { tText = txt } = rnf txt -- * Parsing framework -- -- Haskell has no shortage of parsing frameworks. We use our own because: -- -- * Most available frameworks are inappropriate because of their focus on -- building a parse tree, and completing all of it before any of it is -- accessible to the caller. We return a stream of tokens, and would like -- its head to be accessible as soon as possible to allow for streaming. To -- do this with bounded memory usage we use a combination of continuation -- passing style and difference lists for the collected tokens. -- -- * Haskell makes it so easy to roll your own parsing framework. We need some -- specialized machinery (limited lookahead, forbidden patterns). It is -- possible to build these on top of existing frameworks but the end result -- isn't much shorter than rolling our own. -- -- Since we roll our own framework we don't bother with making it generalized, -- so we maintain a single 'State' type rather than having a generic one that -- contains a polymorphic \"UserState\" field etc. -- | A 'Data.YAML.Token.Parser' is basically a function computing a 'Reply'. newtype Parser result = Parser (State -> Reply result) applyParser :: Parser result -> State -> Reply result applyParser (Parser p) s = p s -- | The 'Result' of each invocation is either an error, the actual result, or -- a continuation for computing the actual result. data Result result = Failed String -- ^ Parsing aborted with a failure. | Result result -- ^ Parsing completed with a result. | More (Parser result) -- ^ Parsing is ongoing with a continuation. {- -- Showing a 'Result' is only used in debugging. instance (Show result) => Show (Result result) where show result = case result of Failed message -> "Failed " ++ message Result result -> "Result " ++ (show result) More _ -> "More" -} -- | Each invocation of a 'Data.YAML.Token.Parser' yields a 'Reply'. The 'Result' is only one -- part of the 'Reply'. data Reply result = Reply { rResult :: !(Result result), -- ^ Parsing result. rTokens :: !(D.DList Token), -- ^ Tokens generated by the parser. rCommit :: !(Maybe Decision), -- ^ Commitment to a decision point. rState :: !State -- ^ The updated parser state. } {- -- Showing a 'State' is only used in debugging. instance (Show result) => Show (Reply result) where show reply = "Result: " ++ (show $ reply^.rResult) ++ ", Tokens: " ++ (show $ D.toList $ reply^.rTokens) ++ ", Commit: " ++ (show $ reply^.rCommit) ++ ", State: { " ++ (show $ reply^.rState) ++ "}" -} -- A 'Pattern' is a parser that doesn't have an (interesting) result. type Pattern = Parser () -- ** Parsing state -- | The internal parser state. We don't bother with parameterising it with a -- \"UserState\", we just bundle the generic and specific fields together (not -- that it is that easy to draw the line - is @sLine@ generic or specific?). data State = State { sEncoding :: !Encoding, -- ^ The input UTF encoding. sDecision :: !Decision, -- ^ Current decision name. sLimit :: !Int, -- ^ Lookahead characters limit. sForbidden :: !(Maybe Pattern), -- ^ Pattern we must not enter into. sIsPeek :: !Bool, -- ^ Disables token generation. sIsSol :: !Bool, -- ^ Is at start of line? sChars :: ![Char], -- ^ (Reversed) characters collected for a token. sCharsByteOffset :: !Int, -- ^ Byte offset of first collected character. sCharsCharOffset :: !Int, -- ^ Char offset of first collected character. sCharsLine :: !Int, -- ^ Line of first collected character. sCharsLineChar :: !Int, -- ^ Character in line of first collected character. sByteOffset :: !Int, -- ^ Offset in bytes in the input. sCharOffset :: !Int, -- ^ Offset in characters in the input. sLine :: !Int, -- ^ Builds on YAML's line break definition. sLineChar :: !Int, -- ^ Character number in line. sCode :: !Code, -- ^ Of token we are collecting chars for. sLast :: !Char, -- ^ Last matched character. sInput :: ![(Int, Char)] -- ^ The decoded input characters. } {- -- Showing a 'State' is only used in debugging. Note that forcing dump of -- @sInput@ will disable streaming it. instance Show State where show state = "Encoding: " ++ (show $ state^.sEncoding) ++ ", Decision: " ++ (show $ state^.sDecision) ++ ", Limit: " ++ (show $ state^.sLimit) ++ ", IsPeek: " ++ (show $ state^.sIsPeek) ++ ", IsSol: " ++ (show $ state^.sIsSol) ++ ", Chars: >>>" ++ (reverse $ state^.sChars) ++ "<<<" ++ ", CharsByteOffset: " ++ (show $ state^.sCharsByteOffset) ++ ", CharsCharOffset: " ++ (show $ state^.sCharsCharOffset) ++ ", CharsLine: " ++ (show $ state^.sCharsLine) ++ ", CharsLineChar: " ++ (show $ state^.sCharsLineChar) ++ ", ByteOffset: " ++ (show $ state^.sByteOffset) ++ ", CharOffset: " ++ (show $ state^.sCharOffset) ++ ", Line: " ++ (show $ state^.sLine) ++ ", LineChar: " ++ (show $ state^.sLineChar) ++ ", Code: " ++ (show $ state^.sCode) ++ ", Last: " ++ (show $ state^.sLast) -- ++ ", Input: >>>" ++ (show $ state^.sInput) ++ "<<<" -} -- | @initialState name input@ returns an initial 'State' for parsing the -- /input/ (with /name/ for error messages). initialState :: BLC.ByteString -> State initialState input = State { sEncoding = encoding , sDecision = DeNone , sLimit = -1 , sForbidden = Nothing , sIsPeek = False , sIsSol = True , sChars = [] , sCharsByteOffset = -1 , sCharsCharOffset = -1 , sCharsLine = -1 , sCharsLineChar = -1 , sByteOffset = 0 , sCharOffset = 0 , sLine = 1 , sLineChar = 0 , sCode = Unparsed , sLast = ' ' , sInput = decoded } where (encoding, decoded) = decode input -- *** Setters -- -- We need four setter functions to pass them around as arguments. For some -- reason, Haskell only generates getter functions. -- | @setLimit limit state@ sets the @sLimit@ field to /limit/. setLimit :: Int -> State -> State setLimit limit state = state { sLimit = limit } {-# INLINE setLimit #-} -- | @setForbidden forbidden state@ sets the @sForbidden@ field to /forbidden/. setForbidden :: Maybe Pattern -> State -> State setForbidden forbidden state = state { sForbidden = forbidden } {-# INLINE setForbidden #-} -- | @setCode code state@ sets the @sCode@ field to /code/. setCode :: Code -> State -> State setCode code state = state { sCode = code } {-# INLINE setCode #-} -- ** Implicit parsers -- -- It is tedious to have to wrap each expected character (or character range) -- in an explicit 'Parse' constructor. We let Haskell do that for us using a -- 'Match' class. -- | @Match parameter result@ specifies that we can convert the /parameter/ to -- a 'Data.YAML.Token.Parser' returning the /result/. class Match parameter result | parameter -> result where match :: parameter -> Parser result -- | We don't need to convert a 'Data.YAML.Token.Parser', it already is one. instance Match (Parser result) result where match = id -- | We convert 'Char' to a parser for a character (that returns nothing). instance Match Char () where match code = nextIf (== code) -- | We convert a 'Char' tuple to a parser for a character range (that returns -- nothing). instance Match (Char, Char) () where match (low, high) = nextIf $ \ code -> low <= code && code <= high -- | We convert 'String' to a parser for a sequence of characters (that returns -- nothing). instance Match String () where match = foldr (&) empty -- ** Reply constructors -- | @returnReply state result@ prepares a 'Reply' with the specified /state/ -- and /result/. returnReply :: State -> result -> Reply result returnReply state result = Reply { rResult = Result result, rTokens = D.empty, rCommit = Nothing, rState = state } -- | @tokenReply state token@ returns a 'Reply' containing the /state/ and -- /token/. Any collected characters are cleared (either there are none, or we -- put them in this token, or we don't want them). tokenReply :: State -> Token -> Reply () tokenReply state token = Reply { rResult = Result (), rTokens = D.singleton token, rCommit = Nothing, rState = state { sCharsByteOffset = -1, sCharsCharOffset = -1, sCharsLine = -1, sCharsLineChar = -1, sChars = [] } } -- | @failReply state message@ prepares a 'Reply' with the specified /state/ -- and error /message/. failReply :: State -> String -> Reply result failReply state message = Reply { rResult = Failed message, rTokens = D.empty, rCommit = Nothing, rState = state } -- | @unexpectedReply state@ returns a @failReply@ for an unexpected character. unexpectedReply :: State -> Reply result unexpectedReply state = case state^.sInput of ((_, char):_) -> failReply state $ "Unexpected '" ++ [char] ++ "'" [] -> failReply state "Unexpected end of input" instance Functor Parser where fmap g f = Parser $ \state -> let reply = applyParser f state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result x -> reply { rResult = Result (g x) } More parser -> reply { rResult = More $ fmap g parser } instance Applicative Parser where pure result = Parser $ \state -> returnReply state result (<*>) = ap left *> right = Parser $ \state -> let reply = applyParser left state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result _ -> reply { rResult = More right } More parser -> reply { rResult = More $ parser *> right } -- | Allow using the @do@ notation for our parsers, which makes for short and -- sweet @do@ syntax when we want to examine the results (we typically don't). instance Monad Parser where -- @return result@ does just that - return a /result/. return = pure -- @left >>= right@ applies the /left/ parser, and if it didn't fail -- applies the /right/ one (well, the one /right/ returns). left >>= right = Parser $ \state -> let reply = applyParser left state in case reply^.rResult of Failed message -> reply { rResult = Failed message } Result value -> reply { rResult = More $ right value } More parser -> reply { rResult = More $ parser >>= right } (>>) = (*>) -- | @fail message@ does just that - fails with a /message/. pfail :: String -> Parser a pfail message = Parser $ \state -> failReply state message -- ** Parsing operators -- -- Here we reap the benefits of renaming the numerical operators. The Operator -- precedence, in decreasing strength: -- -- @repeated % n@, @repeated <% n@, @match - rejected@, @match ! decision@, -- @match ?! decision@, @choice ^ (first \/ second)@. -- -- @match - first - second@ is @(match - first) - second@. -- -- @first & second & third@ is @first & (second & third)@. Note that @first - -- rejected & second@ is @(first - rejected) & second@, etc. -- -- @match \/ alternative \/ otherwise@ is @match \/ (alternative \/ -- otherwise)@. Note that @first & second \/ third@ is @(first & second) \/ -- third@. -- -- @( match *)@, @(match +)@, @(match ?)@, @(match ?)@, @(match -- >!)@, @(match ? infix 0 >! -- | @parser % n@ repeats /parser/ exactly /n/ times. (%) :: (Match match result) => match -> Int -> Pattern parser % n | n <= 0 = empty | otherwise = parser' *> (parser' % n .- 1) where parser' = match parser -- | @parser <% n@ matches fewer than /n/ occurrences of /parser/. (<%) :: (Match match result) => match -> Int -> Pattern parser <% n = case n `compare` 1 of LT -> pfail "Fewer than 0 repetitions" EQ -> reject parser Nothing GT -> DeLess ^ ( ((parser ! DeLess) *> (parser <% n .- 1)) <|> empty ) data Decision = DeNone -- "" | DeStar -- "*" | DeLess -- "<%" | DeDirective | DeDoc | DeEscape | DeEscaped | DeFold | DeKey | DeHeader | DeMore | DeNode | DePair deriving (Show,Eq) -- | @decision ^ (option \/ option \/ ...)@ provides a /decision/ name to the -- choice about to be made, to allow to @commit@ to it. (^) :: (Match match result) => Decision -> match -> Parser result decision ^ parser = choice decision $ match parser -- | @parser ! decision@ commits to /decision/ (in an option) after -- successfully matching the /parser/. (!) :: (Match match result) => match -> Decision -> Pattern parser ! decision = match parser *> commit decision -- | @parser ?! decision@ commits to /decision/ (in an option) if the current -- position matches /parser/, without consuming any characters. (?!) :: (Match match result) => match -> Decision -> Pattern parser ?! decision = peek parser *> commit decision -- | @lookbehind match -> Parser result (?@ matches the current point without consuming any characters -- if it matches the lookahead parser (positive lookahead). (>?) :: (Match match result) => match -> Parser result (>?) lookahead = peek lookahead -- | @lookahead >!@ matches the current point without consuming any characters -- if it matches the lookahead parser (negative lookahead). (>!) :: (Match match result) => match -> Pattern (>!) lookahead = reject lookahead Nothing -- | @parser - rejected@ matches /parser/, except if /rejected/ matches at this -- point. (-) :: (Match match1 result1, Match match2 result2) => match1 -> match2 -> Parser result1 parser - rejected = reject rejected Nothing *> match parser -- | @before & after@ parses /before/ and, if it succeeds, parses /after/. This -- basically invokes the monad's @>>=@ (bind) method. (&) :: (Match match1 result1, Match match2 result2) => match1 -> match2 -> Parser result2 before & after = match before *> match after -- | @first \/ second@ tries to parse /first/, and failing that parses -- /second/, unless /first/ has committed in which case is fails immediately. (/) :: (Match match1 result, Match match2 result) => match1 -> match2 -> Parser result first / second = Parser $ applyParser (match first <|> match second) -- | @(optional ?)@ tries to match /parser/, otherwise does nothing. (?) :: (Match match result) => match -> Pattern (?) optional = (match optional *> empty) <|> empty -- | @(parser *)@ matches zero or more occurrences of /repeat/, as long as each -- one actually consumes input characters. (*) :: (Match match result) => match -> Pattern (*) parser = DeStar ^ zomParser where zomParser = ((parser ! DeStar) *> match zomParser) <|> empty -- | @(parser +)@ matches one or more occurrences of /parser/, as long as each -- one actually consumed input characters. (+) :: (Match match result) => match -> Pattern (+) parser = match parser *> (parser *) -- ** Basic parsers -- | @first <|> second@ tries to parse /first/, and failing that parses -- /second/, unless /first/ has committed in which case is fails immediately. instance Alternative Parser where empty = pfail "empty" left <|> right = Parser $ \state -> decideParser state D.empty left right state where decideParser point tokens left right state = let reply = applyParser left state tokens' = D.append tokens $ reply^.rTokens in case (reply^.rResult, reply^.rCommit) of (Failed _, _) -> Reply { rState = point, rTokens = D.empty, rResult = More right, rCommit = Nothing } (Result _, _) -> reply { rTokens = tokens' } (More _, Just _) -> reply { rTokens = tokens' } (More left', Nothing) -> decideParser point tokens' left' right (reply^.rState) -- | @choice decision parser@ provides a /decision/ name to the choice about to -- be made in /parser/, to allow to @commit@ to it. choice :: Decision -> Parser result -> Parser result choice decision parser = Parser $ \ state -> applyParser (choiceParser (state^.sDecision) decision parser) state { sDecision = decision } where choiceParser parentDecision makingDecision parser = Parser $ \ state -> let reply = applyParser parser state commit' = case reply^.rCommit of Nothing -> Nothing Just decision | decision == makingDecision -> Nothing | otherwise -> reply^.rCommit reply' = case reply^.rResult of More parser' -> reply { rCommit = commit', rResult = More $ choiceParser parentDecision makingDecision parser' } _ -> reply { rCommit = commit', rState = (reply^.rState) { sDecision = parentDecision } } in reply' -- | @parser ``recovery`` pattern@ parses the specified /parser/; if it fails, -- it continues to the /recovery/ parser to recover. recovery :: (Match match1 result) => match1 -> Parser result -> Parser result recovery pattern recover = Parser $ \ state -> let reply = applyParser (match pattern) state in if state^.sIsPeek then reply else case reply^.rResult of Result _ -> reply More more -> reply { rResult = More $ more `recovery` recover } Failed message -> reply { rResult = More $ fake Error message *> unparsed *> recover } where unparsed = Parser $ \ state -> applyParser (match finishToken) $ state { sCode = Unparsed } -- | @prev parser@ succeeds if /parser/ matches at the previous character. It -- does not consume any input. prev :: (Match match result) => match -> Parser result prev parser = Parser $ \ state -> prevParser state (match parser) state { sIsPeek = True, sInput = (-1, state^.sLast) : state^.sInput } where prevParser point parser state = let reply = applyParser parser state in case reply^.rResult of Failed message -> failReply point message Result value -> returnReply point value More parser' -> prevParser point parser' $ reply^.rState -- | @peek parser@ succeeds if /parser/ matches at this point, but does not -- consume any input. peek :: (Match match result) => match -> Parser result peek parser = Parser $ \ state -> peekParser state (match parser) state { sIsPeek = True } where peekParser point parser state = let reply = applyParser parser state in case reply^.rResult of Failed message -> failReply point message Result value -> returnReply point value More parser' -> peekParser point parser' $ reply^.rState -- | @reject parser name@ fails if /parser/ matches at this point, and does -- nothing otherwise. If /name/ is provided, it is used in the error message, -- otherwise the messages uses the current character. reject :: (Match match result) => match -> Maybe String -> Pattern reject parser name = Parser $ \ state -> rejectParser state name (match parser) state { sIsPeek = True } where rejectParser point name parser state = let reply = applyParser parser state in case reply^.rResult of Failed _message -> returnReply point () Result _value -> case name of Nothing -> unexpectedReply point Just text -> failReply point $ "Unexpected " ++ text More parser' -> rejectParser point name parser' $ reply^.rState -- | @upto parser@ consumes all the character up to and not including the next -- point where the specified parser is a match. upto :: Pattern -> Pattern upto parser = ( ( parser >!) *> nextIf (const True) *) -- | @nonEmpty parser@ succeeds if /parser/ matches some non-empty input -- characters at this point. nonEmpty :: (Match match result) => match -> Parser result nonEmpty parser = Parser $ \ state -> applyParser (nonEmptyParser (state^.sCharOffset) (match parser)) state where nonEmptyParser offset parser = Parser $ \ state -> let reply = applyParser parser state state' = reply^.rState in case reply^.rResult of Failed _message -> reply Result _value -> if state'^.sCharOffset > offset then reply else failReply state' "Matched empty pattern" More parser' -> reply { rResult = More $ nonEmptyParser offset parser' } -- | @empty@ always matches without consuming any input. empty :: Pattern empty = return () -- | @eof@ matches the end of the input. eof :: Pattern eof = Parser $ \ state -> if null (state^.sInput) then returnReply state () else unexpectedReply state -- | @sol@ matches the start of a line. sol :: Pattern sol = Parser $ \ state -> if state^.sIsSol then returnReply state () else failReply state "Expected start of line" -- ** State manipulation pseudo-parsers -- | @commit decision@ commits the parser to all the decisions up to the most -- recent parent /decision/. This makes all tokens generated in this parsing -- path immediately available to the caller. commit :: Decision -> Pattern commit decision = Parser $ \ state -> Reply { rState = state, rTokens = D.empty, rResult = Result (), rCommit = Just decision } -- | @nextLine@ increments @sLine@ counter and resets @sLineChar@. nextLine :: Pattern nextLine = Parser $ \ state -> returnReply state { sIsSol = True, sLine = state^.sLine .+ 1, sLineChar = 0 } () -- | @with setField getField value parser@ invokes the specified /parser/ with -- the value of the specified field set to /value/ for the duration of the -- invocation, using the /setField/ and /getField/ functions to manipulate it. with :: (value -> State -> State) -> (State -> value) -> value -> Parser result -> Parser result with setField getField value parser = Parser $ \ state -> let value' = getField state Parser parser' = value' `seq` withParser value' parser in parser' $ setField value state where withParser parentValue parser = Parser $ \ state -> let reply = applyParser parser state in case reply^.rResult of Failed _ -> reply { rState = setField parentValue $ reply^.rState } Result _ -> reply { rState = setField parentValue $ reply^.rState } More parser' -> reply { rResult = More $ withParser parentValue parser' } {-# INLINE with #-} -- | @parser ``forbidding`` pattern@ parses the specified /parser/ ensuring -- that it does not contain anything matching the /forbidden/ parser. forbidding :: (Match match1 result1) => match1 -> Parser result1 -> Parser result1 forbidding parser forbidden = with setForbidden sForbidden (Just $ forbidden *> empty) (match parser) -- | @parser ``limitedTo`` limit@ parses the specified /parser/ -- ensuring that it does not consume more than the /limit/ input chars. limitedTo :: (Match match result) => match -> Int -> Parser result limitedTo parser limit = with setLimit sLimit limit (match parser) -- ** Consuming input characters -- | @nextIf test@ fails if the current position matches the 'State' forbidden -- pattern or if the 'State' lookahead limit is reached. Otherwise it consumes -- (and buffers) the next input char if it satisfies /test/. nextIf :: (Char -> Bool) -> Pattern nextIf test = Parser $ \ state -> case state^.sForbidden of Nothing -> limitedNextIf state Just parser -> let reply = applyParser (reject parser $ Just "forbidden pattern") state { sForbidden = Nothing } in case reply^.rResult of Failed _ -> reply Result _ -> limitedNextIf state More _ -> error "unexpected Result More _ pattern" where limitedNextIf state = case state^.sLimit of -1 -> consumeNextIf state 0 -> failReply state "Lookahead limit reached" _limit -> consumeNextIf state { sLimit = state^.sLimit .- 1 } consumeNextIf state = case state^.sInput of ((offset, char):rest) | test char -> let chars = if state^.sIsPeek then [] else char:(state^.sChars) byte_offset = charsOf sByteOffset sCharsByteOffset char_offset = charsOf sCharOffset sCharsCharOffset line = charsOf sLine sCharsLine line_char = charsOf sLineChar sCharsLineChar is_sol = char == '\xFEFF' && state^.sIsSol state' = state { sInput = rest, sLast = char, sChars = chars, sCharsByteOffset = byte_offset, sCharsCharOffset = char_offset, sCharsLine = line, sCharsLineChar = line_char, sIsSol = is_sol, sByteOffset = offset, sCharOffset = state^.sCharOffset .+ 1, sLineChar = state^.sLineChar .+ 1 } in returnReply state' () | otherwise -> unexpectedReply state [] -> unexpectedReply state where charsOf field charsField | state^.sIsPeek = -1 | null (state^.sChars) = state^.field | otherwise = state^.charsField -- ** Producing tokens -- | @finishToken@ places all collected text into a new token and begins a new -- one, or does nothing if there are no collected characters. finishToken :: Pattern finishToken = Parser $ \ state -> let state' = state { sChars = [], sCharsByteOffset = -1, sCharsCharOffset = -1, sCharsLine = -1, sCharsLineChar = -1 } in if state^.sIsPeek then returnReply state' () else case state^.sChars of [] -> returnReply state' () chars@(_:_) -> tokenReply state' Token { tByteOffset = state^.sCharsByteOffset, tCharOffset = state^.sCharsCharOffset, tLine = state^.sCharsLine, tLineChar = state^.sCharsLineChar, tCode = state^.sCode, tText = reverse chars } -- | @wrap parser@ invokes the /parser/, ensures any unclaimed input characters -- are wrapped into a token (only happens when testing productions), ensures no -- input is left unparsed, and returns the parser's result. wrap :: (Match match result) => match -> Parser result wrap parser = do result <- match parser finishToken eof return result -- | @token code parser@ places all text matched by /parser/ into a 'Token' with -- the specified /code/ (unless it is empty). Note it collects the text even if -- there is an error. token :: (Match match result) => Code -> match -> Pattern token code parser = finishToken & with setCode sCode code (parser & finishToken) -- | @fake code text@ creates a token with the specified /code/ and \"fake\" -- /text/ characters, instead of whatever characters are collected so far. fake :: Code -> String -> Pattern fake code text = Parser $ \ state -> if state^.sIsPeek then returnReply state () else tokenReply state Token { tByteOffset = value state sByteOffset sCharsByteOffset, tCharOffset = value state sCharOffset sCharsCharOffset, tLine = value state sLine sCharsLine, tLineChar = value state sLineChar sCharsLineChar, tCode = code, tText = text } where value state field1 field2 = if field2 state == -1 then field1 state else field2 state -- | @meta parser@ collects the text matched by the specified /parser/ into a -- | @Meta@ token. meta :: (Match match result) => match -> Pattern meta parser = token Meta parser -- | @indicator code@ collects the text matched by the specified /parser/ into an -- @Indicator@ token. indicator :: (Match match result) => match -> Pattern indicator parser = token Indicator $ parser -- | @text parser@ collects the text matched by the specified /parser/ into a -- @Text@ token. text :: (Match match result) => match -> Pattern text parser = token Text parser -- | @emptyToken code@ returns an empty token. emptyToken :: Code -> Pattern emptyToken code = finishToken & parser code where parser code = Parser $ \ state -> if state^.sIsPeek then returnReply state () else tokenReply state Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = code, tText = "" } -- | @wrapTokens beginCode endCode parser@ wraps the specified /parser/ with -- matching /beginCode/ and /endCode/ tokens. wrapTokens :: Code -> Code -> Pattern -> Pattern wrapTokens beginCode endCode pattern = emptyToken beginCode & prefixErrorWith pattern (emptyToken endCode) & emptyToken endCode -- | @prefixErrorWith pattern prefix@ will invoke the @prefix@ parser if an -- error is detected during the @pattern@ parser, and then return the error. prefixErrorWith :: (Match match result) => match -> Pattern -> Parser result prefixErrorWith pattern prefix = Parser $ \ state -> let reply = applyParser (match pattern) state in case reply^.rResult of Result _ -> reply More more -> reply { rResult = More $ prefixErrorWith more prefix } Failed message -> reply { rResult = More $ prefix & (pfail message :: Parser result) } -- * Production parameters -- | Production context. data Context = BlockOut -- ^ Outside block sequence. | BlockIn -- ^ Inside block sequence. | FlowOut -- ^ Outside flow collection. | FlowIn -- ^ Inside flow collection. | BlockKey -- ^ Implicit block key. | FlowKey -- ^ Implicit flow key. {- -- | @show context@ converts a 'Context' to a 'String'. instance Show Context where show context = case context of BlockOut -> "block-out" BlockIn -> "block-in" FlowOut -> "flow-out" FlowIn -> "flow-in" BlockKey -> "block-key" FlowKey -> "flow-key" -- | @read context@ converts a 'String' to a 'Context'. We trust our callers to -- convert any @-@ characters into @_@ to allow the built-in @lex@ function to -- handle the names as single identifiers. instance Read Context where readsPrec _ text = [ ((r word), tail) | (word, tail) <- lex text ] where r word = case word of "block_out" -> BlockOut "block_in" -> BlockIn "flow_out" -> FlowOut "flow_in" -> FlowIn "block_key" -> BlockKey "flow_key" -> FlowKey _ -> error $ "unknown context: " ++ word -} -- | Chomp method. data Chomp = Strip -- ^ Remove all trailing line breaks. | Clip -- ^ Keep first trailing line break. | Keep -- ^ Keep all trailing line breaks. {- -- | @show chomp@ converts a 'Chomp' to a 'String'. instance Show Chomp where show chomp = case chomp of Strip -> "strip" Clip -> "clip" Keep -> "keep" -- | @read chomp@ converts a 'String' to a 'Chomp'. instance Read Chomp where readsPrec _ text = [ ((r word), tail) | (word, tail) <- lex text ] where r word = case word of "strip" -> Strip "clip" -> Clip "keep" -> Keep _ -> error $ "unknown chomp: " ++ word -} -- * Tokenizers -- -- We encapsulate the 'Data.YAML.Token.Parser' inside a 'Tokenizer'. This allows us to hide the -- implementation details from our callers. -- | 'Tokenizer' converts a input text into a list of 'Token'. Errors -- are reported as tokens with the @Error@ 'Code', and the unparsed text -- following an error may be attached as a final token (if the @Bool@ is -- @True@). Note that tokens are available \"immediately\", allowing for -- streaming of large YAML files with memory requirements depending only on the -- YAML nesting level. type Tokenizer = BLC.ByteString -> Bool -> [Token] -- | @patternTokenizer pattern@ converts the /pattern/ to a simple 'Tokenizer'. patternTokenizer :: Pattern -> Tokenizer patternTokenizer pattern input withFollowing = D.toList $ patternParser (wrap pattern) (initialState input) where patternParser parser state = let reply = applyParser parser state tokens = commitBugs reply state' = reply^.rState in case reply^.rResult of Failed message -> errorTokens tokens state' message withFollowing Result _ -> tokens More parser' -> D.append tokens $ patternParser parser' state' -- | @errorTokens tokens state message withFollowing@ appends an @Error@ token -- with the specified /message/ at the end of /tokens/, and if /withFollowing/ -- also appends the unparsed text following the error as a final @Unparsed@ -- token. errorTokens :: D.DList Token -> State -> String -> Bool -> D.DList Token errorTokens tokens state message withFollowing = let tokens' = D.append tokens $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Error, tText = message } in if withFollowing && state^.sInput /= [] then D.append tokens' $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Unparsed, tText = map snd $ state^.sInput } else tokens' -- | @commitBugs reply@ inserts an error token if a commit was made outside a -- named choice. This should never happen outside tests. commitBugs :: Reply result -> D.DList Token commitBugs reply = let tokens = reply^.rTokens state = reply^.rState in case reply^.rCommit of Nothing -> tokens Just commit -> D.append tokens $ D.singleton Token { tByteOffset = state^.sByteOffset, tCharOffset = state^.sCharOffset, tLine = state^.sLine, tLineChar = state^.sLineChar, tCode = Error, tText = "Commit to " ++ show commit ++ " was made outside it" } -- | @'tokenize' input emit_unparsed@ converts the Unicode /input/ -- (using the UTF-8, UTF-16 (LE or BE), or UTF-32 (LE or BE) encoding) -- to a list of 'Token' according to the YAML 1.2 specification. -- -- Errors are reported as tokens with @'Error' :: 'Code'@, and the -- unparsed text following an error may be attached as a final 'Unparsed' token -- (if the /emit_unparsed/ argument is @True@). Note that tokens are available -- \"immediately\", allowing for streaming of large YAML files with -- memory requirements depending only on the YAML nesting level. tokenize :: BLC.ByteString -> Bool -> [Token] tokenize = patternTokenizer l_yaml_stream -- * Productions -- ** BNF compatibility helpers -- | @detect_utf_encoding@ doesn't actually detect the encoding, we just call it -- this way to make the productions compatible with the spec. Instead it simply -- reports the encoding (which was already detected when we started parsing). bom :: Match match1 result1 => match1 -> Parser () bom code = code & (Parser $ \ state -> let text = case state^.sEncoding of UTF8 -> "TF-8" UTF16LE -> "TF-16LE" UTF16BE -> "TF-16BE" UTF32LE -> "TF-32LE" UTF32BE -> "TF-32BE" in applyParser (fake Bom text) state) -- | @na@ is the \"non-applicable\" indentation value. We use Haskell's laziness -- to verify it really is never used. na :: Int na = error "Accessing non-applicable indentation" -- | @asInteger@ returns the last consumed character, which is assumed to be a -- decimal digit, as an integer. asInteger :: Parser Int asInteger = Parser $ \ state -> returnReply state $ ord (state^.sLast) .- 48 -- | @result value@ is the same as /return value/ except that we give the -- Haskell type deduction the additional boost it needs to figure out this is -- wrapped in a 'Data.YAML.Token.Parser'. result :: result -> Parser result result = return ---------------------------------------------------------------------------- -- ** Spec productions -- -- These are copied directly from the spec, with the sprinkling of -- additional token and decision point directives. -- 5.1 Character Set c_printable {- 1 -} = '\x9' / '\xA' / '\xD' / ('\x20', '\x7E') / '\x85' / ('\xA0', '\xD7FF') / ('\xE000', '\xFFFD') / ('\x10000', '\x10FFFF') nb_json {- 2 -} = '\x9' / ('\x20', '\x10FFFF') -- 5.2 Character Encodings c_byte_order_mark {- 3 -} = bom '\xFEFF' -- 5.3 Indicator Characters c_sequence_entry {- 4 -} = indicator '-' c_mapping_key {- 5 -} = indicator '?' c_mapping_value {- 6 -} = indicator ':' c_collect_entry {- 7 -} = indicator ',' c_sequence_start {- 8 -} = indicator '[' c_sequence_end {- 9 -} = indicator ']' c_mapping_start {- 10 -} = indicator '{' c_mapping_end {- 11 -} = indicator '}' c_comment {- 12 -} = indicator '#' c_anchor {- 13 -} = indicator '&' c_alias {- 14 -} = indicator '*' c_tag {- 15 -} = indicator '!' c_literal {- 16 -} = indicator '|' c_folded {- 17 -} = indicator '>' c_single_quote {- 18 -} = indicator '\'' c_double_quote {- 19 -} = indicator '"' c_directive {- 20 -} = indicator '%' c_reserved {- 21 -} = indicator ( '@' / '`' ) c_indicator {- 22 -} = c_sequence_entry / c_mapping_key / c_mapping_value / c_collect_entry / c_sequence_start / c_sequence_end / c_mapping_start / c_mapping_end / c_comment / c_anchor / c_alias / c_tag / c_literal / c_folded / c_single_quote / c_double_quote / c_directive / c_reserved c_flow_indicator {- 23 -} = c_collect_entry / c_sequence_start / c_sequence_end / c_mapping_start / c_mapping_end -- 5.4 Line Break Characters b_line_feed {- 24 -} = '\xA' b_carriage_return {- 25 -} = '\xD' b_char {- 26 -} = b_line_feed / b_carriage_return nb_char {- 27 -} = c_printable - b_char - c_byte_order_mark b_break {- 28 -} = ( b_carriage_return & b_line_feed / b_carriage_return / b_line_feed ) & nextLine b_as_line_feed {- 29 -} = token LineFeed b_break b_non_content {- 30 -} = token Break b_break -- 5.5 White Space Characters s_space {- 31 -} = '\x20' s_tab {- 32 -} = '\x9' s_white {- 33 -} = s_space / s_tab ns_char {- 34 -} = nb_char - s_white -- 5.6 Miscellaneous Characters ns_dec_digit {- 35 -} = ('\x30', '\x39') ns_hex_digit {- 36 -} = ns_dec_digit / ('\x41', '\x46') / ('\x61', '\x66') ns_ascii_letter {- 37 -} = ('\x41', '\x5A') / ('\x61', '\x7A') ns_word_char {- 38 -} = ns_dec_digit / ns_ascii_letter / '-' ns_uri_char {- 39 -} = DeEscape ^ ( '%' ! DeEscape & ns_hex_digit & ns_hex_digit / ns_word_char / '#' / ';' / '/' / '?' / ':' / '@' / '&' / '=' / '+' / '$' / ',' / '_' / '.' / '!' / '~' / '*' / '\'' / '(' / ')' / '[' / ']' ) ns_tag_char {- 40 -} = ns_uri_char - c_tag - c_flow_indicator -- 5.7 Escaped Characters c_escape {- 41 -} = indicator '\\' ns_esc_null {- 42 -} = meta '0' ns_esc_bell {- 43 -} = meta 'a' ns_esc_backspace {- 44 -} = meta 'b' ns_esc_horizontal_tab {- 45 -} = meta ( 't' / '\x9' ) ns_esc_line_feed {- 46 -} = meta 'n' ns_esc_vertical_tab {- 47 -} = meta 'v' ns_esc_form_feed {- 48 -} = meta 'f' ns_esc_carriage_return {- 49 -} = meta 'r' ns_esc_escape {- 50 -} = meta 'e' ns_esc_space {- 51 -} = meta '\x20' ns_esc_double_quote {- 52 -} = meta '"' ns_esc_slash {- 53 -} = meta '/' ns_esc_backslash {- 54 -} = meta '\\' ns_esc_next_line {- 55 -} = meta 'N' ns_esc_non_breaking_space {- 56 -} = meta '_' ns_esc_line_separator {- 57 -} = meta 'L' ns_esc_paragraph_separator {- 58 -} = meta 'P' ns_esc_8_bit {- 59 -} = indicator 'x' ! DeEscaped & meta ( ns_hex_digit % 2 ) ns_esc_16_bit {- 60 -} = indicator 'u' ! DeEscaped & meta ( ns_hex_digit % 4 ) ns_esc_32_bit {- 61 -} = indicator 'U' ! DeEscaped & meta ( ns_hex_digit % 8 ) c_ns_esc_char {- 62 -} = wrapTokens BeginEscape EndEscape $ c_escape ! DeEscape & DeEscaped ^ ( ns_esc_null / ns_esc_bell / ns_esc_backspace / ns_esc_horizontal_tab / ns_esc_line_feed / ns_esc_vertical_tab / ns_esc_form_feed / ns_esc_carriage_return / ns_esc_escape / ns_esc_space / ns_esc_double_quote / ns_esc_slash / ns_esc_backslash / ns_esc_next_line / ns_esc_non_breaking_space / ns_esc_line_separator / ns_esc_paragraph_separator / ns_esc_8_bit / ns_esc_16_bit / ns_esc_32_bit ) -- 6.1 Indentation Spaces s_indent n {- 63 -} = token Indent ( s_space % n ) s_indent_lt n {- 64 -} = token Indent ( s_space <% n ) s_indent_le n {- 65 -} = token Indent ( s_space <% (n .+ 1) ) -- 6.2 Separation Spaces s_separate_in_line {- 66 -} = token White ( s_white +) / sol -- 6.3 Line Prefixes s_line_prefix n c {- 67 -} = case c of BlockOut -> s_block_line_prefix n BlockIn -> s_block_line_prefix n FlowOut -> s_flow_line_prefix n FlowIn -> s_flow_line_prefix n _ -> error "unexpected node style pattern in s_line_prefix" s_block_line_prefix n {- 68 -} = s_indent n s_flow_line_prefix n {- 69 -} = s_indent n & ( s_separate_in_line ?) -- 6.4 Empty Lines l_empty n c {- 70 -} = ( s_line_prefix n c / s_indent_lt n ) & b_as_line_feed -- 6.5 Line Folding b_l_trimmed n c {- 71 -} = b_non_content & ( l_empty n c +) b_as_space {- 72 -} = token LineFold b_break b_l_folded n c {- 73 -} = b_l_trimmed n c / b_as_space s_flow_folded n {- 74 -} = ( s_separate_in_line ?) & b_l_folded n FlowIn & s_flow_line_prefix n -- 6.6 Comments c_nb_comment_text {- 75 -} = wrapTokens BeginComment EndComment $ c_comment & meta ( nb_char *) b_comment {- 76 -} = b_non_content / eof s_b_comment {- 77 -} = ( s_separate_in_line & ( c_nb_comment_text ?) ?) & b_comment l_comment {- 78 -} = s_separate_in_line & ( c_nb_comment_text ?) & b_comment s_l_comments {- 79 -} = ( s_b_comment / sol ) & ( nonEmpty l_comment *) -- 6.7 Separation Lines s_separate n c {- 80 -} = case c of BlockOut -> s_separate_lines n BlockIn -> s_separate_lines n FlowOut -> s_separate_lines n FlowIn -> s_separate_lines n BlockKey -> s_separate_in_line FlowKey -> s_separate_in_line s_separate_lines n {- 81 -} = s_l_comments & s_flow_line_prefix n / s_separate_in_line -- 6.8 Directives l_directive {- 82 -} = ( wrapTokens BeginDirective EndDirective $ c_directive ! DeDoc & DeDirective ^ ( ns_yaml_directive / ns_tag_directive / ns_reserved_directive ) ) & s_l_comments ns_reserved_directive {- 83 -} = ns_directive_name & ( s_separate_in_line & ns_directive_parameter *) ns_directive_name {- 84 -} = meta ( ns_char +) ns_directive_parameter {- 85 -} = meta ( ns_char +) -- 6.8.1 Yaml Directives ns_yaml_directive {- 86 -} = meta [ 'Y', 'A', 'M', 'L' ] ! DeDirective & s_separate_in_line & ns_yaml_version ns_yaml_version {- 87 -} = meta ( ( ns_dec_digit +) & '.' & ( ns_dec_digit +) ) -- 6.8.2 Tag Directives ns_tag_directive {- 88 -} = meta [ 'T', 'A', 'G' ] ! DeDirective & s_separate_in_line & c_tag_handle & s_separate_in_line & ns_tag_prefix -- 6.8.2.1 Tag Handles c_tag_handle {- 89 -} = c_named_tag_handle / c_secondary_tag_handle / c_primary_tag_handle c_primary_tag_handle {- 90 -} = wrapTokens BeginHandle EndHandle $ c_tag c_secondary_tag_handle {- 91 -} = wrapTokens BeginHandle EndHandle $ c_tag & c_tag c_named_tag_handle {- 92 -} = wrapTokens BeginHandle EndHandle $ c_tag & meta ( ns_word_char +) & c_tag -- 6.8.2.2 Tag Prefixes ns_tag_prefix {- 93 -} = wrapTokens BeginTag EndTag $ ( c_ns_local_tag_prefix / ns_global_tag_prefix ) c_ns_local_tag_prefix {- 94 -} = c_tag & meta ( ns_uri_char *) ns_global_tag_prefix {- 95 -} = meta ( ns_tag_char & ( ns_uri_char *) ) -- 6.9 Node Properties c_ns_properties n c {- 96 -} = wrapTokens BeginProperties EndProperties $ ( c_ns_tag_property & ( s_separate n c & c_ns_anchor_property ?) ) / ( c_ns_anchor_property & ( s_separate n c & c_ns_tag_property ?) ) -- 6.9.1 Node Tags c_ns_tag_property {- 97 -} = wrapTokens BeginTag EndTag $ c_verbatim_tag / c_ns_shorthand_tag / c_non_specific_tag c_verbatim_tag {- 98 -} = c_tag & indicator '<' & meta ( ns_uri_char +) & indicator '>' c_ns_shorthand_tag {- 99 -} = c_tag_handle & meta ( ns_tag_char +) c_non_specific_tag {- 100 -} = c_tag -- 6.9.2 Node Anchors c_ns_anchor_property {- 101 -} = wrapTokens BeginAnchor EndAnchor $ c_anchor & ns_anchor_name ns_anchor_char {- 102 -} = ns_char - c_flow_indicator ns_anchor_name {- 103 -} = meta ( ns_anchor_char +) -- 7.1 Alias Nodes c_ns_alias_node {- 104 -} = wrapTokens BeginAlias EndAlias $ c_alias ! DeNode & ns_anchor_name -- 7.2 Empty Nodes e_scalar {- 105 -} = wrapTokens BeginScalar EndScalar empty e_node {- 106 -} = wrapTokens BeginNode EndNode e_scalar -- 7.3.1 Double Quoted Style nb_double_char {- 107 -} = DeEscape ^ ( c_ns_esc_char / ( nb_json - c_escape - c_double_quote ) ) ns_double_char {- 108 -} = nb_double_char - s_white c_double_quoted n c {- 109 -} = wrapTokens BeginScalar EndScalar $ c_double_quote ! DeNode & text ( nb_double_text n c ) & c_double_quote nb_double_text n c {- 110 -} = case c of FlowOut -> nb_double_multi_line n FlowIn -> nb_double_multi_line n BlockKey -> nb_double_one_line FlowKey -> nb_double_one_line _ -> error "unexpected node style pattern in nb_double_text" nb_double_one_line {- 111 -} = ( nb_double_char *) s_double_escaped n {- 112 -} = ( s_white *) & wrapTokens BeginEscape EndEscape ( c_escape ! DeEscape & b_non_content ) & ( l_empty n FlowIn *) & s_flow_line_prefix n s_double_break n {- 113 -} = DeEscape ^ ( s_double_escaped n / s_flow_folded n ) nb_ns_double_in_line {- 114 -} = ( ( s_white *) & ns_double_char *) s_double_next_line n {- 115 -} = s_double_break n & ( ns_double_char & nb_ns_double_in_line & ( s_double_next_line n / ( s_white *) ) ?) nb_double_multi_line n {- 116 -} = nb_ns_double_in_line & ( s_double_next_line n / ( s_white *) ) -- 7.3.2 Single Quoted Style c_quoted_quote {- 117 -} = wrapTokens BeginEscape EndEscape $ c_single_quote ! DeEscape & meta '\'' nb_single_char {- 118 -} = DeEscape ^ ( c_quoted_quote / ( nb_json - c_single_quote ) ) ns_single_char {- 119 -} = nb_single_char - s_white c_single_quoted n c {- 120 -} = wrapTokens BeginScalar EndScalar $ c_single_quote ! DeNode & text ( nb_single_text n c ) & c_single_quote nb_single_text n c {- 121 -} = case c of FlowOut -> nb_single_multi_line n FlowIn -> nb_single_multi_line n BlockKey -> nb_single_one_line FlowKey -> nb_single_one_line _ -> error "unexpected node style pattern in nb_single_text" nb_single_one_line {- 122 -} = ( nb_single_char *) nb_ns_single_in_line {- 123 -} = ( ( s_white *) & ns_single_char *) s_single_next_line n {- 124 -} = s_flow_folded n & ( ns_single_char & nb_ns_single_in_line & ( s_single_next_line n / ( s_white *) ) ?) nb_single_multi_line n {- 125 -} = nb_ns_single_in_line & ( s_single_next_line n / ( s_white *) ) -- 7.3.3 Plain Style ns_plain_first _c {- 126 -} = ns_char - c_indicator / ( ':' / '?' / '-' ) & ( (ns_plain_safe _c) >?) ns_plain_safe c {- 127 -} = case c of FlowOut -> ns_plain_safe_out FlowIn -> ns_plain_safe_in BlockKey -> ns_plain_safe_out FlowKey -> ns_plain_safe_in _ -> error "unexpected node style pattern in ns_plain_safe" ns_plain_safe_out {- 128 -} = ns_char ns_plain_safe_in {- 129 -} = ns_char - c_flow_indicator ns_plain_char c {- 130 -} = ns_plain_safe c - ':' - '#' / ( ns_char ?) ns_plain n c {- 131 -} = wrapTokens BeginScalar EndScalar $ text (case c of FlowOut -> ns_plain_multi_line n c FlowIn -> ns_plain_multi_line n c BlockKey -> ns_plain_one_line c FlowKey -> ns_plain_one_line c _ -> error "unexpected node style pattern in ns_plain") nb_ns_plain_in_line c {- 132 -} = ( ( s_white *) & ns_plain_char c *) ns_plain_one_line c {- 133 -} = ns_plain_first c ! DeNode & nb_ns_plain_in_line c s_ns_plain_next_line n c {- 134 -} = s_flow_folded n & ns_plain_char c & nb_ns_plain_in_line c ns_plain_multi_line n c {- 135 -} = ns_plain_one_line c & ( s_ns_plain_next_line n c *) -- 7.4 Flow Collection Styles in_flow c {- 136 -} = case c of FlowOut -> FlowIn FlowIn -> FlowIn BlockKey -> FlowKey FlowKey -> FlowKey _ -> error "unexpected node style pattern in in_flow" -- 7.4.1 Flow Sequences c_flow_sequence n c {- 137 -} = wrapTokens BeginSequence EndSequence $ c_sequence_start ! DeNode & ( s_separate n c ?) & ( ns_s_flow_seq_entries n (in_flow c) ?) & c_sequence_end ns_s_flow_seq_entries n c {- 138 -} = ns_flow_seq_entry n c & ( s_separate n c ?) & ( c_collect_entry & ( s_separate n c ?) & ( ns_s_flow_seq_entries n c ?) ?) ns_flow_seq_entry n c {- 139 -} = DePair ^ ( ns_flow_pair n c / DeNode ^ ns_flow_node n c ) -- 7.4.2 Flow Mappings c_flow_mapping n c {- 140 -} = wrapTokens BeginMapping EndMapping $ c_mapping_start ! DeNode & ( s_separate n c ?) & ( ns_s_flow_map_entries n (in_flow c) ?) & c_mapping_end ns_s_flow_map_entries n c {- 141 -} = ns_flow_map_entry n c & ( s_separate n c ?) & ( c_collect_entry & ( s_separate n c ?) & ( ns_s_flow_map_entries n c ?) ?) ns_flow_map_entry n c {- 142 -} = wrapTokens BeginPair EndPair $ DeKey ^ ( ( c_mapping_key ! DeKey & s_separate n c & ns_flow_map_explicit_entry n c ) / ns_flow_map_implicit_entry n c ) ns_flow_map_explicit_entry n c {- 143 -} = ns_flow_map_implicit_entry n c / ( e_node & e_node ) ns_flow_map_implicit_entry n c {- 144 -} = DePair ^ ( c_ns_flow_map_json_key_entry n c / ns_flow_map_yaml_key_entry n c / c_ns_flow_map_empty_key_entry n c ) ns_flow_map_yaml_key_entry n c {- 145 -} = ( DeNode ^ ns_flow_yaml_node n c ) ! DePair & ( ( ( s_separate n c ?) & c_ns_flow_map_separate_value n c ) / e_node ) c_ns_flow_map_empty_key_entry n c {- 146 -} = e_node & c_ns_flow_map_separate_value n c c_ns_flow_map_separate_value n c {- 147 -} = c_mapping_value & ( (ns_plain_safe c) >!) ! DePair & ( ( s_separate n c & ns_flow_node n c ) / e_node ) c_ns_flow_map_json_key_entry n c {- 148 -} = ( DeNode ^ c_flow_json_node n c ) ! DePair & ( ( ( s_separate n c ?) & c_ns_flow_map_adjacent_value n c ) / e_node ) c_ns_flow_map_adjacent_value n c {- 149 -} = c_mapping_value ! DePair & ( ( ( s_separate n c ?) & ns_flow_node n c ) / e_node ) ns_flow_pair n c {- 150 -} = wrapTokens BeginMapping EndMapping $ wrapTokens BeginPair EndPair $ ( ( c_mapping_key ! DePair & s_separate n c & ns_flow_map_explicit_entry n c ) / ns_flow_pair_entry n c ) ns_flow_pair_entry n c {- 151 -} = ( ns_flow_pair_yaml_key_entry n c / c_ns_flow_map_empty_key_entry n c / c_ns_flow_pair_json_key_entry n c ) ns_flow_pair_yaml_key_entry n c {- 152 -} = ns_s_implicit_yaml_key FlowKey & c_ns_flow_map_separate_value n c c_ns_flow_pair_json_key_entry n c {- 153 -} = c_s_implicit_json_key FlowKey & c_ns_flow_map_adjacent_value n c ns_s_implicit_yaml_key c {- 154 -} = ( DeNode ^ ( ns_flow_yaml_node na c ) & ( s_separate_in_line ?) ) `limitedTo` 1024 c_s_implicit_json_key c {- 155 -} = ( DeNode ^ ( c_flow_json_node na c ) & ( s_separate_in_line ?) ) `limitedTo` 1024 -- 7.5 Flow Nodes ns_flow_yaml_content n c {- 156 -} = ns_plain n c c_flow_json_content n c {- 157 -} = c_flow_sequence n c / c_flow_mapping n c / c_single_quoted n c / c_double_quoted n c ns_flow_content n c {- 158 -} = ns_flow_yaml_content n c / c_flow_json_content n c ns_flow_yaml_node n c {- 159 -} = wrapTokens BeginNode EndNode $ c_ns_alias_node / ns_flow_yaml_content n c / ( c_ns_properties n c & ( ( s_separate n c & ns_flow_yaml_content n c ) / e_scalar ) ) c_flow_json_node n c {- 160 -} = wrapTokens BeginNode EndNode $ ( c_ns_properties n c & s_separate n c ?) & c_flow_json_content n c ns_flow_node n c {- 161 -} = wrapTokens BeginNode EndNode $ c_ns_alias_node / ns_flow_content n c / ( c_ns_properties n c & ( ( s_separate n c & ns_flow_content n c ) / e_scalar ) ) -- 8.1.1 Block Scalar Headers c_b_block_header n {- 162 -} = DeHeader ^ ( do m <- c_indentation_indicator n t <- c_chomping_indicator ( s_white / b_char ) ?! DeHeader s_b_comment result (m, t) / do t <- c_chomping_indicator m <- c_indentation_indicator n s_b_comment result (m, t) ) -- 8.1.1.1 Block Indentation Indicator c_indentation_indicator n {- 163 -} = fmap fixup (indicator ( ns_dec_digit - '0' ) & asInteger) / detect_scalar_indentation n where fixup | n == -1 = (.+ 1) -- compensate for anomaly at left-most n | otherwise = id detect_scalar_indentation n = peek $ ( nb_char *) -- originally: -- & ( b_non_content & ( l_empty n BlockIn *) ?) & ( b_break & ( (s_space *) & b_break *) ?) & count_spaces (-n) count_spaces n = (s_space & count_spaces (n .+ 1)) / result (max 1 n) -- 8.1.1.2 Chomping Indicator c_chomping_indicator {- 164 -} = indicator '-' & result Strip / indicator '+' & result Keep / result Clip end_block_scalar t = case t of Strip -> emptyToken EndScalar Clip -> emptyToken EndScalar Keep -> empty b_chomped_last t {- 165 -} = case t of Strip -> emptyToken EndScalar & b_non_content Clip -> b_as_line_feed & emptyToken EndScalar Keep -> b_as_line_feed l_chomped_empty n t {- 166 -} = case t of Strip -> l_strip_empty n Clip -> l_strip_empty n Keep -> l_keep_empty n l_strip_empty n {- 167 -} = ( s_indent_le n & b_non_content *) & ( l_trail_comments n ?) l_keep_empty n {- 168 -} = ( l_empty n BlockIn *) & emptyToken EndScalar & ( l_trail_comments n ?) l_trail_comments n {- 169 -} = s_indent_lt n & c_nb_comment_text & b_comment & ( nonEmpty l_comment *) -- 8.1.2 Literal Style c_l__literal n {- 170 -} = do emptyToken BeginScalar c_literal ! DeNode (m, t) <- c_b_block_header n `prefixErrorWith` emptyToken EndScalar text ( l_literal_content (n .+ m) t ) l_nb_literal_text n {- 171 -} = ( l_empty n BlockIn *) & s_indent n & ( nb_char +) b_nb_literal_next n {- 172 -} = b_as_line_feed & l_nb_literal_text n l_literal_content n t {- 173 -} = ( ( l_nb_literal_text n & ( b_nb_literal_next n *) & b_chomped_last t ) / end_block_scalar t ) & l_chomped_empty n t -- 8.1.3 Folded Style c_l__folded n {- 174 -} = do emptyToken BeginScalar c_folded ! DeNode (m, t) <- c_b_block_header n `prefixErrorWith` emptyToken EndScalar text ( l_folded_content (n .+ m) t ) s_nb_folded_text n {- 175 -} = s_indent n & ns_char ! DeFold & ( nb_char *) l_nb_folded_lines n {- 176 -} = s_nb_folded_text n & ( b_l_folded n BlockIn & s_nb_folded_text n *) s_nb_spaced_text n {- 177 -} = s_indent n & s_white ! DeFold & ( nb_char *) b_l_spaced n {- 178 -} = b_as_line_feed & ( l_empty n BlockIn *) l_nb_spaced_lines n {- 179 -} = s_nb_spaced_text n & ( b_l_spaced n & s_nb_spaced_text n *) l_nb_same_lines n {- 180 -} = ( l_empty n BlockIn *) & DeFold ^ ( l_nb_folded_lines n / l_nb_spaced_lines n ) l_nb_diff_lines n {- 181 -} = l_nb_same_lines n & ( b_as_line_feed & l_nb_same_lines n *) l_folded_content n t {- 182 -} = ( ( l_nb_diff_lines n & b_chomped_last t ) / end_block_scalar t ) & l_chomped_empty n t -- 8.2.1 Block Sequences detect_collection_indentation n = peek $ ( nonEmpty l_comment* ) & count_spaces (-n) detect_inline_indentation = peek $ count_spaces 0 l__block_sequence n {- 183 -} = do m <- detect_collection_indentation n wrapTokens BeginSequence EndSequence $ ( s_indent (n .+ m) & c_l_block_seq_entry (n .+ m) +) c_l_block_seq_entry n {- 184 -} = c_sequence_entry & ( ns_char >!) ! DeNode & s_l__block_indented n BlockIn s_l__block_indented n c {- 185 -} = do m <- detect_inline_indentation DeNode ^ ( ( s_indent m & ( ns_l_in_line_sequence (n .+ 1 .+ m) / ns_l_in_line_mapping (n .+ 1 .+ m) ) ) / s_l__block_node n c / ( e_node & ( s_l_comments ?) & unparsed (n .+ 1) ) ) `recovery` unparsed (n .+ 1) ns_l_in_line_sequence n {- 186 -} = wrapTokens BeginNode EndNode $ wrapTokens BeginSequence EndSequence $ c_l_block_seq_entry n & ( s_indent n & c_l_block_seq_entry n *) -- 8.2.2 Block Mappings l__block_mapping n = {- 187 -} do m <- detect_collection_indentation n wrapTokens BeginMapping EndMapping $ ( s_indent (n .+ m) & ns_l_block_map_entry (n .+ m) +) ns_l_block_map_entry n {- 188 -} = wrapTokens BeginPair EndPair $ c_l_block_map_explicit_entry n / ns_l_block_map_implicit_entry n c_l_block_map_explicit_entry n {- 189 -} = c_l_block_map_explicit_key n & ( l_block_map_explicit_value n / e_node ) c_l_block_map_explicit_key n {- 190 -} = c_mapping_key & ( ns_char >!) ! DeNode & s_l__block_indented n BlockOut l_block_map_explicit_value n {- 191 -} = s_indent n & c_mapping_value & s_l__block_indented n BlockOut ns_l_block_map_implicit_entry n {- 192 -} = ( ns_s_block_map_implicit_key / e_node ) & c_l_block_map_implicit_value n ns_s_block_map_implicit_key {- 193 -} = c_s_implicit_json_key BlockKey / ns_s_implicit_yaml_key BlockKey c_l_block_map_implicit_value n {- 194 -} = c_mapping_value ! DeNode & ( ( s_l__block_node n BlockOut / ( e_node & ( s_l_comments ?) & unparsed (n .+ 1) ) ) `recovery` unparsed (n .+ 1) ) ns_l_in_line_mapping n {- 195 -} = wrapTokens BeginNode EndNode $ wrapTokens BeginMapping EndMapping $ ns_l_block_map_entry n & ( s_indent n & ns_l_block_map_entry n *) -- 8.2.3 Block Nodes unparsed n = ( sol / unparsed_text & unparsed_break ) & ( nonEmpty ( unparsed_indent n & unparsed_text & unparsed_break ) *) unparsed_indent n = token Unparsed ( s_space % n ) unparsed_text = token Unparsed ( upto ( eof / c_forbidden / b_break ) ) unparsed_break = eof / peek c_forbidden / token Unparsed b_break / empty s_l__block_node n c {- 196 -} = s_l__block_in_block n c / s_l__flow_in_block n s_l__flow_in_block n {- 197 -} = s_separate (n .+ 1) FlowOut & ns_flow_node (n .+ 1) FlowOut & s_l_comments s_l__block_in_block n c {- 198 -} = wrapTokens BeginNode EndNode $ ( s_l__block_scalar n c / s_l__block_collection n c ) s_l__block_scalar n c {- 199 -} = s_separate (n .+ 1) c & ( c_ns_properties (n .+ 1) c & s_separate (n .+ 1) c ?) & ( c_l__literal n / c_l__folded n ) s_l__block_collection n c {- 200 -} = ( s_separate (n .+ 1) c & c_ns_properties (n .+ 1) c & ( s_l_comments >?) ?) & s_l_comments & ( l__block_sequence (seq_spaces n c) / l__block_mapping n ) seq_spaces n c {- 201 -} = case c of BlockOut -> n .- 1 BlockIn -> n _ -> error "unexpected node style pattern in seq_spaces" -- 9.1.1 Document Prefix l_document_prefix {- 202 -} = ( c_byte_order_mark ?) & ( nonEmpty l_comment *) -- 9.1.2 Document Markers c_directives_end {- 203 -} = token DirectivesEnd [ '-', '-', '-' ] c_document_end {- 204 -} = token DocumentEnd [ '.', '.', '.' ] l_document_suffix {- 205 -} = c_document_end & s_l_comments c_forbidden {- 206 -} = sol & ( c_directives_end / c_document_end ) & ( b_char / s_white / eof ) -- 9.1.3 Explicit Documents l_bare_document {- 207 -} = DeNode ^ s_l__block_node (-1) BlockIn `forbidding` c_forbidden -- 9.1.4 Explicit Documents l_explicit_document {- 208 -} = ( c_directives_end & ( b_char / s_white / eof >?)) ! DeDoc & ( ( l_bare_document / e_node & ( s_l_comments ?) & unparsed 0 ) `recovery` unparsed 0 ) -- 9.1.5 Directives Documents l_directives_document {- 209 -} = ( l_directive +) & l_explicit_document -- 9.2 Streams: l_any_document {- 210 -} = wrapTokens BeginDocument EndDocument $ DeDoc ^ ( l_directives_document / l_explicit_document / l_bare_document ) `recovery` unparsed 0 l_yaml_stream {- 211 -} = ( nonEmpty l_document_prefix *) & ( eof / ( c_document_end & ( b_char / s_white / eof ) >?) / l_any_document ) & ( nonEmpty ( DeMore ^ ( ( l_document_suffix ! DeMore +) & ( nonEmpty l_document_prefix *) & ( eof / l_any_document ) / ( nonEmpty l_document_prefix *) & DeDoc ^ ( wrapTokens BeginDocument EndDocument l_explicit_document ?) ) ) *) HsYAML-0.2.1.4/src/Data/YAML/Token/0000755000000000000000000000000007346545000014364 5ustar0000000000000000HsYAML-0.2.1.4/src/Data/YAML/Token/Encoding.hs0000644000000000000000000002615707346545000016461 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE Safe #-} -- | -- Copyright: © Oren Ben-Kiki 2007, -- © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- -- UTF decoding -- -- This really should be factored out to the standard libraries. Since it isn't -- there, we get to tailor it exactly to our needs. We use lazy byte strings as -- input, which should give reasonable I\/O performance when reading large -- files. The output is a normal 'Char' list which is easy to work with and -- should be efficient enough as long as the 'Data.YAML.Token.Parser' does its job right. -- module Data.YAML.Token.Encoding ( decode , Encoding(..) ) where import qualified Data.ByteString.Lazy as BL import qualified Data.ByteString.Lazy.Char8 as BLC import Util -- | Denotes the /Unicode Transformation Format/ (UTF) used for serializing the YAML document data Encoding = UTF8 -- ^ UTF-8 encoding (or ASCII) | UTF16LE -- ^ UTF-16 little endian | UTF16BE -- ^ UTF-16 big endian | UTF32LE -- ^ UTF-32 little endian | UTF32BE -- ^ UTF-32 big endian deriving (Eq,Generic) -- | @show encoding@ converts an 'Encoding' to the encoding name (with a "-") -- as used by most programs. instance Show Encoding where show UTF8 = "UTF-8" show UTF16LE = "UTF-16LE" show UTF16BE = "UTF-16BE" show UTF32LE = "UTF-32LE" show UTF32BE = "UTF-32BE" -- | @since 0.2.0 instance NFData Encoding where rnf !_ = () -- | @'decode' bytes@ automatically detects the 'Encoding' used and converts the -- /bytes/ to Unicode characters, with byte offsets. Note the offset is for -- past end of the character, not its beginning. decode :: BLC.ByteString -> (Encoding, [(Int, Char)]) decode text = (encoding, undoEncoding encoding text) where encoding = detectEncoding $ BL.unpack $ BL.take 4 text -- | @'detectEncoding' text@ examines the first few chars (bytes) of the /text/ -- to deduce the Unicode encoding used according to the YAML spec. detectEncoding :: [Word8] -> Encoding detectEncoding text = case text of 0x00 : 0x00 : 0xFE : 0xFF : _ -> UTF32BE 0x00 : 0x00 : 0x00 : _ : _ -> UTF32BE 0xFF : 0xFE : 0x00 : 0x00 : _ -> UTF32LE _ : 0x00 : 0x00 : 0x00 : _ -> UTF32LE 0xFE : 0xFF : _ -> UTF16BE 0x00 : _ : _ -> UTF16BE 0xFF : 0xFE : _ -> UTF16LE _ : 0x00 : _ -> UTF16LE 0xEF : 0xBB : 0xBF : _ -> UTF8 _ -> UTF8 -- | @undoEncoding encoding bytes@ converts a /bytes/ stream to Unicode -- characters according to the /encoding/. undoEncoding :: Encoding -> BLC.ByteString -> [(Int, Char)] undoEncoding encoding bytes = case encoding of UTF8 -> undoUTF8 bytes 0 UTF16LE -> combinePairs $ undoUTF16LE bytes 0 UTF16BE -> combinePairs $ undoUTF16BE bytes 0 UTF32LE -> validateScalars $ undoUTF32LE bytes 0 UTF32BE -> validateScalars $ undoUTF32BE bytes 0 where validateScalars [] = [] validateScalars (x@(_,c):rest) | '\xD800' <= c, c <= '\xDFFF' = error "UTF-32 stream contains invalid surrogate code-point" | otherwise = x : validateScalars rest -- ** UTF-32 decoding -- | @hasFewerThan bytes n@ checks whether there are fewer than /n/ /bytes/ -- left to read. hasFewerThan :: Int -> BLC.ByteString -> Bool hasFewerThan n bytes | n == 1 = BLC.null bytes | n > 1 = BLC.null bytes || hasFewerThan (n - 1) (BLC.tail bytes) | otherwise = False -- | @undoUTF32LE bytes offset@ decoded a UTF-32LE /bytes/ stream to Unicode -- chars. undoUTF32LE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF32LE bytes offset | BLC.null bytes = [] | hasFewerThan 4 bytes = error "UTF-32LE input contains invalid number of bytes" | otherwise = let first = BLC.head bytes bytes' = BLC.tail bytes second = BLC.head bytes' bytes'' = BLC.tail bytes' third = BLC.head bytes'' bytes''' = BLC.tail bytes'' fourth = BLC.head bytes''' rest = BLC.tail bytes''' in (offset + 4, chr $ ord first + 256 * (ord second + 256 * (ord third + 256 * ord fourth))):(undoUTF32LE rest $ offset + 4) -- | @undoUTF32BE bytes offset@ decoded a UTF-32BE /bytes/ stream to Unicode -- chars. undoUTF32BE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF32BE bytes offset | BLC.null bytes = [] | hasFewerThan 4 bytes = error "UTF-32BE input contains invalid number of bytes" | otherwise = let first = BLC.head bytes bytes' = BLC.tail bytes second = BLC.head bytes' bytes'' = BLC.tail bytes' third = BLC.head bytes'' bytes''' = BLC.tail bytes'' fourth = BLC.head bytes''' rest = BLC.tail bytes''' in (offset + 4, chr $ ord fourth + 256 * (ord third + 256 * (ord second + 256 * ord first))):(undoUTF32BE rest $ offset + 4) -- ** UTF-16 decoding -- | @combinePairs chars@ converts each pair of UTF-16 surrogate characters to a -- single Unicode character. combinePairs :: [(Int, Char)] -> [(Int, Char)] combinePairs [] = [] combinePairs (head'@(_, head_char):tail') | '\xD800' <= head_char && head_char <= '\xDBFF' = combineLead head' tail' | '\xDC00' <= head_char && head_char <= '\xDFFF' = error "UTF-16 contains trail surrogate without lead surrogate" | otherwise = head':combinePairs tail' -- | @combineLead lead rest@ combines the /lead/ surrogate with the head of the -- /rest/ of the input chars, assumed to be a /trail/ surrogate, and continues -- combining surrogate pairs. combineLead :: (Int, Char) -> [(Int, Char)] -> [(Int, Char)] combineLead _lead [] = error "UTF-16 contains lead surrogate as final character" combineLead (_, lead_char) ((trail_offset, trail_char):rest) | '\xDC00' <= trail_char && trail_char <= '\xDFFF' = (trail_offset, combineSurrogates lead_char trail_char):combinePairs rest | otherwise = error "UTF-16 contains lead surrogate without trail surrogate" -- | @surrogateOffset@ is copied from the Unicode FAQs. surrogateOffset :: Int surrogateOffset = 0x10000 - (0xD800 * 1024) - 0xDC00 -- | @combineSurrogates lead trail@ combines two UTF-16 surrogates into a single -- Unicode character. combineSurrogates :: Char -> Char -> Char combineSurrogates lead trail = chr $ ord lead * 1024 + ord trail + surrogateOffset -- | @undoUTF18LE bytes offset@ decoded a UTF-16LE /bytes/ stream to Unicode -- chars. undoUTF16LE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF16LE bytes offset | BLC.null bytes = [] | hasFewerThan 2 bytes = error "UTF-16LE input contains odd number of bytes" | otherwise = let low = BLC.head bytes bytes' = BLC.tail bytes high = BLC.head bytes' rest = BLC.tail bytes' in (offset + 2, chr $ ord low + ord high * 256):(undoUTF16LE rest $ offset + 2) -- | @undoUTF18BE bytes offset@ decoded a UTF-16BE /bytes/ stream to Unicode -- chars. undoUTF16BE :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF16BE bytes offset | BLC.null bytes = [] | hasFewerThan 2 bytes = error "UTF-16BE input contains odd number of bytes" | otherwise = let high = BLC.head bytes bytes' = BLC.tail bytes low = BLC.head bytes' rest = BLC.tail bytes' in (offset + 2, chr $ ord low + ord high * 256):(undoUTF16BE rest $ offset + 2) -- ** UTF-8 decoding -- | @undoUTF8 bytes offset@ decoded a UTF-8 /bytes/ stream to Unicode chars. undoUTF8 :: BLC.ByteString -> Int -> [(Int, Char)] undoUTF8 bytes = undoUTF8' (BL.unpack bytes) w2c :: Word8 -> Char w2c = chr . fromIntegral w2i :: Word8 -> Int w2i = fromIntegral undoUTF8' :: [Word8] -> Int -> [(Int, Char)] undoUTF8' [] _ = [] undoUTF8' (first:rest) !offset | first < 0x80 = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 1 !c = w2c first undoUTF8' (first:rest) !offset | first < 0xC0 = error "UTF-8 input contains invalid first byte" | first < 0xE0 = decodeTwoUTF8 first offset rest | first < 0xF0 = decodeThreeUTF8 first offset rest | first < 0xF8 = decodeFourUTF8 first offset rest | otherwise = error "UTF-8 input contains invalid first byte" -- | @decodeTwoUTF8 first offset bytes@ decodes a two-byte UTF-8 character, -- where the /first/ byte is already available and the second is the head of -- the /bytes/, and then continues to undo the UTF-8 encoding. decodeTwoUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeTwoUTF8 first offset (second:rest) | second < 0x80 || 0xBF < second = error "UTF-8 double byte char has invalid second byte" | otherwise = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 2 !c = chr ((w2i first - 0xc0) * 0x40 + (w2i second - 0x80)) decodeTwoUTF8 _ _ [] = error "UTF-8 double byte char is missing second byte at eof" -- | @decodeThreeUTF8 first offset bytes@ decodes a three-byte UTF-8 character, -- where the /first/ byte is already available and the second and third are the -- head of the /bytes/, and then continues to undo the UTF-8 encoding. decodeThreeUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeThreeUTF8 first offset (second:third:rest) | second < 0x80 || 0xBF < second = error "UTF-8 triple byte char has invalid second byte" | third < 0x80 || 0xBF < third = error "UTF-8 triple byte char has invalid third byte" | otherwise = (offset', c): undoUTF8' rest offset' where !offset' = offset + 3 !c = chr((w2i first - 0xE0) * 0x1000 + (w2i second - 0x80) * 0x40 + (w2i third - 0x80)) decodeThreeUTF8 _ _ _ =error "UTF-8 triple byte char is missing bytes at eof" -- | @decodeFourUTF8 first offset bytes@ decodes a four-byte UTF-8 character, -- where the /first/ byte is already available and the second, third and fourth -- are the head of the /bytes/, and then continues to undo the UTF-8 encoding. decodeFourUTF8 :: Word8 -> Int -> [Word8] -> [(Int, Char)] decodeFourUTF8 first offset (second:third:fourth:rest) | second < 0x80 || 0xBF < second = error "UTF-8 quad byte char has invalid second byte" | third < 0x80 || 0xBF < third = error "UTF-8 quad byte char has invalid third byte" | third < 0x80 || 0xBF < third = error "UTF-8 quad byte char has invalid fourth byte" | otherwise = (offset', c) : undoUTF8' rest offset' where !offset' = offset + 4 !c = chr((w2i first - 0xF0) * 0x40000 + (w2i second - 0x80) * 0x1000 + (w2i third - 0x80) * 0x40 + (w2i fourth - 0x80)) decodeFourUTF8 _ _ _ = error "UTF-8 quad byte char is missing bytes at eof" HsYAML-0.2.1.4/src/0000755000000000000000000000000007346545000011671 5ustar0000000000000000HsYAML-0.2.1.4/src/Util.hs0000644000000000000000000001045407346545000013146 0ustar0000000000000000{-# LANGUAGE BangPatterns #-} {-# LANGUAGE CPP #-} {-# LANGUAGE Safe #-} {-# LANGUAGE ScopedTypeVariables #-} {-# OPTIONS_GHC -fno-warn-unused-imports #-} -- | -- Copyright: © Herbert Valerio Riedel 2015-2018 -- SPDX-License-Identifier: GPL-2.0-or-later -- module Util ( liftEither' , readMaybe , readEither , fromIntegerMaybe , (<>) , mapFromListNoDupes , mapInsertNoDupe , bsToStrict , module X ) where import Control.Applicative as X import Control.DeepSeq as X (NFData (rnf)) import Control.Monad as X import Data.Functor as X import Data.Int as X import Data.Word as X import GHC.Generics as X (Generic) import Numeric.Natural as X (Natural) import Control.Monad.Fix as X (MonadFix) import Control.Monad.Except as X (MonadError (..)) import Control.Monad.Identity as X import Control.Monad.Trans.Except as X (ExceptT (..), runExceptT) import Data.Char as X (chr, ord) import Data.Map as X (Map) import qualified Data.Map as Map import Data.Monoid as X (Monoid (mappend, mempty)) #if MIN_VERSION_base(4,9,0) import Data.Semigroup ((<>)) #else import Data.Monoid ((<>)) #endif import qualified Data.ByteString as BS import qualified Data.ByteString.Lazy as BS.L import Data.Set as X (Set) import Data.Text as X (Text) import Text.ParserCombinators.ReadP as P import Text.Read -- GHC 8.4.1 shipped with a phony `mtl-2.2.2` and so we have no -- bulletproof way to know when `Control.Monad.Except` exports liftEither -- or not; after NixOS managed to break an otherwise effective workaround -- I'll just throwing my hands up in the air and will consider -- `Control.Monad.Except.liftEither` scorched earth for now. liftEither' :: MonadError e m => Either e a -> m a liftEither' = either throwError return #if !MIN_VERSION_base(4,6,0) -- | Parse a string using the 'Read' instance. Succeeds if there is -- exactly one valid result. readMaybe :: Read a => String -> Maybe a readMaybe = either (const Nothing) id . readEither -- | Parse a string using the 'Read' instance. Succeeds if there is -- exactly one valid result. A 'Left' value indicates a parse error. readEither :: Read a => String -> Either String a readEither s = case [ x | (x,"") <- readPrec_to_S read' minPrec s ] of [x] -> Right x [] -> Left "Prelude.read: no parse" _ -> Left "Prelude.read: ambiguous parse" where read' = do x <- readPrec Text.Read.lift P.skipSpaces return x #endif -- | Succeeds if the 'Integral' value is in the bounds of the given Data type. -- 'Nothing' indicates that the value is outside the bounds. fromIntegerMaybe :: forall n . (Integral n, Bounded n) => Integer -> Maybe n fromIntegerMaybe j | l <= j, j <= u = Just (fromInteger j) | otherwise = Nothing where u = toInteger (maxBound :: n) l = toInteger (minBound :: n) -- | A convience wrapper over 'mapInsertNoDupe' mapFromListNoDupes :: Ord k => [(k,a)] -> Either (k,a) (Map k a) mapFromListNoDupes = go mempty where go !m [] = Right m go !m ((k,!v):rest) = case mapInsertNoDupe k v m of Nothing -> Left (k,v) Just m' -> go m' rest -- | A convience wrapper over 'Data.Map.insertLookupWithKey' mapInsertNoDupe :: Ord k => k -> a -> Map k a -> Maybe (Map k a) mapInsertNoDupe kx x t = case Map.insertLookupWithKey (\_ a _ -> a) kx x t of (Nothing, m) -> Just m (Just _, _) -> Nothing -- | Equivalent to the function 'Data.ByteString.toStrict'. -- O(n) Convert a lazy 'BS.L.ByteString' into a strict 'BS.ByteString'. {-# INLINE bsToStrict #-} bsToStrict :: BS.L.ByteString -> BS.ByteString #if MIN_VERSION_bytestring(0,10,0) bsToStrict = BS.L.toStrict #else bsToStrict = BS.concat . BS.L.toChunks #endif HsYAML-0.2.1.4/tests/0000755000000000000000000000000007346545000012244 5ustar0000000000000000HsYAML-0.2.1.4/tests/Tests.hs0000644000000000000000000000623507346545000013710 0ustar0000000000000000{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE RecordWildCards #-} import Control.Monad import Control.Applicative import Data.YAML as Y import qualified Data.Text as T import qualified Data.Map as Map import qualified Data.ByteString.Lazy.Char8 as BS.L import Test.Tasty (defaultMain, TestTree, testGroup) import Test.Tasty.QuickCheck (testProperty,Arbitrary(..)) outputStr :: ToYAML a => a -> BS.L.ByteString outputStr a = BS.L.init (encode1 a) -- TODO: remove trailing newline from Writer.hs roundTripInt :: Int -> Bool roundTripInt i = BS.L.pack (show i) == outputStr i roundTripBool :: Bool -> Bool roundTripBool b | b = "true" == outputStr b | otherwise = "false" == outputStr b roundTripDouble :: Double -> Double -> Bool roundTripDouble num denom | d /= d = ".nan" == outputStr d | d == (1/0) = ".inf" == outputStr d | d == (-1/0) = "-.inf" == outputStr d | otherwise = BS.L.pack (show d) == outputStr d where d = num / denom roundTrip :: (Eq a, FromYAML a, ToYAML a) => (a -> a -> Bool) -> a -> a -> Bool roundTrip eq _ v = case decode1 (encode1 v) :: (FromYAML a) => (Either (Pos, String) a) of Left _ -> False Right ans -> ans `eq` v approxEq :: Double -> Double -> Bool approxEq a b = a == b || d < maxAbsoluteError || d / max (abs b) (abs a) <= maxRelativeError where d = abs (a - b) maxAbsoluteError = 1e-15 maxRelativeError = 1e-15 roundTripEq :: (Eq a, FromYAML a, ToYAML a) => a -> a -> Bool roundTripEq x y = roundTrip (==) x y main :: IO () main = defaultMain (testGroup "tests" tests) tests :: [TestTree] tests = [ testGroup "encode" [ testProperty "encodeInt" roundTripInt , testProperty "encodeBool" roundTripBool , testProperty "encodeDouble" roundTripDouble ] , testGroup "roundTrip" [ testProperty "Bool" $ roundTripEq True , testProperty "Double" $ roundTrip approxEq (1::Double) , testProperty "Int" $ roundTripEq (1::Int) , testProperty "Integer" $ roundTripEq (1::Integer) , testProperty "Text" $ roundTripEq T.empty , testProperty "Seq" $ roundTripEq ([""]:: [T.Text]) , testProperty "Map" $ roundTripEq (undefined :: Map.Map T.Text T.Text) , testProperty "Foo" $ roundTripEq (undefined :: Foo) ] ] instance Arbitrary T.Text where arbitrary = T.pack <$> arbitrary data Foo = Foo { fooBool :: Bool , fooInt :: Int , fooTuple :: (T.Text, Int) , fooSeq :: [T.Text] , fooMap :: Map.Map T.Text T.Text } deriving (Show,Eq) instance ToYAML Foo where toYAML Foo{..} = mapping [ "fooBool" .= fooBool , "fooInt" .= fooInt , "fooTuple" .= fooTuple , "fooSeq" .= fooSeq , "fooMap" .= fooMap ] instance FromYAML Foo where parseYAML = withMap "Foo" $ \m -> Foo <$> m .: "fooBool" <*> m .: "fooInt" <*> m .: "fooTuple" <*> m .: "fooSeq" <*> m .: "fooMap" instance Arbitrary Foo where arbitrary = liftM5 Foo arbitrary arbitrary arbitrary arbitrary arbitrary