pax_global_header00006660000000000000000000000064144737156760014536gustar00rootroot0000000000000052 comment=f46a8f2cd89070d043e01b20e6b34d155854ab6f pprint-20230830/000077500000000000000000000000001447371567600133155ustar00rootroot00000000000000pprint-20230830/.exclude000066400000000000000000000000471447371567600147500ustar00rootroot00000000000000OldPPrintEngine.* size.* PPrintBench.* pprint-20230830/.gitignore000066400000000000000000000001121447371567600152770ustar00rootroot00000000000000*~ _build blog/billet.html .merlin pprint.install dune-workspace.versions pprint-20230830/AUTHORS.md000066400000000000000000000002341447371567600147630ustar00rootroot00000000000000PPrint was written by François Pottier and Nicolas Pouillard, with contributions by Yann Régis-Gianas, Gabriel Scherer, Jonathan Protzenko, Thomas Refis. pprint-20230830/CHANGES.md000066400000000000000000000071071447371567600147140ustar00rootroot00000000000000# Changes ## 2023/08/30 * The new function `is_empty` allows testing (in constant time) whether a document is empty. * Documentation: add a warning about the time and space complexity of a naive use of `ifflat`. * The library now requires OCaml 4.03 or newer. * Add a new micro-benchmark, which uses `core_bench` and involves randomly-generated arithmetic expressions. ## 2022/01/03 * Improved documentation. (Initial proposal by Thomas Refis, extended by François Pottier.) * The internal modules `PPrintEngine`, `PPrintCombinators`, `PPrintRenderer`, and `PPrintOCaml` have been removed. (Their existence was an implementation detail.) Please refer to `PPrint`, `PPrint`, `PPrint`, and `PPrint.OCaml` instead. ## 2021/11/29 * Trailing blank characters at the end of a line are now suppressed. This includes indentation characters (whose production is implicit) as well as blank characters that are explicitly produced by the combinators `space` and `blank`. Trailing blank characters are suppressed in both rendering modes (pretty and compact). (Contributed by Thomas Refis, reviewed and polished by François Pottier.) * New function `PPrint.OCaml.unit`. ## 2020/04/10 * New function `PPrint.utf8format`. ## 2020/03/16 * New functions `PPrint.OCaml.flowing_list` and `PPrint.OCaml.flowing_array`. ## 2020/02/26 * Change the behavior of `PPrint.ToFormatter` to use `Format.pp_print_text` internally. This means that a newline character causes a call to `Format.pp_force_newline`; a space character causes a call to `Format.pp_print_space`; and every other character is printed using `Format.pp_print_char`. * Switch to `dune`. * Avoid a few compilation warnings. ## 2018/05/23 * Add a `line` field to the `state` record, which can be read by the code that implements custom documents. Add a `range` combinator that allows retrieving the start and end points of a (sub)document in the output. (Suggested by Victor Gomes.) ## 2017/10/03 * Update the code and build options to use `-safe-string`. This means that the library now requires OCaml 4.02 or later, and is compatible with 4.06. ## 2015/03/16 * Moved to github and changed the license to LGPL with an exception. ## 2014/04/25 * Minor changes in the implementation of `string` and `substring`. Initially committed on 2014/03/24, but left out of the 20140424 release due to a goof-up. ## 2014/04/11 * Changed the behavior of `align`, which was not consistent with its documentation. `align` now sets the indentation level to the current column. In particular, this means that `align (align d)` is equivalent to `align d`, which was not the case previously. Thanks to Dmitry Grebeniuk for reporting this issue. ## 2014/04/03 * The library is now extensible (in principle). A `custom` document constructor allows the user to define her own documents, as long as they fit the manner in which the current rendering engine works. * The `compact` rendering engine is now tail-recursive too. ## 2014/03/21 * Minor optimisation in the smart constructor `group`. ## 2014/03/13 * New (simpler) pretty-printing engine. The representation of documents in memory is slightly larger; document construction is perhaps slightly slower, while rendering is significantly faster. (Construction dominates rendering.) The rendering speed is now guaranteed to be independent of the width parameter. The price to pay for this simplification is that the primitive document constructors `column` and `nesting` are no longer supported. The API is otherwise unchanged. ## 2013/01/31 * First official release of PPrint. pprint-20230830/LICENSE000066400000000000000000000636661447371567600143430ustar00rootroot00000000000000In the following, "the Library" refers to the OCaml source files that form the PPrint library. The names of these files match the pattern PPrint*.{ml,mli}. The Library is distributed under the terms of the GNU Library General Public License version 2 (included below). As a special exception to the GNU Library General Public License, you may link, statically or dynamically, a "work that uses the Library" with a publicly distributed version of the Library to produce an executable file containing portions of the Library, and distribute that executable file under terms of your choice, without any of the additional requirements listed in clause 6 of the GNU Library General Public License. By "a publicly distributed version of the Library", we mean either the unmodified Library as distributed by INRIA, or a modified version of the Library that is distributed under the conditions defined in clause 2 of the GNU Library General Public License. This exception does not however invalidate any other reasons why the executable file might be covered by the GNU Library General Public License. ---------------------------------------------------------------------- GNU LIBRARY GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1991 Free Software Foundation, Inc. 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the library GPL. It is numbered 2 because it goes with version 2 of the ordinary GPL.] Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Library General Public License, applies to some specially designated Free Software Foundation software, and to any other libraries whose authors decide to use it. You can use it for your libraries, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library, or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link a program with the library, you must provide complete object files to the recipients so that they can relink them with the library, after making changes to the library and recompiling it. And you must show them these terms so they know their rights. Our method of protecting your rights has two steps: (1) copyright the library, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the library. Also, for each distributor's protection, we want to make certain that everyone understands that there is no warranty for this free library. If the library is modified by someone else and passed on, we want its recipients to know that what they have is not the original version, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that companies distributing free software will individually obtain patent licenses, thus in effect transforming the program into proprietary software. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License, which was designed for utility programs. This license, the GNU Library General Public License, applies to certain designated libraries. This license is quite different from the ordinary one; be sure to read it in full, and don't assume that anything in it is the same as in the ordinary license. The reason we have a separate public license for some libraries is that they blur the distinction we usually make between modifying or adding to a program and simply using it. Linking a program with a library, without changing the library, is in some sense simply using the library, and is analogous to running a utility program or application program. However, in a textual and legal sense, the linked executable is a combined work, a derivative of the original library, and the ordinary General Public License treats it as such. Because of this blurred distinction, using the ordinary General Public License for libraries did not effectively promote software sharing, because most developers did not use the libraries. We concluded that weaker conditions might promote sharing better. However, unrestricted linking of non-free programs would deprive the users of those programs of all benefit from the free status of the libraries themselves. This Library General Public License is intended to permit developers of non-free programs to use free libraries, while preserving your freedom as a user of such programs to change the free libraries that are incorporated in them. (We have not seen how to achieve this as regards changes in header files, but we have achieved it as regards changes in the actual functions of the Library.) The hope is that this will lead to faster development of free libraries. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, while the latter only works together with the library. Note that it is possible for a library to be covered by the ordinary General Public License rather than by this special one. GNU LIBRARY GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Library General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also compile or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. c) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. d) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Library General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Appendix: How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This library is free software; you can redistribute it and/or modify it under the terms of the GNU Library General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details. You should have received a copy of the GNU Library General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA Also add information on how to contact you by electronic and paper mail. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. , 1 April 1990 Ty Coon, President of Vice That's all there is to it! pprint-20230830/Makefile000066400000000000000000000105411447371567600147560ustar00rootroot00000000000000# ------------------------------------------------------------------------------ # The version number is automatically set to the current date, # unless DATE is defined on the command line. DATE := $(shell /bin/date +%Y%m%d) # The project's name. THIS := pprint # The archive's URL (https). ARCHIVE := https://github.com/fpottier/$(THIS)/archive/$(DATE).tar.gz # ------------------------------------------------------------------------------ .PHONY: all all: @ dune build @all .PHONY: clean clean: @ git clean -fdX .PHONY: test test: @ dune exec test/PPrintTest.exe .PHONY: bench bench: @ make -C benchmark_old $@ @ make -C benchmark_new $@ .PHONY: install install: all @ dune install -p $(THIS) .PHONY: uninstall uninstall: @ ocamlfind remove $(THIS) || true .PHONY: reinstall reinstall: uninstall @ make install .PHONY: show show: reinstall @ echo "#require \"pprint\";;\n#show PPrint;;" | ocaml .PHONY: pin pin: @ opam pin add $(THIS) . --yes .PHONY: unpin unpin: @ opam pin remove $(THIS) --yes # ------------------------------------------------------------------------------ # Documentation. DOCDIR = _build/default/_doc/_html DOC = $(DOCDIR)/index.html .PHONY: doc doc: @ rm -rf _build/default/_doc @ dune clean @ dune build @doc @ echo "You can view the documentation by typing 'make view'". .PHONY: view view: doc @ echo Attempting to open $(DOC)... @ if command -v firefox > /dev/null ; then \ firefox $(DOC) ; \ else \ open -a /Applications/Firefox.app/ $(DOC) ; \ fi .PHONY: export export: doc ssh yquem.inria.fr rm -rf public_html/$(THIS)/doc scp -r $(DOCDIR) yquem.inria.fr:public_html/$(THIS)/doc # ------------------------------------------------------------------------------ # [make versions] compiles the package under many versions of OCaml, # whose list is specified below. # This requires appropriate opam switches to exist. A missing switch # can be created like this: # opam switch create 4.03.0 VERSIONS := \ 4.03.0 \ 4.04.2 \ 4.05.0 \ 4.06.1 \ 4.07.1 \ 4.08.1 \ 4.09.1 \ 4.09.0+bytecode-only \ 4.10.0 \ 4.11.1 \ 4.12.0 \ 4.13.0 \ 4.14.1 \ 5.0.0 \ .PHONY: versions versions: @(echo "(lang dune 2.0)" && \ for v in $(VERSIONS) ; do \ echo "(context (opam (switch $$v)))" ; \ done) > dune-workspace.versions @ dune build --workspace dune-workspace.versions # ------------------------------------------------------------------------------ # [make headache] updates the headers. HEADACHE := headache HEADER := header .PHONY: headache headache: @ for f in {src,benchmark_old}/*.{ml,mli} benchmark_new/*.ml ; do \ $(HEADACHE) -h $(HEADER) $$f ; \ done # ------------------------------------------------------------------------- # Publishing a release. .PHONY: release release: # Make sure the current version can be compiled and installed. @ make uninstall @ make clean @ make install # Check the current package description. @ opam lint # Check if everything has been committed. @ if [ -n "$$(git status --porcelain)" ] ; then \ echo "Error: there remain uncommitted changes." ; \ git status ; \ exit 1 ; \ else \ echo "Now making a release..." ; \ fi # Create a git tag. @ git tag -a $(DATE) -m "Release $(DATE)." # Upload. (This automatically makes a .tar.gz archive available on gitlab.) @ git push @ git push --tags # Done. @ echo "Done." @ echo "If happy, please type:" @ echo " \"make publish\" to publish a new opam package" @ echo " \"make export\" to upload the documentation to yquem.inria.fr" .PHONY: publish publish: # Publish an opam description. @ opam publish -v $(DATE) $(THIS) $(ARCHIVE) . .PHONY: undo undo: # Undo the last release (assuming it was done on the same date). @ git tag -d $(DATE) @ git push -u origin :$(DATE) # ------------------------------------------------------------------------- # Copying pprint into Menhir's working directory. MENHIR_WORKING_COPY=$(HOME)/dev/menhir PPRINT_COPY=$(MENHIR_WORKING_COPY)/pprint UNNECESSARY= \ .git \ .gitignore \ Makefile \ README.md \ TODO.md \ benchmark_old \ benchmark_new \ blog \ header \ test \ src/Makefile \ .PHONY: menhir menhir: clean # Copy our source files to the Menhir repository. @ rm -rf $(PPRINT_COPY) @ cp -r $(shell pwd) $(PPRINT_COPY) # Remove a number of unneeded files and subdirectories. @ (cd $(PPRINT_COPY) && rm -rf $(UNNECESSARY)) pprint-20230830/README.md000066400000000000000000000006071447371567600145770ustar00rootroot00000000000000# PPrint: a Pretty-Printing Toolbox `PPrint` is an OCaml library for **pretty-printing textual documents**. It takes care of **indentation and line breaks**, and is typically used to **pretty-print code**. To install the latest released version, type `opam install pprint`. Here is [the documentation of the latest released version](http://cambium.inria.fr/~fpottier/pprint/doc/pprint/). pprint-20230830/TODO.md000066400000000000000000000011251447371567600144030ustar00rootroot00000000000000* Test the interaction of `range` with the automatic removal of trailing blank characters. Do we obtain the desired behavior? * Set up a real test suite. * Write a real documentation and tutorial. * Fix the warnings produced by `make doc`. Review its output. * Update the private `Makefile` so as to publish the package documentation on yquem (or gitlab?). * Try to speed up the random generator. `choose`, applied to a list, is too slow: use an array? avoid building n suspensions when only one will be forced? * Extend `PPrintBench` to also try non-random documents of large size. pprint-20230830/benchmark_new/000077500000000000000000000000001447371567600161205ustar00rootroot00000000000000pprint-20230830/benchmark_new/AST.ml000066400000000000000000000023101447371567600170750ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (* Unary operators. *) type unop = | UNeg (* Binary operators. *) type binop = | BAdd | BSub | BMul | BDiv (* Expressions. *) type expr = | EConst of int | EUnOp of unop * expr | EBinOp of expr * binop * expr type main = expr pprint-20230830/benchmark_new/AST2Document.ml000066400000000000000000000040231447371567600206610ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (* Converting an AST to a PPrint document. *) open PPrint open AST let lparen = lparen ^^ ifflat empty space let rparen = ifflat empty hardline ^^ rparen let add = space ^^ plus ^^ break 1 let sub = space ^^ minus ^^ break 1 let mul = space ^^ star ^^ break 1 let div = space ^^ slash ^^ break 1 let[@inline] const i = utf8format "%d" i let[@inline] paren d = nest 2 (lparen ^^ d) ^^ rparen let rec factor e = group begin match e with | EConst i -> const i | EUnOp (UNeg, e) -> minus ^^ break 0 ^^ factor e | _ -> paren (expr e) end and term e = group begin match e with | EBinOp (e1, BMul, e2) -> term e1 ^^ mul ^^ factor e2 | EBinOp (e1, BDiv, e2) -> term e1 ^^ div ^^ factor e2 | _ -> factor e end and expr e = group begin match e with | EBinOp (e1, BAdd, e2) -> expr e1 ^^ add ^^ term e2 | EBinOp (e1, BSub, e2) -> expr e1 ^^ sub ^^ term e2 | _ -> term e end and main : AST.main -> document = function | e -> expr e ^^ hardline pprint-20230830/benchmark_new/Makefile000066400000000000000000000003271447371567600175620ustar00rootroot00000000000000.PHONY: all all: dune build @all .PHONY: clean clean: git clean -fX . .PHONY: bench bench: dune exec ./main.exe -- time alloc samples -quota 10 .PHONY: once once: dune exec ./main.exe -- time alloc -quota 1x pprint-20230830/benchmark_new/README.md000066400000000000000000000007041447371567600174000ustar00rootroot00000000000000This benchmark measures the performance of PPrint using an artificial benchmark that involves randomly generating arithmetic expressions, converting them to PPrint documents, and rendering these documents. Caveat: very few of the PPrint combinators are exercised by this benchmark. Use `make once` in this directory to run the benchmark just once. Use `make bench` in this directory to run it several times and obtain somewhat more reliable results. pprint-20230830/benchmark_new/benchmark.txt000066400000000000000000000107611447371567600206200ustar00rootroot00000000000000┌───────────────────────────────┬────────────────┬────────────────────┬─────────────────┬────────────────┬────────────────┬────────────┐ │ Name │ Runs @ Samples │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │ ├───────────────────────────────┼────────────────┼────────────────────┼─────────────────┼────────────────┼────────────────┼────────────┤ │ Generating AST:10 │ 216776 @ 929 │ 464.33ns │ 62.00w │ │ │ │ │ Generating AST:100 │ 21385 @ 696 │ 4_723.28ns │ 619.99w │ 0.43w │ 0.43w │ │ │ Generating AST:1000 │ 1989 @ 455 │ 48_160.03ns │ 6_199.93w │ 45.26w │ 45.26w │ │ │ Generating AST:10000 │ 197 @ 197 │ 517_285.53ns │ 62_003.34w │ 4_764.14w │ 4_764.14w │ 0.03% │ │ Generating AST:100000 │ 53 @ 53 │ 7_264_986.66ns │ 619_989.24w │ 306_696.78w │ 306_696.78w │ 0.41% │ │ Generating AST:1000000 │ 16 @ 16 │ 90_710_369.38ns │ 6_199_925.16w │ 3_730_766.82w │ 3_730_766.82w │ 5.13% │ │ Generating AST:3000000 │ 9 @ 9 │ 304_697_926.02ns │ 18_598_998.73w │ 11_330_178.05w │ 11_330_178.05w │ 17.24% │ │ Constructing document:10 │ 53868 @ 789 │ 1_856.55ns │ 691.00w │ 0.46w │ 0.46w │ │ │ Constructing document:100 │ 6327 @ 573 │ 15_400.23ns │ 5_459.00w │ 30.40w │ 30.40w │ │ │ Constructing document:1000 │ 478 @ 303 │ 189_899.66ns │ 56_479.00w │ 3_359.74w │ 3_359.74w │ 0.01% │ │ Constructing document:10000 │ 76 @ 76 │ 3_533_742.14ns │ 566_160.00w │ 259_387.04w │ 259_387.04w │ 0.20% │ │ Constructing document:100000 │ 22 @ 22 │ 47_337_705.54ns │ 5_625_674.00w │ 2_912_119.35w │ 2_912_119.35w │ 2.68% │ │ Constructing document:1000000 │ 7 @ 7 │ 569_203_287.56ns │ 56_192_506.00w │ 29_708_800.82w │ 29_708_800.82w │ 32.21% │ │ Constructing document:3000000 │ 4 @ 4 │ 1_767_411_640.50ns │ 168_560_532.00w │ 89_297_060.40w │ 89_297_060.40w │ 100.00% │ │ Rendering document:10 │ 59499 @ 799 │ 1_498.73ns │ 1_064.00w │ 0.62w │ 0.62w │ │ │ Rendering document:100 │ 7123 @ 585 │ 13_146.30ns │ 7_916.00w │ 5.82w │ 5.82w │ │ │ Rendering document:1000 │ 512 @ 311 │ 166_991.10ns │ 81_939.00w │ 2_597.18w │ 68.18w │ │ │ Rendering document:10000 │ 102 @ 102 │ 1_826_896.93ns │ 821_528.00w │ 42_269.89w │ 415.89w │ 0.10% │ │ Rendering document:100000 │ 29 @ 29 │ 24_206_210.55ns │ 8_214_669.00w │ 392_796.61w │ 2_931.61w │ 1.37% │ │ Rendering document:1000000 │ 9 @ 9 │ 258_933_670.28ns │ 82_557_024.00w │ 5_793_610.02w │ 27_404.02w │ 14.65% │ │ Rendering document:3000000 │ 3 @ 3 │ 874_072_360.80ns │ 248_119_196.00w │ 21_866_036.50w │ 84_359.50w │ 49.45% │ └───────────────────────────────┴────────────────┴────────────────────┴─────────────────┴────────────────┴────────────────┴────────────┘ pprint-20230830/benchmark_new/dune000066400000000000000000000001271447371567600167760ustar00rootroot00000000000000(executable (name main) (libraries pprint fix core_unix.command_unix core_bench) ) pprint-20230830/benchmark_new/main.ml000066400000000000000000000074141447371567600174040ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) open Core_bench let memoize = Fix.Memoize.Int.memoize (* -------------------------------------------------------------------------- *) (* Random generation of abstract syntax trees. *) module Generate = struct open AST let uneg e = EUnOp (UNeg, e) let ebinop op (e1, e2) = EBinOp (e1, op, e2) let pay s = assert (s > 0); s - 1 let split s = assert (s >= 0); let s1 = Random.int (s + 1) in let s2 = s - s1 in s1, s2 let rec expr (s : int) : expr = if s = 0 then EConst 0 else let s = pay s in let i = Random.int 5 in if i = 4 then EUnOp (UNeg, expr s) else let s1, s2 = split s in let op = List.nth [BAdd; BSub; BMul; BDiv] i in EBinOp (expr s1, op, expr s2) let main (s : int) : main = (* We want reproducible results, and placing a call to [Random.init] in the main program does not seem to work (not sure why). *) Random.init 128; expr s end (* -------------------------------------------------------------------------- *) (* Each benchmark is run at the following tree sizes. *) let args = [10; 100; 1_000; 10_000; 100_000; 1_000_000; 3_000_000] (* -------------------------------------------------------------------------- *) (* Generating ASTs. *) let generation = let name = "Generating AST" in Bench.Test.create_indexed ~name ~args @@ fun s -> Core.Staged.stage (fun () -> ignore (Generate.main s)) (* After [Generate.main] has been benchmarked, a memoized version of it can be used, so we spend less time preparing data for the next benchmarks. *) let make_ast = memoize Generate.main (* -------------------------------------------------------------------------- *) (* Converting ASTs to PPrint documents. *) let conversion = let name = "Constructing document" in Bench.Test.create_indexed ~name ~args @@ fun s -> let ast = make_ast s in Core.Staged.stage (fun () -> ignore (AST2Document.main ast)) let make_doc = memoize @@ fun s -> make_ast s |> AST2Document.main (* -------------------------------------------------------------------------- *) (* Rendering PPrint documents (in memory). *) let format document : string = let b = Buffer.create 1024 in PPrint.ToBuffer.pretty 0.8 80 b document; Buffer.contents b let formatting = let name = "Rendering document" in Bench.Test.create_indexed ~name ~args @@ fun s -> let document = make_doc s in Core.Staged.stage (fun () -> ignore (format document)) (* -------------------------------------------------------------------------- *) (* Running the benchmarks. *) let run_all_benchmarks () = Command_unix.run (Bench.make_command [ generation; conversion; formatting; ]) (* -------------------------------------------------------------------------- *) (* Main. *) let () = run_all_benchmarks() pprint-20230830/benchmark_old/000077500000000000000000000000001447371567600161055ustar00rootroot00000000000000pprint-20230830/benchmark_old/Makefile000066400000000000000000000001741447371567600175470ustar00rootroot00000000000000.PHONY: all all: dune build @all .PHONY: clean clean: git clean -fX . .PHONY: bench bench: dune exec ./PPrintBench.exe pprint-20230830/benchmark_old/OldPPrintEngine.ml000066400000000000000000000536151447371567600214520ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (* ------------------------------------------------------------------------- *) (* The last element of a non-empty list. *) let rec last x xs = match xs with | [] -> x | x :: xs -> last x xs let last = function | [] -> assert false | x :: xs -> last x xs (* ------------------------------------------------------------------------- *) (* ------------------------------------------------------------------------- *) (* A uniform interface for output channels. *) module type OUTPUT = sig type channel val char: channel -> char -> unit val substring: channel -> string -> int (* offset *) -> int (* length *) -> unit end (* ------------------------------------------------------------------------- *) (* Three implementations of the above interface, respectively based on output channels, memory buffers, and formatters. This compensates for the fact that ocaml's standard library does not allow creating an output channel that feeds into a memory buffer (a regrettable omission). *) module ChannelOutput : OUTPUT with type channel = out_channel = struct type channel = out_channel let char = output_char let substring = output_substring (* requires OCaml >= 4.02 *) end module BufferOutput : OUTPUT with type channel = Buffer.t = struct type channel = Buffer.t let char = Buffer.add_char let substring = Buffer.add_substring end module FormatterOutput : OUTPUT with type channel = Format.formatter = struct type channel = Format.formatter let char = Format.pp_print_char let substring fmt = fst (Format.pp_get_formatter_output_functions fmt ()) end (* ------------------------------------------------------------------------- *) (* ------------------------------------------------------------------------- *) (* Here is the algebraic data type of documents. It is analogous to Daan Leijen's version, but the binary constructor [Union] is replaced with the unary constructor [Group], and the constant [Line] is replaced with more general constructions, namely [IfFlat], which provides alternative forms depending on the current flattening mode, and [HardLine], which represents a newline character, and causes a failure in flattening mode. *) type document = (* [Empty] is the empty document. *) | Empty (* [Char c] is a document that consists of the single character [c]. We enforce the invariant that [c] is not a newline character. *) | Char of char (* [String (s, ofs, len)] is a document that consists of the portion of the string [s] delimited by the offset [ofs] and the length [len]. We assume, but do not check, that this portion does not contain a newline character. *) | String of string * int * int (* [FancyString (s, ofs, len, apparent_length)] is a (portion of a) string that may contain fancy characters: color escape characters, UTF-8 or multi-byte characters, etc. Thus, the apparent length (which corresponds to what will be visible on screen) differs from the length (which is a number of bytes, and is reported by [String.length]). We assume, but do not check, that fancystrings do not contain a newline character. *) | FancyString of string * int * int * int (* [Blank n] is a document that consists of [n] blank characters. *) | Blank of int (* When in flattening mode, [IfFlat (d1, d2)] turns into the document [d1]. When not in flattening mode, it turns into the document [d2]. *) | IfFlat of document * document (* When in flattening mode, [HardLine] causes a failure, which requires backtracking all the way until the stack is empty. When not in flattening mode, it represents a newline character, followed with an appropriate number of indentation. A common way of using [HardLine] is to only use it directly within the right branch of an [IfFlat] construct. *) | HardLine (* [Cat doc1 doc2] is the concatenation of the documents [doc1] and [doc2]. *) | Cat of document * document (* [Nest (j, doc)] is the document [doc], in which the indentation level has been increased by [j], that is, in which [j] blanks have been inserted after every newline character. *) | Nest of int * document (* [Group doc] represents an alternative: it is either a flattened form of [doc], in which occurrences of [Group] disappear and occurrences of [IfFlat] resolve to their left branch, or [doc] itself. *) | Group of document (* [Column f] is the document obtained by applying [f] to the current column number. *) | Column of (int -> document) (* [Nesting f] is the document obtained by applying [f] to the current indentation level, that is, the number of blanks that were printed at the beginning of the current line. *) | Nesting of (int -> document) (* ------------------------------------------------------------------------- *) (* The above algebraic data type is not exposed to the user. Instead, we expose the following functions. *) let empty = Empty let char c = assert (c <> '\n'); Char c let substring s ofs len = if len = 0 then Empty else String (s, ofs, len) let string s = substring s 0 (String.length s) let fancysubstring s ofs len apparent_length = if len = 0 then Empty else FancyString (s, ofs, len, apparent_length) let fancystring s apparent_length = fancysubstring s 0 (String.length s) apparent_length (* The following function was stolen from [Batteries]. *) let utf8_length s = let rec length_aux s c i = if i >= String.length s then c else let n = Char.code (String.unsafe_get s i) in let k = if n < 0x80 then 1 else if n < 0xe0 then 2 else if n < 0xf0 then 3 else 4 in length_aux s (c + 1) (i + k) in length_aux s 0 0 let utf8string s = fancystring s (utf8_length s) let hardline = HardLine let blank n = match n with | 0 -> Empty | 1 -> Blank 1 | _ -> Blank n let internal_break i = IfFlat (blank i, HardLine) let break0 = internal_break 0 let break1 = internal_break 1 let break i = match i with | 0 -> break0 | 1 -> break1 | _ -> internal_break i let (^^) x y = match x, y with | Empty, x | x, Empty -> x | _, _ -> Cat (x, y) let nest i x = assert (i >= 0); Nest (i, x) let group x = Group x let column f = Column f let nesting f = Nesting f let ifflat x y = IfFlat (x, y) (* ------------------------------------------------------------------------- *) (* The pretty rendering algorithm: preliminary declarations. *) (* The renderer is supposed to behave exactly like Daan Leijen's, although its implementation is quite radically different. Instead of relying on Haskell's lazy evaluation mechanism, we implement an abstract machine with mutable current state, forking, backtracking (via an explicit stack of choice points), and cut (disposal of earlier choice points). *) (* The renderer's input consists of an ordered sequence of documents. Each document carries an extra indentation level, akin to an implicit [Nest] constructor, and a ``flattening'' flag, which, if set, means that this document should be printed in flattening mode. *) (* An alternative coding style would be to avoid decorating each input document with an indentation level and a flattening mode, and allow the input sequence to contain instructions that set the current nesting level or reset the flattening mode. That would perhaps be slightly more readable, and slightly less efficient. *) type input = | INil | ICons of int * bool * document * input (* When possible (that is, when the stack is empty), the renderer writes directly to the output channel. Otherwise, output is buffered until either a failure point is reached (then, the buffered output is discarded) or a cut is reached (then, all buffered output is committed to the output channel). At all times, the length of the buffered output is at most one line. *) (* The buffered output consists of a list of characters and strings. It is stored in reverse order (the head of the list should be printed last). *) type output = | OEmpty | OChar of char * output | OString of string * int * int * output | OBlank of int * output (* The renderer maintains the following state record. For efficiency, the record is mutable; it is copied when the renderer forks, that is, at choice points. *) type 'channel state = { (* The line width and ribbon width. *) width: int; ribbon: int; (* The output channel. *) channel: 'channel; (* The current indentation level. This is the number of blanks that were printed at the beginning of the current line. *) mutable indentation: int; (* The current column. *) mutable column: int; (* The renderer's input. For efficiency, the input is assumed to never be empty, and the leading [ICons] constructor is inlined within the state record. In other words, the fields [nest1], [flatten1], and [input1] concern the first input document, and the field [input] contains the rest of the input sequence. *) mutable indent1: int; mutable flatten1: bool; mutable input1: document; mutable input: input; (* The renderer's buffered output. *) mutable output: output; } (* The renderer maintains a stack of resumptions, that is, states in which execution should be resumed if the current thread of execution fails by lack of space on the current line. *) (* It is not difficult to prove that the stack is empty if and only if flattening mode is off. Furthermore, when flattening mode is on, all groups are ignored, so no new choice points are pushed onto the stack. As a result, the stack has height one at most at all times, so that the stack height is zero when flattening mode is off and one when flattening mode is on. *) type 'channel stack = 'channel state list (* ------------------------------------------------------------------------- *) (* The pretty rendering algorithm: code. *) (* The renderer is parameterized over an implementation of output channels. *) module Renderer (Output : OUTPUT) = struct type channel = Output.channel type dummy = document type document = dummy (* Printing blank space (indentation characters). *) let blank_length = 80 let blank_buffer = String.make blank_length ' ' let rec blanks channel n = if n <= 0 then () else if n <= blank_length then Output.substring channel blank_buffer 0 n else begin Output.substring channel blank_buffer 0 blank_length; blanks channel (n - blank_length) end (* Committing buffered output to the output channel. The list is printed in reverse order. The code is not tail recursive, but there is no risk of stack overflow, since the length of the buffered output cannot exceed one line. *) let rec commit channel = function | OEmpty -> () | OChar (c, output) -> commit channel output; Output.char channel c | OString (s, ofs, len, output) -> commit channel output; Output.substring channel s ofs len | OBlank (n, output) -> commit channel output; blanks channel n (* The renderer's abstract machine. *) (* The procedures [run], [shift], [emit_char], [emit_string], and [emit_blanks] are mutually recursive, and are tail recursive. They maintain a stack and a current state. The states in the stack, and the current state, are pairwise distinct, so that the current state can be mutated without affecting the contents of the stack. *) (* An invariant is: the buffered output is nonempty only when the stack is nonempty. The contrapositive is: if the stack is empty, then the buffered output is empty. Indeed, the fact that the stack is empty means that no choices were made, so we are not in a speculative mode of execution: as a result, all output can be sent directly to the output channel. On the contrary, when the stack is nonempty, there is a possibility that we might backtrack in the future, so all output should be held in a buffer. *) (* [run] is allowed to call itself recursively only when no material is printed. In that case, the check for failure is skipped -- indeed, this test is performed only within [shift]. *) let rec run (stack : channel stack) (state : channel state) : unit = (* Examine the first piece of input, as well as (in some cases) the current flattening mode. *) match state.input1, state.flatten1 with (* The first piece of input is an empty document. Discard it and continue. *) | Empty, _ -> shift stack state (* The first piece of input is a character. Emit it and continue. *) | Char c, _ -> emit_char stack state c (* The first piece of input is a string. Emit it and continue. *) | String (s, ofs, len), _ -> emit_string stack state s ofs len len | FancyString (s, ofs, len, apparent_length), _ -> emit_string stack state s ofs len apparent_length | Blank n, _ -> emit_blanks stack state n (* The first piece of input is a hard newline instruction. *) (* If flattening mode is off, then we behave as follows. We emit a newline character, followed by the prescribed amount of indentation. We update the current state to record how many indentation characters were printed and to to reflect the new column number. Then, we discard the current piece of input and continue. *) | HardLine, false -> assert (stack = []); (* since flattening mode is off, the stack must be empty. *) Output.char state.channel '\n'; let i = state.indent1 in blanks state.channel i; state.column <- i; state.indentation <- i; shift stack state (* If flattening mode is on, then [HardLine] causes an immediate failure. We backtrack all the way to the state found at the bottom of the stack. (Indeed, if we were to backtrack to the state found at the top of the stack, then we would come back to this point in flattening mode, and fail again.) This will take us back to non-flattening mode, so that, when we come back to this [HardLine], we will be able to honor it. *) | HardLine, true -> assert (stack <> []); (* since flattening mode is on, the stack must be non-empty. *) run [] (last stack) (* The first piece of input is an [IfFlat] conditional instruction. *) | IfFlat (doc, _), true | IfFlat (_, doc), false -> state.input1 <- doc; run stack state (* The first piece of input is a concatenation operator. We take it apart and queue both documents in the input sequence. *) | Cat (doc1, doc2), _ -> state.input1 <- doc1; state.input <- ICons (state.indent1, state.flatten1, doc2, state.input); run stack state (* The first piece of input is a [Nest] operator. We increase the amount of indentation to be applied to the first input document. *) | Nest (j, doc), _ -> state.indent1 <- state.indent1 + j; state.input1 <- doc; run stack state (* The first piece of input is a [Group] operator, and flattening mode is currently off. This introduces a choice point: either we flatten this whole group, or we don't. We try the former possibility first: this is done by enabling flattening mode. Should this avenue fail, we push the current state, in which flattening mode is disabled, onto the stack. *) (* Note that the current state is copied before continuing, so that the state that is pushed on the stack is not affected by future modifications. This is a fork. *) | Group doc, false -> state.input1 <- doc; run (state :: stack) { state with flatten1 = true } (* The first piece of input is a [Group] operator, and flattening mode is currently on. The operator is ignored. *) | Group doc, true -> state.input1 <- doc; run stack state (* The first piece of input is a [Column] operator. The current column is fed into it, so as to produce a document, with which we continue. *) | Column f, _ -> state.input1 <- f state.column; run stack state (* The first piece of input is a [Nesting] operator. The current indentation level is fed into it, so as to produce a document, with which we continue. *) | Nesting f, _ -> state.input1 <- f state.indentation; run stack state (* [shift] discards the first document in the input sequence, so that the second input document, if there is one, becomes first. The renderer stops if there is none. *) and shift stack state = assert (state.output = OEmpty || stack <> []); assert (state.flatten1 = (stack <> [])); (* If the stack is nonempty and we have exceeded either the width or the ribbon width parameters, then fail. Backtracking is implemented by discarding the current state, popping a state off the stack, and making it the current state. *) match stack with | resumption :: stack when state.column > state.width || state.column - state.indentation > state.ribbon -> run stack resumption | _ -> match state.input with | INil -> (* End of input. Commit any buffered output and stop. *) commit state.channel state.output | ICons (indent, flatten, head, tail) -> (* There is an input document. Move it one slot ahead and check if we are leaving flattening mode. *) state.indent1 <- indent; state.input1 <- head; state.input <- tail; if state.flatten1 && not flatten then begin (* Leaving flattening mode means success: we have flattened a certain group, and fitted it all on a line, without reaching a failure point. We would now like to commit our decision to flatten this group. This is a Prolog cut. We discard the stack of choice points, replacing it with an empty stack, and commit all buffered output. *) state.flatten1 <- flatten; (* false *) commit state.channel state.output; state.output <- OEmpty; run [] state end else run stack state (* [emit_char] prints a character (either to the output channel or to the output buffer), increments the current column, discards the first piece of input, and continues. *) and emit_char stack state c = begin match stack with | [] -> Output.char state.channel c | _ -> state.output <- OChar (c, state.output) end; state.column <- state.column + 1; shift stack state (* [emit_string] prints a string (either to the output channel or to the output buffer), updates the current column, discards the first piece of input, and continues. *) and emit_string stack state s ofs len apparent_length = begin match stack with | [] -> Output.substring state.channel s ofs len | _ -> state.output <- OString (s, ofs, len, state.output) end; state.column <- state.column + apparent_length; shift stack state (* [emit_blanks] prints a blank string (either to the output channel or to the output buffer), updates the current column, discards the first piece of input, and continues. *) and emit_blanks stack state n = begin match stack with | [] -> blanks state.channel n | _ -> state.output <- OBlank (n, state.output) end; state.column <- state.column + n; shift stack state (* This is the renderer's main entry point. *) let pretty rfrac width channel document = run [] { width = width; ribbon = max 0 (min width (truncate (float_of_int width *. rfrac))); channel = channel; indentation = 0; column = 0; indent1 = 0; flatten1 = false; input1 = document; input = INil; output = OEmpty; } (* ------------------------------------------------------------------------- *) (* The compact rendering algorithm. *) let compact channel document = let column = ref 0 in let rec scan = function | Empty -> () | Char c -> Output.char channel c; column := !column + 1 | String (s, ofs, len) -> Output.substring channel s ofs len; column := !column + len | FancyString (s, ofs, len, apparent_length) -> Output.substring channel s ofs len; column := !column + apparent_length | Blank n -> blanks channel n; column := !column + n | HardLine -> Output.char channel '\n'; column := 0 | Cat (doc1, doc2) -> scan doc1; scan doc2 | IfFlat (doc, _) | Nest (_, doc) | Group doc -> scan doc | Column f -> scan (f !column) | Nesting f -> scan (f 0) in scan document end (* ------------------------------------------------------------------------- *) (* Instantiating the renderers for the three kinds of output channels. *) module type RENDERER = sig type channel type document val pretty: float -> int -> channel -> document -> unit val compact: channel -> document -> unit end module ToChannel = Renderer(ChannelOutput) module ToBuffer = Renderer(BufferOutput) module ToFormatter = Renderer(FormatterOutput) pprint-20230830/benchmark_old/OldPPrintEngine.mli000066400000000000000000000151771447371567600216240ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (** A pretty-printing engine and a set of basic document combinators. *) (** {1 Building documents} *) (** Documents must be built in memory before they are rendered. This may seem costly, but it is a simple approach, and works well. *) (** The following operations form a set of basic (low-level) combinators for building documents. On top of these combinators, higher-level combinators can be defined: see {!PPrintCombinators}. *) (** This is the abstract type of documents. *) type document (** The following basic (low-level) combinators allow constructing documents. *) (** [empty] is the empty document. *) val empty: document (** [char c] is a document that consists of the single character [c]. This character must not be a newline. *) val char: char -> document (** [string s] is a document that consists of the string [s]. This string must not contain a newline. *) val string: string -> document (** [substring s ofs len] is a document that consists of the portion of the string [s] delimited by the offset [ofs] and the length [len]. This portion must contain a newline. *) val substring: string -> int -> int -> document (** [fancystring s apparent_length] is a document that consists of the string [s]. This string must not contain a newline. The string may contain fancy characters: color escape characters, UTF-8 or multi-byte characters, etc. Thus, its apparent length (which measures how many columns the text will take up on screen) differs from its length in bytes. *) val fancystring: string -> int -> document (** [fancysubstring s ofs len apparent_length] is a document that consists of the portion of the string [s] delimited by the offset [ofs] and the length [len]. This portion must contain a newline. The string may contain fancy characters. *) val fancysubstring : string -> int -> int -> int -> document (** [utf8string s] is a document that consists of the UTF-8-encoded string [s]. This string must not contain a newline. *) val utf8string: string -> document (** [hardline] is a forced newline document. This document forces all enclosing groups to be printed in non-flattening mode. In other words, any enclosing groups are dissolved. *) val hardline: document (** [blank n] is a document that consists of [n] blank characters. *) val blank: int -> document (** [break n] is a document which consists of either [n] blank characters, when forced to display on a single line, or a single newline character, otherwise. Note that there is no choice at this point: choices are encoded by the [group] combinator. *) val break: int -> document (** [doc1 ^^ doc2] is the concatenation of the documents [doc1] and [doc2]. *) val (^^): document -> document -> document (** [nest j doc] is the document [doc], in which the indentation level has been increased by [j], that is, in which [j] blanks have been inserted after every newline character. Read this again: indentation is inserted after every newline character. No indentation is inserted at the beginning of the document. *) val nest: int -> document -> document (** [group doc] encodes a choice. If possible, then the entire document [group doc] is rendered on a single line. Otherwise, the group is dissolved, and [doc] is rendered. There might be further groups within [doc], whose presence will lead to further choices being explored. *) val group: document -> document (** [column f] is the document obtained by applying the function [f] to the current column number. This combinator allows making the construction of a document dependent on the current column number. *) val column: (int -> document) -> document (** [nesting f] is the document obtained by applying the function [f] to the current indentation level, that is, the number of indentation (blank) characters that were inserted at the beginning of the current line. *) val nesting: (int -> document) -> document (** [ifflat doc1 doc2] is rendered as [doc1] if part of a group that can be successfully flattened, and is rendered as [doc2] otherwise. Use this operation with caution. Because the pretty-printer is free to choose between [doc1] and [doc2], these documents should be semantically equivalent. *) val ifflat: document -> document -> document (** {1 Rendering documents} *) (**This signature describes the document renderers in a manner that is independent of the type of the output channel. *) module type RENDERER = sig (**The type of the output channel. *) type channel (**The type of documents. *) type document (** [pretty rfrac width channel document] pretty-prints the document [document] into the output channel [channel]. The parameter [width] is the maximum number of characters per line. The parameter [rfrac] is the ribbon width, a fraction relative to [width]. The ribbon width is the maximum number of non-indentation characters per line. *) val pretty: float -> int -> channel -> document -> unit (** [compact channel document] prints the document [document] to the output channel [channel]. No indentation is used. All newline instructions are respected, that is, no groups are flattened. *) val compact: channel -> document -> unit end (** This renderer sends its output into an output channel. *) module ToChannel : RENDERER with type channel = out_channel and type document = document (** This renderer sends its output into a memory buffer. *) module ToBuffer : RENDERER with type channel = Buffer.t and type document = document (** This renderer sends its output into a formatter channel. *) module ToFormatter : RENDERER with type channel = Format.formatter and type document = document pprint-20230830/benchmark_old/PPrintBench.ml000066400000000000000000000234151447371567600206200ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (* ------------------------------------------------------------------------- *) (* The following signature is common to the old and new engines. *) module type ENGINE = sig type document val empty: document val char: char -> document val string: string -> document val substring: string -> int -> int -> document val fancystring: string -> int -> document val fancysubstring : string -> int -> int -> int -> document val utf8string: string -> document val hardline: document val blank: int -> document val break: int -> document val (^^): document -> document -> document val nest: int -> document -> document val group: document -> document val ifflat: document -> document -> document module ToBuffer : PPrint.RENDERER with type channel = Buffer.t and type document = document end (* ------------------------------------------------------------------------- *) (* We use our own abstract syntax of documents. We produce random documents in this syntax first, then (as part of the timed test) translate them to the engine's syntax. This allows timing the engine's document construction code too. *) type mydoc = | MyEmpty | MyChar of char | MyString of string | MySubString of string * int * int | MyUtf8String of string | MyHardLine | MyBlank of int | MyBreak of int | MyCat of mydoc * mydoc | MyNest of int * mydoc | MyGroup of mydoc | MyIfFlat of mydoc * mydoc (* ------------------------------------------------------------------------- *) (* [measure v] measures the size of an OCaml value [v] in bytes. *) let measure v = (* String.length (Marshal.to_string v []) *) Size.size_b v (* ------------------------------------------------------------------------- *) (* [split n] produces two numbers [n1] and [n2] comprised between [0] and [n] (inclusive) whose sum is [n]. *) let split n = let n1 = Random.int (n + 1) in let n2 = n - n1 in n1, n2 (* [choose xs] randomly and uniformly chooses between the elements of the array [xs]. *) let choose xs = Array.unsafe_get xs (Random.int (Array.length xs)) (* [pick] is analogous, but each element comes with a relative integer weight. *) let pick wxs = (* Compute the total weight. *) let weight = List.fold_left (fun weight (w, _) -> weight + w) 0 wxs in assert (weight > 0); (* Pick a random integer between 0 and the total weight. *) let i = Random.int weight in (* Find the corresponding element. *) let rec loop i wxs = match wxs with | [] -> assert false | (w, x) :: wxs -> if i < w then x else loop (i - w) wxs in loop i wxs (* ------------------------------------------------------------------------- *) (* A random document generator. *) let leaf = [| MyChar 'c'; MyString "hello"; MySubString ("the cat", 4, 3); MyUtf8String "étoile"; MyHardLine; MyBlank 2; MyBreak 2 |] let rec random (n : int) : mydoc = (* If the budget is 0, produce an empty document. *) if n = 0 then MyEmpty (* If the budget is 1, produce a leaf. *) else if n = 1 then choose leaf (* Otherwise, decrement the budget, and produce a node of nonzero arity, spending the rest of the budget on the children. *) else let n = n - 1 in Lazy.force (pick [ 10, lazy (let n1, n2 = split n in MyCat (random n1, random n2)); 2, lazy (MyNest (2, random n)); 10, lazy (MyGroup (random n)); 2, lazy (let n1, n2 = split n in MyIfFlat (random n1, random n2)) ]) (* ------------------------------------------------------------------------- *) (* Building documents for a particular engine. *) module Build (E : ENGINE) = struct open E let rec build (doc : mydoc) : document = match doc with | MyEmpty -> empty | MyChar c -> char c | MyString s -> string s | MySubString (s, ofs, len) -> substring s ofs len | MyUtf8String s -> utf8string s | MyHardLine -> hardline | MyBlank b -> blank b | MyBreak b -> break b | MyCat (doc1, doc2) -> build doc1 ^^ build doc2 | MyNest (i, doc) -> nest i (build doc) | MyGroup doc -> group (build doc) | MyIfFlat (doc1, doc2) -> ifflat (build doc1) (build doc2) end (* ------------------------------------------------------------------------- *) (* The rendering parameters. *) let rfrac = 0.8 let width = 80 (* ------------------------------------------------------------------------- *) (* Testing an engine, alone. *) module Test1 (E : ENGINE) = struct open E (* The size of the randomly generated documents. *) let n = 1000 (* The number of runs. *) let runs = 10000 let () = let module B = Build(E) in let s = ref 0 in for _r = 1 to runs do let document = B.build (random n) in s := !s + measure document; let buffer = Buffer.create 32768 in ToBuffer.pretty rfrac width buffer document; let buffer = Buffer.create 32768 in ToBuffer.compact buffer document done; Printf.printf "Test 1: success.\n%!"; let average = float_of_int !s /. float_of_int runs in Printf.printf "Average document size: %d bytes.\n%!" (truncate average) end (* ------------------------------------------------------------------------- *) (* Testing two engines and comparing their output. *) module Test2 (E1 : ENGINE) (E2 : ENGINE) = struct (* The size of the randomly generated documents. *) let n = 1000 (* The number of runs. *) let runs = 10000 let () = let module B1 = Build(E1) in let module B2 = Build(E2) in for _r = 1 to runs do let document = random n in let document1 = B1.build document in let document2 = B2.build document in let buffer1 = Buffer.create 32768 in E1.ToBuffer.pretty rfrac width buffer1 document1; let buffer2 = Buffer.create 32768 in E2.ToBuffer.pretty rfrac width buffer2 document2; assert (Buffer.contents buffer1 = Buffer.contents buffer2) done; Printf.printf "Test 2: success.\n%!" end (* ------------------------------------------------------------------------- *) (* Timing an engine, alone. *) module Time1 (E : ENGINE) (D : sig val n: int val runs: int val docs : mydoc array end) = struct open E open D let gc = false let time f x = if gc then Gc.major(); let start = Unix.gettimeofday() in let y = f x in let finish = Unix.gettimeofday() in y, finish -. start let () = let module B = Build(E) in Printf.printf "Time: building documents...\n%!"; let docs, duration = time (fun () -> Array.map B.build docs) () in Printf.printf "Time: built %d documents of size %d in %.2f seconds.\n%!" runs n duration; let size = Array.fold_left (fun accu doc -> accu + measure doc) 0 docs in let average = float_of_int size /. float_of_int runs in Printf.printf "Average document size: %d bytes.\n%!" (truncate average); let buffer = Buffer.create 32768 in Printf.printf "Time: rendering documents...\n%!"; let (), duration = time (fun () -> Array.iter (fun document -> ToBuffer.pretty rfrac width buffer document; Buffer.clear buffer ) docs ) () in Printf.printf "Time: rendered %d documents of size %d in %.2f seconds.\n%!" runs n duration end (* ------------------------------------------------------------------------- *) (* Main. *) let test1 () = (* Testing both engines on the same set of documents. *) Printf.printf "Testing old engine...\n"; let state = Random.get_state() in let module T = Test1(OldPPrintEngine) in Random.set_state state; Printf.printf "Testing new engine...\n"; let module T = Test1(PPrintEngine) in () let test2 () = (* Comparing the two engines. *) Printf.printf "Comparing old and new engines...\n"; let module T = Test2(OldPPrintEngine)(PPrintEngine) in () type engine = Old | New let test3 engine = (* The timing test. Best to run it separately on each engine (in two different processes), as there are otherwise GC effects. If a major GC is triggered, the timing test is severely affected. *) let module D = struct (* The size of the randomly generated documents. *) let n = 10000 (* The number of runs. *) let runs = 1000 let () = Printf.printf "Generating %d documents of size %d...\n%!" runs n let docs = Array.init runs (fun _ -> random n) end in match engine with | Old -> Printf.printf "Timing old engine...\n"; let module T = Time1(OldPPrintEngine)(D) in () | New -> Printf.printf "Timing new engine...\n"; let module T = Time1(PPrintEngine)(D) in () let () = (* The comparison between the old and new engines is now disabled, because the new engine removes trailing blank characters on every line, whereas the old engine does not. *) if false then test2(); test3 New pprint-20230830/benchmark_old/Size.ml000066400000000000000000000050551447371567600173560ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (*i $Id: size.ml,v 1.7 2008-07-21 14:53:06 filliatr Exp $ i*) (*i*) open Obj (*i*) (*s Pointers already visited are stored in a hash-table, where comparisons are done using physical equality. *) module H = Hashtbl.Make( struct type t = Obj.t let equal = (==) let hash o = Hashtbl.hash (magic o : int) end) let node_table = (H.create 257 : unit H.t) let in_table o = try H.find node_table o; true with Not_found -> false let add_in_table o = H.add node_table o () let reset_table () = H.clear node_table (*s Objects are traversed recursively, as soon as their tags are less than [no_scan_tag]. [count] records the numbers of words already visited. *) let size_of_double = size (repr 1.0) let count = ref 0 let rec traverse t = if not (in_table t) then begin add_in_table t; if is_block t then begin let n = size t in let tag = tag t in if tag < no_scan_tag then begin count := !count + 1 + n; for i = 0 to n - 1 do let f = field t i in if is_block f then traverse f done end else if tag = string_tag then count := !count + 1 + n else if tag = double_tag then count := !count + size_of_double else if tag = double_array_tag then count := !count + 1 + size_of_double * n else incr count end end (*s Sizes of objects in words and in bytes. The size in bytes is computed system-independently according to [Sys.word_size]. *) let size_w o = reset_table (); count := 0; traverse (repr o); !count let size_b o = (size_w o) * (Sys.word_size / 8) let size_kb o = (size_w o) / (8192 / Sys.word_size) pprint-20230830/benchmark_old/Size.mli000066400000000000000000000024111447371567600175200ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (*i $Id: size.mli,v 1.5 2008-07-21 14:53:06 filliatr Exp $ i*) (* Sizes of ocaml values (in their memory representation). Sizes are given in words ([size_w]), bytes ([size_b]) or kilobytes ([size_kb]), in a system-independent way. *) val size_w : 'a -> int val size_b : 'a -> int val size_kb : 'a -> int pprint-20230830/benchmark_old/dune000066400000000000000000000000751447371567600167650ustar00rootroot00000000000000(executable (name PPrintBench) (libraries unix pprint) ) pprint-20230830/blog/000077500000000000000000000000001447371567600142405ustar00rootroot00000000000000pprint-20230830/blog/Makefile000066400000000000000000000003661447371567600157050ustar00rootroot00000000000000# -------------------------------------------------------------------------------- # [make billet] creates the blog entry. .PHONY: billet clean billet: billet.html clean: rm -f billet.html %.html: %.markdown pandoc -s $< -c style.css > $@ pprint-20230830/blog/billet.markdown000066400000000000000000000112461447371567600172630ustar00rootroot00000000000000 I am pleased to announce the first official release of _**PPrint**_, an OCaml library for pretty-printing textual documents. ## A taste of the layout language At the heart of _**PPrint**_ is a little domain-specific language of documents. This language has a well-defined semantics, which the printing engine implements. This language rests upon a small number of fundamental concepts. There are combinators for creating atomic documents. For instance, ```ocaml string "hello" ``` is a simple, unbreakable document. There is also a concatenation operator, which joins two documents. For instance, ```ocaml string "hello" ^^ string "world" ``` is a composite document. It is in fact equivalent to `string "helloworld"`. So far, nothing very exciting. The next two combinators are more original and interesting. The first of these combinators, `break 1`, is a breakable space. If printed in flat mode, it produces an ordinary space character; if printed in normal mode, it produces a newline character. Yes, there are two printing modes, namely flat mode and normal mode. The printing engine goes back and forth between these two modes. Exactly where and how the engine switches from one mode to the other is controlled by the next combinator. The second of these combinators, `group`, introduces a choice between flat mode and normal mode. It is a document transformer: if `d` is a document, then `group d` is a document. When the printing engine encounters `group d`, two possibilities arise. The first possibility is to print all of `d` on a single line. This is known as flat mode. The engine tries this first (ignoring any `group` combinators inside `d`). If it succeeds, great. If it fails, by lack of space on the current line, then the engine backtracks and reverts to the second possibility, which is to simply ignore the `group` combinator, and just print `d`. This has subtle consequences: there might be further groups inside `d`, and each of these groups will give rise to further choices. This gives rise to an interesting language, where `group` is used to indicate a choice point, and the appearance of `break` is dependent upon the choice point(s) that appear higher up in the hierarchical structure of the document. For instance, the document: ```ocaml group (string "This" ^^ break 1 ^^ string "is" ^^ break 1 ^^ string "pretty.") ``` will be printed either on a single line, if it fits, or on three lines. It will not be printed on two lines: there is just one choice point, so either the two breakable spaces will be broken, or none of them will. By the way, this document can be abbreviated as follows: ```ocaml group (string "This" ^/^ string "is" ^/^ string "pretty.") ``` On the other hand, the document: ```ocaml string "This" ^^ group (break 1 ^^ string "is") ^^ group (break 1 ^^ string "pretty.") ``` could be printed on one, two, or three lines. There are two choice points, each of which influences one of the two breakable spaces. The two choices are independent of one another. Each of the words in the sentence `This is pretty.` will be printed on the current line if it fits, and on a new line otherwise. By the way, this document can be abbreviated as follows: ```ocaml flow (break 1) [ string "This" ; string "is" ; string "pretty." ] ``` There are more combinators, such as `nest`, which controls indentation, and it is relatively easy to roll your own combinators on top of those that are provided. One limitation of the library is that the document must be entirely built in memory before it is printed. So far, we have used the library in small- to medium-scale applications, and this has not been a problem. In principle, one could work around this limitation by adding a new document constructor whose argument is a suspended document computation. ## Acknowledgements The document language and the printing engine are inspired by Daan Leijen's [PPrint](http://www.cs.uu.nl/~daan/pprint.html) library, which itself is based on the ideas developed by Philip Wadler in the paper [A Prettier Printer](http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). _**PPrint**_ was written by François Pottier and Nicolas Pouillard, with contributions by Yann Régis-Gianas, Gabriel Scherer, and Jonathan Protzenko. ## Installation The library is available online ([source code](http://gallium.inria.fr/~fpottier/pprint/pprint.tar.gz), [documentation](http://gallium.inria.fr/~fpottier/pprint/doc/)), and can also be installed via OPAM: just type `opam install pprint` if you already have a working OPAM installation. Have fun! Feel free to make comments, suggestions, and to let me know if and how you are using this library. pprint-20230830/blog/style.css000066400000000000000000000017561447371567600161230ustar00rootroot00000000000000body { width: 800px; font-family: sans-serif; margin: 0 auto; } pre.sourceCode, code { background-color: #eee; } .controls { text-align: center; } a { text-decoration: none; color: #888; display: inline-block; } a:hover { text-decoration: underline; color: #666; } .boxes { width: 100%; margin-bottom: 1em; text-align: center; } .box { text-align: left; width: 395px; display: inline-block; height: 8em; border: 1px solid black; vertical-align: top; background-color: white; border-radius: 5px; } .box ul { margin: 0; } .title { padding: 5px; font-weight: bold; } .controls a { display: inline-block; padding: 2px; border: 1px solid #888; border-radius: 2px; background-color: #fffef7; color: #353535; } .controls a:hover { text-decoration: none; border: 1px solid #353535; background-color: #fffceb; } .caption { text-align: center; font-style: italic; margin: 0; } .figure { border: 1px solid black; padding: 2em; } pprint-20230830/dune-project000066400000000000000000000000201447371567600156270ustar00rootroot00000000000000(lang dune 1.3) pprint-20230830/header000077500000000000000000000005531447371567600144760ustar00rootroot00000000000000 PPrint François Pottier, Inria Paris Nicolas Pouillard Copyright 2007-2022 Inria. All rights reserved. This file is distributed under the terms of the GNU Library General Public License, with an exception, as described in the file LICENSE. pprint-20230830/pprint.opam000066400000000000000000000015571447371567600155170ustar00rootroot00000000000000opam-version: "2.0" maintainer: "francois.pottier@inria.fr" authors: [ "François Pottier " "Nicolas Pouillard " ] license: "LGPL-2.0-only WITH OCaml-LGPL-linking-exception" homepage: "https://github.com/fpottier/pprint" dev-repo: "git+ssh://git@github.com/fpottier/pprint.git" bug-reports: "francois.pottier@inria.fr" build: [ ["dune" "build" "-p" name "-j" jobs] ] depends: [ "ocaml" {>= "4.03"} "dune" {>= "1.3"} ] synopsis: "A pretty-printing combinator library and rendering engine" description: "This library offers a set of combinators for building so-called documents as well as an efficient engine for converting documents to a textual, fixed-width format. The engine takes care of indentation and line breaks, while respecting the constraints imposed by the structure of the document and by the text width." pprint-20230830/src/000077500000000000000000000000001447371567600141045ustar00rootroot00000000000000pprint-20230830/src/Makefile000066400000000000000000000001161447371567600155420ustar00rootroot00000000000000.PHONY: all clean doc test bench all clean doc test bench: $(MAKE) -C .. $@ pprint-20230830/src/PPrint.ml000066400000000000000000000303411447371567600156530ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) include PPrintEngine (* -------------------------------------------------------------------------- *) (* Predefined single-character documents. *) let lparen = char '(' let rparen = char ')' let langle = char '<' let rangle = char '>' let lbrace = char '{' let rbrace = char '}' let lbracket = char '[' let rbracket = char ']' let squote = char '\'' let dquote = char '"' let bquote = char '`' let semi = char ';' let colon = char ':' let comma = char ',' let dot = char '.' let sharp = char '#' let slash = char '/' let backslash = char '\\' let equals = char '=' let qmark = char '?' let tilde = char '~' let at = char '@' let percent = char '%' let dollar = char '$' let caret = char '^' let ampersand = char '&' let star = char '*' let plus = char '+' let minus = char '-' let underscore = char '_' let bang = char '!' let bar = char '|' (* -------------------------------------------------------------------------- *) (* Repetition. *) let[@inline] twice doc = doc ^^ doc let repeat n doc = let rec loop n doc accu = if n = 0 then accu else loop (n - 1) doc (doc ^^ accu) in loop n doc empty (* -------------------------------------------------------------------------- *) (* Delimiters. *) let[@inline] precede l x = l ^^ x let[@inline] terminate r x = x ^^ r let[@inline] enclose l r x = l ^^ x ^^ r let[@inline] squotes x = enclose squote squote x let[@inline] dquotes x = enclose dquote dquote x let[@inline] bquotes x = enclose bquote bquote x let[@inline] braces x = enclose lbrace rbrace x let[@inline] parens x = enclose lparen rparen x let[@inline] angles x = enclose langle rangle x let[@inline] brackets x = enclose lbracket rbracket x (* -------------------------------------------------------------------------- *) (* Some functions on lists. *) (* A variant of [fold_left] that keeps track of the element index. *) let foldli (f : int -> 'b -> 'a -> 'b) (accu : 'b) (xs : 'a list) : 'b = let r = ref 0 in List.fold_left (fun accu x -> let i = !r in r := i + 1; f i accu x ) accu xs (* -------------------------------------------------------------------------- *) (* Working with lists of documents. *) let concat docs = (* We take advantage of the fact that [^^] operates in constant time, regardless of the size of its arguments. The document that is constructed is essentially a reversed list (i.e., a tree that is biased towards the left). This is not a problem; when pretty-printing this document, the engine will descend along the left branch, pushing the nodes onto its stack as it goes down, effectively reversing the list again. *) List.fold_left (^^) empty docs let separate sep docs = foldli (fun i accu doc -> if i = 0 then doc else accu ^^ sep ^^ doc ) empty docs let concat_map f xs = List.fold_left (fun accu x -> accu ^^ f x ) empty xs let separate_map sep f xs = foldli (fun i accu x -> if i = 0 then f x else accu ^^ sep ^^ f x ) empty xs let separate2 sep last_sep docs = let n = List.length docs in foldli (fun i accu doc -> if i = 0 then doc else accu ^^ (if i < n - 1 then sep else last_sep) ^^ doc ) empty docs let optional f = function | None -> empty | Some x -> f x (* -------------------------------------------------------------------------- *) (* Text. *) (* This variant of [String.index_from] returns an option. *) let index_from s i c = try Some (String.index_from s i c) with Not_found -> None (* [lines s] chops the string [s] into a list of lines, which are turned into documents. *) let lines s = let rec chop accu i = match index_from s i '\n' with | Some j -> let accu = substring s i (j - i) :: accu in chop accu (j + 1) | None -> substring s i (String.length s - i) :: accu in List.rev (chop [] 0) let arbitrary_string s = separate (break 1) (lines s) (* [split ok s] splits the string [s] at every occurrence of a character that satisfies the predicate [ok]. The substrings thus obtained are turned into documents, and a list of documents is returned. No information is lost: the concatenation of the documents yields the original string. This code is not UTF-8 aware. *) let split ok s = let n = String.length s in let rec index_from i = if i = n then None else if ok s.[i] then Some i else index_from (i + 1) in let rec chop accu i = match index_from i with | Some j -> let accu = substring s i (j - i) :: accu in let accu = char s.[j] :: accu in chop accu (j + 1) | None -> substring s i (String.length s - i) :: accu in List.rev (chop [] 0) (* [words s] chops the string [s] into a list of words, which are turned into documents. *) let words s = let n = String.length s in (* A two-state finite automaton. *) (* In this state, we have skipped at least one blank character. *) let rec skipping accu i = if i = n then (* There was whitespace at the end. Drop it. *) accu else match s.[i] with | ' ' | '\t' | '\n' | '\r' -> (* Skip more whitespace. *) skipping accu (i + 1) | _ -> (* Begin a new word. *) word accu i (i + 1) (* In this state, we have skipped at least one non-blank character. *) and word accu i j = if j = n then (* Final word. *) substring s i (j - i) :: accu else match s.[j] with | ' ' | '\t' | '\n' | '\r' -> (* A new word has been identified. *) let accu = substring s i (j - i) :: accu in skipping accu (j + 1) | _ -> (* Continue inside the current word. *) word accu i (j + 1) in List.rev (skipping [] 0) let flow_map sep f docs = foldli (fun i accu doc -> if i = 0 then f doc else accu ^^ (* This idiom allows beginning a new line if [doc] does not fit on the current line. *) group (sep ^^ f doc) ) empty docs let flow sep docs = flow_map sep (fun x -> x) docs let url s = flow (break 0) (split (function '/' | '.' -> true | _ -> false) s) (* -------------------------------------------------------------------------- *) (* Alignment and indentation. *) let hang i d = align (nest i d) let ( !^ ) = string let[@inline] ( ^/^ ) x y = x ^^ break 1 ^^ y let prefix n b x y = group (x ^^ nest n (break b ^^ y)) let[@inline] (^//^) x y = prefix 2 1 x y let jump n b y = group (nest n (break b ^^ y)) let infix n b op x y = prefix n b (x ^^ blank b ^^ op) y let surround n b opening contents closing = group (opening ^^ nest n ( break b ^^ contents) ^^ break b ^^ closing ) let soft_surround n b opening contents closing = group (opening ^^ nest n (group (break b) ^^ contents) ^^ group (break b ^^ closing)) let surround_separate n b void opening sep closing docs = match docs with | [] -> void | _ :: _ -> surround n b opening (separate sep docs) closing let surround_separate_map n b void opening sep closing f xs = match xs with | [] -> void | _ :: _ -> surround n b opening (separate_map sep f xs) closing (* -------------------------------------------------------------------------- *) (* Printing OCaml values. *) module OCaml = struct open Printf type constructor = string type type_name = string type record_field = string type tag = int (* -------------------------------------------------------------------------- *) (* This internal [sprintf]-like function produces a document. We use [string], as opposed to [arbitrary_string], because the strings that we produce will never contain a newline character. *) let[@inline] dsprintf format = ksprintf string format (* -------------------------------------------------------------------------- *) (* Nicolas prefers using this code as opposed to just [sprintf "%g"] or [sprintf "%f"]. The latter print [inf] and [-inf], whereas OCaml understands [infinity] and [neg_infinity]. [sprintf "%g"] does not add a trailing dot when the number happens to be an integral number. [sprintf "%F"] seems to lose precision and ignores the precision modifier. *) let valid_float_lexeme (s : string) : string = let l = String.length s in let rec loop i = if i >= l then (* If we reach the end of the string and have found only characters in the set '0' .. '9' and '-', then this string will be considered as an integer literal by OCaml. Adding a trailing dot makes it a float literal. *) s ^ "." else match s.[i] with | '0' .. '9' | '-' -> loop (i + 1) | _ -> s in loop 0 (* This function constructs a string representation of a floating point number. This representation is supposed to be accepted by OCaml as a valid floating point literal. *) let float_representation (f : float) : string = match classify_float f with | FP_nan -> "nan" | FP_infinite -> if f < 0.0 then "neg_infinity" else "infinity" | _ -> (* Try increasing precisions and validate. *) let s = sprintf "%.12g" f in if f = float_of_string s then valid_float_lexeme s else let s = sprintf "%.15g" f in if f = float_of_string s then valid_float_lexeme s else sprintf "%.18g" f (* -------------------------------------------------------------------------- *) (* A few constants and combinators, used below. *) let some = string "Some" let none = string "None" let lbracketbar = string "[|" let rbracketbar = string "|]" let seq1 opening separator closing = surround_separate 2 0 (opening ^^ closing) opening (separator ^^ break 1) closing let seq2 opening separator closing = surround_separate_map 2 1 (opening ^^ closing) opening (separator ^^ break 1) closing (* -------------------------------------------------------------------------- *) (* The following functions are printers for many types of OCaml values. *) (* There is no protection against cyclic values. *) let tuple = seq1 lparen comma rparen let variant _ cons _ args = match args with | [] -> !^cons | _ :: _ -> !^cons ^^ tuple args let record _ fields = seq2 lbrace semi rbrace (fun (k, v) -> infix 2 1 equals !^k v) fields let option f = function | None -> none | Some x -> some ^^ tuple [f x] let list f xs = seq2 lbracket semi rbracket f xs let flowing_list f xs = group (lbracket ^^ space ^^ nest 2 ( flow_map (semi ^^ break 1) f xs ) ^^ space ^^ rbracket) let array f xs = seq2 lbracketbar semi rbracketbar f (Array.to_list xs) let flowing_array f xs = group (lbracketbar ^^ space ^^ nest 2 ( flow_map (semi ^^ break 1) f (Array.to_list xs) ) ^^ space ^^ rbracketbar) let ref f x = record "ref" ["contents", f !x] let float f = string (float_representation f) let int = dsprintf "%d" let int32 = dsprintf "%ld" let int64 = dsprintf "%Ld" let nativeint = dsprintf "%nd" let char = dsprintf "%C" let bool = dsprintf "%B" let unit = dsprintf "()" let string = dsprintf "%S" let unknown tyname _ = dsprintf "" tyname type representation = document end (* OCaml *) pprint-20230830/src/PPrint.mli000066400000000000000000000313541447371567600160310ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) include module type of PPrintEngine (** @inline *) (** {1:combinators High-Level Combinators} *) (** {2 Single Characters} *) (**The following atomic documents consist of a single character. Each of them is a synonym for the application of {!char} to some constant character. For instance, {!lparen} is a synonym for [char '(']. *) val lparen: document val rparen: document val langle: document val rangle: document val lbrace: document val rbrace: document val lbracket: document val rbracket: document val squote: document val dquote: document val bquote: document val semi: document val colon: document val comma: document val dot: document val sharp: document val slash: document val backslash: document val equals: document val qmark: document val tilde: document val at: document val percent: document val dollar: document val caret: document val ampersand: document val star: document val plus: document val minus: document val underscore: document val bang: document val bar: document (** {2 Delimiters} *) (**[precede l x] is [l ^^ x]. *) val precede: document -> document -> document (**[terminate r x] is [x ^^ r]. *) val terminate: document -> document -> document (**[enclose l r x] is [l ^^ x ^^ r]. *) val enclose: document -> document -> document -> document (**The following combinators enclose a document within a pair of delimiters. They are partial applications of [enclose]. No whitespace or line break is introduced. *) val squotes: document -> document val dquotes: document -> document val bquotes: document -> document val braces: document -> document val parens: document -> document val angles: document -> document val brackets: document -> document (** {2 Repetition} *) (**[twice doc] is the document obtained by concatenating two copies of the document [doc]. *) val twice: document -> document (**[repeat n doc] is the document obtained by concatenating [n] copies of the document [doc]. *) val repeat: int -> document -> document (** {2 Lists and Options} *) (**[concat docs] is the concatenation of the documents in the list [docs]. *) val concat: document list -> document (**[separate sep docs] is the concatenation of the documents in the list [docs]. The separator [sep] is inserted between every two adjacent documents. *) val separate: document -> document list -> document (**[concat_map f xs] is equivalent to [concat (List.map f xs)]. *) val concat_map: ('a -> document) -> 'a list -> document (**[separate_map sep f xs] is equivalent to [separate sep (List.map f xs)]. *) val separate_map: document -> ('a -> document) -> 'a list -> document (**[separate2 sep last_sep docs] is the concatenation of the documents in the list [docs]. The separator [sep] is inserted between every two adjacent documents, except between the last two documents, where the separator [last_sep] is used instead. *) val separate2: document -> document -> document list -> document (**[optional f None] is the empty document. [optional f (Some x)] is the document [f x]. *) val optional: ('a -> document) -> 'a option -> document (** {2 Text} *) (**[lines s] is the list of documents obtained by splitting [s] at newline characters, and turning each line into a document via [substring]. This code is not UTF-8 aware. *) val lines: string -> document list (**[arbitrary_string s] is equivalent to [separate (break 1) (lines s)]. It is analogous to [string s], but is valid even if the string [s] contains newline characters. *) val arbitrary_string: string -> document (**[words s] is the list of documents obtained by splitting [s] at whitespace characters, and turning each word into a document via [substring]. All whitespace is discarded. This code is not UTF-8 aware. *) val words: string -> document list (**[split ok s] splits the string [s] before and after every occurrence of a character that satisfies the predicate [ok]. The substrings thus obtained are turned into documents, and a list of documents is returned. No information is lost: the concatenation of the documents yields the original string. This code is not UTF-8 aware. *) val split: (char -> bool) -> string -> document list (**[flow sep docs] separates the documents in the list [docs] with the separator [sep] and arranges for a new line to begin whenever a document does not fit on the current line. This is useful for typesetting free-flowing, ragged-right text. A typical choice of [sep] is [break b], where [b] is the number of spaces that must be inserted between two consecutive words (when displayed on the same line). *) val flow: document -> document list -> document (**[flow_map sep f docs] is equivalent to [flow sep (List.map f docs)]. *) val flow_map: document -> ('a -> document) -> 'a list -> document (**[url s] is a possible way of displaying the URL [s]. A potential line break is inserted immediately before and immediately after every slash and dot character. *) val url: string -> document (** {2 Alignment and Indentation} *) (**[hang n doc] is analogous to [align], but additionally indents all lines, except the first one, by [n]. Thus, the text in the box forms a hanging indent. *) val hang: int -> document -> document (**[prefix n b left right] has the following flat layout: {[ left right ]} and the following non-flat layout: {[ left right ]} The parameter [n] controls the nesting of [right] (when not flat). The parameter [b] controls the number of spaces between [left] and [right] (when flat). *) val prefix: int -> int -> document -> document -> document (**[jump n b right] is equivalent to [prefix n b empty right]. *) val jump: int -> int -> document -> document (**[infix n b middle left right] has the following flat layout: {[ left middle right ]} and the following non-flat layout: {[ left middle right ]} The parameter [n] controls the nesting of [right] (when not flat). The parameter [b] controls the number of spaces between [left] and [middle] (always) and between [middle] and [right] (when flat). *) val infix: int -> int -> document -> document -> document -> document (**[surround n b opening contents closing] has the following flat layout: {[ opening contents closing ]} and the following non-flat layout: {[ opening contents closing ]} The parameter [n] controls the nesting of [contents] (when not flat). The parameter [b] controls the number of spaces between [opening] and [contents] and between [contents] and [closing] (when flat). *) val surround: int -> int -> document -> document -> document -> document (**[soft_surround] is analogous to [surround], but involves more than one group, so it offers possibilities other than the completely flat layout (where [opening], [contents], and [closing] appear on a single line) and the completely developed layout (where [opening], [contents], and [closing] appear on separate lines). It tries to place the beginning of [contents] on the same line as [opening], and to place [closing] on the same line as the end of [contents], if possible. *) val soft_surround: int -> int -> document -> document -> document -> document (**[surround_separate n b void opening sep closing docs] is equivalent to [surround n b opening (separate sep docs) closing], except when the list [docs] is empty, in which case it reduces to [void]. *) val surround_separate: int -> int -> document -> document -> document -> document -> document list -> document (**[surround_separate_map n b void opening sep closing f xs] is equivalent to [surround_separate n b void opening sep closing (List.map f xs)]. *) val surround_separate_map: int -> int -> document -> document -> document -> document -> ('a -> document) -> 'a list -> document (** {2 Short-Hands} *) (**[!^s] is a short-hand for [string s]. *) val ( !^ ) : string -> document (**[x ^/^ y] separates [x] and [y] with a breakable space. It is a short-hand for [x ^^ break 1 ^^ y]. *) val ( ^/^ ) : document -> document -> document (**[x ^//^ y] is a short-hand for [prefix 2 1 x y]. *) val ( ^//^ ) : document -> document -> document (** {1:ocaml Printing OCaml Values} *) (**This module offers document combinators that help print OCaml values. The strings produced by rendering these documents are supposed to be accepted by the OCaml parser as valid values. These functions do {i not} distinguish between mutable and immutable values. They do {i not} recognize sharing, and do {i not} incorporate a protection against cyclic values. *) module OCaml : sig (* The signature of this module is compatible with that expected by the [camlp4] generator [Camlp4RepresentationGenerator]. This explains why some functions have unused parameters. This is also the reason why there is a type [representation]. *) type constructor = string type type_name = string type record_field = string type tag = int (**[variant _ dc _ args] represents a constructed value whose data constructor is [dc] and whose arguments are [args]. The other two parameters are presently unused. *) val variant : type_name -> constructor -> tag -> document list -> document (**[record _ fields] represents a record value whose fields are [fields]. The other parameter is presently unused. *) val record : type_name -> (record_field * document) list -> document (**[tuple args] represents a tuple value whose components are [args]. *) val tuple : document list -> document (**[string s] represents the literal string [s]. *) val string : string -> document (**[int i] represents the literal integer [i]. *) val int : int -> document (**[int32 i] represents the literal 32-bit integer [i]. *) val int32 : int32 -> document (**[int64 i] represents the literal 64-bit integer [i]. *) val int64 : int64 -> document (**[nativeint i] represents the literal native integer [i]. *) val nativeint : nativeint -> document (**[float f] represents the literal floating-point number [f]. *) val float : float -> document (**[char c] represents the literal character [c]. *) val char : char -> document (**[bool b] represents the Boolean value [b]. *) val bool : bool -> document (**[unit] represents the unit constant [()]. *) val unit : document (**[option f o] represents the option [o]. The representation of the element, if present, is computed by the function [f]. *) val option : ('a -> document) -> 'a option -> document (**[list f xs] represents the list [xs]. The representation of each element is computed by the function [f]. If the whole list fits on a single line, then it is printed on a single line; otherwise each element is printed on a separate line. *) val list : ('a -> document) -> 'a list -> document (**[flowing_list f xs] represents the list [xs]. The representation of each element is computed by the function [f]. As many elements are possible are printed on each line. *) val flowing_list : ('a -> document) -> 'a list -> document (**[array f xs] represents the array [xs]. The representation of each element is computed by the function [f]. If the whole array fits on a single line, then it is printed on a single line; otherwise each element is printed on a separate line. *) val array : ('a -> document) -> 'a array -> document (**[flowing_array f xs] represents the array [xs]. The representation of each element is computed by the function [f]. As many elements are possible are printed on each line. *) val flowing_array : ('a -> document) -> 'a array -> document (**[ref r] represents the reference [r]. The representation of the content is computed by the function [f]. *) val ref : ('a -> document) -> 'a ref -> document (** [unknown t _] represents an unknown value of type [t]. It is rendered as a string of the form []. *) val unknown : type_name -> 'a -> document (**/**) type representation = document end (* OCaml *) pprint-20230830/src/PPrintEngine.ml000066400000000000000000000632031447371567600170040ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (** A point is a pair of a line number and a column number. *) type point = int * int (** A range is a pair of points. *) type range = point * point (* ------------------------------------------------------------------------- *) (* A type of integers with infinity. *) type requirement = int (* with infinity *) (* Infinity is encoded as [max_int]. *) let infinity : requirement = max_int (* Addition of integers with infinity. *) let (++) (x : requirement) (y : requirement) : requirement = if x = infinity || y = infinity then infinity else x + y (* Comparison between an integer with infinity and a normal integer. *) let (<==) (x : requirement) (y : int) = x <= y (* ------------------------------------------------------------------------- *) (* A uniform interface for output channels. *) class type output = object (** [char c] sends the character [c] to the output channel. *) method char: char -> unit (** [substring s ofs len] sends the substring of [s] delimited by the offset [ofs] and the length [len] to the output channel. *) method substring: string -> int (* offset *) -> int (* length *) -> unit end (* ------------------------------------------------------------------------- *) (* Printing blank space. This is used both internally (to emit indentation characters) and via the public combinator [blank]. *) let blank_length = 80 let blank_buffer = String.make blank_length ' ' let rec blanks (output : output) n = if n <= 0 then () else if n <= blank_length then output#substring blank_buffer 0 n else begin output#substring blank_buffer 0 blank_length; blanks output (n - blank_length) end (* ------------------------------------------------------------------------- *) (* The class [buffering] implements a wrapper that delays the printing of blank characters. This includes indentation characters and characters produced by the combinator [blank]. The printing of these characters is delayed until it is known that they are followed by something on the same line; if they are not followed with anything, then it is canceled. The actual printing task is delegated to the object [delegate], whose type is [output]; the new object has type [output] as well. *) class buffering (delegate : output) : output = object (self) (* The number of blank characters that are withholding. *) val mutable buffered = 0 (* [flush] sends out the blank characters that have been withheld. *) method private flush = blanks delegate buffered; buffered <- 0 method char c : unit = begin match c with | '\n' -> (* The current line ends here. Any blank characters that were withheld are destroyed. This is where we avoid printing blank characters if nothing follows them. *) buffered <- 0 | _ -> (* The current line is nonempty. Any blank characters that were withheld can now be flushed. *) self#flush end; (* Print this character as usual. *) delegate#char c method substring s pos len = (* If this is a string of length zero, then there is nothing to do. *) if len = 0 then () (* If this is a blank string (which we recognize by its address), then its content is withheld. *) else if s == blank_buffer then buffered <- buffered + len (* If this is not a blank string, then the blank characters that were withheld up to this point can now be flushed. *) else begin self#flush; delegate#substring s pos len end end (* ------------------------------------------------------------------------- *) (* Three kinds of output channels are wrapped so as to satisfy the above interface: OCaml output channels, OCaml memory buffers, and OCaml formatters. *) class channel_output channel = object method char = output_char channel method substring = output_substring channel (* We used to use [output], but, as of OCaml 4.02 and with -safe-string enabled, the type of [output] has changed: this function now expects an argument of type [bytes]. The new function [output_substring] must be used instead. Furthermore, as of OCaml 4.06, -safe-string is enabled by default. In summary, we require OCaml 4.02, use [output_substring], and enable -safe-string. *) end class buffer_output buffer = object method char = Buffer.add_char buffer method substring = Buffer.add_substring buffer end class formatter_output fmt = object method char = function | '\n' -> Format.pp_force_newline fmt () | ' ' -> Format.pp_print_space fmt () | c -> Format.pp_print_char fmt c method substring str ofs len = Format.pp_print_text fmt ( if ofs = 0 && len = String.length str then str else String.sub str ofs len ) end (* ------------------------------------------------------------------------- *) (** The rendering engine maintains the following internal state. Its structure is subject to change in future versions of the library. Nevertheless, it is exposed to the user who wishes to define custom documents. *) type state = { width: int; (** The line width. This parameter is fixed throughout the execution of the renderer. *) ribbon: int; (** The ribbon width. This parameter is fixed throughout the execution of the renderer. *) mutable last_indent: int; (** The number of blanks that were printed at the beginning of the current line. This field is updated (only) when a hardline is emitted. It is used (only) to determine whether the ribbon width constraint is respected. *) mutable line: int; (** The current line. This field is updated (only) when a hardline is emitted. It is not used by the pretty-printing engine itself. *) mutable column: int; (** The current column. This field must be updated whenever something is sent to the output channel. It is used (only) to determine whether the width constraint is respected. *) } (* ------------------------------------------------------------------------- *) (* [initial rfrac width] creates a fresh initial state. *) let initial rfrac width = { width = width; ribbon = max 0 (min width (truncate (float_of_int width *. rfrac))); last_indent = 0; line = 0; column = 0 } (* ------------------------------------------------------------------------- *) (** A custom document is defined by implementing the following methods. *) class type custom = object (** A custom document must publish the width (i.e., the number of columns) that it would like to occupy if it is printed on a single line (that is, in flattening mode). The special value [infinity] means that this document cannot be printed on a single line; this value causes any groups that contain this document to be dissolved. This method should in principle work in constant time. *) method requirement: requirement (** The method [pretty] is used by the main rendering algorithm. It has access to the output channel and to the algorithm's internal state, as described above. In addition, it receives the current indentation level and the current flattening mode (on or off). If flattening mode is on, then the document must be printed on a single line, in a manner that is consistent with the requirement that was published ahead of time. If flattening mode is off, then there is no such obligation. The state must be updated in a manner that is consistent with what is sent to the output channel. *) method pretty: output -> state -> int -> bool -> unit (** The method [compact] is used by the compact rendering algorithm. It has access to the output channel only. *) method compact: output -> unit end (* ------------------------------------------------------------------------- *) (* Here is the algebraic data type of documents. It is analogous to Daan Leijen's version, but the binary constructor [Union] is replaced with the unary constructor [Group], and the constant [Line] is replaced with more general constructions, namely [IfFlat], which provides alternative forms depending on the current flattening mode, and [HardLine], which represents a newline character, and causes a failure in flattening mode. *) type document = (* [Empty] is the empty document. *) | Empty (* [Char c] is a document that consists of the single character [c]. We enforce the invariant that [c] is not a newline character. *) | Char of char (* [String s] is a document that consists of just the string [s]. We assume, but do not check, that this string does not contain a newline character. [String] is a special case of [FancyString], which takes up less space in memory. *) | String of string (* [FancyString (s, ofs, len, apparent_length)] is a (portion of a) string that may contain fancy characters: color escape characters, UTF-8 or multi-byte characters, etc. Thus, the apparent length (which corresponds to what will be visible on screen) differs from the length (which is a number of bytes, and is reported by [String.length]). We assume, but do not check, that fancystrings do not contain a newline character. *) | FancyString of string * int * int * int (* [Blank n] is a document that consists of [n] blank characters. *) | Blank of int (* When in flattening mode, [IfFlat (d1, d2)] turns into the document [d1]. When not in flattening mode, it turns into the document [d2]. *) | IfFlat of document * document (* When in flattening mode, [HardLine] causes a failure, which requires backtracking all the way until the stack is empty. When not in flattening mode, it represents a newline character, followed with an appropriate number of indentation. A common way of using [HardLine] is to only use it directly within the right branch of an [IfFlat] construct. *) | HardLine (* The following constructors store their space requirement. This is the document's apparent length, if printed in flattening mode. This information is computed in a bottom-up manner when the document is constructed. *) (* In other words, the space requirement is the number of columns that the document needs in order to fit on a single line. We express this value in the set of `integers extended with infinity', and use the value [infinity] to indicate that the document cannot be printed on a single line. *) (* Storing this information at [Group] nodes is crucial, as it allows us to avoid backtracking and buffering. *) (* Storing this information at other nodes allows the function [requirement] to operate in constant time. This means that the bottom-up computation of requirements takes linear time. *) (* [Cat (req, doc1, doc2)] is the concatenation of the documents [doc1] and [doc2]. The space requirement [req] is the sum of the requirements of [doc1] and [doc2]. *) | Cat of requirement * document * document (* [Nest (req, j, doc)] is the document [doc], in which the indentation level has been increased by [j], that is, in which [j] blanks have been inserted after every newline character. The space requirement [req] is the same as the requirement of [doc]. *) | Nest of requirement * int * document (* [Group (req, doc)] represents an alternative: it is either a flattened form of [doc], in which occurrences of [Group] disappear and occurrences of [IfFlat] resolve to their left branch, or [doc] itself. The space requirement [req] is the same as the requirement of [doc]. *) | Group of requirement * document (* [Align (req, doc)] increases the indentation level to reach the current column. Thus, the document [doc] is rendered within a box whose upper left corner is the current position. The space requirement [req] is the same as the requirement of [doc]. *) | Align of requirement * document (* [Range (req, hook, doc)] is printed like [doc]. After it is printed, the function [hook] is applied to the range that is occupied by [doc] in the output. *) | Range of requirement * (range -> unit) * document (* [Custom (req, f)] is a document whose appearance is user-defined. *) | Custom of custom (* ------------------------------------------------------------------------- *) (* Retrieving or computing the space requirement of a document. *) let rec requirement = function | Empty -> 0 | Char _ -> 1 | String s -> String.length s | FancyString (_, _, _, len) | Blank len -> len | IfFlat (doc1, _) -> (* The requirement of a document is the space that it needs when it is printed in flattening mode. So, the requirement of [ifflat x y] is just the requirement of its flat version, [x]. *) (* The smart constructor [ifflat] ensures that [IfFlat] is never nested in the left-hand side of [IfFlat], so this recursive call is not a problem; the function [requirement] has constant time complexity. *) requirement doc1 | HardLine -> (* A hard line cannot be printed in flattening mode. *) infinity | Cat (req, _, _) | Nest (req, _, _) | Group (req, _) | Align (req, _) | Range (req, _, _) -> (* These nodes store their requirement -- which is computed when the node is constructed -- so as to allow us to answer in constant time here. *) req | Custom c -> c#requirement (* ------------------------------------------------------------------------- *) (* The above algebraic data type is not exposed to the user. Instead, we expose the following smart constructors. These functions construct a raw document and compute its requirement, so as to obtain a document. *) (* The smart constructors ensure that [Empty] is the only empty document; that is, there is no other way of constructing a document that behaves (in all contexts) as an empty document. (This claim could be violated by constructing [range hook empty] where [hook] has no effect, or by constructing a [custom] document that behaves like an empty document. These violations seem benign.) *) let empty = Empty let char c = assert (c <> '\n'); Char c let space = Blank 1 let string s = if String.length s = 0 then empty else String s let fancysubstring s ofs len apparent_length = if len = 0 then empty else FancyString (s, ofs, len, apparent_length) let[@inline] substring s ofs len = fancysubstring s ofs len len let[@inline] fancystring s apparent_length = fancysubstring s 0 (String.length s) apparent_length (* The following function was stolen from [Batteries]. *) let utf8_length s = let rec length_aux s c i = if i >= String.length s then c else let n = Char.code (String.unsafe_get s i) in let k = if n < 0x80 then 1 else if n < 0xe0 then 2 else if n < 0xf0 then 3 else 4 in length_aux s (c + 1) (i + k) in length_aux s 0 0 let[@inline] utf8string s = fancystring s (utf8_length s) let[@inline] utf8format f = Printf.ksprintf utf8string f let hardline = HardLine let blank n = match n with | 0 -> empty | _ -> Blank n let ifflat doc1 doc2 = match doc1, doc2 with (* If both documents are empty then the result is empty. *) | Empty, Empty -> empty (* We avoid nesting [IfFlat] inside the left-hand side of [IfFlat]. That would be redundant; and the function [requirement] relies on the fact that the left child of [IfFlat] cannot be [IfFlat]. On the right-hand side, a symmetric optimization would be valid as well, but is not useful. *) | IfFlat (doc1, _), doc2 | doc1, doc2 -> IfFlat (doc1, doc2) let[@inline] internal_break i = IfFlat (blank i, hardline) let break0 = IfFlat (Empty, HardLine) (* this is [internal_break 0] *) let break1 = IfFlat (Blank 1, HardLine) (* this is [internal_break 1] *) let break i = match i with | 0 -> break0 | 1 -> break1 | _ -> internal_break i let (^^) x y = match x, y with | Empty, _ -> y | _, Empty -> x | _, _ -> Cat (requirement x ++ requirement y, x, y) let nest i x = assert (i >= 0); match x with | Empty -> Empty | _ -> Nest (requirement x, i, x) let group x = match x with | Empty -> Empty | _ -> let req = requirement x in (* Minor optimisation: an infinite requirement dissolves a group. *) if req = infinity then x else Group (req, x) let align x = match x with | Empty -> Empty | _ -> Align (requirement x, x) let[@inline] range hook x = Range (requirement x, hook, x) let custom c = (* Sanity check. *) assert (c#requirement >= 0); Custom c (* ------------------------------------------------------------------------- *) (* Because the smart constructors ensure that [Empty] is the only empty document, [is_empty] can be implemented in a simple and efficient way. *) let is_empty x = match x with Empty -> true | _ -> false (* ------------------------------------------------------------------------- *) (* This function expresses the following invariant: if we are in flattening mode, then we must be within bounds, i.e. the width and ribbon width constraints must be respected. *) let ok state flatten : bool = not flatten || state.column <= state.width && state.column <= state.last_indent + state.ribbon (* ------------------------------------------------------------------------- *) (* The pretty rendering engine. *) (* The renderer is supposed to behave exactly like Daan Leijen's, although its implementation is quite radically different, and simpler. Our documents are constructed eagerly, as opposed to lazily. This means that we pay a large space overhead, but in return, we get the ability of computing information bottom-up, as described above, which allows to render documents without backtracking or buffering. *) (* The [state] record is never copied; it is just threaded through. In addition to it, the parameters [indent] and [flatten] influence the manner in which the document is rendered. *) (* The code is written in tail-recursive style, so as to avoid running out of stack space if the document is very deep. Each [KCons] cell in a continuation represents a pending call to [pretty]. Each [KRange] cell represents a pending call to a user-provided range hook. *) type cont = | KNil | KCons of int * bool * document * cont | KRange of (range -> unit) * point * cont let rec pretty (output : output) (state : state) (indent : int) (flatten : bool) (doc : document) (cont : cont) : unit = match doc with | Empty -> continue output state cont | Char c -> output#char c; state.column <- state.column + 1; (* assert (ok state flatten); *) continue output state cont | String s -> let len = String.length s in output#substring s 0 len; state.column <- state.column + len; (* assert (ok state flatten); *) continue output state cont | FancyString (s, ofs, len, apparent_length) -> output#substring s ofs len; state.column <- state.column + apparent_length; (* assert (ok state flatten); *) continue output state cont | Blank n -> blanks output n; state.column <- state.column + n; (* assert (ok state flatten); *) continue output state cont | HardLine -> (* We cannot be in flattening mode, because a hard line has an [infinity] requirement, and we attempt to render a group in flattening mode only if this group's requirement is met. *) assert (not flatten); (* Emit a hardline. *) output#char '\n'; blanks output indent; state.line <- state.line + 1; state.column <- indent; state.last_indent <- indent; (* Continue. *) continue output state cont | IfFlat (doc1, doc2) -> (* Pick an appropriate sub-document, based on the current flattening mode. *) pretty output state indent flatten (if flatten then doc1 else doc2) cont | Cat (_, doc1, doc2) -> (* Push the second document onto the continuation. *) pretty output state indent flatten doc1 (KCons (indent, flatten, doc2, cont)) | Nest (_, j, doc) -> pretty output state (indent + j) flatten doc cont | Group (req, doc) -> (* If we already are in flattening mode, stay in flattening mode; we are committed to it. If we are not already in flattening mode, we have a choice of entering flattening mode. We enter this mode only if we know that this group fits on this line without violating the width or ribbon width constraints. Thus, we never backtrack. *) let flatten = flatten || let column = state.column ++ req in column <== state.width && column <== state.last_indent + state.ribbon in pretty output state indent flatten doc cont | Align (_, doc) -> (* The effect of this combinator is to set [indent] to [state.column]. Usually [indent] is equal to [state.last_indent], hence setting it to [state.column] increases it. However, if [nest] has been used since the current line began, then this could cause [indent] to decrease. *) (* assert (state.column > state.last_indent); *) pretty output state state.column flatten doc cont | Range (_, hook, doc) -> let start : point = (state.line, state.column) in pretty output state indent flatten doc (KRange (hook, start, cont)) | Custom c -> (* Invoke the document's custom rendering function. *) c#pretty output state indent flatten; (* Sanity check. *) assert (ok state flatten); (* Continue. *) continue output state cont and continue output state = function | KNil -> () | KCons (indent, flatten, doc, cont) -> pretty output state indent flatten doc cont | KRange (hook, start, cont) -> let finish : point = (state.line, state.column) in hook (start, finish); continue output state cont (* Publish a version of [pretty] that does not take an explicit continuation. This function may be used by authors of custom documents. We do not expose the internal [pretty] -- the one that takes a continuation -- because we wish to simplify the user's life. The price to pay is that calls that go through a custom document cannot be tail calls. *) let pretty output state indent flatten doc = pretty output state indent flatten doc KNil (* ------------------------------------------------------------------------- *) (* The compact rendering algorithm. *) let rec compact output doc cont = match doc with | Empty -> continue output cont | Char c -> output#char c; continue output cont | String s -> let len = String.length s in output#substring s 0 len; continue output cont | FancyString (s, ofs, len, _apparent_length) -> output#substring s ofs len; continue output cont | Blank n -> blanks output n; continue output cont | HardLine -> output#char '\n'; continue output cont | Cat (_, doc1, doc2) -> compact output doc1 (doc2 :: cont) | IfFlat (doc, _) | Nest (_, _, doc) | Group (_, doc) | Align (_, doc) | Range (_, _, doc) -> compact output doc cont | Custom c -> (* Invoke the document's custom rendering function. *) c#compact output; continue output cont and continue output cont = match cont with | [] -> () | doc :: cont -> compact output doc cont let compact output doc = compact output doc [] (* ------------------------------------------------------------------------- *) (* We now instantiate the renderers for the three kinds of output channels. *) (* This is just boilerplate. *) module type RENDERER = sig type channel type document val pretty: float -> int -> channel -> document -> unit val compact: channel -> document -> unit end module MakeRenderer (X : sig type channel val output: channel -> output end) : RENDERER with type channel = X.channel and type document = document = struct type channel = X.channel type nonrec document = document let pretty rfrac width channel doc = pretty (X.output channel) (initial rfrac width) 0 false doc let compact channel doc = compact (X.output channel) doc end module ToChannel = MakeRenderer(struct type channel = out_channel let output channel = new buffering (new channel_output channel) end) module ToBuffer = MakeRenderer(struct type channel = Buffer.t let output buffer = new buffering (new buffer_output buffer) end) module ToFormatter = MakeRenderer(struct type channel = Format.formatter let output fmt = new buffering (new formatter_output fmt) end) pprint-20230830/src/PPrintEngine.mli000066400000000000000000000341521447371567600171560ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (**[PPrint] is an OCaml library for {b pretty-printing textual documents}. It takes care of {b indentation and line breaks}, and is typically used to {b pretty-print code}. *) (** {1:building Building Documents} *) (**The abstract type of documents. *) type document (** {2 Atomic Documents} *) (**[empty] is the empty document. *) val empty: document (**[char c] is an atomic document that consists of the single character [c]. This character must not be a newline character. *) val char: char -> document (**[string s] is an atomic document that consists of the string [s]. This string must not contain a newline. The printing engine assumes that the ideal width of this string is [String.length s]. This assumption is safe if this is an ASCII string. Otherwise, {!fancystring} or {!utf8string} should be preferred. *) val string: string -> document (**[substring s ofs len] is an atomic document that consists of the portion of the string [s] delimited by the offset [ofs] and the length [len]. This portion must not contain a newline. [substring s ofs len] is equivalent to [string (String.sub s ofs len)], but is expected to be more efficient, as the substring is not actually extracted. *) val substring: string -> int -> int -> document (**[fancystring s alen] is an atomic document that consists of the string [s]. This string must not contain a newline. The string may contain fancy characters: color escape characters, UTF-8 characters, etc. Thus, its apparent length (which measures how many columns the text will take up on screen) differs from its length in bytes. The printing engine assumes that its apparent length is [alen]. *) val fancystring: string -> int -> document (**[fancysubstring s ofs len alen] is equivalent to [fancystring (String.sub s ofs len) alen]. *) val fancysubstring : string -> int -> int -> int -> document (**[utf8string s] is an atomic document that consists of the UTF-8-encoded string [s]. This string must not contain a newline. [utf8string s] is equivalent to [fancystring s (utf8_length s)], where [utf8_length s] is the apparent length of the UTF-8-encoded string [s]. *) val utf8string: string -> document (** [utf8format format ...] is equivalent to [utf8string (Printf.sprintf format ...)]. *) val utf8format: ('a, unit, string, document) format4 -> 'a (** {2 Blanks and Newlines} *) (**The atomic document [hardline] represents a forced newline. This document has infinite ideal width: thus, if there is a choice between printing it in flat mode and printing it in normal mode, normal mode is preferred. In other words, when [hardline] is placed directly inside a group, this group is dissolved: [group hardline] is equivalent to [hardline]. This combinator should be seldom used; consider using {!break} instead. *) val hardline: document (**The atomic document [blank n] consists of [n] blank characters. A blank character is like an ordinary ASCII space character [char ' '], except that blank characters that appear at the end of a line are automatically suppressed. *) val blank: int -> document (**[space] is a synonym for [blank 1]. It consists of one blank character. It is therefore not equivalent to [char ' ']. *) val space: document (**The document [break n] is a breakable blank of width [n]. It produces [n] blank characters if the printing engine is in flat mode, and a single newline character if the printing engine is in normal mode. [break 1] is equivalent to [ifflat (blank 1) hardline]. *) val break: int -> document (** {2 Composite Documents} *) (**[doc1 ^^ doc2] is the concatenation of the documents [doc1] and [doc2]. *) val (^^): document -> document -> document (**[group doc] encodes a choice. If the document [doc] fits on the current line, then it is rendered on a single line, in flat mode. (All [group] combinators inside it are then ignored.) Otherwise, this group is dissolved, and [doc] is rendered in normal mode. There might be more groups within [doc], whose presence leads to further choices being explored. *) val group: document -> document (**[ifflat doc1 doc2] is rendered as [doc1] if the printing engine is in flat mode, that is, if the printing engine has determined that some enclosing group fits on the current line. Otherwise, it is rendered as [doc2]. Use this combinator with caution! Because the printing engine is free to choose between [doc1] and [doc2], these documents must be semantically equivalent. It is up to the user to enforce this property. *) val ifflat: document -> document -> document (**To render the document [nest j doc], the printing engine temporarily increases the current indentation level by [j], then renders [doc]. The effect of the current indentation level is as follows: every time a newline character is emitted, it is immediately followed by [n] blank characters, where [n] is the current indentation level. Thus, one may think of [nest j doc] roughly as the document [doc] in which [j] blank characters have been inserted after every newline character. *) val nest: int -> document -> document (**To render [align doc], the printing engine sets the current indentation level to the current column, then renders [doc]. In other words, the document [doc] is rendered within a box whose upper left corner is the current position of the printing engine. *) val align: document -> document (**A point is a pair of a line number and a column number. *) type point = int * int (**A range is a pair of points. *) type range = point * point (**The document [range hook doc] is printed like the document [doc], but allows the caller to register a hook that is applied, when the document is printed, to the range occupied by this document in the output text. This offers a way of mapping positions in the output text back to (sub)documents. *) val range: (range -> unit) -> document -> document (** {1:inspecting Inspecting Documents} *) (**Documents are abstract, and cannot be inspected. Nevertheless, it is possible to test whether a document is empty. *) (**[is_empty doc] determines whether the document [doc] is empty. Most ways of constructing empty documents, such as [empty], [empty ^^ empty], [nest j empty], and so on, are recognized as such. However, a document constructed by {!val-custom} or {!val-range} is never considered empty. *) val is_empty: document -> bool (** {1:rendering Rendering Documents} *) (**Three renderers are available. They offer the same API, described by the signature {!RENDERER}, and differ only in the nature of the output channel that they use. *) (**This signature describes the document renderers in a manner that is independent of the type of the output channel. *) module type RENDERER = sig (**The type of the output channel. *) type channel (**The type of documents. *) type document (** [pretty rfrac width channel document] pretty-prints the document [document] into the output channel [channel]. The parameter [width] is the maximum number of characters per line. The parameter [rfrac] is the ribbon width, a fraction relative to [width]. The ribbon width is the maximum number of non-indentation characters per line. *) val pretty: float -> int -> channel -> document -> unit (** [compact channel document] prints the document [document] to the output channel [channel]. No indentation is used. All newline instructions are respected, that is, no groups are flattened. *) val compact: channel -> document -> unit end (**This renderer sends its output into an output channel. *) module ToChannel : RENDERER with type channel = out_channel and type document = document (**This renderer sends its output into a memory buffer. *) module ToBuffer : RENDERER with type channel = Buffer.t and type document = document (**This renderer sends its output into a formatter channel. *) module ToFormatter : RENDERER with type channel = Format.formatter and type document = document (** {1:defining Defining Custom Documents} *) (**It is possible to define custom document constructors, provided they meet the expectations of the printing engine. In short, the custom document combinator {!val-custom} expects an object of class {!class-type-custom}. This object must provide three methods. The method [requirement] must compute the ideal width of the custom document. The methods [pretty] and [compact] must render the custom document. For this purpose, they have access to the {{!output}output channel} and to the {{!state}state} of the printing engine. *) (** A width requirement is expressed as an integer. The value [max_int] is reserved and represents infinity. *) type requirement = int (**[infinity] represents an infinite width requirement. *) val infinity : requirement (**An output channel is abstractly represented as an object equipped with methods for displaying one character and for displaying a substring. *) class type output = object (**[char c] sends the character [c] to the output channel. *) method char: char -> unit (**[substring s ofs len] sends the substring of [s] delimited by the offset [ofs] and the length [len] to the output channel. *) method substring: string -> int (* offset *) -> int (* length *) -> unit end (**The internal state of the rendering engine is exposed to the user who wishes to define custom documents. However, its structure is subject to change in future versions of the library. *) type state = { width: int; (** The line width. This parameter is fixed throughout the execution of the renderer. *) ribbon: int; (** The ribbon width. This parameter is fixed throughout the execution of the renderer. *) mutable last_indent: int; (** The number of blanks that were printed at the beginning of the current line. This field is updated (only) when a hardline is emitted. It is used (only) to determine whether the ribbon width constraint is respected. *) mutable line: int; (** The current line. This field is updated (only) when a hardline is emitted. It is not used by the pretty-printing engine itself. *) mutable column: int; (** The current column. This field must be updated whenever something is sent to the output channel. It is used (only) to determine whether the width constraint is respected. *) } (**A custom document is defined by implementing an object of class {!class-type-custom}. *) class type custom = object (**A custom document must publish the width (i.e., the number of columns) that it would like to occupy if printed on a single line (in flat mode). The special value [infinity] means that this document cannot be printed on a single line; this value causes any groups that contain this document to be dissolved. This method should in principle work in constant time. *) method requirement: requirement (**The method [pretty] is used by the main rendering algorithm. It has access to the output channel and to the printing engine's internal state. In addition, it receives the current indentation level and a Boolean flag that tells whether the engine is currently in flat mode. If the engine is in flat mode, then the document must be printed on a single line, in a manner that is consistent with the width requirement that was published ahead of time. If the engine is in normal mode, then there is no such obligation. The state must be updated in a manner that is consistent with what is sent to the output channel. *) method pretty: output -> state -> int -> bool -> unit (**The method [compact] is used by the compact rendering algorithm. It has access to the output channel only. *) method compact: output -> unit end (**[custom] constructs a custom document out an object of type {!class-type-custom}. *) val custom: custom -> document (**Some of the key functions of the library are exposed, in the hope that they may be useful to authors of custom (leaf and composite) documents. In the case of a leaf document, they can help perform certain basic functions; for instance, applying the function {!pretty} to the document {!hardline} is a simple way of printing a hardline, while respecting the indentation parameters and updating the state in a correct manner. Similarly, applying {!pretty} to the document [blank n] is a simple way of printing [n] blank characters. In the case of a composite document (one that contains subdocuments), these functions are essential: they allow computing the width requirement of a subdocument and displaying a subdocument. *) (**[requirement doc] computes the width requirement of the document [doc]. It runs in constant time. *) val requirement: document -> requirement (**[pretty output state indent flatten doc] prints the document [doc]. See the documentation of the method [pretty] in the class {!class-type-custom}. *) val pretty: output -> state -> int -> bool -> document -> unit (**[compact output doc] prints the document [doc]. See the documentation of the method [compact] in the class {!class-type-custom}. *) val compact: output -> document -> unit pprint-20230830/src/PPrintMini.ml000066400000000000000000000140751447371567600164760ustar00rootroot00000000000000(******************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2022 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (* *) (******************************************************************************) (* -------------------------------------------------------------------------- *) (* A type of integers with infinity. *) type requirement = int (* with infinity *) (* Infinity is encoded as [max_int]. *) let infinity : requirement = max_int (* Addition of integers with infinity. *) let (++) (x : requirement) (y : requirement) : requirement = if x = infinity || y = infinity then infinity else x + y (* Comparison of requirements is just ordinary comparison. *) (* -------------------------------------------------------------------------- *) (* The type of documents. See [PPrintEngine] for documentation. *) type document = | Empty | FancyString of string * int * int * int | Blank of int | IfFlat of document * document | HardLine | Cat of requirement * document * document | Nest of requirement * int * document | Group of requirement * document (* -------------------------------------------------------------------------- *) (* Retrieving or computing the space requirement of a document. *) let rec requirement = function | Empty -> 0 | FancyString (_, _, _, len) | Blank len -> len | IfFlat (doc1, _) -> requirement doc1 | HardLine -> infinity | Cat (req, _, _) | Nest (req, _, _) | Group (req, _) -> req (* -------------------------------------------------------------------------- *) (* Document constructors. *) let empty = Empty let fancysubstring s ofs len apparent_length = if len = 0 then empty else FancyString (s, ofs, len, apparent_length) let fancystring s apparent_length = fancysubstring s 0 (String.length s) apparent_length let utf8_length s = let rec length_aux s c i = if i >= String.length s then c else let n = Char.code (String.unsafe_get s i) in let k = if n < 0x80 then 1 else if n < 0xe0 then 2 else if n < 0xf0 then 3 else 4 in length_aux s (c + 1) (i + k) in length_aux s 0 0 let utf8string s = fancystring s (utf8_length s) let char c = assert (c <> '\n'); fancystring (String.make 1 c) 1 let space = char ' ' let hardline = HardLine let blank n = match n with | 0 -> empty | 1 -> space | _ -> Blank n let ifflat doc1 doc2 = match doc1 with | IfFlat (doc1, _) | doc1 -> IfFlat (doc1, doc2) let internal_break i = ifflat (blank i) hardline let break0 = internal_break 0 let break1 = internal_break 1 let break i = match i with | 0 -> break0 | 1 -> break1 | _ -> internal_break i let (^^) x y = match x, y with | Empty, _ -> y | _, Empty -> x | _, _ -> Cat (requirement x ++ requirement y, x, y) let nest i x = assert (i >= 0); Nest (requirement x, i, x) let group x = let req = requirement x in if req = infinity then x else Group (req, x) (* -------------------------------------------------------------------------- *) (* Printing blank space (indentation characters). *) let blank_length = 80 let blank_buffer = String.make blank_length ' ' let rec blanks output n = if n <= 0 then () else if n <= blank_length then Buffer.add_substring output blank_buffer 0 n else begin Buffer.add_substring output blank_buffer 0 blank_length; blanks output (n - blank_length) end (* -------------------------------------------------------------------------- *) (* The rendering engine maintains the following internal state. *) (* For simplicity, the ribbon width is considered equal to the line width; in other words, there is no ribbon width constraint. *) (* For simplicity, the output channel is required to be an OCaml buffer. It is stored within the [state] record. *) type state = { (* The line width. *) width: int; (* The current column. *) mutable column: int; (* The output buffer. *) output: Buffer.t; } (* -------------------------------------------------------------------------- *) (* For simplicity, the rendering engine is *not* in tail-recursive style. *) let rec pretty state (indent : int) (flatten : bool) doc = match doc with | Empty -> () | FancyString (s, ofs, len, apparent_length) -> Buffer.add_substring state.output s ofs len; state.column <- state.column + apparent_length | Blank n -> blanks state.output n; state.column <- state.column + n | HardLine -> assert (not flatten); Buffer.add_char state.output '\n'; blanks state.output indent; state.column <- indent | IfFlat (doc1, doc2) -> pretty state indent flatten (if flatten then doc1 else doc2) | Cat (_, doc1, doc2) -> pretty state indent flatten doc1; pretty state indent flatten doc2 | Nest (_, j, doc) -> pretty state (indent + j) flatten doc | Group (req, doc) -> let flatten = flatten || state.column ++ req <= state.width in pretty state indent flatten doc (* -------------------------------------------------------------------------- *) (* The engine's entry point. *) let pretty width doc = let output = Buffer.create 512 in let state = { width; column = 0; output } in pretty state 0 false doc; Buffer.contents output pprint-20230830/src/dune000066400000000000000000000003041447371567600147570ustar00rootroot00000000000000(env (dev (flags :standard -w @A-4)) (release (flags :standard)) ) (library (name pprint) (public_name pprint) (modules :standard \ PPrintMini) (wrapped false) ) (documentation) pprint-20230830/src/index.mld000066400000000000000000000301651447371567600157160ustar00rootroot00000000000000{0 PPrint} [PPrint] is an OCaml library for {b pretty-printing textual documents}. It takes care of {b indentation and line breaks}, and is typically used to {b pretty-print code}. {1 API Reference} An experienced user may wish to jump directly to a section of the API documentation: - {{!PPrint.building}building documents,} - {{!PPrint.inspecting}inspecting documents,} - {{!PPrint.rendering}rendering documents,} - {{!PPrint.defining}defining custom documents,} - {{!PPrint.combinators}high-level combinators,} - combinators for {{!PPrint.OCaml}printing OCaml values}; these reside in the submodule {!PPrint.OCaml}. {1 Core Combinators} At the heart of [PPrint] is a little {b domain-specific language of documents}. This language has a well-defined semantics, which the printing engine implements. The language and its semantics rest upon a small number of fundamental concepts. There are combinators for creating {b atomic documents}. For instance, the {{!PPrint.string}string} combinator turns an OCaml string (which must not contain any newline character) into a document. Thus, {[ string "hello" ]} is a simple, unbreakable document. The {{!PPrint.utf8string}utf8string} combinator is analogous, and should be preferred when working with non-ASCII strings. The {{!PPrint.utf8format}utf8format} combinator provides a convenient [sprintf]-style API for constructing a complex string and turning it into an atomic document. There is a {b concatenation} operator {{!PPrint.(^^)}(^^)}, which joins two documents. For instance, {[ string "hello" ^^ string "world" ]} is a composite document. It is in fact equivalent to [string "helloworld"]. A somewhat more interesting combinator is the {b breakable blank} combinator {{!PPrint.break}break}. This combinator expects a nonnegative integer argument, the width of the desired breakable blank. If [break n] is printed in flat mode, it produces [n] blank characters; if it is printed in normal mode, it produces one newline character. As suggested by the previous sentence, there are {b two printing modes}, namely {b flat mode} and {b normal mode}. The printing engine goes back and forth between these two modes. Exactly where and how the printing engine switches from one mode to the other is controlled by the next combinator. The {b grouping combinator}, {{!PPrint.group}group}, introduces {b a choice between flat mode and normal mode}. It is a document transformer: if [d] is a document, then [group d] is a document. When the printing engine encounters [group d], two possibilities arise. The first possibility is to print all of [d] on a single line. This is known as flat mode. The engine tries this first (ignoring all {{!PPrint.group}group} combinators inside [d]). If it succeeds, great. If it fails, by lack of space on the current line, then the engine reverts to the second possibility, which is to dissolve the group and print the bare document [d] in normal mode. This has subtle consequences: there might be further groups inside [d], and each of these groups gives rise to further choices. At each group, the choice is resolved in an efficient way. No backtracking is required. The ideal width of every document is computed (in a bottom-up manner) when documents are constructed. This allows every choice to be resolved in constant time. The time complexity of building and rendering documents is linear in the size of the document. {1 Examples} The interplay of {{!PPrint.break}break} and {{!PPrint.group}group} gives rise to an interesting language, where {{!PPrint.group}group} is used to indicate a choice point, and the appearance of {{!PPrint.break}break} is dependent upon the choice points that appear higher up in the hierarchical structure of the document. For instance, the document: {[ group (string "This" ^^ break 1 ^^ string "is" ^^ break 1 ^^ string "pretty.") ]} is printed either on a single line, if it fits, or on three lines. It cannot be printed on two lines: there is just one choice point, so either the two breakable blanks are broken, or none of them is. By the way, this document can be abbreviated as follows: {[ group (string "This" ^/^ string "is" ^/^ string "pretty.") ]} On the other hand, the document: {[ string "This" ^^ group (break 1 ^^ string "is") ^^ group (break 1 ^^ string "pretty.") ]} can be printed on one, two, or three lines. There are two choice points, each of which influences one of the two breakable blanks. The two choices are independent of one another. Each of the words in the sentence [This is pretty.] is printed on the current line if it fits, and on a new line otherwise. By the way, this document can be abbreviated as follows: {[ flow (break 1) [ string "This"; string "is"; string "pretty." ] ]} or as follows: {[ flow_map (break 1) string [ "This"; "is"; "pretty." ] ]} {1 More Core Combinators} As noted earlier, the string that is supplied to {{!PPrint.string}string}, {{!PPrint.utf8string}utf8string}, or {{!PPrint.utf8format}utf8format} must not contain any newline characters. If one wishes to impose a line break, one must use the {b forced newline} combinator {{!PPrint.hardline}hardline}. Whereas {{!PPrint.group}group} introduces a choice between flat mode and normal mode, the {b conditional construct} {{!PPrint.ifflat}ifflat} allows testing whether the printing engine is currently in flat mode or in normal mode. The document [ifflat doc1 doc2] is rendered as [doc1] if the engine is currently in flat mode, and as [doc2] if the engine is currently in normal mode. This is a powerful combinator; however, one must be aware that {i both branches are evaluated and constructed in memory} during the document construction phase. So, if used naively, {{!PPrint.ifflat}ifflat} can cause exponential time and space usage. To avoid this danger, {{!PPrint.ifflat}ifflat} should typically be applied to documents of constant size. The {b blank combinator} {{!PPrint.blank}blank} is analogous to {{!PPrint.break}break}, but produces non-breakable blank characters. A blank character is like an ordinary ASCII space character [string " "], except that blank characters at the end of a line are automatically suppressed. Thus, the printing engine guarantees that no trailing blank characters are ever produced. To illustrate the power of these combinators, let us reveal that {{!PPrint.break}break} is in reality not a primitive combinator: it is defined in terms of {{!PPrint.hardline}hardline}, {{!PPrint.blank}blank}, and {{!PPrint.ifflat}ifflat}. A possible definition of [break 1] is [ifflat (blank 1) hardline]. The {b nesting} combinator {{!PPrint.nest}nest} deals with indentation. At every time, the printing engine maintains a {b current indentation level}, which is a nonnegative integer. The current indentation level is initially zero. To render the document [nest 2 d], the printing engine temporarily increases the current indentation level by 2, renders the document [d], then restores the previous indentation level. The effect of the current indentation level is as follows: {b every time a newline character is emitted, it is immediately followed by [n] blank characters}, where [n] is the current indentation level. To illustrate the use of indentation, let us look at this document: {[ group ( string "begin" ^^ nest 2 (break 1 ^^ string "work") ^^ break 1 ^^ string "end" ) ]} Although this document looks somewhat complicated, understanding its behavior is relatively easy, because there is only one {{!PPrint.group}group} combinator in it. This document can be printed in one of two ways. If it fits on the current line, then the content of the group is rendered in flat mode: [break 1] becomes equivalent to [blank 1], and (because no newline characters are emitted) [nest 2] has no effect. The document is then rendered as follows: {[ begin work end ]} If the document does {i not} fit on the current line, then the group is dissolved, and [break 1] becomes equivalent to [hardline]. Thus, the document becomes equivalent to: {[ string "begin" ^^ nest 2 (hardline ^^ string "work") ^^ hardline ^^ string "end" ]} Thanks to the {{!PPrint.nest}nest} combinator, the first {{!PPrint.hardline}hardline} is immediately followed with two blank characters, whereas the second {{!PPrint.hardline}hardline} is not. The document is then rendered as follows: {[ begin work end ]} The {b alignment} combinator {{!PPrint.align}align} can be used to change the current indentation level in a more subtle way. The effect of this combinator is to set the current indentation level to the current column. To understand what this means, let us look at this document: {[ string "please" ^/^ align (group (string "align" ^/^ string "here")) ]} If this document fits on the current line, then neither {{!PPrint.align}align} nor {{!PPrint.group}group} have any effect, so the document is rendered as follows: {[ please align here ]} If the document does {i not} fit on the current line, then the group is dissolved. The second concatenation operator {{!PPrint.(^/^)}(^/^)} inserts a breakable blank [break 1], which is in this case is equivalent to [hardline]. Because the current indentation level is set by {{!PPrint.align}align} to the column that follows "[please ]", the document is rendered as follows: {[ please align here ]} This concludes our review of [PPrint]'s core combinators. Not every combinator has been mentioned here; for further details, please consult {{!PPrint.building}the complete list} of the core combinators for building documents. On top of the core combinators, it is up to the user of the library to define higher-level combinators that are more convenient or better suited to a particular use case. [PPrint] itself comes with {{!PPrint.combinators}a collection of high-level combinators}, and the submodule {!PPrint.OCaml} offers a collection of combinators for {{!PPrint.OCaml}printing OCaml values}. These collections are not as complete and thoughtfully designed as they could be. They are subject to change in the future. {1 Rendering Documents} The submodules {!PPrint.ToChannel}, {!PPrint.ToBuffer}, and {!PPrint.ToFormatter} give access to the printing engine, and send their output respectively to an output channel of type [out_channel], to a buffer of type [Buffer.t], and to a formatter channel of type [Format.formatter]. Each of these submodules offers a choice between two printing engines. The {{!PPrint.ToChannel.pretty}pretty} printing engine should be preferred in most situations; it attempts to respects the maximum line width and ribbon width specified by the user. The {{!PPrint.ToChannel.compact}compact} printing engine can be used when the readability of the output does not matter: it assumes a maximum line width of zero (so it never flattens a group) and does not emit any indentation characters. {1 Defining Custom Documents} It is possible to extend [PPrint] with custom document constructors, provided they meet the expectations of the printing engine. In short, the {b custom document} combinator {{!PPrint.val-custom}custom} expects an object of class {{!PPrint.class-type-custom}custom}. This object must provide three methods. The method [requirement] must compute the ideal width of the custom document. The methods [pretty] and [compact] must render the custom document. For this purpose, they have access to the {{!PPrint.output}output channel} and to the {{!PPrint.state}state} of the printing engine. For more details, see {{!PPrint.defining}Defining Custom Documents}. {1 History and Acknowledgements} The document language and the printing engine are inspired by Daan Leijen's {{:https://hackage.haskell.org/package/wl-pprint}wl-pprint} library, which itself is based on the ideas developed by Philip Wadler in the paper {{:http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf}A Prettier Printer}. This Haskell library exploits laziness to achieve a very low memory requirement: the entire document never needs to reside in memory. [PPrint] achieves greater simplicity and possibly higher throughput by requiring the entire document to be built in memory before it is printed. [PPrint] was written by {{:http://cambium.inria.fr/~fpottier/}François Pottier} and Nicolas Pouillard, with contributions by Yann Régis-Gianas, Gabriel Scherer, Jonathan Protzenko, and Thomas Refis. pprint-20230830/test/000077500000000000000000000000001447371567600142745ustar00rootroot00000000000000pprint-20230830/test/PPrintTest.ml000066400000000000000000000037531447371567600167120ustar00rootroot00000000000000(**************************************************************************) (* *) (* PPrint *) (* *) (* François Pottier, Inria Paris *) (* Nicolas Pouillard *) (* *) (* Copyright 2007-2019 Inria. All rights reserved. This file is *) (* distributed under the terms of the GNU Library General Public *) (* License, with an exception, as described in the file LICENSE. *) (**************************************************************************) open PPrint (* This is a test file. It is not, strictly speaking, part of the library. *) let paragraph (s : string) = flow (break 1) (words s) let document = prefix 2 1 (string "TITLE:") (string "PPrint") ^^ hardline ^^ prefix 2 1 (string "AUTHORS:") (utf8string "François Pottier and Nicolas Pouillard") ^^ hardline ^^ prefix 2 1 (string "ABSTRACT:") ( paragraph "This is an adaptation of Daan Leijen's \"PPrint\" library, which itself is based on the ideas developed by Philip Wadler in \"A Prettier Printer\". For more information about Wadler's and Leijen's work, please consult the following reference:" ^^ nest 2 ( twice (break 1) ^^ separate_map (break 1) (fun s -> nest 2 (url s)) [ "http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf"; ] ) ^^ twice (break 1) ^^ paragraph "To install PPrint, type \"opam install pprint\"." ^^ twice (break 1) ^^ paragraph "The documentation for PPrint is built by \"make doc\"." ) ^^ hardline let () = ToChannel.pretty 0.5 80 stdout document; flush stdout pprint-20230830/test/dune000066400000000000000000000000741447371567600151530ustar00rootroot00000000000000(executable (name PPrintTest) (libraries unix pprint) )