pax_global_header00006660000000000000000000000064125337705700014523gustar00rootroot0000000000000052 comment=c64c5df345d81735e0c7b1823c6ca03d0b9723dd madness-0.10/000077500000000000000000000000001253377057000130755ustar00rootroot00000000000000madness-0.10/.gitignore000066400000000000000000000007461253377057000150740ustar00rootroot00000000000000# Compiled Object files *.slo *.lo *.o *.obj # Compiled Dynamic libraries *.so *.dylib *.dll # Compiled Static libraries *.lai *.la *.a *.lib # Executables *.exe *.out *.app .cproject .project Makefile.in config.h.in aclocal.m4 autom4te.cache configure test-driver log.00* org.eclipse.cdt.core.prefs org.eclipse.cdt.ui.prefs .autotools .externalToolBuilders/org.eclipse.cdt.autotools.core.genmakebuilderV2.launch compile config.guess config.sub depcomp install-sh missing /Default/ madness-0.10/.travis.yml000066400000000000000000000023701253377057000152100ustar00rootroot00000000000000language: cpp os: linux compiler: - gcc - clang env: - GCC_VERSION=4.7 RUN_TEST=buildonly - GCC_VERSION=4.7 RUN_TEST=world - GCC_VERSION=4.7 RUN_TEST=tensor - GCC_VERSION=4.7 RUN_TEST=mra - GCC_VERSION=4.8 RUN_TEST=buildonly - GCC_VERSION=4.8 RUN_TEST=world - GCC_VERSION=4.8 RUN_TEST=tensor - GCC_VERSION=4.8 RUN_TEST=mra - GCC_VERSION=4.9 RUN_TEST=buildonly - GCC_VERSION=4.9 RUN_TEST=world - GCC_VERSION=4.9 RUN_TEST=tensor - GCC_VERSION=4.9 RUN_TEST=mra matrix: exclude: - compiler: clang env: GCC_VERSION=4.7 RUN_TEST=buildonly - compiler: clang env: GCC_VERSION=4.7 RUN_TEST=world - compiler: clang env: GCC_VERSION=4.7 RUN_TEST=tensor - compiler: clang env: GCC_VERSION=4.7 RUN_TEST=mra - compiler: clang env: GCC_VERSION=4.9 RUN_TEST=buildonly - compiler: clang env: GCC_VERSION=4.9 RUN_TEST=world - compiler: clang env: GCC_VERSION=4.9 RUN_TEST=tensor - compiler: clang env: GCC_VERSION=4.9 RUN_TEST=mra #notifications: # email: # recipients: # - madness-developers@googlegroups.com # on_success: change # on_failure: always before_install: ./ci/dep-$TRAVIS_OS_NAME.sh script: ./ci/build-$TRAVIS_OS_NAME.sh after_failure: cat ./config.logmadness-0.10/INSTALL000066400000000000000000000161621253377057000141340ustar00rootroot00000000000000 CONFIGURING =========== We are using standard GNU configure scripts but since we rely upon MPI, BLAS, LAPACK and other non-standard libraries most machines require options passed to configure and build. More information on the MADNESS software environment is available on the wiki at https://github.com/m-a-d-n-e-s-s/madness Hardware requirements: o A recent x86 cpu (32 or 64 bit) with SSE3 instructions (if you do not have SSE3, see below), or o IBM BG/P Software requirements: o MPI-2 with mpicc and mpicxx (tested with mpich2 and openmpi) o BLAS o LAPACK o C and C++ compilers: gcc, g++ (most versions 4.7.x, 4.8.x, 4.9.x should work). Your compiler must also support C++11. o If you are building from the respository you will also need autoconf and automake. Optional: o Google performance tools (the malloc library is fast and it is the only free multithreaded performance analysis tool) o GTest for google unit test If you checked the source out of subversion: o You will need to run autoreconf (no options necessary) in the trunk directory to build the configure scripts Configure options: Run the configure script with option --help to obtain a full list of options. The most commonly used are: o --with-libunwind=dir ... directory containing libunwind.a for google profiler o --with-google-perf=dir ... installation directory for google performance tools (that specified with --prefix=dir when configuring the tools) o --with-google-test=dir ... installation directory for google unit test library (that specified with --prefix=dir when configuring the library) o --enable-debugging[=yes|no|OPTION] ... Enable debugging C and C++ compilers [default=no] o --enable-optimization[=yes|no|OPTION] ... Enable optimization for C and C++ [default=yes] o --enable-warning[=yes|no|GNU|Pathscale|Portland|Intel|IBM] ... Automatically set warnings for compiler.[default=yes] o --enable-optimal[=yes|no|GNU|Pathscale|Portland|Intel|IBM] ... Auto detect optimal CXXFLAGS for compiler, or specify compiler vendor.[default=yes] o --enable-spinlocks ... forces use of spin locks instead of mutex (enabled by default on Cray XT and IBM BG/P) o --enable-never-spin ... completely disables use of spinlocks (necessary for using VMs) CONFIGURE COMMAND EXAMPLES ========================== o Linux workstation or cluster using GNU compilers (some systems might require additional libraries to resolve Fortran symbols used by LAPACK/BLAS) - 64-bit with Intel MKL installed ./configure LDFLAGS="-L/opt/intel/mkl/10.0.1.014/lib/em64t" LIBS="-lmkl -lguide -lpthread -lm" CXXFLAGS="-std=c++0x" - 32-bit with Intel MKL installed ./configure LDFLAGS="-L/opt/intel/mkl/10.0.1.014/lib/32" LIBS="-lmkl -lguide -lpthread -lm" CXXFLAGS="-std=c++0x" - 32-bit with Intel MKL installed and also including MPFR and GSL ./configure LDFLAGS="-I/usr/include/gsl -L/opt/intel/mkl/10.0.1.014/lib/32" LIBS="-lgsl -lgslcblas -lmkl -lguide -lpthread -lm -lmpfr -lgmp" CXXFLAGS="-std=c++0x" - 64-bit with MKL, google's fast malloc and profiler library ./configure LDFLAGS="-L/opt/intel/mkl/10.0.1.014/lib/em64t" LIBS="-lmkl -lguide -lpthread -lm" -with-google-perf=/usr/local/gperftools CXXFLAGS="-std=c++0x" - 64-bit with AMD ACML ./configure LDFLAGS="-L/usr/local/acml4.2.0/gfortran64_int64/lib" LIBS="-lacml -lacml_mv -lgfortran -lstdc++" CXXFLAGS="-std=c++0x" - 64-bit with AMD ACML, google's super-fast malloc and google's profiler library ./configure LDFLAGS="-L/usr/local/acml4.2.0/gfortran64_int64/lib -L/usr/local/gperftools-20090306/lib -L/usr/local/libunwind-20090306" LIBS="-lacml -lacml_mv -lprofiler -ltcmalloc -lunwind-x86_64 -lgfortran -lstdc++" CXXFLAGS="-std=c++0x" o Cray-XT (e.g., jaguar@ornl and jaguarpf@ornl, kraken at NICS and franklin at NERSC) [Module commands assume you start from the default which is PGI and scilib] - with GNU compilers and ACML - RECOMMENDED This is recommended since the GNU compiler is about 10x faster at compiling and has comparable execution speed to PGI. ACML is also faster than the Goto BLAS in scilib for the small matrices primarily used by MADNESS and does not have a huge piece of static memory reserved. module swap PrgEnv-pgi PrgEnv-gnu module swap gcc/4.3.2 gcc/4.4.2 module load acml/4.3.0 ./configure CXXFLAGS=-std=c++11 - with GNU compilers and scilib module swap PrgEnv-pgi PrgEnv-gnu module swap gcc/4.3.2 gcc/4.4.2 ./configure CXXFLAGS=-std=c++11 o IBM BlueGene-Q @ ANL For other BGQ boxes you will need to find these compilers and libraries. They seem to be site dependent. #!/bin/bash export LIBS="" export LIBS="${LIBS} -L/home/robert/install/lib -llapack_bgp" export LIBS="${LIBS} -L/soft/apps/ESSL-4.4.1-0/lib -lesslbg" export LIBS="${LIBS} -L/soft/apps/ibmcmp-aug2009/xlf/bg/11.1/bglib -lxlf90_r -lxlfmath" export LIBS="${LIBS} -L/soft/apps/ibmcmp-aug2009/xlsmp/bg/1.7/bglib -lxlsmp -lxl" export LIBS="${LIBS} -L/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/lib -lrt" export LIBS="${LIBS} -lpthread" export CPPFLAGS="-I/bgsys/drivers/ppcfloor/arch/include" ./configure \ --build=powerpc-bgp-linux-gnu \ --with-google-test=/home/robert/install \ LIBS="${LIBS}" \ CPPFLAGS="${CPPFLAGS}" o Macintosh Darwin 10.6, snow leopard Tested with Apple GCC 4.2.1, Apple clang 2.0, Intel 11.1 & 12.0, GNU GCC 4.4.x & 4.5.x, and LLVM clang 2.8 - Apple GCC 4.2.1 using vecLib Set the following system variables export OMPI_CC=/usr/bin/gcc-4.2 export OMPI_CXX=/usr/bin/g++-4.2 ./configure MPICC=/usr/bin/mpicc MPICXX=/usr/bin/mpicxx \ LIBS="/System/Library/Frameworks/vecLib.framework/vecLib" \ CPPFLAGS="-I/System/Library/Frameworks/vecLib.framework/Headers" - Apple clang 2.0 using vecLib Set the following system variables export OMPI_CC=/usr/bin/clang export OMPI_CXX=/usr/bin/clang++ ./configure MPICC=/usr/bin/mpicc MPICXX=/usr/bin/mpicxx \ LIBS="/System/Library/Frameworks/vecLib.framework/vecLib" \ CPPFLAGS="-I/System/Library/Frameworks/vecLib.framework/Headers" \ CXXFLAGS="-std=c++0x" - GNU GCC 4.4.x or later using vecLib Set the following system variables export OMPI_CC=/path/to/gnu_gcc/gcc export OMPI_CXX=/path/to/gun_gcc/g++ ./configure MPICC=/usr/bin/mpicc MPICXX=/usr/bin/mpicxx \ LIBS="/System/Library/Frameworks/vecLib.framework/vecLib" \ CPPFLAGS="-I/System/Library/Frameworks/vecLib.framework/Headers" \ CXXFLAGS="-std=c++0x" - Intel 11.x and 2011 (12.x) with Apple GCC Set the following system variables export OMPI_CC=icc export OMPI_CXX=icpc ./configure MPICC=/usr/bin/mpicc MPICXX=/usr/bin/mpicxx o On x86 without SSE3 -- configure with the option CPPFLAGS="-DDISABLE_SSE3", requires autoconf-2.65+, automake-1.11+ madness-0.10/LICENSE000066400000000000000000000351021253377057000141030ustar00rootroot00000000000000Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS madness-0.10/Makefile.am000066400000000000000000000025051253377057000151330ustar00rootroot00000000000000# This must match definition of AC_CONFIG_MACRO_DIR in configure.ac ACLOCAL_AMFLAGS = -I ./config SUBDIRS = src config doc: cd doc; $(MAKE) .PHONY: libraries: $(MAKE) -C src libraries world: $(MAKE) -C src/madness/world libraries tinyxml: $(MAKE) -C src/madness/external/tinyxml libraries muparser: $(MAKE) -C src/madness/external/muParser libraries misc: world $(MAKE) -C src/madness/misc libraries tensor: misc $(MAKE) -C src/madness/tensor libraries mra: $(MAKE) -C src libraries chem: mra $(MAKE) -C src/apps/chem libraries install-libraries: $(MAKE) -C src install-libraries $(MAKE) -C config install install-madinclude: $(MAKE) -C src/madness install-thisincludeHEADERS install-world: install-madinclude $(MAKE) -C src/madness/world install-libraries install-tinyxml: install-madinclude $(MAKE) -C src/madness/external/tinyxml install-libraries install-muparser: install-madinclude $(MAKE) -C src/madness/external/muParser install-libraries install-misc: install-world $(MAKE) -C src/madness/misc install-libraries install-tensor: install-misc $(MAKE) -C src/madness/tensor install-libraries install-mra: install-tensor install-muparser install-tinyxml $(MAKE) -C src install-thisincludeHEADERS $(MAKE) -C src/madness/mra install-libraries $(MAKE) -C config install install-chem: install-libraries madness-0.10/README.md000066400000000000000000000047101253377057000143560ustar00rootroot00000000000000madness ======= Multiresolution Adaptive Numerical Environment for Scientific Simulation # Summary MADNESS provides a high-level environment for the solution of integral and differential equations in many dimensions using adaptive, fast methods with guaranteed precision based on multi-resolution analysis and novel separated representations. There are three main components to MADNESS. At the lowest level is a new petascale parallel programming environment that increases programmer productivity and code performance/scalability while maintaining backward compatibility with current programming tools such as MPI and Global Arrays. The numerical capabilities built upon the parallel tools provide a high-level environment for composing and solving numerical problems in many (1-6+) dimensions. Finally, built upon the numerical tools are new applications with initial focus upon chemistry, atomic and molecular physics, material science, and nuclear structure. Please look in the [wiki](https://github.com/m-a-d-n-e-s-s/madness/wiki) for more information and project activity. Here's a [video](http://www.youtube.com/watch?v=dBwWjmf5Tic) about MADNESS. # Funding The developers gratefully acknowledge the support of the Department of Energy, Office of Science, Office of Basic Energy Sciences and Office of Advanced Scientific Computing Research, under contract DE-AC05-00OR22725 with Oak Ridge National Laboratory. The developers gratefully acknowledge the support of the National Science Foundation under grant 0509410 to the University of Tennessee in collaboration with The Ohio State University (P. Sadayappan). The MADNESS parallel runtime and parallel tree-algorithms include concepts and software developed under this project. The developers gratefully acknowledge the support of the National Science Foundation under grant NSF OCI-0904972 to the University of Tennessee. The solid state physics and multiconfiguration SCF capabilities are being developed by this project. The developers gratefully acknowledge the support of the National Science Foundation under grant NSF CHE-0625598 to the University of Tennessee, in collaboration with UIUC/NCSA. Some of the multi-threading and preliminary GPGPU ports were developed by this project. The developers gratefully acknowledge the support of the Defense Advanced Research Projects Agency (DARPA) under subcontract from Argonne National Laboratory as part of the High-Productivity Computer Systems (HPCS) language evaluation project. madness-0.10/autogen.sh000077500000000000000000000001231253377057000150720ustar00rootroot00000000000000#! /bin/sh set -e aclocal -I ./config autoconf autoheader automake --add-missing madness-0.10/bin/000077500000000000000000000000001253377057000136455ustar00rootroot00000000000000madness-0.10/bin/latex2oo000077500000000000000000000035261253377057000153360ustar00rootroot00000000000000#!/bin/csh # This sed script partially automates the conversion # from latex to openoffice focussing on the equations. sed \ -e 's,\\begin{array}, matrix { ,g' \ -e 's,\\begin{eqnarray}, matrix { ,g' \ -e 's,\\begin{eqnarray\*}, matrix { ,g' \ -e 's,\\end{array}, } ,g' \ -e 's,\\end{eqnarray}, } ,g' \ -e 's,\\end{eqnarray\*}, } ,g' \ -e 's,\\left(, left ( ,g' \ -e 's,\\right), right ) ,g' \ -e 's,\\left\\lfloor, left lfloor ,g' \ -e 's,\\right\\rfloor, right rfloor,g' \ -e 's,\\left\\|, left ldline ,g' \ -e 's,\\right\\|, right rdline ,g' \ -e 's,\\left|, left lline ,g' \ -e 's,\\right|, right rline ,g' \ -e 's,\\left\., left none ,g' \ -e 's,\\right\., right none ,g' \ -e 's,& = &, # "=" # alignl ,g' \ -e 's,&=&, # "=" # alignl ,g' \ -e 's,\\asterisk, "*" ,g' \ -e 's,\\dagger, %dagger ,g' \ -e 's,\\ldots, dotslow ,g' \ -e 's,\\alpha, %alpha ,g' \ -e 's,\\beta, %beta ,g' \ -e 's,\\gamma, %gamma ,g' \ -e 's,\\delta, %delta ,g' \ -e 's,\\epsilon, %epsilon ,g' \ -e 's,\\mu, %mu ,g' \ -e 's,\\nu, %nu ,g' \ -e 's,\\omega, %omega ,g' \ -e 's,\\tau, %tau ,g' \ -e 's,\\phi, %phi ,g' \ -e 's,\\psi, %psi ,g' \ -e 's,\\sigma, %sigma ,g' \ -e 's,\\lambda, %lambda ,g' \ -e 's,\\approx, simeq ,g' \ -e 's,\\tmop, nitalic,g' \ -e 's,\\text, nitalic,g' \ -e 's,\\mathd, d,g' \ -e 's,\\sum_, sum from ,g' \ -e 's,\\int_, int from ,g' \ -e 's,\\infty, infinity ,g' \ -e 's,\\oplus, %oplus ,g' \ -e 's,\\\\, ## ,g' \ -e 's,\\nonumber,,g' \ -e 's,\\begin{equation},,g' \ -e 's,\\end{equation},,g' \ -e 's,\\\[,,g' \ -e 's,\\\],,g' \ -e 's,\\sqrt,sqrt,g' \ -e 's,\\leqslant,<=,g' \ -e 's,\\frac,frac,g' \ -e 's,\$,,g' \ -e 's,&, # ,g' \ -e 's,\\, ,g' madness-0.10/bin/taskprofile.pl000066400000000000000000000301101253377057000165200ustar00rootroot00000000000000#!/usr/bin/perl # # This file is part of MADNESS. # # Copyright (C) 2012 Jinmei Zhang, Edward Valeev # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # For more information please contact: # # Robert J. Harrison # Oak Ridge National Laboratory # One Bethel Valley Road # P.O. Box 2008, MS-6367 # # email: harrisonrj@ornl.gov # tel: 865-241-3937 # fax: 865-572-0680 # # $Id$ # &usage() if $#ARGV == -1; my %func_counts = (); my %func_tottime = (); my %func_lattime = (); my @funcs; my $run_time = 0; my $tot_tasktime = 0; my %have_thread = {}; # to produce time-histograms set this to 1 and adjust the number of timeslices, if needed my $make_histo = 1; my $nslices = 100; my $skip_shorter_than = 0.00; # skip if task shorter than this fraction of the timeslice # use this search pattern to only include particular tasks in the running histogram my %runninghisto_includes = ( "all" => ".*" # account all tasks # add additional patterns below, e.g. for ta_dgemm these are useful , "contract" => "contract" , "bcasth" => "bcast_.*_handler" , "bcastt" => "BcastTask" ); foreach $file (@ARGV) { # Open file open( INFILE, "<$file" ) || die "cannot open file $file"; # Collect data for each task in the file while () { s/\n//g; my @line = split("\t"); # get task data my $thread = $line[0]; # Thread that the task ran on my $func_address = $line[1]; # Address of the task function my $func_name = $line[2]; # Name of task function my $func_nthreads = $line[3]; # Number of threads used by the task my $func_submit = $line[4]; # Task submit time my $func_start = $line[5]; # Task start time my $func_finish = $line[6]; # Task finish time # mark presence of this thread $have_thread{$thread} = 1; # Calculate task run time and delay time my $func_time = ( $func_finish - $func_start ) * $func_nthreads; my $func_delaytime = $func_start - $func_submit; # Accumulate function statistics if ( $func_counts{$func_name} == 0 ) { push( @funcs, $func_name ); $func_counts{$func_name} = 1; $func_tottime{$func_name} = $func_time; $func_lattime{$func_name} = $func_delaytime; } else { ++$func_counts{$func_name}; $func_tottime{$func_name} += $func_time; $func_lattime{$func_name} += $func_delaytime; } # Get the run time if ( $run_time < $func_finish ) { $run_time = $func_finish; } # Accumulate total function run time $tot_functime += $func_time; } close INFILE; } printf "\nWall time: %10.6f (s)\n", $run_time; printf "Total task time: %10.6f (s)\n\n", $tot_functime; printf "%40s\n", "Time (s)"; printf "%-5s %7s %7s %10s %5s %10s\n", "name", "counts", "total", "average", "%", "latency"; foreach $func ( sort by_time ( keys(%func_tottime) ) ) { my $counts = $func_counts{$func}; my $tottime = $func_tottime{$func}; my $avgtime = $tottime / $counts; my $lattime = $func_lattime{$func} / $counts; my $perctime = $tottime / $run_time * 100; printf "%s\n %10d %10.7f %10.7f %5d %10.7f\n", $func, $counts, $tottime, $avgtime, $perctime, $lattime; } print "\n"; # # time histograms # if ($make_histo) { use POSIX; my $timestep = $run_time / $nslices; # time-weighted my @pendinghisto = (); foreach $ts ( 0 .. $nslices - 1 ) { push @pendinghisto, 0.0; } my @inprogresshisto = (); foreach $ts ( 0 .. $nslices - 1 ) { push @inprogresshisto, 0.0; } my %runninghistos = {}; foreach my $key (keys %runninghisto_includes) { my $values = []; foreach $ts ( 0 .. $nslices - 1 ) { push @$values, 0.0; } $runninghistos{$key} = $values; } # not time-weighted my @spawnedhisto = (); foreach $ts ( 0 .. $nslices - 1 ) { push @spawnedhisto, 0.0; } my @startedhisto = (); foreach $ts ( 0 .. $nslices - 1 ) { push @startedhisto, 0.0; } my @retiredhisto = (); foreach $ts ( 0 .. $nslices - 1 ) { push @retiredhisto, 0.0; } # this is the list of tasks active for each time slice # each entry is (ref to) an array of tasks active (not necessarily actively running, i.e. could be waiting for # subtasks to finish) my @inprogresstasks = (); foreach my $slice (0 .. $nslices-1) { $inprogresstasks[$slice] = []; } foreach $file (@ARGV) { # Open file open( INFILE, "<$file" ) || die "cannot open file $file"; # this will keep running tasks for each thread for deeper processing my %runningtasks = {}; foreach my $threadid (keys %have_thread) { $runningtasks{$threadid} = []; } # Collect data for each task in the file while () { s/\n//g; my @line = split("\t"); my $thread = $line[0]; # Thread that the task ran on my $func_name = $line[2]; # Name of task function my $func_nthreads = $line[3]; # Number of threads used by the task # die("don't know how to handle a multithreaded task") # if ( $func_nthreads != 1 ); my $time_submit = $line[4]; # Task submit time my $time_start = $line[5]; # Task start time my $time_finish = $line[6]; # Task finish time my $tasklist = $runningtasks{$thread}; my $task = { "start" => $time_start, "finish" => $time_finish, "name" => $func_name, "thread" => $thread }; push @$tasklist, $task; if ( $time_start - $time_submit > $timestep * $skip_shorter_than ) { my $submit_ts_0 = floor( $time_submit / $timestep ); my $submit_ts_1 = floor( $time_start / $timestep ); if ( $submit_ts_0 != $submit_ts_1 ) { $pendinghisto[$submit_ts_0] += ( $submit_ts_0 + 1 ) - ( $time_submit / $timestep ); foreach my $ts ( $submit_ts_0 + 1 .. $submit_ts_1 - 1 ) { $pendinghisto[$ts] += 1; } $pendinghisto[$submit_ts_1] += ( $time_start / $timestep ) - $submit_ts_1; } else { $pendinghisto[$submit_ts_0] += ( $time_start - $time_submit ) / $timestep; } } if ( $time_finish - $time_start > $timestep * $skip_shorter_than ) { my $start_ts_0 = floor( $time_start / $timestep ); my $start_ts_1 = floor( $time_finish / $timestep ); if ( $start_ts_0 != $start_ts_1 ) { $inprogresshisto[$start_ts_0] += ( $start_ts_0 + 1 ) - ( $time_start / $timestep ); push @{ $$inprogresstasks[$start_ts_0] }, $task; foreach my $ts ( $start_ts_0 + 1 .. $start_ts_1 - 1 ) { $inprogresshisto[$ts] += 1; push @{ $$inprogresstasks[$ts] }, $task; } $inprogresshisto[$start_ts_1] += ( $time_finish / $timestep ) - $start_ts_1; push @{ $$inprogresstasks[$start_ts_1] }, $task; } else { $inprogresshisto[$start_ts_0] += ( $time_finish - $time_start ) / $timestep; push @{ $$inprogresstasks[$start_ts_0] }, $task; } } my $spawned_ts = floor( $time_submit / $timestep ); $spawnedhisto[$spawned_ts] += 1; my $started_ts = floor( $time_start / $timestep ); $startedhisto[$started_ts] += 1; my $stop_ts = floor( $time_finish / $timestep ); $retiredhisto[$stop_ts] += 1; } close INFILE; # compute the time-averaged # of running foreach my $threadid (keys %runningtasks) { # printf STDOUT "threadid = %d\n", $threadid; my $tasklistref = $runningtasks{$threadid}; my @tasklist = sort { $$a{"start"} <=> $$b{"start"} } @$tasklistref; my $time_finish_last = 0.0; foreach my $taskptr (@tasklist) { my %task = %$taskptr; my $time_start = $task{"start"}; my $time_finish = $task{"finish"}; my $func_name = $task{"name"}; # printf STDOUT "%lf %lf\n", $task{"start"}, $task{"finish"}; if ($time_finish > $time_finish_last) { if ($time_start < $time_finish_last) { $time_start = $time_finish_last; } my $start_ts_0 = floor( $time_start / $timestep ); my $start_ts_1 = floor( $time_finish / $timestep ); if ( $start_ts_0 != $start_ts_1 ) { foreach my $key (keys %runninghisto_includes) { my $func_name_pattern = $runninghisto_includes{$key}; if ($func_name =~ /$func_name_pattern/) { ${ $runninghistos{$key} }[$start_ts_0] += ( $start_ts_0 + 1 ) - ( $time_start / $timestep ); foreach my $ts ( $start_ts_0 + 1 .. $start_ts_1 - 1 ) { ${ $runninghistos{$key} }[$ts] += 1; } ${ $runninghistos{$key} }[$start_ts_1] += ( $time_finish / $timestep ) - $start_ts_1; } } } else { foreach my $key (keys %runninghisto_includes) { my $func_name_pattern = $runninghisto_includes{$key}; if ($func_name =~ /$func_name_pattern/) { ${ $runninghistos{$key} }[$start_ts_0] += ( $time_finish - $time_start ) / $timestep; } } } $time_finish_last = $time_finish; } } } # print out the histograms for this trace printf STDOUT "file = %s\n", $file; printf STDOUT "timestep = %lf\n", $timestep; printf STDOUT "# ts nspawned nstarted nretired\n"; foreach my $ts ( 0 .. $nslices - 1 ) { printf STDOUT "%d\t%lf\t%lf\t%lf\n", $ts, $spawnedhisto[$ts], $startedhisto[$ts], $retiredhisto[$ts]; } # printf STDOUT "time-weighted # of waiting\n"; # foreach my $ts ( 0 .. $nslices - 1 ) { # printf STDOUT "%d %lf\n", $ts, $pendinghisto[$ts]; # } # printf STDOUT "time-weighted # of in-progress\n"; # foreach my $ts ( 0 .. $nslices - 1 ) { # printf STDOUT "%d %lf\n", $ts, $inprogresshisto[$ts]; # } # list keys in a reasonable order (keys returns keys in a weird order) # "all" first, then the rest my @running_keys = ("all"); foreach my $key (keys %runninghisto_includes) { push @running_keys, $key if $key ne "all"; } printf STDOUT "# ts, pend"; foreach my $key (@running_keys) { printf STDOUT ", run_%s", $key; } printf STDOUT "\n"; foreach my $ts ( 0 .. $nslices - 1 ) { printf STDOUT "%d", $ts; printf STDOUT ", %lf", $pendinghisto[$ts]; foreach my $key (@running_keys) { printf STDOUT ", %lf", ${ $runninghistos{$key} }[$ts]; } printf STDOUT "\n"; } my $active_tasks_fname = "$file.active_tasks.log"; open ATFILE, ">$active_tasks_fname" || die "could not open $active_tasks_fname"; foreach my $ts (0 .. $nslices-1) { printf ATFILE "============================ ts %d: [%lf,%lf] =============================\n", $ts, $ts*$timestep, ($ts+1)*$timestep; foreach my $taskref (sort { $$a{"thread"} <=> $$b{"thread"} } @{ $$inprogresstasks[$ts] }) { printf ATFILE "%d %s %lf %lf\n", $$taskref{"thread"}, $$taskref{"name"}, $$taskref{"start"}, $$taskref{"finish"}; } } close ATFILE; } } exit 0; sub usage { printf STDERR "taskprofile.pl converts trace files info a summary profile. Usage:\n"; printf STDERR " taskprofile.pl [ ... ]\n"; exit 0; } sub by_time { $func_tottime{$b} <=> $func_tottime{$a}; } madness-0.10/ci/000077500000000000000000000000001253377057000134705ustar00rootroot00000000000000madness-0.10/ci/build-linux.sh000077500000000000000000000016751253377057000162740ustar00rootroot00000000000000#! /bin/sh # Exit on error set -ev # Environment variables export CXXFLAGS="-std=c++11 -mno-avx" export CPPFLAGS=-DDISABLE_SSE3 if [ "$CXX" = "g++" ]; then export CC=/usr/bin/gcc-$GCC_VERSION export CXX=/usr/bin/g++-$GCC_VERSION fi export F77=/usr/bin/gfortran-$GCC_VERSION export MPICH_CC=$CC export MPICH_CXX=$CXX export MPICC=/usr/bin/mpicc.mpich2 export MPICXX=/usr/bin/mpicxx.mpich2 export LD_LIBRARY_PATH=/usr/lib/lapack:/usr/lib/openblas-base:$LD_LIBRARY_PATH # Configure and build MADNESS ./autogen.sh ./configure \ --enable-debugging --disable-optimization --enable-warning --disable-optimal \ --with-google-test \ --enable-never-spin \ LIBS="-L/usr/lib/lapack -L/usr/lib/libblas -llapack -lblas -lpthread" if [ "$RUN_TEST" = "buildonly" ]; then # Build all libraries, examples, and applications make -j2 all else # Run unit tests export MAD_NUM_THREADS=2 make -C src/madness/$RUN_TEST -j2 check fi madness-0.10/ci/build-osx.sh000077500000000000000000000002071253377057000157340ustar00rootroot00000000000000#! /bin/sh # Exit on error set -e # Configure MADNESS ./autogen.sh ./configure --enable-never-spin make # Run unit tests make checkmadness-0.10/ci/dep-linux.sh000077500000000000000000000017061253377057000157400ustar00rootroot00000000000000#! /bin/sh # Exit on error set -ev # Add repository for libxc #sudo add-apt-repository ppa:hogliux/misstep -y # Add repository for a newer version GCC sudo add-apt-repository ppa:ubuntu-toolchain-r/test -y # Update package list sudo apt-get update -qq # Install packages sudo apt-get install -qq -y gcc-$GCC_VERSION g++-$GCC_VERSION gfortran-$GCC_VERSION if [ "$CXX" = "g++" ]; then export CC=/usr/bin/gcc-$GCC_VERSION export CXX=/usr/bin/g++-$GCC_VERSION fi export FC=/usr/bin/gfortran-$GCC_VERSION # Print compiler information $CC --version $CXX --version $FC --version # Install libxc wget -O libxc-2.2.1.tar.gz "http://www.tddft.org/programs/octopus/down.php?file=libxc/libxc-2.2.1.tar.gz" tar -xzf libxc-2.2.1.tar.gz cd libxc-2.2.1 autoreconf -i ./configure --prefix=/usr/local CFLAGS="-mno-avx" FCFLAGS="-mno-avx" make -j2 sudo make install sudo apt-get install -qq -y cmake libblas-dev liblapack-dev libgoogle-perftools-dev mpich2 libtbb-dev madness-0.10/ci/dep-osx.sh000077500000000000000000000001511253377057000154030ustar00rootroot00000000000000#! /bin/sh # Exit on error set -e brew update brew install autoconf automake libtool cmake libxc mpich2madness-0.10/config/000077500000000000000000000000001253377057000143425ustar00rootroot00000000000000madness-0.10/config/MADNESS.pc.in000066400000000000000000000007531253377057000163720ustar00rootroot00000000000000prefix=@prefix@ exec_prefix=@exec_prefix@ libdir=@libdir@ includedir=@includedir@ cc=@CC@ cxx=@CXX@ cppflags=@CPPFLAGS@ cxxflags=@CXXFLAGS@ ldflags=@LDFLAGS@ libs=@LIBS@ fortran_integer_width=@MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE@ Name: MADNESS Description: Multiresolution Adaptive Numerical Environment for Scientific Simulation Version: @PACKAGE_VERSION@ Libs: -L${libdir} -lMADmra -lMADlinalg -lMADtensor -lMADmisc -lMADmuparser -lMADtinyxml -lMADworld ${libs} Cflags: -I${includedir} madness-0.10/config/MakeGlobal.am000066400000000000000000000036711253377057000166660ustar00rootroot00000000000000# Define paths for includes (note convention #include ) AM_CPPFLAGS = -I$(top_srcdir)/src -I$(top_builddir)/src -I$(top_srcdir)/src/apps # Define directories holding libraries and variables for corresponding libraries LIBGTESTDIR=$(top_builddir)/src/madness/external/gtest LIBWORLDDIR=$(top_builddir)/src/madness/world LIBTENSORDIR=$(top_builddir)/src/madness/tensor LIBMISCDIR=$(top_builddir)/src/madness/misc LIBMRADIR=$(top_builddir)/src/madness/mra LIBCHEMDIR=$(top_builddir)/src/apps/chem LIBTINYXMLDIR=$(top_builddir)/src/madness/external/tinyxml LIBMUPARSERDIR=$(top_builddir)/src/madness/external/muParser LIBGTEST=$(LIBGTESTDIR)/libMADgtest.a LIBWORLD=$(LIBWORLDDIR)/libMADworld.a LIBTENSOR=$(LIBTENSORDIR)/libMADtensor.a LIBLINALG=$(LIBTENSORDIR)/libMADlinalg.a LIBMISC=$(LIBMISCDIR)/libMADmisc.a LIBMRA=$(LIBMRADIR)/libMADmra.a LIBCHEM=$(LIBCHEMDIR)/libMADchem.a LIBTINYXML=$(LIBTINYXMLDIR)/libMADtinyxml.a LIBMUPARSER=$(LIBMUPARSERDIR)/libMADmuparser.a # Most scientific/numeric applications will link against these libraries MRALIBS=$(LIBMRA) $(LIBLINALG) $(LIBTENSOR) $(LIBMISC) $(LIBMUPARSER) $(LIBTINYXML) $(LIBWORLD) LIBGTEST_CPPFLAGS = $(GTEST_CPPFLAGS) -DGTEST_HAS_PTHREAD=1 -isystem $(top_srcdir)/src/madness/external/gtest/include $(AM_CPPFLAGS) LIBGTEST_CXXFLAGS = $(GTEST_CXXFLAGS) $(AM_CXXFLAGS) LIBGTEST_LIBS = $(GTEST_LDFLAGS) $(GTEST_LIBS) # External library targets $(LIBGTEST): make -C $(LIBGTESTDIR) libMADgtest.a $(LIBTINYXML): make -C $(LIBTINYXMLDIR) libMADtinyxml.a $(LIBMUPARSER): make -C $(LIBMUPARSERDIR) libMADmuparser.a # MADNESS library targets $(LIBWORLD): make -C $(LIBWORLDDIR) libMADworld.a $(LIBMISC): $(LIBWORLD) make -C $(LIBMISCDIR) libMADmisc.a $(LIBTENSOR): $(LIBMISC) make -C $(LIBTENSORDIR) libMADtensor.a $(LIBLINALG): $(LIBTENSOR) make -C $(LIBTENSORDIR) libMADlinalg.a $(LIBMRA): $(LIBLINALG) make -C $(LIBMRADIR) libMADmra.a $(LIBCHEM): $(LIBMRA) make -C $(LIBCHEMDIR) libMADchem.a madness-0.10/config/Makefile.am000066400000000000000000000003761253377057000164040ustar00rootroot00000000000000# MADNESS configuration can be queried via (hopefully) standard pkg-config pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = MADNESS.pc # MADNESS CMake configuration file cmakeconfigdir = $(libdir)/CMake/Madness cmakeconfig_DATA = madness-config.cmakemadness-0.10/config/Makefile.sample.in000066400000000000000000000023151253377057000176700ustar00rootroot00000000000000 # These variables substituted by configure TRUNK = @abs_top_srcdir@ CXX = @CXX@ CXXFLAGS = @CXXFLAGS@ CPPFLAGS = @CPPFLAGS@ -I$(TRUNK)/include -I$(TRUNK)/src -I$(TRUNK)/src/apps LDFLAGS = @LDFLAGS@ LIBS = @LIBS@ # Directories holding libraries LIBWORLDDIR=$(TRUNK)/src/madness/world LIBTENSORDIR=$(TRUNK)/src/madness/tensor LIBMISCDIR=$(TRUNK)/src/madness/misc LIBMRADIR=$(TRUNK)/src/madness/mra LIBTINYXMLDIR=$(TRUNK)/src/madness/external/tinyxml LIBMUPARSERDIR=$(TRUNK)/src/madness/external/muParser # Individual libraries LIBWORLD=$(LIBWORLDDIR)/libMADworld.a LIBTENSOR=$(LIBTENSORDIR)/libMADtensor.a LIBLINALG=$(LIBTENSORDIR)/libMADlinalg.a LIBMISC=$(LIBMISCDIR)/libMADmisc.a LIBMRA=$(LIBMRADIR)/libMADmra.a LIBTINYXML=$(LIBTINYXMLDIR)/libMADtinyxml.a LIBMUPARSER=$(LIBMUPARSERDIR)/libMADmuparser.a # Most scientific/numeric applications will link against these libraries MRALIBS=$(LIBMRA) $(LIBLINALG) $(LIBTENSOR) $(LIBMISC) $(LIBMUPARSER) \ $(LIBTINYXML) $(LIBWORLD) # This to enable implicit Gnumake rule for linking from single source LDLIBS := $(MRALIBS) $(LIBS) # Define your targets below here ... this is just an example OBJ = a.o b.o c.o myprog: $(OBJ) $(CXX) -o $@ $^ $(LDLIBS) madness-0.10/config/acx_check_compiler_flags.m4000066400000000000000000000010551253377057000215630ustar00rootroot00000000000000#ACX_CHECK_COMPILER_FLAGS(compiler, flag_var, flag, success, fail) AC_DEFUN([ACX_CHECK_COMPILER_FLAG], [ AC_LANG_SAVE AC_LANG([$1]) acx_check_compiler_flags="no" acx_check_compiler_flags_save=$[$2] [$2]="$3" AC_MSG_CHECKING([whether $1 compiler accepts $3]) AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],[])], [ acx_check_compiler_flags="yes" AC_MSG_RESULT([yes]) ], [ AC_MSG_RESULT([no]) ]) [$2]=$acx_check_compiler_flags_save AC_LANG_RESTORE AS_IF([test $acx_check_compiler_flags != no], [$4], [$5]) ])madness-0.10/config/acx_check_tls.m4000066400000000000000000000025731253377057000174050ustar00rootroot00000000000000AC_DEFUN([ACX_CHECK_TLS],[ # Check for thread local storage support. # thread_local, __thread_local, __declspec(thread) AC_LANG_PUSH([C++]) # Check for shared_ptr in std namespace AC_MSG_CHECKING([for C++ thread_local keyword]) acx_check_tls=no # Check for the key word thread_local (C++11) AC_COMPILE_IFELSE( [ AC_LANG_PROGRAM( [[thread_local int i = 0;]], [[i = 1;]] ) ], [acx_check_tls="thread_local"] ) # Check for the key word __thread if test "$acx_check_tls" = no; then AC_COMPILE_IFELSE( [ AC_LANG_PROGRAM( [[__thread int i = 0;]], [[i = 1;]] ) ], [ acx_check_tls="__thread" AC_DEFINE([thread_local],[__thread],[Define the thread_local key word.]) ] ) fi # Check for the key word __declspec(thread) # if test "$acx_check_tls" = no; then # AC_COMPILE_IFELSE( # [ # AC_LANG_PROGRAM( # [[__declspec(thread) int i = 0;]], # [[i = 1;]] # ) # ], # [ # acx_check_tls="__declspec(thread)" # AC_DEFINE([thread_local],[__declspec(thread)],[Define the thread_local key word.]) # ] # ) # fi if test "$acx_check_tls" = no; then AC_DEFINE([thread_local],[],[Define the thread_local key word.]) fi AC_MSG_RESULT([$acx_check_tls]) AC_LANG_POP ]) madness-0.10/config/acx_crayxe.m4000066400000000000000000000037051253377057000167370ustar00rootroot00000000000000AC_DEFUN([ACX_CRAYXE], [ # If on a Cray XE # - defines HAVE_CRAYXE=1 in headers # - defines HAVE_CRAYXE=yes in the script # - sets MPICXX=CC and MPICC=cc if the user has not already set them # - sets thread binding to "1 0 2" TODO: this has to be wrong on AMD Magny Cours # - enables spinlocks echo "int main()" > __crayxe.cc echo "{" >> __crayxe.cc echo "#ifdef __CRAYXE" >> __crayxe.cc echo "return 0;" >> __crayxe.cc echo "#else" >> __crayxe.cc echo "choke me" >> __crayxe.cc echo "#endif" >> __crayxe.cc echo "}" >> __crayxe.cc CC __crayxe.cc >& /dev/null if test $? = 0; then AC_MSG_NOTICE([Cray XE detected]) HAVE_CRAYXE=yes AC_DEFINE(HAVE_CRAYXE,[1],[Defined if we are running on an Cray XE]) fi /bin/rm __crayxe.cc if test "x$HAVE_CRAYXE" = xyes; then AC_DEFINE(AMD_QUADCORE_TUNE,[1],[Target for tuning mtxmq kernels]) if test "x$MPICC" = x; then AC_MSG_NOTICE([Choosing MPICC=cc for Cray XE]) MPICC=cc; fi if test "x$MPICXX" = x; then AC_MSG_NOTICE([Choosing MPICXX=CC for Cray XE]) MPICXX=CC; fi echo "int main(){return 0;}" > __acml.cc CC __acml.cc -lacml >& /dev/null if test $? = 0; then AC_MSG_NOTICE([AMD ACML library detected]) LIBS="$LIBS -lacml" AC_DEFINE(HAVE_ACML,[1],[Define if AMD math library available - ACML]) fi /bin/rm __acml.cc BIND="1 0 2" AC_DEFINE(USE_SPINLOCKS, [1], [Define if should use spinlocks]) fi ]) madness-0.10/config/acx_crayxt.m4000066400000000000000000000036271253377057000167610ustar00rootroot00000000000000AC_DEFUN([ACX_CRAYXT], [ # If on a Cray XT # - defines HAVE_CRAYXT=1 in headers # - defines HAVE_CRAYXT=yes in the script # - sets MPICXX=CC and MPICC=cc if the user has not already set them # - sets thread binding to "1 0 2" # - enables spinlocks echo "int main()" > __crayxt.cc echo "{" >> __crayxt.cc echo "#ifdef __CRAYXT" >> __crayxt.cc echo "return 0;" >> __crayxt.cc echo "#else" >> __crayxt.cc echo "choke me" >> __crayxt.cc echo "#endif" >> __crayxt.cc echo "}" >> __crayxt.cc CC __crayxt.cc >& /dev/null if test $? = 0; then AC_MSG_NOTICE([Cray XT detected]) HAVE_CRAYXT=yes AC_DEFINE(HAVE_CRAYXT,[1],[Defined if we are running on an Cray XT]) fi /bin/rm __crayxt.cc if test "x$HAVE_CRAYXT" = xyes; then AC_DEFINE(AMD_QUADCORE_TUNE,[1],[Target for tuning mtxmq kernels]) if test "x$MPICC" = x; then AC_MSG_NOTICE([Choosing MPICC=cc for Cray XT]) MPICC=cc; fi if test "x$MPICXX" = x; then AC_MSG_NOTICE([Choosing MPICXX=CC for Cray XT]) MPICXX=CC; fi echo "int main(){return 0;}" > __acml.cc CC __acml.cc -lacml >& /dev/null if test $? = 0; then AC_MSG_NOTICE([AMD ACML library detected]) LIBS="$LIBS -lacml" AC_DEFINE(HAVE_ACML,[1],[Define if AMD math library available - ACML]) fi /bin/rm __acml.cc BIND="1 0 2" AC_DEFINE(USE_SPINLOCKS, [1], [Define if should use spinlocks]) fi ]) madness-0.10/config/acx_detect_cxx.m4000066400000000000000000000026211253377057000175720ustar00rootroot00000000000000AC_DEFUN([ACX_DETECT_CXX], [ # Sets environment variable CXXVENDOR to one of # [GNU,Intel,Portland,Pathscale,IBM,unknown] AC_CACHE_CHECK([compiler vendor], [acx_cv_detect_cxx], [ acx_cv_detect_cxx=unknown if test $acx_cv_detect_cxx = unknown; then $CXX --version 2>&1 | egrep -q "clang" if test $? = 0; then acx_cv_detect_cxx=clang fi fi if test $acx_cv_detect_cxx = unknown; then $CXX --version 2>&1 | egrep -q "GCC|GNU|gcc|gnu|g\+\+|Free S" if test $? = 0; then acx_cv_detect_cxx=GNU fi fi if test $acx_cv_detect_cxx = unknown; then $CXX --version 2>&1 | grep -q "Intel" if test $? = 0; then acx_cv_detect_cxx=Intel fi fi if test $acx_cv_detect_cxx = unknown; then $CXX --version 2>&1 | grep -q "Portland" if test $? = 0; then acx_cv_detect_cxx=Portland fi fi if test $acx_cv_detect_cxx = unknown; then $CXX -v 2>&1 | grep -q "Pathscale" if test $? = 0; then acx_cv_detect_cxx=Pathscale fi fi if test $acx_cv_detect_cxx = unknown; then $CXX -qversion 2>&1 | grep -q "IBM" if test $? = 0; then acx_cv_detect_cxx=IBM fi fi ]) CXXVENDOR="$acx_cv_detect_cxx" ]) madness-0.10/config/acx_enable_debugging.m4000066400000000000000000000026721253377057000207070ustar00rootroot00000000000000# This function is used to add debug flags (e.g. -g) to CFLAGS and CXXFLAGS # environment variables. Users are expected to specify their own debug flags for # special use cases by adding appropriate values to CFLAGS and CXXFLAGS. AC_DEFUN([ACX_ENABLE_DEBUGGING], [ acx_enable_debugging="no" acx_enable_debugging_flags="" # Allow the user to enable or disable debugging flag AC_ARG_ENABLE([debugging], [AC_HELP_STRING([--enable-debugging@<:@=yes|no|LEVEL@:>@], [Enable debugging C and C++ compilers. You can also specify debug level (e.g. 3). @<:@default=no@:>@]) ], [ case $enableval in yes) acx_enable_debugging="yes" acx_enable_debugging_flags="-g" ;; no) ;; *) acx_enable_debugging="yes" acx_enable_debugging_flags="-g$enableval" ;; esac ]) # Test the flags and add them to flag variables if successful. if test $acx_enable_debugging != no; then ACX_CHECK_COMPILER_FLAG([C], [CFLAGS], [$acx_enable_debugging_flags], [CFLAGS="$CFLAGS $acx_enable_debugging_flags"], [AC_MSG_WARN([$CC does not accept $acx_enable_debugging_flags, no debugging flags will be used.])]) ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [$acx_enable_debugging_flags], [CXXFLAGS="$CXXFLAGS $acx_enable_debugging_flags"], [AC_MSG_WARN([$CXX does not accept $acx_enable_debugging_flags, no debugging flags will be used.])]) fi ])madness-0.10/config/acx_enable_gentensor.m4000066400000000000000000000024571253377057000207610ustar00rootroot00000000000000# This function is used to add the gentensor flags and CXXFLAGS. These flags are # necessary for the correlated quantum chemistry (e.g. mp2), but gentensor # can't handle complex tensors. In this case disable gentensor (default behavior) AC_DEFUN([ACX_ENABLE_GENTENSOR], [ acx_enable_gentensor="no" acx_enable_gentensor_flags="" # Allow the user to enable or disable gentensor flag AC_ARG_ENABLE([gentensor], [AC_HELP_STRING([--enable-gentensor@<:@=yes|no], [Enable gentensor C and C++ compilers.]) ], [ case $enableval in yes) acx_enable_gentensor="yes" acx_enable_gentensor_flags="-DUSE_GENTENSOR" ;; no) ;; *) ;; esac ]) # Test the flags and add them to flag variables if successful. if test $acx_enable_gentensor != no; then ACX_CHECK_COMPILER_FLAG([C], [CFLAGS], [$acx_enable_gentensor_flags], [CFLAGS="$CFLAGS $acx_enable_gentensor_flags"], [AC_MSG_WARN([$CC does not accept $acx_enable_gentensor_flags, no gentensor flags will be used.])]) ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [$acx_enable_gentensor_flags], [CXXFLAGS="$CXXFLAGS $acx_enable_gentensor_flags"], [AC_MSG_WARN([$CXX does not accept $acx_enable_gentensor_flags, no gentensor flags will be used.])]) fi ]) madness-0.10/config/acx_enable_optimal.m4000066400000000000000000000113411253377057000204120ustar00rootroot00000000000000# This function is used to add performace or system specific compiler flags to # to CFLAGS and CXXFLAGS environment variables. Users are expected to specify # their own optimization flags for unknown compilers, unknown systems, or # special use cases by adding appropriate values to CFLAGS and CXXFLAGS. It # should not be used for warning (e.g. -Wall), optimization (e.g. -O3), or debug # (e.g. -g) flags. Those handled in acx_enable_warn.m4, acx_enable_optimization.m4, # and acx_enable_debugging.m4 respectively. AC_DEFUN([ACX_ENABLE_OPTIMAL], [ acx_enable_optimal="" acx_enable_optimal_save_cxxflags="$CXXFLAGS" acx_enable_optimal_flags="" acx_enable_optimal_compiler="$CXXVENDOR" # Allow the user to enable or disable optimal flag AC_ARG_ENABLE([optimal], [AC_HELP_STRING([--enable-optimal@<:@=yes|no|GNU|clang|Pathscale|Portland|Intel|IBM@:>@], [Auto detect optimal CXXFLAGS for compiler and known systems.@<:@default=yes@:>@])], [ case $enableval in yes) acx_enable_optimal="yes" ;; no) acx_enable_optimal="no" ;; *) acx_enable_optimal="yes" acx_enable_optimal_compiler="$enableval" esac ], [acx_enable_optimal="yes"] ) # Set the flags for the specific compilers and systems if test $acx_enable_optimal != "no"; then AC_LANG_SAVE AC_LANG([C++]) case $acx_enable_optimal_compiler in GNU) # Delete trailing -stuff from X.X.X-stuff then parse CXXVERSION=[`$CXX -dumpversion | sed -e 's/-.*//'`] CXXMAJOR=[`echo $CXXVERSION | sed -e 's/\.[.0-9a-zA-Z\-_]*//'`] CXXMINOR=[`echo $CXXVERSION | sed -e 's/[0-9]*\.//' -e 's/\.[0-9]*//'`] CXXMICRO=[`echo $CXXVERSION | sed -e 's/[0-9]*\.[0-9]*\.//'`] AC_MSG_NOTICE([Setting compiler flags for GNU C++ major=$CXXMAJOR minor=$CXXMINOR micro=$CXXMICRO]) # Flags for all GCC variants acx_enable_optimal_flags="$acx_enable_optimal_flags -ffast-math" if test $enable_cpp0x = "yes"; then acx_enable_optimal_flags="$acx_enable_optimal_flags -std=c++0x" fi # Add GCC system specific flags if test "x$HAVE_CRAYXT" = xyes; then ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [-march=barcelona], [acx_enable_optimal_flags="$acx_enable_optimal_flags -march=barcelona"]) elif test "x$HAVE_IBMBGP" = xyes; then acx_enable_optimal_flags="" else ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [-march=native], [acx_enable_optimal_flags="$acx_enable_optimal_flags -march=native"]) fi # Add flags for Intel x86 architectures. case $host_cpu in ??86*) acx_enable_optimal_flags="$acx_enable_optimal_flags -mfpmath=sse -msse -mpc64" ;; esac ;; clang) acx_enable_optimal_flags="$acx_enable_optimal_flags" ;; Pathscale) acx_enable_optimal_flags="$acx_enable_optimal_flags" if test "x$HAVE_CRAYXT" = xyes; then ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [-march=barcelona], [acx_enable_optimal_flags="$acx_enable_optimal_flags -march=barcelona"]) fi ;; Portland) acx_enable_optimal_flags="$acx_enable_optimal_flags -fastsse -Mflushz -Mcache_align -Drestrict=__restrict" AC_MSG_NOTICE([Appending -pgf90libs to LIBS so can link against Fortran BLAS/linalg]) LIBS="$LIBS -pgf90libs" if test "x$HAVE_CRAYXT" = xyes; then ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [-tp barcelona-64], [acx_enable_optimal_flags="$acx_enable_optimal_flags -tp barcelona-64"]) fi ;; Intel) acx_enable_optimal_flags="$acx_enable_optimal_flags -ip -no-prec-div -mkl=sequential -ansi -xHOST" if test $enable_cpp0x = "yes"; then acx_enable_optimal_flags="$acx_enable_optimal_flags -std=c++0x" fi #-use-intel-optimized-headers -fp-model fast=2 -inline-level=2 ;; IBM) acx_enable_optimal_flags="$acx_enable_optimal_flags" if test "x$HAVE_IBMBGP" = xyes; then ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [ -qtune=450 -qarch=450d -qlanglvl=extended], [acx_enable_optimal_flags="$acx_enable_optimal_flags -qtune=450 -qarch=450d -qlanglvl=extended "]) fi ;; *) AC_MSG_WARN([Optimal flags not set for $acx_enable_optimal_compile compiler]) ;; esac # Test the flags and add them to flag variables if successful. ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [$acx_enable_optimal_flags], [CXXFLAGS="$CXXFLAGS $acx_enable_optimal_flags"], [AC_MSG_WARN([$CXX does not accept $acx_enable_optimal_flags, no optimal flags will be used.])]) fi ]) madness-0.10/config/acx_enable_optimization.m4000066400000000000000000000045131253377057000214760ustar00rootroot00000000000000# This function is used to add compiler specific optimization flags (e.g. -O3) # to CFLAGS and CXXFLAGS environment variables. Users are expected to specify # their own optimization flags for unknown compilers or special use cases by # adding appropriate values to CFLAGS and CXXFLAGS. AC_DEFUN([ACX_ENABLE_OPTIMIZATION], [ # Specify the default optimization level for a given compiler. This value is # appended to "-O" in the flag variables. default_optimization="" case $CXXVENDOR in GNU) default_optimization="3" ;; clang) default_optimization="3" ;; Pathscale) default_optimization="fast" ;; Portland) default_optimization="3" ;; Intel) default_optimization="3" ;; IBM) default_optimization="3" ;; *) default_optimization="2" ;; esac acx_enable_optimization="yes" acx_enable_optimization_flags="" # Allow the user to enable or disable optimization flag AC_ARG_ENABLE([optimization], [AC_HELP_STRING([--enable-optimization@<:@=yes|no|OPTION@:>@], [Enable optimization for C and C++ (e.g. -O2) @<:@default=yes@:>@]) ], [ case $enableval in yes) acx_enable_optimization_flags="-O$default_optimization" ;; no) acx_enable_optimization="no" ;; *) acx_enable_optimization_flags="-O$enableval" ;; esac ], [ if test "x$acx_enable_debugging" == xno; then acx_enable_optimization_flags="-O$default_optimization" else acx_enable_optimization_flags="-O0" AC_MSG_WARN([Optimizations is disabled, because debugging is enabled. Add --enable-optimization to overide this behavior.]) fi ] ) # Test the flags and add them to flag variables if successful. if test $acx_enable_optimization != no; then ACX_CHECK_COMPILER_FLAG([C], [CFLAGS], [$acx_enable_optimization_flags], [CFLAGS="$CFLAGS $acx_enable_optimization_flags"], [AC_MSG_WARN([$CC does not accept $acx_enable_optimization_flags, no optimization flags will be used.])]) ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [$acx_enable_optimization_flags], [CXXFLAGS="$CXXFLAGS $acx_enable_optimization_flags"], [AC_MSG_WARN([$CXX does not accept $acx_enable_optimization_flags, no optimization flags will be used.])]) fi ])madness-0.10/config/acx_enable_task_profiler.m4000066400000000000000000000025541253377057000216170ustar00rootroot00000000000000AC_DEFUN([ACX_ENABLE_TASK_PROFILER], [ AC_LANG_SAVE AC_LANG([C++]) acx_enable_task_profiler="" # Allow the user to enable or disable warnings AC_ARG_ENABLE([task-profiler], [AC_HELP_STRING([--enable-task-profiler], [Enable task profiler that collects per-task start and stop times.])], [ case $enableval in yes) acx_enable_task_profiler="yes" ;; no) acx_enable_task_profiler="no" ;; *) acx_enable_task_profiler="yes" esac ], [acx_enable_task_profiler="no"] ) # Automatically specify the warning flags for known compilers. if test $acx_enable_task_profiler != "no"; then AC_CHECK_HEADER([execinfo.h], [], [AC_MSG_ERROR([execinfo.h is required by the task profiler.])]) AC_CHECK_HEADER([cxxabi.h], [], [AC_MSG_ERROR([cxxabi.h is required by the task profiler.])]) AC_DEFINE([MADNESS_TASK_PROFILING],[1],[Define to enable task profiler.]) if test $ON_A_MAC = "no"; then case $acx_enable_optimal_compiler in GNU) LDFLAGS="$LDFLAGS -rdynamic" ;; clang) LDFLAGS="$LDFLAGS -rdynamic" ;; Pathscale) ;; Portland) ;; Intel) ;; IBM) ;; *) ;; esac fi fi AC_LANG_RESTORE ]) madness-0.10/config/acx_enable_warn.m4000066400000000000000000000041301253377057000177120ustar00rootroot00000000000000# This function is used to add compiler specific warning flags to CFLAGS and # CXXFLAGS environment variables. Users are expected to specify their own warning # flags for unknown compilers or special use cases by adding appropriate values # to CFLAGS and CXXFLAGS. AC_DEFUN([ACX_ENABLE_WARN], [ acx_enable_warn="" acx_enable_warn_flags="" acx_enable_warn_compiler="$CXXVENDOR" # Allow the user to enable or disable warnings AC_ARG_ENABLE([warning], [AC_HELP_STRING([--enable-warning@<:@=yes|no|GNU|clang|Pathscale|Portland|Intel|IBM@:>@], [Automatically set warnings for compiler.@<:@default=yes@:>@])], [ case $enableval in yes) acx_enable_warn="yes" ;; no) acx_enable_warn="no" ;; *) acx_enable_warn="yes" acx_enable_warn_compiler="$enableval" esac ], [acx_enable_warn="yes"] ) # Automatically specify the warning flags for known compilers. if test $acx_enable_warn != "no"; then case $acx_enable_warn_compiler in GNU) acx_enable_warn_flags="-Wall -Wno-strict-aliasing -Wno-deprecated -Wno-unused-local-typedefs" ;; clang) acx_enable_warn_flags="-Wall" ;; Pathscale) acx_enable_warn_flags="-Wall" ;; Portland) acx_enable_warn_flags="" ;; Intel) acx_enable_warn_flags="-Wall -diag-disable remark,279,654,1125" ;; IBM) acx_enable_warn_flags="-qflag=w:w" ;; *) AC_MSG_WARN([Warning flags not set for $acx_enable_optimal_compile compiler]) ;; esac # Test the flags and add them to flag variables if successful. ACX_CHECK_COMPILER_FLAG([C], [CFLAGS], [$acx_enable_warn_flags], [CFLAGS="$CFLAGS $acx_enable_warn_flags"], [AC_MSG_WARN([$CC does not accept $acx_enable_warn_flags, no warning flags will be used.])]) ACX_CHECK_COMPILER_FLAG([C++], [CXXFLAGS], [$acx_enable_warn_flags], [CXXFLAGS="$CXXFLAGS $acx_enable_warn_flags"], [AC_MSG_WARN([$CXX does not accept $acx_enable_warn_flags, no warning flags will be used.])]) fi ]) madness-0.10/config/acx_fortran_symbols.m4000066400000000000000000000113001253377057000206550ustar00rootroot00000000000000AC_DEFUN([ACX_FORTRAN_SYMBOLS], [ # Dubiously checks for Fortran linking conventions and BLAS+LAPACK at the same time # mostly to avoid the need for having a fortran compiler installed # Check for no underscore first since IBM BG ESSL seems to define dgemm with/without underscore # but dsyev only without underscore ... but AMD ACML also defines both but with different # interfaces (fortran and c) ... ugh. Hardwire linking for bgp and restore to original order. AC_MSG_NOTICE([Checking Fortran-C linking conventions (dgemm -> ?)]) fsym=no if test $host = "powerpc-bgp-linux-gnu"; then fsym="lc" AC_MSG_NOTICE([Fortran linking convention is $fsym]) AC_DEFINE([FORTRAN_LINKAGE_LC],[1],[Fortran-C linking convention lower case (no underscore)]) fi if test $host = "powerpc64-bgq-linux-gnu"; then fsym="lc" AC_MSG_NOTICE([Fortran linking convention is $fsym]) AC_DEFINE([FORTRAN_LINKAGE_LC],[1],[Fortran-C linking convention lower case (no underscore)]) fi if test $fsym = no; then AC_CHECK_FUNC([dgemm_],[fsym="lcu"]) fi if test $fsym = no; then AC_CHECK_FUNC([dgemm],[fsym="lc"]) fi if test $fsym = no; then AC_CHECK_FUNC([dgemm__],[fsym="lcuu"]) fi if test $fsym = no; then AC_CHECK_FUNC([DGEMM],[fsym="uc"]) fi if test $fsym = no; then AC_CHECK_FUNC([DGEMM_],[fsym="ucu"]) fi # Well there is nothing in the existing path that gives us what we are # looking for so try looking for some standard examples so that common # Linux, Apple and configurations work automatically. We save the # BLAS library name instead of adding it directly to LIBS since it # will need to be appened after any LAPACK library that is yet to # be found. # OS X if test $fsym$ON_A_MAC = noyes; then LDFLAGS="$LDFLAGS -framework Accelerate" fsym="lcu" AC_MSG_NOTICE([Using Accelerate framework for BLAS support]) fi # Linux BLASLIB="" if test $fsym = no; then AC_LANG_SAVE AC_LANG([C++]) for blaslib in openblas blas; do AC_CHECK_LIB([$blaslib], [dgemm_], [fsym="lcu"; BLASLIB="-l$blaslib"; AC_MSG_NOTICE([Found dgemm_ in $blaslib]); break], [AC_MSG_NOTICE([Unable to find dgemm_ in $blaslib])], [-lpthread]) done AC_LANG_RESTORE fi # others ... insert here or extend above for loop if correct symbol is dgemm_ if test $fsym = no; then AC_MSG_ERROR([Could not find dgemm with any known linking conventions]) fi AC_MSG_NOTICE([Fortran linking convention is $fsym]) # Now verify that we have at least one of the required lapack routines and again attempt to search for candidate libraries if nothing is found if test $fsym = lc; then AC_DEFINE([FORTRAN_LINKAGE_LC],[1],[Fortran-C linking convention lower case (no underscore)]) lapacksym=dsyev fi if test $fsym = lcu; then AC_DEFINE([FORTRAN_LINKAGE_LCU],[1],[Fortran-C linking convention lower case with single underscore]) lapacksym=dsyev_ fi if test $fsym = lcuu; then AC_DEFINE([FORTRAN_LINKAGE_LCUU],[1],[Fortran-C linking convention lower case with double underscore]) lapacksym=dsyev__ fi if test $fsym = uc; then AC_DEFINE([FORTRAN_LINKAGE_UC],[1],[Fortran-C linking convention upper case]) lapacksym=DSYEV fi if test $fsym = ucu; then AC_DEFINE([FORTRAN_LINKAGE_UCU],[1],[Fortran-C linking convention upper case with single underscore]) lapacksym=DSYEV_ fi LAPACKLIB="" foundlapack=no AC_CHECK_FUNC([$lapacksym],[foundlapack=yes],AC_MSG_NOTICE([Could not find dsyev with selected linking convention in default library path])) LAPACKLIB="" if test $foundlapack = no; then AC_LANG_SAVE AC_LANG([C++]) for lapacklib in lapack; do AC_CHECK_LIB([$lapacklib], [$lapacksym], [foundlapack=yes; LAPACKLIB="-l$lapacklib"; AC_MSG_NOTICE([Found $lapacksym in $lapacklib]); break], [AC_MSG_NOTICE([Unable to find $lapacksym in $lapackib])], [$BLASLIB -lpthread]) done AC_LANG_RESTORE fi if test $foundlapack = no; then AC_MSG_NOTICE([Could not find $lapacksym in any known library ... specify LAPACK library via LIBS]) fi LIBS="$LIBS $LAPACKLIB $BLASLIB" ]) madness-0.10/config/acx_gnu_hashmap.m4000066400000000000000000000022551253377057000177350ustar00rootroot00000000000000AC_DEFUN([ACX_GNU_HASHMAP], [ AC_LANG_PUSH([C++]) gotgnuhashmap=no AC_MSG_CHECKING([for GNU hashmap in '' and namespace __gnu_cxx]) AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #include using __gnu_cxx::hash_map; ]])], [ AC_DEFINE(GNU_HASHMAP_NAMESPACE, __gnu_cxx, [GNU hashmap namespace]) AC_DEFINE(INCLUDE_EXT_HASH_MAP,[1],[If defined header is in ext directory]) AC_MSG_RESULT([yes]) gotgnuhashmap=yes ], [AC_MSG_RESULT([no])]) if test $gotgnuhashmap = no; then AC_MSG_CHECKING([for GNU hashmap in '' and namespace _SLTP_STD]) AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #include using _STLP_STD::hash_map; ]])], [ AC_DEFINE(GNU_HASHMAP_NAMESPACE, _STLP_STD, [GNU hashmap namespace]) AC_MSG_RESULT([yes]) gotgnuhashmap=yes ], [AC_MSG_RESULT([no])]) fi AC_LANG_POP([C++]) if test $gotgnuhashmap = no; then AC_MSG_ERROR([Could not find GNU hashmap with any known combination of header+namespace]) fi AC_DEFINE(HAVE_GNU_HASHMAP,[1],[Enable if have GNU hashmap]) ]) madness-0.10/config/acx_ibmbgp.m4000066400000000000000000000030151253377057000166760ustar00rootroot00000000000000AC_DEFUN([ACX_IBMBGP],[ # If on an IBMBGP # - defines HAVE_IBMBGP=1 in headers # - defines HAVE_IBMBGP=yes in the script # - sets thread binding to "1 0 2" # - enables spinlocks # - sets the host architecture to powerpc-bgp-linux-gnu # #AC_CHECK_FILE([/bgsys], [HAVE_IBMBGP=yes AC_DEFINE([HAVE_IBMBGP],[1],[Defined if we are running on an IBM Blue Gene/P])],[]) echo "int main()" > __bgp__.cc echo "{" >> __bgp__.cc echo "#ifdef __bgp__" >> __bgp__.cc echo "return 0;" >> __bgp__.cc echo "#else" >> __bgp__.cc echo "choke me" >> __bgp__.cc echo "#endif" >> __bgp__.cc echo "}" >> __bgp__.cc mpicxx __bgp__.cc >& /dev/null if test $? = 0; then HAVE_IBMBGP=yes fi /bin/rm __bgp__.cc if test "x$HAVE_IBMBGP" = x; then mpicxx --version -c | grep -q bgp if test $? = 0; then HAVE_IBMBGP=yes fi fi if test "x$HAVE_IBMBGP" = xyes; then AC_MSG_NOTICE([IBM Blue Gene/P detected]) AC_DEFINE(HAVE_IBMBGP,[1],[Defined if we are running on an IBM Blue Gene/P]) host="powerpc-bgp-linux" host_triplet="powerpc-bgp-linux" BIND="-1 -1 -1" AC_DEFINE(USE_SPINLOCKS, [1], [Define if should use spinlocks]) fi ]) madness-0.10/config/acx_ibmbgq.m4000066400000000000000000000025111253377057000166770ustar00rootroot00000000000000AC_DEFUN([ACX_IBMBGQ],[ # If on an IBMBGQ # - defines HAVE_IBMBGQ=1 in headers # - defines HAVE_IBMBGQ=yes in the script # - sets thread binding to "1 0 2" # - enables spinlocks # - sets the host architecture to powerpc-bgq-linux-gnu # #AC_CHECK_FILE([/bgsys], [HAVE_IBMBGQ=yes AC_DEFINE([HAVE_IBMBGQ],[1],[Defined if we are running on an IBM Blue Gene/Q])],[]) echo "int main()" > __bgq__.cc echo "{" >> __bgq__.cc echo "#ifdef __bgq__" >> __bgq__.cc echo "return 0;" >> __bgq__.cc echo "#else" >> __bgq__.cc echo "choke me" >> __bgq__.cc echo "#endif" >> __bgq__.cc echo "}" >> __bgq__.cc mpicxx __bgq__.cc >& /dev/null if test $? = 0; then AC_MSG_NOTICE([IBM Blue Gene/Q detected]) HAVE_IBMBGQ=yes AC_DEFINE(HAVE_IBMBGQ,[1],[Defined if we are running on an IBM Blue Gene/Q]) fi /bin/rm __bgq__.cc if test "x$HAVE_IBMBGQ" = xyes; then host="powerpc64-bgq-linux" host_triplet="powerpc64-bgq-linux" BIND="-1 -1 -1" AC_DEFINE(USE_SPINLOCKS, [1], [Define if should use spinlocks]) fi ]) madness-0.10/config/acx_mac.m4000066400000000000000000000007531253377057000162040ustar00rootroot00000000000000AC_DEFUN([ACX_MAC], [ # If on a MAC # - kiss steve jobs' ass # - get an iphone # - get another mac ON_A_MAC="no" uname -a | grep -iq Darwin if test $? = 0; then ITS_A_MAC="yes" ON_A_MAC="yes" LDFLAGS="-Wl,-no_pie $LDFLAGS" AC_MSG_NOTICE([You are building on a mac ... now tell ten of your friends.]) AC_DEFINE(ON_A_MAC,[1],[Set if building on a mac]) fi ]) madness-0.10/config/acx_mpi.m4000066400000000000000000000020641253377057000162260ustar00rootroot00000000000000AC_DEFUN([ACX_MPI], [ # We were using the full macro from ACX but this forced AC_PROG_C or AC_PROG_CXX to # be run before we had overridden the compilers which meant that some confdef.h # entries were incorrect (specifically std::exit problem with PGI) # Disable MPI C++ bindings. CPPFLAGS="$CPPFLAGS -DMPICH_SKIP_MPICXX=1 -DOMPI_SKIP_MPICXX=1" AC_ARG_VAR(MPICC,[MPI C compiler command]) AC_CHECK_PROGS(MPICC, mpicc hcc mpcc mpcc_r mpxlc cmpicc, $CC) acx_mpi_save_CC="$CC" CC="$MPICC" AC_SUBST(MPICC) AC_ARG_VAR(MPICXX,[MPI C++ compiler command]) AC_CHECK_PROGS(MPICXX, mpicxx mpic++ mpiCC mpCC hcp mpxlC mpxlC_r cmpic++, $CXX) acx_mpi_save_CXX="$CXX" CXX="$MPICXX" AC_SUBST(MPICXX) #AC_ARG_VAR(MPIF77,[MPI Fortran compiler command]) #AC_CHECK_PROGS(MPIF77, mpif77 hf77 mpxlf mpf77 mpif90 mpf90 mpxlf90 mpxlf95 mpxlf_r cmpifc cmpif90c, $F77) #acx_mpi_save_F77="$F77" #F77="$MPIF77" #AC_SUBST(MPIF77) ]) madness-0.10/config/acx_posix_memalign.m4000066400000000000000000000030671253377057000204600ustar00rootroot00000000000000AC_DEFUN([ACX_POSIX_MEMALIGN], [ AC_CHECK_FUNC([posix_memalign],[gotpm=1], [gotpm=0]) if test $gotpm = 1; then AC_DEFINE([HAVE_POSIX_MEMALIGN], [1], [Set if have posix_memalign]) elif test "$ON_A_MAC" = "yes"; then AC_DEFINE([HAVE_POSIX_MEMALIGN], [0], [Set if have posix_memalign]) else AC_MSG_WARN([[ posix_memalign NOT FOUND ... enabling override of new/delete ... THIS WILL BE SLOW ]]) AC_DEFINE([WORLD_GATHER_MEM_STATS], [1], [Set if MADNESS gathers memory statistics]) fi # look for both version of posix_memalign, with and without throw() gotpmh=0 if test $gotpm = 1; then AC_MSG_CHECKING([if missing declaration of posix_memalign in stdlib.h]) AC_LANG_SAVE AC_LANG([C++]) AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #include #include extern "C" int posix_memalign(void **memptr, size_t alignment, size_t size) throw();]], [[void *m; posix_memalign(&m, 16, 1024);]])], [AC_MSG_RESULT([no]) gotpmh=1 ] ) if test $gotpmh = 0; then AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #include #include extern "C" int posix_memalign(void **memptr, size_t alignment, size_t size);]], [[void *m; posix_memalign(&m, 16, 1024);]])], [AC_MSG_RESULT([no]) gotpmh=1 ], [ AC_DEFINE(MISSING_POSIX_MEMALIGN_PROTO, [1], [Set if the posix_memalign prototype is missing]) AC_MSG_RESULT([yes]) ] ) fi AC_LANG_RESTORE fi ]) madness-0.10/config/acx_std_abs.m4000066400000000000000000000015311253377057000170560ustar00rootroot00000000000000AC_DEFUN([ACX_STD_ABS], [ AC_MSG_CHECKING([std::abs(long)]) AC_LANG_PUSH([C++]) AC_LINK_IFELSE([AC_LANG_PROGRAM([[ #include #include long (*absptr)(long) = &std::abs; long a = -1; long b = std::abs(a); ]])], [AC_DEFINE(HAVE_STD_ABS_LONG,[1],[Define if have std::abs(long)]) have_abs_long=yes], [have_abs_long=no]) AC_MSG_RESULT([$have_abs_long]) if test X"$have_abs_long" = Xno; then AC_MSG_CHECKING([std::labs(long)]) AC_LINK_IFELSE([AC_LANG_PROGRAM([[ #include #include long (*labsptr)(long) = &std::labs; long a = -1l; long b = labs(a); ]])], [AC_DEFINE(HAVE_LABS,[1],[Define if have std::labs(long)]) have_std_labs=yes], [have_std_labs=no]) AC_MSG_RESULT([$have_std_labs]) fi ]) madness-0.10/config/acx_unqal_stat_decl.m4000066400000000000000000000010471253377057000206030ustar00rootroot00000000000000AC_DEFUN([ACX_UNQUALIFIED_STATIC_DECL], [ AC_LANG_PUSH([C++]) AC_MSG_CHECKING([if unqualified static declarations are considered]) AC_COMPILE_IFELSE( [AC_LANG_PROGRAM( [[template static inline void f(T* a) {}; template void g(T* a) { f(a); } template void g(int* a);]], [[]])], [AC_DEFINE(HAVE_UNQUALIFIED_STATIC_DECL, [1], [Set if compiler will instantiate static templates]) AC_MSG_RESULT([yes])], [AC_MSG_RESULT([no])]) AC_LANG_POP([C++]) ]) madness-0.10/config/acx_with_elemental.m4000066400000000000000000000020771253377057000204460ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_ELEMENTAL], [ acx_with_elemental=no AC_ARG_WITH([elemental], [AS_HELP_STRING([--with-elemental@<:@=DIR@:>@], [Build with Elemental headers.])], [ case $withval in yes) acx_with_elemental=yes ;; no) acx_with_elemental=no ;; *) acx_with_elemental=yes CPPFLAGS="-I$withval/include $CPPFLAGS" #ELEMENTAL_LIBS="-L$withval/lib -lelemental -lpmrrr -lelem-dummy-lib" #for 0.81 ELEMENTAL_LIBS="-L$withval/lib -lelemental -lpmrrr " if test -f "$withval/lib/liblapack-addons.a"; then ELEMENTAL_LIBS="$ELEMENTAL_LIBS -llapack-addons" fi LIBS="$ELEMENTAL_LIBS $LIBS" ;; esac ] ) if test "$acx_with_elemental" != no; then # Check for the pressence of Elemental header files. AC_CHECK_HEADER([elemental.hpp], [], [AC_MSG_ERROR([Unable to find the elemental.hpp header file.])]) AC_DEFINE([MADNESS_HAS_ELEMENTAL], [1], [Madness will use Elemental for parallel linear algebra operations]) MADNESS_HAS_ELEMENTAL=1 fi ]) madness-0.10/config/acx_with_google_perf.m4000066400000000000000000000030441253377057000207630ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_GOOGLE_PERF], [ acx_with_google_perf="" AC_ARG_WITH([google-perf], [AS_HELP_STRING([--without-google-perf], [Disables use of Google fast malloc.profiler/heapchecker; --with-google-perf=directory overrides install path])], [ case $withval in yes) acx_with_google_perf="yes" ;; no) acx_with_google_perf="no" ;; *) LIBS="$LIBS -L$withval/lib" CPPFLAGS="-I$withval/include $CPPFLAGS" acx_with_google_perf="$withval" esac ], [acx_with_google_perf="yes"] ) if test $acx_with_google_perf != "no"; then AC_LANG_SAVE AC_LANG([C++]) AC_CHECK_LIB([tcmalloc_and_profiler], [ProfilerStart], [LIBS="-lprofiler -ltcmalloc $LIBS" have_profiler=yes AC_MSG_NOTICE([Google profiler and fast malloc found]) AC_DEFINE([MADNESS_HAS_GOOGLE_PERF], [1], [Define if using Google PerformanceTools])], [have_profiler=no AC_MSG_NOTICE(["Unable to link with libprofiler+libtcmalloc])]) if test $have_profiler = "no" ; then AC_CHECK_LIB([tcmalloc_minimal], [malloc], [LIBS="-ltcmalloc_minimal $LIBS" AC_MSG_NOTICE([Google minimal fast malloc found]) AC_DEFINE([MADNESS_HAS_GOOGLE_PERF_MINIMAL], [1], [Define if using Google PerformanceTools without libunwind])], [AC_MSG_NOTICE(["Unable to link with libtcmalloc_minimal])]) fi AC_LANG_RESTORE fi ]) madness-0.10/config/acx_with_google_test.m4000066400000000000000000000023041253377057000210040ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_GOOGLE_TEST], [ acx_with_google_test="" AC_ARG_WITH([google-test], [AS_HELP_STRING([--with-google-test@<:@=yes|no@:>@], [Enables use of Google unit test @<:@default=no@:>@])], [ case $withval in yes) acx_with_google_test="yes" ;; *) acx_with_google_test="no" esac ], [ acx_with_google_test="no" ] ) AC_ARG_VAR([GTEST_CPPFLAGS], [C-like preprocessor flags for Google Test.]) AC_ARG_VAR([GTEST_CXXFLAGS], [C++ compile flags for Google Test.]) AC_ARG_VAR([GTEST_LDFLAGS], [Linker path and option flags for Google Test.]) AC_ARG_VAR([GTEST_LIBS], [Library linking flags for Google Test.]) if test $acx_with_google_test == "no"; then AC_MSG_NOTICE([Configuring with Google Unit Test --- no]) else AC_MSG_NOTICE([Configuring with Google Unit Test --- yes]) # Set preprocessor and build variables AC_DEFINE([MADNESS_HAS_GOOGLE_TEST], [1], [Define if should use Google unit testing]) AC_SUBST([GTEST_CPPFLAGS]) AC_SUBST([GTEST_CXXFLAGS]) AC_SUBST([GTEST_LDFLAGS]) AC_SUBST([GTEST_LIBS]) fi AM_CONDITIONAL([MADNESS_HAS_GOOGLE_TEST], [test $acx_with_google_test != no]) ]) madness-0.10/config/acx_with_libunwind.m4000066400000000000000000000017351253377057000204730ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_LIBUNWIND], [ acx_with_libunwind="" AC_ARG_WITH([libunwind], [AS_HELP_STRING([--with-libunwind@<:@=Install DIR@:>@], [Enables use of libunwind, required for Google profiler and heap checker])], [ case $withval in yes) acx_with_libunwind="yes" ;; no) acx_with_libunwind="no" ;; *) CPPFLAGS="$CPPFLAGS -I$with_libunwind/include" LIBS="$LIBS -L$with_libunwind/lib" acx_with_libunwind="$withval" esac ], [acx_with_libunwind="no"] ) if test $acx_with_libunwind != "no"; then AC_LANG_SAVE AC_LANG_C AC_CHECK_HEADER([libunwind.h], [], [AC_MSG_ERROR([Unable to compile with the libunwind.])]) AC_CHECK_LIB([unwind], [_Unwind_GetRegionStart], [LIBS="$LIBS -lunwind"], [AC_MSG_ERROR(["Unable to link with libunwind])]) AC_DEFINE(MADNESS_HAS_LIBUNWIND, [1], [Define if should use libunwind for Google performance tools]) AC_LANG_RESTORE fi ]) madness-0.10/config/acx_with_libxc.m4000066400000000000000000000021371253377057000175760ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_LIBXC], [ acx_with_libxc="" AC_ARG_WITH([libxc], [AS_HELP_STRING([--with-libxc@<:@=Install DIR@:>@], [Enables use of the libxc library of density functionals])], [ case $withval in yes) acx_with_libxc="yes" ;; no) acx_with_libxc="no" ;; *) CPPFLAGS="-I$withval/include $CPPFLAGS" LIBS="$LIBS -L$withval/lib" acx_with_libxc="$withval" esac ], [acx_with_libxc="yes"] ) if test $acx_with_libxc != "no"; then AC_LANG_SAVE AC_LANG([C++]) AC_CHECK_HEADERS([xc.h xc_funcs.h], [], [acx_with_libxc=no AC_MSG_NOTICE([Unable to include with xc.h or xc_func.h])]) AC_CHECK_LIB([xc], [xc_func_end], [], [acx_with_libxc=no AC_MSG_NOTICE([Unable to link with libxc])]) AC_LANG_RESTORE fi if test $acx_with_libxc != "no"; then AC_DEFINE([MADNESS_HAS_LIBXC], [1], [Define if using libxc]) fi AM_CONDITIONAL([MADNESS_HAS_LIBXC], [test $acx_with_libxc != "no"]) ]) madness-0.10/config/acx_with_madxc.m4000066400000000000000000000016251253377057000175720ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_MADXC], [ acx_with_madxc="" AC_ARG_WITH([madxc], [AS_HELP_STRING([--with-madxc@<:@=Install DIR@:>@], [Enables use of the madxc library of density functionals])], [ case $withval in yes) acx_with_madxc="yes" ;; no) acx_with_madxc="no" ;; *) CPPFLAGS="-I$withval $CPPFLAGS" LIBS="$LIBS -L$withval" acx_with_madxc="$withval" esac ], [acx_with_madxc="no"] ) if test $acx_with_madxc != "no"; then AC_LANG_SAVE AC_LANG([C++]) AC_CHECK_HEADERS([libMADxc.h], [], [AC_MSG_ERROR(["Unable to include with libMADxc.h])]) AC_CHECK_LIB([MADxc], [rks_c_vwn5_ ], [], [AC_MSG_ERROR(["Unable to link with libMADxc])]) AC_DEFINE([MADNESS_HAS_MADXC], [1], [Define if using libMADxc]) AC_LANG_RESTORE fi AM_CONDITIONAL([MADNESS_HAS_MADXC], [test $acx_with_madxc != "no"]) ]) madness-0.10/config/acx_with_mkl.m4000066400000000000000000000035441253377057000172630ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_MKL], [ AC_ARG_VAR([MKLROOT], [Intel MKL install directory.]) AC_ARG_WITH([mkl], [AS_HELP_STRING([--with-mkl@<:@=yes|no|check|DIR@:>@], [Enables use of Intel MKL @<:@default=check@:>@])], [ case $withval in yes) acx_with_mkl="yes" ;; no) acx_with_mkl="no" ;; *) acx_with_mkl="$withval" ;; esac ], [ acx_with_mkl="check" ] ) if test $acx_with_mkl != "no"; then AC_MSG_CHECKING([Intel MKL]) case $acx_with_mkl in yes|check) if test -d "$MKLROOT"; then acx_mkl_root="$MKLROOT" elif test -d "/opt/intel/mkl"; then acx_mkl_root="/opt/intel/mkl" else acx_mkl_root="no" fi ;; *) if test -d "$acx_with_mkl"; then acx_mkl_root="$acx_with_mkl" else acx_mkl_root="no" fi ;; esac # Show results AC_MSG_RESULT([$acx_mkl_root]) if test $acx_mkl_root != "no"; then # Set the MKL library name based on the integer type if test $acx_fortran_int_size == "4"; then acx_mkl_int_type="lp64" else acx_mkl_int_type="ilp64" fi # Add MKL libraries to the LIBS variable if test $ON_A_MAC = "no"; then LIBS="$LIBS -Wl,--start-group $acx_mkl_root/lib/intel64/libmkl_intel_$acx_mkl_int_type.a $acx_mkl_root/lib/intel64/libmkl_core.a $acx_mkl_root/lib/intel64/libmkl_sequential.a -Wl,--end-group -lpthread -lm -ldl" else LIBS="$LIBS $acx_mkl_root/lib/libmkl_intel_$acx_mkl_int_type.a $acx_mkl_root/lib/libmkl_core.a $acx_mkl_root/lib/libmkl_sequential.a -lpthread -lm" fi else # Return an error if not checking if test $acx_with_mkl != "check"; then AC_MSG_ERROR([Intel MKL install directory not found or invalid.]) fi fi fi ]) madness-0.10/config/acx_with_stubmpi.m4000066400000000000000000000013451253377057000201600ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_STUBMPI], [ acx_with_stubmpi=no AC_ARG_WITH([stubmpi], [AS_HELP_STRING([--with-stubmpi], [Build without MPI ... i.e., stubbing it out.])], [ case $withval in yes) acx_with_stubmpi=yes ;; no) acx_with_stubmpi=no ;; *) acx_with_stubmpi=yes ;; esac ]) if test $acx_with_stubmpi = yes; then AC_DEFINE(STUBOUTMPI,[1],[If defined header disable MPI by including stubmpi.h]) MPICC="$CC" MPICXX="$CXX" if test "x$CC" = x; then MPICC=gcc; fi if test "x$CXX" = x; then MPICXX=g++; fi AC_MSG_NOTICE([Stubbing out MPI with MPICXX=$MPICXX MPICC=$MPICC]) fi ]) madness-0.10/config/acx_with_tbb.m4000066400000000000000000000055631253377057000172520ustar00rootroot00000000000000AC_DEFUN([ACX_WITH_TBB], [ acx_with_tbb_include="no" acx_with_tbb_lib="no" acx_with_tbb="no" # Configure madness to use Intel TBB and specify the include path. AC_ARG_WITH([tbb-include], [AS_HELP_STRING([--with-tbb-include@<:@=DIR@:>@], [Enables use of Intel TBB as the task scheduler.])], [ case $withval in yes) AC_MSG_ERROR([You must specify a directory for --with-tbb-include.]) ;; no) ;; *) CPPFLAGS="$CPPFLAGS -I$withval" acx_with_tbb_include="yes" acx_with_tbb="yes" esac ] ) # Configure madness to use Intel TBB and specify the library path. AC_ARG_WITH([tbb-lib], [AS_HELP_STRING([--with-tbb-lib@<:@=DIR@:>@], [Enables use of Intel TBB as the task scheduler.])], [ case $withval in yes) AC_MSG_ERROR([You must specify a directory for --with-tbb-lib.]) ;; no) ;; *) LIBS="$LIBS -L$withval" acx_with_tbb_lib="yes" acx_with_tbb="yes" esac ] ) # Configure madness to use Intel TBB AC_ARG_WITH([tbb], [AS_HELP_STRING([--with-tbb@<:@=Install DIR@:>@], [Enables use of Intel TBB as the task scheduler.])], [ case $withval in yes) acx_with_tbb="yes" ;; no) ;; *) if test "$acx_with_tbb_include" == no; then CPPFLAGS="$CPPFLAGS -I$withval/include" fi if test "$acx_with_tbb_lib" == no; then LIBS="$LIBS -L$withval/lib" fi acx_with_tbb="yes" esac ], [acx_with_tbb="yes"] ) # Check that we can compile with Intel TBB if test $acx_with_tbb != "no"; then AC_LANG_SAVE AC_LANG([C++]) # Check for Intel TBB header. AC_CHECK_HEADER([tbb/tbb.h], [], [acx_with_tbb=no AC_MSG_NOTICE([Unable to compile with Intel TBB.])]) AC_LANG_RESTORE fi if test $acx_with_tbb != "no"; then AC_LANG_SAVE AC_LANG([C++]) # Check for Intel TBB library. if test "x$acx_enable_debugging" == xno; then AC_CHECK_LIB([tbb], [TBB_runtime_interface_version], [LIBS="-ltbb $LIBS"], [acx_with_tbb=no AC_MSG_NOTICE(["Unable to link with Intel TBB])]) else AC_CHECK_LIB([tbb_debug], [TBB_runtime_interface_version], [ LIBS="-ltbb_debug $LIBS" CPPFLAGS="$CPPFLAGS -DTBB_USE_DEBUG=1" AC_MSG_WARN([Linking with the debug variant of Intel TBB.]) ], [ AC_CHECK_LIB([tbb], [TBB_runtime_interface_version], [LIBS="-ltbb $LIBS"], [acx_with_tbb=no AC_MSG_NOTICE(["Unable to link with Intel TBB])]) ]) fi AC_LANG_RESTORE fi if test $acx_with_tbb != "no"; then AC_DEFINE(HAVE_INTEL_TBB, [1], [Define if Intel TBB is available.]) fi ]) madness-0.10/config/am_prog_ar.m4000066400000000000000000000000311253377057000167040ustar00rootroot00000000000000AC_DEFUN([AM_PROG_AR],[])madness-0.10/config/ax_cxx_compile_stdcxx_11.m4000066400000000000000000000125241253377057000215100ustar00rootroot00000000000000# ============================================================================ # http://www.gnu.org/software/autoconf-archive/ax_cxx_compile_stdcxx_11.html # ============================================================================ # # SYNOPSIS # # AX_CXX_COMPILE_STDCXX_11([ext|noext],[mandatory|optional]) # # DESCRIPTION # # Check for baseline language coverage in the compiler for the C++11 # standard; if necessary, add switches to CXXFLAGS to enable support. # # The first argument, if specified, indicates whether you insist on an # extended mode (e.g. -std=gnu++11) or a strict conformance mode (e.g. # -std=c++11). If neither is specified, you get whatever works, with # preference for an extended mode. # # The second argument, if specified 'mandatory' or if left unspecified, # indicates that baseline C++11 support is required and that the macro # should error out if no mode with that support is found. If specified # 'optional', then configuration proceeds regardless, after defining # HAVE_CXX11 if and only if a supporting mode is found. # # LICENSE # # Copyright (c) 2008 Benjamin Kosnik # Copyright (c) 2012 Zack Weinberg # Copyright (c) 2013 Roy Stogner # Copyright (c) 2014 Alexey Sokolov # # Copying and distribution of this file, with or without modification, are # permitted in any medium without royalty provided the copyright notice # and this notice are preserved. This file is offered as-is, without any # warranty. #serial 4 m4_define([_AX_CXX_COMPILE_STDCXX_11_testbody], [[ template struct check { static_assert(sizeof(int) <= sizeof(T), "not big enough"); }; struct Base { virtual void f() {} }; struct Child : public Base { virtual void f() override {} }; typedef check> right_angle_brackets; int a; decltype(a) b; typedef check check_type; check_type c; check_type&& cr = static_cast(c); auto d = a; auto l = [](){}; #include std::tuple t(1, 2.0, 3ul); #include std::hash h; int h1 = h(1); #include std::array aint10 = { 0,1,2,3,4,5,6,7,8,9 }; #include typedef std::is_same same_type; #include std::shared_ptr sptr = std::make_shared(1); std::unique_ptr uptr{new int(2)}; inline int sum() { return 0; } template inline int sum(const T& t, const U&... u) { return t + sum(u...); } const int s123 = sum(1, 2, 3); ]]) AC_DEFUN([AX_CXX_COMPILE_STDCXX_11], [dnl m4_if([$1], [], [], [$1], [ext], [], [$1], [noext], [], [m4_fatal([invalid argument `$1' to AX_CXX_COMPILE_STDCXX_11])])dnl m4_if([$2], [], [ax_cxx_compile_cxx11_required=true], [$2], [mandatory], [ax_cxx_compile_cxx11_required=true], [$2], [optional], [ax_cxx_compile_cxx11_required=false], [m4_fatal([invalid second argument `$2' to AX_CXX_COMPILE_STDCXX_11])]) AC_LANG_PUSH([C++])dnl ac_success=no AC_CACHE_CHECK(whether $CXX supports C++11 features by default, ax_cv_cxx_compile_cxx11, [AC_COMPILE_IFELSE([AC_LANG_SOURCE([_AX_CXX_COMPILE_STDCXX_11_testbody])], [ax_cv_cxx_compile_cxx11=yes], [ax_cv_cxx_compile_cxx11=no])]) if test x$ax_cv_cxx_compile_cxx11 = xyes; then ac_success=yes fi m4_if([$1], [noext], [], [dnl if test x$ac_success = xno; then for switch in -std=gnu++11 -std=gnu++0x; do cachevar=AS_TR_SH([ax_cv_cxx_compile_cxx11_$switch]) AC_CACHE_CHECK(whether $CXX supports C++11 features with $switch, $cachevar, [ac_save_CXXFLAGS="$CXXFLAGS" CXXFLAGS="$CXXFLAGS $switch" AC_COMPILE_IFELSE([AC_LANG_SOURCE([_AX_CXX_COMPILE_STDCXX_11_testbody])], [eval $cachevar=yes], [eval $cachevar=no]) CXXFLAGS="$ac_save_CXXFLAGS"]) if eval test x\$$cachevar = xyes; then CXXFLAGS="$CXXFLAGS $switch" ac_success=yes break fi done fi]) m4_if([$1], [ext], [], [dnl if test x$ac_success = xno; then for switch in -std=c++11 -std=c++0x; do cachevar=AS_TR_SH([ax_cv_cxx_compile_cxx11_$switch]) AC_CACHE_CHECK(whether $CXX supports C++11 features with $switch, $cachevar, [ac_save_CXXFLAGS="$CXXFLAGS" CXXFLAGS="$CXXFLAGS $switch" AC_COMPILE_IFELSE([AC_LANG_SOURCE([_AX_CXX_COMPILE_STDCXX_11_testbody])], [eval $cachevar=yes], [eval $cachevar=no]) CXXFLAGS="$ac_save_CXXFLAGS"]) if eval test x\$$cachevar = xyes; then CXXFLAGS="$CXXFLAGS $switch" ac_success=yes break fi done fi]) AC_LANG_POP([C++]) if test x$ax_cxx_compile_cxx11_required = xtrue; then if test x$ac_success = xno; then AC_MSG_ERROR([*** A compiler with support for C++11 language features is required.]) fi else if test x$ac_success = xno; then HAVE_CXX11=0 AC_MSG_NOTICE([No compiler with C++11 support was found]) else HAVE_CXX11=1 AC_DEFINE(HAVE_CXX11,1, [define if the compiler supports basic C++11 syntax]) fi AC_SUBST(HAVE_CXX11) fi ]) madness-0.10/config/copyright_header000066400000000000000000000020111253377057000175770ustar00rootroot00000000000000/* This file is part of MADNESS. Copyright (C) 2007,2010 Oak Ridge National Laboratory This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 email: harrisonrj@ornl.gov tel: 865-241-3937 fax: 865-572-0680 $Id$ */ madness-0.10/config/insert_headers000077500000000000000000000011231253377057000172640ustar00rootroot00000000000000 # This script goes thru all source files (*.h and *.cc only) in the # MADNESS tree nominally owned by the project and ensures that the # copyright header is installed # It must be run from the top of the madness tree (trunk directory) for file in `find . \( -name "*.h" -o -name "*.cc" \) -exec grep -L "This file is part of MADNESS" {} ";" | grep -v muParser | grep -v tinyxml | grep -v cfft | grep -v madness_config.h | grep -v mainpage.h` do echo "Processing $file" cp $file $file.bak cat config/copyright_header > fredfred cat $file >> fredfred mv fredfred $file done madness-0.10/config/madness-config.cmake.in000066400000000000000000000074741253377057000206620ustar00rootroot00000000000000# - CMAKE Config file for the MADNESS package # The following variables are defined: # Madness_INCLUDE_DIRS - The MADNESS include directory # Madness_LIBRARIES - The MADNESS libraries and their dependencies # Madness__FOUND - System has the specified MADNESS COMPONENT # Madness__LIBRARY - The MADNESS COMPONENT library # Madness_COMPILE_FLAGS - Compile flags required to build with MADNESS # Madness_LINKER_FLAGS - Linker flags required to link with MADNESS # Madness_VERSION - MADNESS version number # Madness_F77_INTEGER_SIZE - The default F77 integer size used for BLAS calls # Compute paths get_filename_component(Madness_CONFIG_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH) get_filename_component(Madness_DIR "${Madness_CONFIG_DIR}/../../.." ABSOLUTE) set(Madness_INCLUDE_DIRS ${Madness_DIR}/include) set(Madness_LIBRARY_DIR ${Madness_DIR}/lib) # Set package version set(Madness_VERSION "@PACKAGE_VERSION@") # Set compile and link flags, and remove optimization and debug flags string(REGEX REPLACE "-(O[0-9s]|g[0-9]?)([ ]+|$)" "" Madness_COMPILE_FLAGS "@CPPFLAGS@ @CXXFLAGS@") string(REGEX REPLACE "-(O[0-9s]|g[0-9]?)([ ]+|$)" "" Madness_LINKER_FLAGS "@CXXFLAGS@ @LDFLAGS@ @LIBS@") # Set MADNESS component variables set(Madness_DEFAULT_COMPONENT_LIST MADchem MADmra MADtinyxml MADmuparser MADlinalg MADtensor MADmisc MADworld) set(Madness_MADchem_DEP_LIST MADmra) set(Madness_MADmra_DEP_LIST MADtinyxml MADmuparser MADlinalg) set(Madness_MADlinalg_DEP_LIST MADtensor) set(Madness_MADtensor_DEP_LIST MADmisc) set(Madness_MADmisc_DEP_LIST MADworld) set(Madness_MADworld_DEP_LIST) # Check for valid component list foreach(_comp ${Madness_FIND_COMPONENTS}) if(NOT "${Madness_DEFAULT_COMPONENT_LIST}" MATCHES "${_comp}") message(FATAL_ERROR "Invalid MADNESS component: ${_comp}") endif() endforeach() # Set MADNESS libraries variable foreach(_comp ${Madness_DEFAULT_COMPONENT_LIST}) # Search for Madness library find_library(Madness_${_comp}_LIBRARY ${_comp} PATHS ${Madness_LIBRARY_DIR} NO_DEFAULT_PATH) # Check that the library component was found if(Madness_${_comp}_LIBRARY) set(Madness_${_comp}_FOUND TRUE) # Set MADNESS libraries variable if("${Madness_FIND_COMPONENTS}" MATCHES "${_comp}" OR NOT Madness_FIND_COMPONENTS) list(APPEND Madness_LIBRARIES ${Madness_${_comp}_LIBRARY}) endif() else() if(Madness_FIND_REQUIRED_${_comp} OR (NOT Madness_FIND_COMPONENTS AND Madness_FIND_REQUIRED)) # Fail due to missing required component. MESSAGE(FATAL_ERROR "!!ERROR: Madness ${_comp} library is not available.") endif() set(Madness_${_comp}_FOUND FALSE) endif() # Check for dependencies in the component list if("${Madness_FIND_COMPONENTS}" MATCHES "${_comp}") foreach(_comp_dep ${Madness_${_comp}_DEP_LIST}) # Add dependency to the component list if missing if(NOT "${Madness_FIND_COMPONENTS}" MATCHES "${_comp_dep}") list(APPEND Madness_FIND_COMPONENTS ${_comp_dep}) endif() # Set required flag for component dependencies if(Madness_FIND_REQUIRED_${_comp}) set(Madness_FIND_REQUIRED_${_comp_dep} TRUE) else() set(Madness_FIND_REQUIRED_${_comp_dep} FALSE) endif() endforeach() endif() endforeach() # Set Fortran 77 integer size used by MADNESS set(Madness_F77_INTEGER_SIZE @MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE@) # Clear local variables unset(Madness_DEFAULT_COMPONENT_LIST) unset(Madness_MADchem_DEP_LIST) unset(Madness_MADmra_DEP_LIST) unset(Madness_MADlinalg_DEP_LIST) unset(Madness_MADtensor_DEP_LIST) unset(Madness_MADmisc_DEP_LIST) unset(Madness_MADworld_DEP_LIST) madness-0.10/configure.ac000066400000000000000000000317561253377057000153770ustar00rootroot00000000000000AC_PREREQ(2.59) # was 2.61 but jaguar is not up to date so backported which # involved replacing AC_PROG_SED and UINT*_T type checks. AC_INIT([MADNESS], [0.10], [https://github.com/m-a-d-n-e-s-s/madness], , [https://github.com/m-a-d-n-e-s-s/madness]) AC_CONFIG_SRCDIR([configure.ac]) # This is where autoconf automatically generated files go AC_CONFIG_AUX_DIR([config]) # This is where local macros will be stored (by us) AC_CONFIG_MACRO_DIR([config]) # Default options for automake ?? AM_INIT_AUTOMAKE([-Wall -Werror nostdinc foreign]) m4_pattern_allow([AM_PROG_AR]) AM_PROG_AR # We want #defines output into a header file AC_CONFIG_HEADERS([src/madness/config.h]) # Detect specific systems so can provide their whacky defaults ACX_CRAYXT ACX_CRAYXE ACX_MAC ACX_IBMBGP ACX_IBMBGQ #AC_AIX ACX_WITH_STUBMPI # Set default thread processor-affinity BIND=${BIND-"-1 -1 -1"} AC_DEFINE_UNQUOTED([MAD_BIND_DEFAULT], ["$BIND"], [The default binding for threads]) # This sets host, host_cpu, host_vendor, host_os AC_CANONICAL_HOST AC_MSG_NOTICE([HOST INFORMATION: $host $host_cpu $host_vendor $host_os]) AC_DEFINE_UNQUOTED([HOST_CPU], ["$host_cpu"], [Defines the host cpu (x86, x86_64, ...).]) AC_DEFINE_UNQUOTED([HOST_SYSTEM], ["$host_os"], [Defines the host os (linux-gnu, ...).]) # Pick out CPUs for which we have special source files (assembler) use_x86_64_asm=no use_x86_32_asm=no AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #if !(defined(__x86_64__) || defined(_M_X64)) #error Not x86_64 #endif ]])], [use_x86_64_asm=yes]) if test $use_x86_64_asm = no; then AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ #if !(defined(__i386) || defined(_M_IX86)) #error Not x86 #endif ]])], [use_x86_32_asm=yes]) fi # Define AM conditional for use of aforementioned assembler AM_CONDITIONAL([USE_X86_64_ASM], [test $use_x86_64_asm = yes]) AM_CONDITIONAL([USE_X86_32_ASM], [test $use_x86_32_asm = yes]) # All source will be built with the MPI C++ and C compilers ACX_MPI initial_CFLAGS="$CFLAGS" initial_CXXFLAGS="$CXXFLAGS" AC_PROG_CC([$MPICC]) AC_PROG_CXX([$MPICXX]) AC_PROG_F77([$MPIF77]) # f90 is a superset of f77 AC_LANG_CPLUSPLUS AC_MSG_NOTICE([Compiler choices after 'AC LANG' ... $CXX and $CC]) CFLAGS="$initial_CFLAGS" CXXFLAGS="$initial_CXXFLAGS" # Check that the MPICC Compiler can link correctly. AC_LANG_PUSH([C]) AC_MSG_CHECKING([that the $MPICC linker works]) AC_LINK_IFELSE( [AC_LANG_PROGRAM([[]], [[]])], [AC_MSG_RESULT([yes])], [AC_MSG_RESULT([no]) AC_MSG_ERROR(["Unable to link with $MPICC])]) AC_LANG_POP # Check that the MPICXX Compiler can link correctly. AC_LANG_PUSH([C++]) AC_MSG_CHECKING([that the $MPICXX linker works]) AC_LINK_IFELSE( [AC_LANG_PROGRAM([[]], [[]])], [AC_MSG_RESULT([yes])], [AC_MSG_RESULT([no]) AC_MSG_ERROR(["Unable to link with $MPICXX])]) AC_LANG_POP ACX_DETECT_CXX # Check for C++11 support, and # add the -std=c++11 or -std=c++0x flag to CXXFLAGS if necessary. AX_CXX_COMPILE_STDCXX_11([noext],[mandatory]) # Set debugging ACX_ENABLE_DEBUGGING # Set optimization ACX_ENABLE_OPTIMIZATION # Set warning flags by compiler vendor ACX_ENABLE_WARN # Selelect best compilation flags ACX_ENABLE_OPTIMAL # Get optional external libraries inplace so that building will partially check them #ACX_WITH_LIBUNWIND ... no longer needed for google perf? ACX_WITH_GOOGLE_PERF ACX_WITH_GOOGLE_TEST ACX_WITH_LIBXC ACX_WITH_MADXC ACX_WITH_TBB ACX_ENABLE_GENTENSOR # Set task profiler option ACX_ENABLE_TASK_PROFILER # Check for basic build functionality AC_PROG_RANLIB AC_PROG_INSTALL AC_PROG_LN_S AC_PROG_MAKE_SET AM_PROG_AS #AC_PROG_SED AC_CHECK_PROG([HAVE_SED],[sed],[YES],[NO]) if test $HAVE_SED = YES; then AC_PATH_PROG(SED,sed) fi # We need these to build the documentation AC_CHECK_PROG([HAVE_DOXYGEN],[doxygen],[YES],[NO]) if test $HAVE_DOXYGEN = YES; then AC_PATH_PROG(DOXYGEN,doxygen) fi AC_CHECK_PROG([HAVE_DOT],[dot],[YES],[NO]) if test $HAVE_DOT = YES; then AC_PATH_PROG(DOT,dot) fi AC_CHECK_PROG([HAVE_PDFLATEX],[pdflatex],[YES],[NO]) if test $HAVE_PDFLATEX = YES; then AC_PATH_PROG(PDFLATEX,pdflatex) fi AC_CHECK_PROG([HAVE_LATEX],[latex],[YES],[NO]) if test $HAVE_LATEX = YES; then AC_PATH_PROG(LATEX,latex) fi AM_CONDITIONAL([CAN_BUILD_DOC], [test $HAVE_DOXYGEN = YES]) AC_HEADER_TIME AC_HEADER_STDC AC_CHECK_HEADERS([limits.h stddef.h stdint.h stdlib.h string.h sys/param.h sys/time.h unistd.h]) AC_CHECK_HEADERS([bits/atomicity.h ext/atomicity.h]) # Check c++11 headers AC_CHECK_HEADERS([array type_traits tuple initializer_list]) AC_HEADER_STDBOOL # Types AC_TYPE_PID_T AC_C_RESTRICT ACX_CHECK_TLS AC_TYPE_SIZE_T AC_HEADER_TIME #AC_TYPE_UINT64_T #AC_TYPE_INT64_T AC_CHECK_TYPES([uint64_t, int64_t, uint32_t, int32_t, uint16_t, uint8_t, long long]) AC_C_VOLATILE AC_C_CONST AC_C_INLINE AC_CHECK_TYPES([ptrdiff_t]) # Library functions AC_CHECK_FUNCS([sleep random execv perror gettimeofday memmove memset pow sqrt strchr strdup getenv]) AC_FUNC_FORK # EFV: when using Intel compiler on OS X malloc is found to be non-GNU-compatible, which breaks gcc's cstdlib #AC_FUNC_MALLOC AC_FUNC_ERROR_AT_LINE ACX_POSIX_MEMALIGN # Check for Elemental ACX_WITH_ELEMENTAL # Set the default fortran integer type AC_ARG_WITH([fortran-int32], [AC_HELP_STRING([--with-fortran-int32], [Set FORTRAN integer size to 32 bit. By default, assume 32-bit integers.])], [ case $withval in yes) acx_fortran_int_size=4 ;; no) acx_fortran_int_size=8 ;; *) AC_MSG_ERROR([Unknown value provided for --with-fortran-int32]) esac ], [acx_fortran_int_size=4]) case $acx_fortran_int_size in 4) AC_DEFINE([MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE],[4],[Default Fortran integer size.]) AC_SUBST([MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE],[4]) AC_MSG_NOTICE([Default Fortran integer: integer*4]) ;; 8) if test x$ac_cv_type_int64_t = xno; then AC_MSG_ERROR([The compiler does not support int64_t, which is required to link with fortran libraries that use 8 byte integers. Rerun configure with '--with-fortran-int32' option.]) fi AC_DEFINE([MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE],[8],[Default Fortran integer size.]) AC_SUBST([MADNESS_FORTRAN_DEFAULT_INTEGER_SIZE],[8]) AC_MSG_NOTICE([Default Fortran integer: integer*8]) esac ACX_WITH_MKL # Checks for Fortran linking conventions and BLAS+LAPACK at the same time ACX_FORTRAN_SYMBOLS # Compiler quirks ACX_UNQUALIFIED_STATIC_DECL ACX_STD_ABS # Check for xterm AC_CHECK_PROG(XTERM,xterm,xterm,no) if test $XTERM = 'xterm'; then AC_DEFINE(HAVE_XTERM,[1],[If set indicates xterm is in path]) fi # Check for gdb AC_CHECK_PROG(GDB,gdb,gdb,no) if test $GDB = 'gdb'; then AC_DEFINE(HAVE_GDB,[1],[If set indicates gdb is in path]) fi # How to handle MADNESS exceptions. AC_ARG_ENABLE(madex, [AC_HELP_STRING([--enable-madex=arg], [Specifies behavior of MADNESS assertions: throw (default), abort, assert, disable])], [madex=$enableval], [madex=throw]) case $madex in abort) AC_MSG_NOTICE([MADNESS assertions will abort]) AC_DEFINE([MADNESS_ASSERTIONS_ABORT], [1], [Set if MADNESS assertions abort]) ;; assert) AC_MSG_NOTICE([MADNESS assertions will assert]) AC_DEFINE([MADNESS_ASSERTIONS_ASSERT], [1], [Set if MADNESS assertions assert]) ;; disable) AC_MSG_NOTICE([MADNESS assertions are disabled]) AC_DEFINE([MADNESS_ASSERTIONS_DISABLE], [1], [Set if MADNESS assertions disabled]) ;; *) AC_MSG_NOTICE([MADNESS assertions will throw]) AC_DEFINE([MADNESS_ASSERTIONS_THROW], [1], [Set if MADNESS assertions throw]) ;; esac # Determine if we should use the instrumented new/delete AC_ARG_ENABLE([memstats], [AC_HELP_STRING([--enable-memstats], [If yes, gather memory statistics (expensive)])], [memstats=$enableval], [memstats=no]) if test "$memstats" = "yes"; then AC_MSG_NOTICE([MADNESS will gather memory statistics (expensive)]) AC_DEFINE([WORLD_GATHER_MEM_STATS], [1], [Set if MADNESS gathers memory statistics]) fi AC_ARG_ENABLE([profile], [AC_HELP_STRING([--enable-profile], [Enables profiling])], [AC_MSG_NOTICE([Enabling profiling]); AC_DEFINE(WORLD_PROFILE_ENABLE, [1], [Define if should enable profiling])], []) AC_ARG_ENABLE([tensor-bounds-checking], [AC_HELP_STRING([--enable-tensor-bounds-checking], [Enable checking of bounds in tensors ... slow but useful for debugging])], [AC_MSG_NOTICE([Enabling tensor bounds checking]); AC_DEFINE(TENSOR_BOUNDS_CHECKING, [1], [Define if should enable bounds checking in tensors])], []) AC_ARG_ENABLE([tensor-instance-count], [AC_HELP_STRING([--enable-tensor-instance-count], [Enable counting of allocated tensors for memory leak detection])], [AC_MSG_NOTICE([Enabling tensor instance counting]); AC_DEFINE(TENSOR_INSTANCE_COUNT, [1], [Define if should enable instance counting in tensors])], []) AC_ARG_ENABLE([spinlocks], [AC_HELP_STRING([--enable-spinlocks], [Enables use of spinlocks instead of mutexs (faster unless over subscribing processors)])], [AC_MSG_NOTICE([Enabling use of spinlocks]); AC_DEFINE(USE_SPINLOCKS, [1], [Define if should use spinlocks])], []) AC_ARG_ENABLE([never-spin], [AC_HELP_STRING([--enable-never-spin], [Disables use of spinlocks (notably for use inside virtual machines)])], [AC_MSG_NOTICE([Disabling use of spinlocks]); AC_DEFINE(NEVER_SPIN, [1], [Define if should use never use spinlocks])], []) AC_ARG_WITH([papi], [AC_HELP_STRING([--with-papi], [Enables use of PAPI])], [AC_MSG_NOTICE([Enabling use of PAPI]); AC_DEFINE(HAVE_PAPI,[1], [Define if have PAPI])], []) # Capture configuration info for user by compiled code AC_DEFINE_UNQUOTED([MADNESS_CONFIGURATION_USER],["$USER"],[User that configured the code]) AC_DEFINE_UNQUOTED([MADNESS_CONFIGURATION_DATE],["`date`"],[Date of configuration]) AC_DEFINE_UNQUOTED([MADNESS_CONFIGURATION_CXX], ["$CXX"],[Configured C++ compiler]) AC_DEFINE_UNQUOTED([MADNESS_CONFIGURATION_CXXFLAGS], ["$CXXFLAGS"],[Configured C++ compiler flags]) AC_DEFINE_UNQUOTED([MADNESS_CONFIGURATION_HOST], ["$HOSTNAME"],[Configured on this machine]) # Not yet enabled the user to configure this stuff but probably should AC_ARG_ENABLE([bsend-ack], [AC_HELP_STRING([--disable-bsend-ack], [Use MPI Send instead of MPI Bsend for huge message acknowledgements.])], [ if test "x$enableval" != xno; then AC_DEFINE([MADNESS_USE_BSEND_ACKS],[1],[Define when MADNESS should use Bsend for huge message acknowledgements.]) fi ], [ AC_DEFINE([MADNESS_USE_BSEND_ACKS],[1],[Define when MADNESS should use Bsend for huge message acknowledgements.]) ]) # Set the MPI thread level used by MADNESS MADNESS_MPI_THREAD_LEVEL=MPI_THREAD_SERIALIZED if test X$MADNESS_HAS_ELEMENTAL = X1; then MADNESS_MPI_THREAD_LEVEL=MPI_THREAD_MULTIPLE fi AC_ARG_WITH([mpi-thread], [AC_HELP_STRING([--with-mpi-thread], [Thread level for MPI (multiple,serialized)])], [ case $withval in multiple) MADNESS_MPI_THREAD_LEVEL=MPI_THREAD_MULTIPLE ;; serialized) MADNESS_MPI_THREAD_LEVEL=MPI_THREAD_SERIALIZED ;; *) AC_MSG_ERROR([Invalid value for --with-mpi-thread ($withval)]) ;; esac ]) AC_DEFINE_UNQUOTED([MADNESS_MPI_THREAD_LEVEL], [$MADNESS_MPI_THREAD_LEVEL], [Thread-safety level requested from MPI by MADNESS]) AC_MSG_NOTICE([MPI thread-safety requested: $MADNESS_MPI_THREAD_LEVEL]) # Display options AC_MSG_NOTICE([CPPFLAGS = $CPPFLAGS]) AC_MSG_NOTICE([CFLAGS = $CFLAGS]) AC_MSG_NOTICE([CXXFLAGS = $CXXFLAGS]) AC_MSG_NOTICE([LDFLAGS = $LDFLAGS]) AC_MSG_NOTICE([LIBS = $LIBS]) # These are the files that must be generated/preprocessed ... as you add files # please maintain the directory tree file order and indenting. AC_CONFIG_FILES([ Makefile config/Makefile config/Makefile.sample config/MADNESS.pc config/madness-config.cmake src/Makefile src/madness/Makefile src/madness/external/Makefile src/madness/external/gtest/Makefile src/madness/external/muParser/Makefile src/madness/external/tinyxml/Makefile src/madness/world/Makefile src/madness/misc/Makefile src/madness/tensor/Makefile src/madness/mra/Makefile src/apps/Makefile src/apps/chem/Makefile src/apps/tdse/Makefile src/apps/hf/Makefile src/apps/exciting/Makefile src/apps/moldft/Makefile src/apps/polar/Makefile src/apps/nick/Makefile src/apps/interior_bc/Makefile src/examples/Makefile doc/Makefile doc/doxygen.cfg ]) AC_OUTPUT madness-0.10/doc/000077500000000000000000000000001253377057000136425ustar00rootroot00000000000000madness-0.10/doc/Latex/000077500000000000000000000000001253377057000147175ustar00rootroot00000000000000madness-0.10/doc/Latex/api.tex000066400000000000000000001060731253377057000162210ustar00rootroot00000000000000% This file was converted to LaTeX by Writer2LaTeX ver. 1.1.7 % see http://writer2latex.sourceforge.net for more info \documentclass[letterpaper]{article} \usepackage[ascii]{inputenc} \usepackage[T1]{fontenc} \usepackage[english]{babel} \usepackage{amsmath} \usepackage{amssymb,amsfonts,textcomp} \usepackage{color} \usepackage{array} \usepackage{hhline} \usepackage{hyperref} \hypersetup{pdftex, colorlinks=true, linkcolor=blue, citecolor=blue, filecolor=blue, urlcolor=blue, pdftitle=, pdfauthor=Robert Harrison, pdfsubject=, pdfkeywords=} % Outline numbering \setcounter{secnumdepth}{0} \renewcommand\thesubsection{\arabic{subsection}} \renewcommand\thesubsubsection{\arabic{subsection}.\arabic{subsubsection}} \renewcommand\theparagraph{\arabic{subsection}.\arabic{subsubsection}.\arabic{paragraph}} % List styles \newcommand\liststyleLi{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLiii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLiv{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLv{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLvi{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLvii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLviii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLix{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } % Page layout (geometry) \setlength\voffset{-1in} \setlength\hoffset{-1in} \setlength\topmargin{0.7874in} \setlength\oddsidemargin{0.7874in} \setlength\textheight{9.062033in} \setlength\textwidth{6.9251995in} \setlength\footskip{26.148pt} \setlength\headheight{0cm} \setlength\headsep{0cm} % Footnote rule \setlength{\skip\footins}{0.0469in} \renewcommand\footnoterule{\vspace*{-0.0071in}\setlength\leftskip{0pt}\setlength\rightskip{0pt plus 1fil}\noindent\textcolor{black}{\rule{0.25\columnwidth}{0.0071in}}\vspace*{0.0398in}} % Pages styles \makeatletter \newcommand\ps@Standard{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{\hfill \thepage{}} \renewcommand\@evenfoot{\@oddfoot} \renewcommand\thepage{\arabic{page}} } \makeatother \pagestyle{Standard} \title{} \author{Robert Harrison} \date{2009-12-14} \begin{document} \section*{Summary of the MADNESS application programming interface} THIS IS SADLY OUT OF DATE ... probably best to ignore this document until it is updated. \subsection{Low-level sequential runtime} \subsubsection{Shared pointers (sharedptr.h)} Shared pointers in the spirit of Boost and the new C++ standard but with thread-safe reference counting and additional hooks to facilitate remote access, stealing, etc. Probably needs redesigning to sit on the new standard as appropriate, but basic functionality will remain the same. Basic rule of thumb is when you allocate memory to immediately wrap the pointer in a \texttt{SharedPointer} (or \texttt{SharedArray}) so that you do not have to remember to delete it and so you can freely give others the pointer without anyone worrying about premature or double freeing of the data. Fast, thread-safe reference counting is performed using the atomic operations. {\ttfamily namespace madness \{} {\ttfamily \ \ template {\textless}typename T{\textgreater} class SharedPtr \{} {\ttfamily \ \ \ \ /// Default constructor makes an null pointer} {\ttfamily \ \ \ \ SharedPtr(); } {\ttfamily \ \ \ \ /// Wrap a pointer with optional custom deleter} {\ttfamily \ \ \ \ explicit SharedPtr(T* ptr, void (*deleter)(T*)=0) ;} {\ttfamily \ \ \ \ /// Wrap a pointer with optional custom deleter \& ownership} {\ttfamily \ \ \ \ explicit SharedPtr(T* ptr, bool own, void (*deleter)(T*)=0);} {\ttfamily \ \ \ \ /// Copy constructor generates new reference to same pointer} {\ttfamily \ \ \ \ SharedPtr(const SharedPtr{\textless}T{\textgreater}\& s) ;} {\ttfamily \ \ \ \ /// Copy constructor with static type conversion} {\ttfamily \ \ \ \ template {\textless}typename Q{\textgreater} SharedPtr(const SharedPtr{\textless}Q{\textgreater}\& s);} {\ttfamily \ \ \ \ /// Destructor decrements ref. count freeing data if zero} {\ttfamily \ \ \ \ virtual \~{}SharedPtr();} {\ttfamily \ \ \ \ /// Assignment} {\ttfamily \ \ \ \ SharedPtr{\textless}T{\textgreater}\& operator=(const SharedPtr{\textless}T{\textgreater}\& s);} {\ttfamily \ \ \ \ /// Returns number of references} {\ttfamily \ \ \ \ int use\_count() const;} {\ttfamily \ \ \ \ /// Returns the value of the pointer} {\ttfamily \ \ \ \ T* get() const;} {\ttfamily \ \ \ \ /// Returns true if the SharedPtr owns the pointer} {\ttfamily \ \ \ \ bool owned() const; \ } {\ttfamily \ \ \ \ /// Cast of SharedPtr{\textless}T{\textgreater} to T* returns the value of the pointer} {\ttfamily \ \ \ \ operator T*() const;} {\ttfamily \ \ \ \ \ /// Return pointer+offset} {\ttfamily \ \ \ \ T* operator+(long offset) const;} {\ttfamily \ \ \ \ /// Return pointer-offset} {\ttfamily \ \ \ \ T* operator-(long offset) const;} {\ttfamily \ \ \ \ \ /// Dereferencing returns a reference to pointed value} {\ttfamily \ \ \ \ T\& operator*() const;} {\ttfamily \ \ \ \ \ \ /// Member access via pointer works as expected} {\ttfamily \ \ \ \ T* operator-{\textgreater}() const;} {\ttfamily \ \ \ \ /// Array indexing returns reference to indexed value} {\ttfamily \ \ \ \ T\& operator[](long index) const;} {\ttfamily \ \ \ \ /// Boolean value (test for null pointer)} {\ttfamily \ \ \ \ operator bool() const;} {\ttfamily \ \ \ \ /// Are two pointers equal?} {\ttfamily \ \ \ \ bool operator==(const SharedPtr{\textless}T{\textgreater}\& other) const;} {\ttfamily \ \ \ \ /// Are two pointers not equal?} {\ttfamily \ \ \ \ bool operator!=(const SharedPtr{\textless}T{\textgreater}\& other) const;} {\ttfamily \ \ \ \ /// Steal an un-owned reference to the pointer } {\ttfamily \ \ \ \ SharedPtr{\textless}T{\textgreater} steal() const;} {\ttfamily \ \ \};} {\ttfamily \ \ \ /// A SharedArray uses delete [] is used to free it} {\ttfamily \ \ \ template {\textless}class T{\textgreater} \ class SharedArray : public SharedPtr{\textless}T{\textgreater} \{} {\ttfamily \ \ \ public:} {\ttfamily \ \ \ \ SharedArray(T* ptr = 0);} {\ttfamily \ \ \ \ \ \ SharedArray(const SharedArray{\textless}T{\textgreater}\& s);} {\ttfamily \ \ \ \ \ /// Assignment} {\ttfamily \ \ \ \ SharedArray\& operator=(const SharedArray\& s);} {\ttfamily \ \ \};} {\ttfamily \}} \subsubsection{Serialization (archive.h, buffer\_archive.h, mpi\_archive.h, binary\_fstream\_archive.h, vector\_archive.h, parallel\_archive.h)} Central to MADNESS's ability to move data and computation freely around the machine is the ability to serialize all data types with full type safety. For fundamental data types and standard containers this is already provided in \texttt{archive.h} in which it is also described how to accomplish this for user types. See there for more info. Various archives (serialized streams) are provided including MPI p2p messaging, binary/text sequential files, vectors, in-memory buffers (special) and parallel archives. Issues \liststyleLi \begin{itemize} \item mpi\_archive.h needs to be made independent of world so that it is more broadly useful and this discussion needs moving to the low-level parallel runtime level \item discussion of parallel\_archive.h should be up at the world level \item discussion of serialization of World and WorldObjects into buffer\_archive should be at the world level \item need to implement type-safe serialization of ConcurrentHashMap to archive.h \item need to discuss special role and overloading of buffer\_archive (basically user should use vector\_archive not buffer\_archive which is reserved for internal system use). \item thread safety of archives (there is none!) \end{itemize} \subsubsection{Exceptions and assertions (madness\_exception.h)} There is a poorly realized goal to have all exceptions generated by MADNESS be derived from \texttt{MadnessException}. The \texttt{MADNESS\_EXCEPTION} macro wraps throwing an exception and stores the line number, function name, and source file name in the object to aid in debugging. The \texttt{MADNESS\_ASSERTION} macro also captures the assertion code as a string. \subsubsection{Cpu time, wall time, processor frequency, cycle count (timers.h)} Fast and accurate timers. Still need to understand multi-threaded and multi-core behavior. \subsubsection{Profiling (worldprofile.h)} MADNESS is good at breaking many profilers due to extensive use of C++ templates, inlining, 1-sided communication, etc. For this reason we have a simple but lightweight profiling class derived from the open source SHINY project (see sourceforge). Issues: \liststyleLii \begin{itemize} \item Document the 3 or 4 useful macros \item Describe briefly how it works \item Make the darn thing thread safe \end{itemize} \subsubsection{Pseudo-random number generation (misc/ran.h)} A thread-safe pseudo-random number generator with single number and vector interfaces. It uses a lagged, Fibonacci sequence with parameters described in the header file. The quality should be very high, though there are even better generators now available and please note that although the implementation has been tested in various ways including using Marsaglia's die-hard test the generator has not yet been employed in applications especially sensitive to the quality of the numbers. You can have multiple independent instantiations of \texttt{Random} to obtain multiple independent streams. The \texttt{RandomValue} and \texttt{RandomVector} functions all map to one shared default instance. {\ttfamily namespace madness \{} {\ttfamily \ \ struct RandomState; /// Used to set/get state of the generator} {\ttfamily \ \ class Random \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ \ Random(unsigned int seed = 5461);} {\ttfamily \ \ \ \ \ \ virtual \~{}Random();} {\ttfamily \ \ \ \ \ double get() ; // Returns a value uniform in [0,1)} {\ttfamily \ \ \ \ /// Returns a vector uniform in [0,1)} {\ttfamily \ \ \ \ \ \ template {\textless}typename T{\textgreater} \ void getv(int n, T * restrict v); } {\ttfamily \ \ \ \ /// Returns vector of random bytes in [0,256)} {\ttfamily \ \ \ \ \ \ void getbytes(int n, unsigned char * restrict v); } {\ttfamily \ \ \ \ \ \ RandomState getstate() const; \ /// Returns state of generator} {\ttfamily \ \ \ \ \ \ void setstate(const RandomState \&s); /// Restores state} {\ttfamily \ \ \ \};} {\ttfamily \ \ /// Random value generator with specializations for float,} {\ttfamily \ \ /// double, complex{\textless}float{\textgreater}, complex{\textless}double{\textgreater}, int and long} {\ttfamily \ \ \ template {\textless}class T{\textgreater} \ \ T RandomValue();} {\ttfamily \ \ /// Random vector generator with specializations for float,} {\ttfamily \ \ /// double complex{\textless}float{\textgreater}, complex{\textless}double{\textgreater}, int and long} {\ttfamily \ \ \ template {\textless}class T{\textgreater} void RandomVector(int n, T* t);} {\ttfamily \}} \subsubsection{Miscellaneous (misc.h)} {\ttfamily namespace madness \{} {\ttfamily \ \ unsigned long checksum\_file(const char* filename);} {\ttfamily \ \ std::istream\& position\_stream(std::istream\& f, } {\ttfamily \ \ \ \ \ \ \ \ const std::string\& tag);} {\ttfamily \ \ std::string lowercase(const std::string\& s);} {\ttfamily \}} \subsection{Low level parallel runtime} This provides remote method invocation, threads, mutex, thread-safe containers, and the thread pool. At this level there is no knowledge of the World nor perhaps even of MPI. User applications will almost certainly never need these interfaces, unless extending low-level MADNESS functionality. The interfaces relating to threading will evolve to be compatible with the Intel TBB (we are already adopting some of their design elements). \subsubsection{SafeMPI -- an optionally serialized MPI wrapper (safempi.h)} All MPI communication must go through this interface. Wraps an MPI communicator and provides all necessary MPI functionality (please extend as necessary) plus a few small extensions. If the configure script determines that the local MPI implementation does not safely support entry by multiple threads, all calls into all instances of \texttt{SafeMPI} are protected by a single mutex to ensure that only one thread is ever inside the MPI library. At some point it may be necessary to implement funneling of all calls to one thread. {\ttfamily namespace SafeMPI \{} {\ttfamily \ \ class Intracomm \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ ... standard MPI::Intracomm methods ...} {\ttfamily \ \ \ \ ... extensions to simplify p2p and tree-based communication} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ class Request \{} {\ttfamily \ \ \ \ public:} {\ttfamily \ \ ... standard MPI::Request methods ...} {\ttfamily \ \ \};} {\ttfamily \}} Rather than repeat the entire MPI interface we refer the user to the MPI manual or \texttt{safempi.h}. \subsubsection{RMI -- remote method invocation (worldrmi.h)} A single thread serves incoming active messages for all activities or Worlds on an SMP node. To be consistent with what they actually do, this class probably needs to be renamed to active message (\texttt{AM}) and \texttt{WorldAmInterface} renamed to \texttt{WorldRMI}. The multi-threaded processes are identified using their rank in \texttt{MPI::COMM\_WORLD}. You are responsible for translating ranks from within your communicator to that within \texttt{MPI::COMM\_WORLD} by getting the groups from both communicators using \texttt{Comm\_group} and then creating a map for the ranks using \texttt{Group\_translate\_ranks}. Instantiation of the singleton is not thread safe and must be done after initializing MPI but before it might be invoked by multiple threads. Before calling \texttt{MPI::Finalize} you must terminate the server so that it can clean up and terminate the server thread. Failure to do so will cause occasional SEGV upon program exit. Messages are a contiguous buffer of which the first \texttt{RMI::HEADER\_LEN} bytes are reserved for internal use. Upon receipt, the user payload (at \texttt{buf+HEADER\_LEN}) is guaranteed to be aligned at least on a 16-byte boundary. When sending and receiving a message the length parameter (\texttt{nbyte}) specifies the full length of the buffer including the header (i.e., the user data size is \texttt{nbyte-HEADER\_LEN}). The \texttt{RMI::Request} data structure is copyable and provides at least the \texttt{Test} and \texttt{Testsome} interfaces of MPI, but note that MPI may not be the underlying transport mechanism (on the Cray-XT we will soon be using a native Portals implementation). Unordered messages are processed in the order of receipt that due to multiple buffers and network routes may not correspond to the sending order. Ordered messages are processed in the order sent so that remote operations appear sequentially consistent to a single remote thread. Handlers should be lightweight operations and in particular should not block and to be fully safe should not send messages (though this does work in the current MPI implementation). Anything expensive should be put as a task into the thread pool (below). Note that there is just one server thread which is useful for simplifying maintenance of remote data structures, but means that if you abuse it communication can back up. {\ttfamily namespace madness \{} {\ttfamily \ \ typedef void (*rmi\_handlerT)(void* buf, size\_t nbyte);} {\ttfamily \ \ class RMI \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ \ \ typedef Request; // Type of structure returned by isend} {\ttfamily \ \ \ \ \ \ static const size\_t HEADER\_LEN;} {\ttfamily \ \ \ \ \ \ static const size\_t MAX\_MSG\_LEN;} {\ttfamily \ \ \ \ \ \ static const unsigned int ATTR\_UNORDERED;} {\ttfamily \ \ \ \ \ \ \ static const unsigned int ATTR\_ORDERED;} {\ttfamily \ \ \ \ \ \ static void begin(); // Instantiate the server} {\ttfamily \ \ \ \ \ \ static void end(); // Terminate the server} {\ttfamily \ \ \ \ \ \ static Request isend(const void* buf, size\_t nbyte, \newline \ \ \ \ \ \ \ \ ProcessID dest, rmi\_handlerT func, \newline \ \ \ \ \ \ \ \ unsigned int attr=ATTR\_UNORDERED); // Async. send} {\ttfamily \ \ \};} {\ttfamily \}} \subsubsection{Atomic operations on integers (madatomic.h)} This minimal API is likely to change to become more compatible with Intel TBB and also as we port to more machines that might force changes. Presently all functionality is provided via macros (since the original code is old and came from C). The argument \texttt{ptr} refers to a pointer to a \texttt{MADATOMIC\_INT}. All operations on an atomic integer must be through the API to ensure appropriate fencing of both memory and instructions. {\ttfamily MADATOMIC\_FENCE // Presently null and unused} {\ttfamily MADATOMIC\_INT // Type of an atomic integer (platform dependent).} {\ttfamily MADATOMIC\_INT\_GET(ptr) // Read value of atomic integer} {\ttfamily MADATOMIC\_INT\_SET(ptr,value) // Write value to atomic integer} {\ttfamily MADATOMIC\_INT\_INC(ptr) // Atomic increment of atomic integer} {\ttfamily MADATOMIC\_INT\_DEC\_AND\_TEST(ptr) // Decrement and test atomic integer} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ \ \ \ // Returns true if result is zero} {\ttfamily MADATOMIC\_INITIALIZE(val) // For static initialization} \bigskip Static initialization of an integer (i.e., where it is declared) must be performed as follows {\ttfamily MADATOMIC\_INT atom = MADATOMIC\_INITIALIZE(value)} \bigskip \subsubsection{Mutex, MutexReaderWriter, ScopedMutex, MutexWaiter (thread.h)} Simple locking mechanisms based upon either atomic memory operations or Pthread mutexes. The locks are spin-locks with a rudimentary back-off algorithm implemented by \texttt{MutexWaiter}. Over subscribing processors might cause performance problems without care. \texttt{Mutex} provides a simple lock. \texttt{MutexReaderWriter} provides a non-exclusive read lock and an exclusive write lock. \texttt{ScopedMutex} wraps an existing \texttt{Mutex} to protect an entire scope following the rules of the lifetime of C++ objects. Mutexes of any kind cannot be copied or assigned. The locks are not recursive -- i.e., if a thread has a lock and attempts to lock it again the results are undefined (probably deadlock). {\ttfamily namespace madness \{} {\ttfamily \ \ class Mutex \{} {\ttfamily \ \ public:\newline \ \ \ \ Mutex();} {\ttfamily \ \ \ \ void lock() const;} {\ttfamily \ \ \ \ void unlock() const;} {\ttfamily \ \ \ \ bool try\_lock() const; // Returns true if acquired} {\ttfamily \ \ \ \ virtual \~{}Mutex();} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ class MutexReaderWriter \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ \ static const int NOLOCK;} {\ttfamily \ \ \ \ static const int READLOCK;} {\ttfamily \ \ \ \ static const int WRITELOCK;} {\ttfamily \ \ \ \ bool try\_read\_lock() const;} {\ttfamily \ \ \ \ bool try\_write\_lock() const;} {\ttfamily \ \ \ \ bool try\_lock(int lockmode) const;} {\ttfamily \ \ \ \ bool try\_convert\_read\_lock\_to\_write\_lock() const;} {\ttfamily \ \ \ \ void read\_lock() const;} {\ttfamily \ \ \ \ void write\_lock() const;} {\ttfamily \ \ \ \ void lock(int lockmode) const;} {\ttfamily \ \ \ \ void read\_unlock() const;} {\ttfamily \ \ \ \ void write\_unlock() const;} {\ttfamily \ \ \ \ void unlock(int lockmode) const;} {\ttfamily \ \ \ \ void convert\_read\_lock\_to\_write\_lock() const;} {\ttfamily \ \ \ \ void convert\_write\_lock\_to\_read\_lock() const;} {\ttfamily \ \ \ \ virtual \~{}MutexReaderWriter();} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ class ScopedMutex \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ ScopedMutex(Mutex\&); // Constructor acquires lock} {\ttfamily \ \ \ \ ScopedMutex(Mutex*); // Constructor acquires lock\newline \ \ \ \ virtual \~{}ScopedMutex(); // Destructor releases lock} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ /// Attempt to acquire two locks without blocking while \newline \ \ /// holding either one} {\ttfamily \ \ bool try\_two\_locks(const Mutex\& m1, const Mutex\& m2);} {\ttfamily \}} \bigskip \subsubsection{Threads (thread.h)} This provides simple wrappers around POSIX standard Pthread calls to simplify creation of detached system scheduled threads. It is probable that this interface will change either to match Intel TBB or the new C++ standard. The first wrapper provides a base class that enables object instances to be associated with threads. The second is probably not worth discussing. The derived class must call \texttt{ThreadBase::start()} to commence the thread which executes the virtual run method. To terminate the thread the run method can return or the thread can invoke \texttt{ThreadBase::exit()}. Another thread can cancel the thread by using \texttt{ThreadBase::cancel()}; {\ttfamily namespace madness \{} {\ttfamily \ \ class ThreadBase \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ ThreadBase(); // Must invoke start() to start the thread.} {\ttfamily \ \ \ \ virtual void run() = 0; // Implement this to do work} {\ttfamily \ \ \ \ void start(); // Start the thread running} {\ttfamily \ \ \ \ static void exit(); // Call for premature exit from run()} {\ttfamily \ \ \ \ const pthread\_t\& get\_id(); // Get pthread id (if running)} {\ttfamily \ \ \ \ int get\_pool\_thread\_index() const; // Index in ThreadPool or -1} {\ttfamily \ \ \ \ int cancel() const; // Cancel this thread} {\ttfamily \ \ \ \ virtual \~{}ThreadBase();} {\ttfamily \ \ \ \ \};} {\ttfamily \}} \bigskip Mmm ... I see that the ThreadBase destructor does not cancel the thread if it is still running ... probably should? Seem to recall an issue here but it seems bad to delete the object and leave the thread running. \subsubsection{Tasks, task attributes and the thread pool (thread.h)} The primary (only?) source of SMP concurrency inside MADNESS comes from inserting tasks into the task queue for execution by the pool of threads. The class is a singleton. The one pool of threads serves all activities and worlds within the SMP node so as to avoid oversubscribing the processors and to facilitate (eventually) switching between cache-aware and cache-oblivious computation. Presently we just have cache oblivious. A task is derived from \texttt{PoolTaskInterface} and is submitted to the queue by invoking \texttt{ThreadPool::add()} which takes ownership of the task and deletes it upon completion. Tasks have attributes that presently can specify high priority (task is inserted at the front of the queue instead of the rear), generation of additional parallelism, and stealable. The last two are presently ignored. As for the RMI class, instantiate the singleton with \texttt{ThreadPool::begin()} before adding tasks to avoid a race condition. The number of threads may be specified as an argument or via the environment variable \texttt{POOL\_NTHREAD} (probably to be renamed). {\ttfamily namespace madness \{} {\ttfamily \ \ class TaskAttributes \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ static const unsigned long GENERATOR;} {\ttfamily \ \ \ \ static const unsigned long STEALABLE;} {\ttfamily \ \ \ \ static const unsigned long HIGHPRIORITY;} {\ttfamily \ \ \ \ TaskAttributes(unsigned long flags = 0);} {\ttfamily \ \ \ \ bool is\_generator() const;} {\ttfamily \ \ \ \ bool is\_stealable() const;} {\ttfamily \ \ \ \ bool is\_high\_priority() const;} {\ttfamily \ \ \ \ void set\_generator (bool generator\_hint);} {\ttfamily \ \ \ \ void set\_stealable (bool stealable);} {\ttfamily \ \ \ \ void set\_highpriority (bool hipri);} {\ttfamily \ \ \ \ static TaskAttributes generator();} {\ttfamily \ \ \ \ static TaskAttributes hipri();} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ class PoolTaskInterface : public TaskAttributes \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ PoolTaskInterface();} {\ttfamily \ \ \ \ explicit PoolTaskInterface(const TaskAttributes\& attr);} {\ttfamily \ \ \ \ virtual void run() = 0; // Implement to do work} {\ttfamily \ \ \ \ virtual \~{}PoolTaskInterface() \{\}} {\ttfamily \ \ \ \};} \bigskip {\ttfamily \ \ class ThreadPool \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ static void begin(nthread=0); // Instantiate the pool} {\ttfamily \ \ \ \ static void add(PoolTaskInterface*); // Add task to queue} {\ttfamily \ \ \ \ static void add(const std::vector{\textless}PoolTaskInterface*{\textgreater}\&);} {\ttfamily \ \ \ \ \~{}ThreadPool() \{\};} {\ttfamily \ \ \ \ \};} {\ttfamily \}} \subsubsection{Callbacks and tracking dependencies (worlddep.h)} MADNESS's internal execution is at least in part software data flow with dependencies between tasks being managed by counting dependencies. Callbacks are used to count dependencies and to perform the action(s) taken when all dependencies are satisfied. The present rudimentary interface is thread safe and may be extended to provide greater functionality. {\ttfamily namespace madness \{} {\ttfamily \ \ class CallbackInterface \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ virtual void notify() = 0;} {\ttfamily \ \ \ \ virtual \~{}CallbackInterface()\{\};} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ class DependencyInterface : public CallbackInterface \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ DependencyInterface(int ndep = 0);} {\ttfamily \ \ \ \ int ndep() const; // Returns no. of unsatisfied dependencies} {\ttfamily \ \ \ \ bool probe() const; // Returns true if ndep() == 0} {\ttfamily \ \ \ \ void notify(); // Callback for dependency being satisfied} {\ttfamily \ \ \ \ void inc(); // Increment the number of dependencies} \bigskip {\ttfamily \ \ \ \ // Registers a callback for when ndepend=0.} {\ttfamily \ \ \ \ // If ndepend == 0, the callback is immediately invoked.} {\ttfamily \ \ \ \ // ADDING A CALL BACK IS NOT PRESENTLY THREAD SAFE.} {\ttfamily \ \ \ \ void register\_callback(CallbackInterface* callback);} \bigskip {\ttfamily \ \ \ \ \ \ virtual \~{}DependencyInterface();} {\ttfamily \ \ \ \ \};} \} \subsubsection{Ranges -- specifying parallel iteration} Ranges automate the expression and decomposition of an index space for parallel iteration. Our present capability is extremely limited and crude when compared with the TBB, but hile our interface adopts some of the TBB conventions, we use futures both internally and externally to facilitate the generation of more parallelism in the user code. {\ttfamily namespace madness \{} {\ttfamily \ \ template {\textless}typename iteratorT{\textgreater} class Range \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ typedef iteratorT iterator;} {\ttfamily \ \ \ \ Range(const iterator\& start, const iterator\& finish, \newline \ \ \ \ \ \ \ \ int chunksize=-1);} \bigskip {\ttfamily \ \ \ \ \ \ Range(const Range\& r);} {\ttfamily \ \ \ \ Range(Range\& r, const Split\& split);} {\ttfamily \ \ \ \ size\_t size() const;} {\ttfamily \ \ \ \ \ \ bool empty() const;} {\ttfamily \ \ \ \ const iterator\& begin() const;} {\ttfamily \ \ \ \ const iterator\& end() const;} {\ttfamily \ \ \};} {\ttfamily \}} \subsubsection[Futures {}-- managing latency and dependencies (future.h and worldref.h)]{Futures -- managing latency and dependencies (future.h and worldref.h)} The presently world level \liststyleLiii \begin{itemize} \item {\ttfamily Future} \item {\ttfamily RemoteReference} \end{itemize} classes should be stripped of references to world and pushed down here so that we have a complete parallel programming environment independent of the World stuff. \subsubsection{Concurrent hash map (worldhashmap.h)} A thread-safe hash map that is a drop in replacement for the GNU \texttt{hash\_map} and new C++ \ \texttt{unordered\_map}, but has the additional features \liststyleLiv \begin{itemize} \item Concurrent deletions, insertions or modifications of entries do not invalidate iterators or pointers to entries (i.e., will not make them point to junk unless someone actually deletes the entry you are pointing at). \item Intel TBB-style accessor interfaces to provide full thread-safe access to entries with either read or write mutexes. \end{itemize} Issues: \liststyleLv \begin{itemize} \item Should be renamed concurhashmap.h \item Include class prototype here \end{itemize} \bigskip \subsection{World level runtime} The \texttt{World} class encapsulates the entire global runtime environment for either all or a sub-group of processes. It wraps a single MPI communicator. The main capabilities for the global runtime are \liststyleLvi \begin{itemize} \item One-sided active messaging including remote task creation \item Local and global fencing of AM and tasks (termination detection) \item Global names (id) for objects \item Globally addressable data structures including hash tables and arrays (still coming) \item Parallel serialization of data \end{itemize} \bigskip \subsubsection{World (world.h)} ???? \subsubsection{World MPI interface (worldmpi.h)} This is really a convenience for you to avoid having to first get the safe MPI communicator and then use it and is also a hook for future instrumentation. \subsubsection[World active message interface (worldam.h)]{World active message interface (worldam.h)} This service sits on top of the global RMI described above (note that correct translation of ranks between communicators is not yet being performed ... i.e., only works for COMM\_WORLD). It adds to the RMI interface automatic send buffer management, eliminates user concern with the RMI header, throttles the number of outstanding send operations, and propagates state (the world, source process, message size) by encapsulatation in the class \texttt{AmArg} that also facilitates packing and unpacking of the message buffer using serialization. A World active message handler has this type {\ttfamily namespace madness \{} {\ttfamily \ \ typedef void (*am\_handlerT)(const AmArg\&);} {\ttfamily \}} \paragraph{AmArg (worldam.h)} This class and its helper functions greatly simplify packing and unpacking data from message buffers. \bigskip {\ttfamily namespace madness \{} {\ttfamily \ \ class AmArg \{} {\ttfamily \ \ public:} {\ttfamily \ \ \ \ AmArg();} {\ttfamily \ \ \ \ unsigned char* buf() const; // Returns pointer to payload} {\ttfamily \ \ \ \ size\_t size() const; // Returns size of payload} {\ttfamily \ \ \ \ \ \ template {\textless}typename T{\textgreater} \ } {\ttfamily \ \ \ \ \ BufferInputArchive operator\&(T\& t) const; // Deserialize} {\ttfamily \ \ \ \ template {\textless}typename T{\textgreater}} {\ttfamily \ \ \ \ \ BufferOutputArchive operator\&(const T\& t) const; // Serialize} {\ttfamily \ \ \ \ ProcessID get\_src() const; // Source of incoming msg.} {\ttfamily \ \ \ \ World* get\_world() const; // World of incoming msg.} {\ttfamily \ \ \};} \bigskip {\ttfamily \ \ // Allocates a new AmArg with nbytes of user data} {\ttfamily \ \ \ inline AmArg* alloc\_am\_arg(std::size\_t nbyte);} \bigskip {\ttfamily \ \ // Duplicates an AmArg} {\ttfamily \ \ \ inline AmArg* copy\_am\_arg(const AmArg\& arg);} \bigskip {\ttfamily \ \ // Frees an AmArg allocated with alloc\_am\_arg or copy\_am\_arg} {\ttfamily \ \ inline void free\_am\_arg(AmArg* arg);} \bigskip {\ttfamily \ \ // Serialize one argument into a new AmArg} {\ttfamily \ \ template {\textless}typename A{\textgreater}} {\ttfamily \ \ inline AmArg* new\_am\_arg(const A\& a);} \bigskip {\ttfamily \ \ \ // Serialize two arguments in a new AmArg} {\ttfamily \ \ template {\textless}typename A, typename B{\textgreater}} {\ttfamily \ \ inline AmArg* new\_am\_arg(const A\& a, const B\& b);} \bigskip {\ttfamily \ \ // Ditto for up to 10 arguments} {\ttfamily \}} \bigskip \paragraph{WorldAmInterface (worldam.h)} This class is embedded inside World and you probably will not instantiate it separately. {\ttfamily namespace madness \{} {\ttfamily \ \ class WorldAmInterface \{} {\ttfamily \ \ \ \ WorldAmInterface(World\& world);} \bigskip {\ttfamily \ \ \ \ \ \ // Ensures all local AM operations have completed} {\ttfamily \ \ \ \ \ \ inline void fence();} \bigskip {\ttfamily \ \ \ \ // Sends non-blocking message -- user responsible for managing\newline \ \ \ \ // completion and freeing buffer} {\ttfamily \ \ \ \ \ \ RMI::Request isend\_(ProcessID dest, am\_handlerT op, \newline \ \ \ \ \ \ \ \ const AmArg* arg, int attr=RMI::ATTR\_ORDERED);} \bigskip {\ttfamily \ \ \ \ \ \ // Sends manged non-blocking non-blocking active message.\newline \ \ \ \ // The buffer will be freed when the send is complete} {\ttfamily \ \ \ \ \ \ void send(ProcessID dest, am\_handlerT op, } {\ttfamily \ \ \ \ \ \ \ \ const AmArg* arg, int attr=RMI::ATTR\_ORDERED);} {\ttfamily \ \ \ \ \};} {\ttfamily \}} \bigskip The \texttt{fence()}method ensures local completion. After the call, all managed outgoing messages will have been sent (though not necessarily processed by the remote end) and all received incoming active messages will have been executed. \subsubsection{World global operations (worldgop.h)} Except for fence these were originally needed to avoid deadlock in the single threaded code. Are still useful if MPI is single-threaded (most common) or if we expose asynchronous global operations. \liststyleLvii \begin{itemize} \item Add discussion of what global fence means and its cost \end{itemize} \bigskip \subsubsection{World task interface (world\_task\_queue.h)} Provides submission of local and remote tasks using serialization to move arguments, and futures with callbacks to manage dependencies. Counts number of pending tasks to enable fencing. \subsubsection{World objects (world\_object.h)} Provides global names for single replicated objects so that can send messages to instance of the logically same object on a different processor without having to keep track of pointers etc. \subsubsection[World container (worlddc.h)]{World container (worlddc.h)} A globally addressable hash table with one-sided access. Can send messages to either the container (using inherited WorldObj methods) or to items in the container. The latter is a centrally important ability because \liststyleLviii \begin{itemize} \item it enables non-process-centric computing -- it eliminates reference to process number and hence \ virtualizes how and where computation occurs \item it automatically gives you a write lock on the item via a non-const accessor \end{itemize} Issues \liststyleLix \begin{itemize} \item Gotta be careful when acquiring other locks \item Need to modify send protocol to break call chains \bigskip \end{itemize} \end{document} madness-0.10/doc/Latex/implementation.tex000066400000000000000000001675131253377057000205030ustar00rootroot00000000000000\documentclass[letterpaper]{book} %\documentclass[letterpaper]{article} \usepackage{hyperref} \usepackage{amsmath} \usepackage{graphics} %%\hypersetup{pdftex, colorlinks=true, linkcolor=blue, citecolor=blue, filecolor=blue, urlcolor=blue, pdftitle=, pdfauthor=Robert Harrison, pdfsubject=, pdfkeywords=} %%% Outline numbering %%\setcounter{secnumdepth}{3} %%\renewcommand\thesection{\arabic{section}} %%\renewcommand\thesubsection{\arabic{section}.\arabic{subsection}} %%\renewcommand\thesubsubsection{\arabic{section}.\arabic{subsection}.\arabic{subsubsection}} %%% Page layout (geometry) %%\setlength\voffset{-1in} %%\setlength\hoffset{-1in} %%\setlength\topmargin{0.7874in} %%\setlength\oddsidemargin{0.7874in} %%\setlength\textheight{8.825199in} %%\setlength\textwidth{6.9251995in} %%\setlength\footskip{0.6in} %%\setlength\headheight{0cm} %%\setlength\headsep{0cm} %%% Footnote rule %%\setlength{\skip\footins}{0.0469in} %%\renewcommand\footnoterule{\vspace*{-0.0071in}\setlength\leftskip{0pt}\setlength\rightskip{0pt plus 1fil}\noindent\textcolor{black}{\rule{0.25\columnwidth}{0.0071in}}\vspace*{0.0398in}} % Paragraph formatting \setlength{\parindent}{0pt} \setlength{\parskip}{2ex plus 0.5ex minus 0.2ex} \begin{document} % Title Page \title{MADNESS Implementation Notes} \date{Last Modification: 12/14/2009} \maketitle % Copyright Page \pagestyle{empty} \null\vfill \noindent This file is part of MADNESS. Copyright (C) 2007, 2010 Oak Ridge National Laboratory This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or(at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: \begin{quote} Robert J. Harrison \\ Oak Ridge National Laboratory \\ One Bethel Valley Road \\ P.O. Box 2008, MS-6367 \\ Oak Ridge, TN 37831 \\ \\ email: harrisonrj@ornl.gov \\ tel: 865-241-3937 \\ fax: 865-572-0680 \end{quote} \newpage % Table of Contents Pages \clearpage \setcounter{page}{1} \pagenumbering{roman} \setcounter{tocdepth}{10} \renewcommand\contentsname{Table of Contents} \tableofcontents \clearpage \setcounter{page}{1} \pagenumbering{arabic} \chapter{Implementation Notes} This document provides reference information concerning the mathematics, numerics, algorithms, and design of the multiresolution capabilities of MADNESS. The information herein will be useful to both users of MADNESS and implementers of new capabilities within MADNESS. \section{ABGV} And references therein: that from which (nearly) all else follows. B. Alpert, G. Beylkin, D. Gines, L. Vozovoi, \href{http://math.nist.gov/~BAlpert/mwpde.pdf}{Adaptive Solution of Partial Differential Equations in Multiwavelet Bases,} \textit{Journal of Computational Physics }\textbf{182}, 149-190 (2002). \section{Legendre scaling functions and multiwavelets} \subsection{Scaling functions} The mother Legendre scaling functions $i=0,\ldots ,k-1$ in 1D are defined as \begin{equation} \phi (x)=\left\{\begin{matrix}\sqrt{2i+1}P(2x-1)&0\le x\le 1\\0&\mathrm{\mathit{otherwise}}\end{matrix}\right. \end{equation} These are orthonormal on $[0,1]$. The scaling functions scaled to level $n=0,1,\ldots $ and translated to box $l=0,\ldots ,2^{n}-1$ span the space $V_{n}^{k}$ and are defined by \begin{equation}\label{seq:refText1} \phi _{il}^{n}(x)=2^{n/2}\phi _{i}(2^{n}x-1) \end{equation} These are orthonormal on $[2^{-n}l,2^{-n}(l+1)]$. The scaling functions by construction satisfy the following properties: \begin{itemize} \item In the limit of either large \textit{k }or large \textit{n }the closure of $V_{n}^{k}$ is a complete basis for $L_{2}[0,1]$. \item Containment forming a ladder of spaces $V_{0}^{k}\subset V_{1}^{k}\subset V_{2}^{k}\subset \cdots $. \item Translation and dilation, c.f., (2). \item Orthonormality within a scale $\int _{-\infty }^{\infty }{\phi _{il}^{n}(x)\phi _{jm}^{n}(x)\mathit{dx}}=\delta _{ij}\delta _{lm}$. \end{itemize} The two-scale relationship describes how to expand exactly a polynomial at level \textit{n }in terms of the polynomials at level \textit{n+1.} \begin{equation}\label{seq:refText2} \begin{matrix}\hfill \phi _{i}(x)&\text{=}&\sqrt{2}\sum _{j=0}^{k-1}\left(h_{ij}^{(0)}\phi _{j}(2x)+h_{ij}^{(1)}\phi _{j}(2x-1)\right)\\\phi _{il}^{n}(x)&\text{=}&\sum _{j=0}^{k-1}\left(h_{ij}^{(0)}\phi _{j2l}^{n+1}(x)+h_{ij}^{(1)}\phi _{j2l+1}^{n+1}(x)\right)\hfill\null \end{matrix} \end{equation} The coefficients $H^{(0)}$ and $H^{(1)}$ are straightforwardly computed by left projection of the first equation in (3) with the fine-scale polynomials. \subsection[Telescoping series]{Telescoping series} The main point of multiresolution analysis is to separate the behavior of functions and operators at different length scales. Central to this is the telescoping series which \textit{exactly }represents the basis at level \textit{n }(the finest scale) in terms of the basis at level zero (the coarsest scale) plus corrections at successively finer scales. \begin{equation}\label{seq:refText3} V_{n}^{k}=V_{0}^{k}+\left(V_{1}^{k}-V_{0}^{k}\right)+\cdots +\left(V_{n}^{k}-V_{n-1}^{k}\right) \end{equation} If function is sufficiently smooth in some region of space to be represented at the desired precision at some level, then the differences at finer scales will be negligibly small. \subsection{Multi-wavelets} The space of wavelets at level \textit{n } $W_{n}^{k}$ is defined as the orthogonal complement of the scaling functions (polynomials) at level \textit{n+1} to those at level \textit{n. \ }I.e., $V_{n+1}^{k}=V_{n}^{k}\oplus W_{n}^{k}$. Thus, by definition the functions in $W_{n}^{k}$ are orthogonal to the functions in $V_{n}^{k}$. \ The wavelets at level \textit{n }are constructed by expanding them in the polynomials at level \textit{n+1} \begin{equation}\label{seq:refText4} \begin{matrix}\psi _{i}(x)&\text{=}&\sqrt{2}\sum _{j=0}^{k-1}\left(g_{ij}^{(0)}\phi _{j}(2x)+g_{ij}^{(1)}\phi _{j}(2x-1)\right)\\\psi _{il}^{n}(x)&\text{=}&\sum _{j=0}^{k-1}\left(g_{ij}^{(0)}\phi _{j2l}^{n+1}(x)+g_{ij}^{(1)}\phi _{j2l+1}^{n+1}(x)\right)\hfill\null \end{matrix} \end{equation} The coefficients $G^{(0)}$ and $G^{(1)}$ are formed by orthogonalizing the wavelets to the polynomials at level n. \ This determines the wavelets to within a unitary transformation and we follow the additional choices in Alpert's papers/thesis. The wavelets have these properties \begin{itemize} \item Decomposition of $V_{n}^{k}$ \end{itemize} \begin{equation}\label{seq:refText5} V_{n}^{k}=V_{0}^{k}\oplus W_{0}^{k}\oplus W_{1}^{k}\oplus \cdots \oplus W_{n-1}^{k} \end{equation} \begin{itemize} \item Translation and dilation $\psi _{il}^{n}(x)=2^{n/2}\psi _{i}(2^{n}x-l)$ \item Orthnormality within and between scales \end{itemize} \begin{equation} \begin{matrix}\hfill \int _{-\infty }^{\infty }\psi _{il}^{n}(x)\psi _{i'l'}^{n'}(x)\mathit{dx}&\text{=}&\delta _{nn'}\delta _{ii'}\delta _{ll'}\\\hfill \int _{-\infty }^{\infty }\psi _{il}^{n}(x)\phi _{i'l'}^{n'}(x)\mathit{dx}&\text{=}&\delta _{ii'}\delta _{ll'}\hfill\null \end{matrix} \end{equation} \subsection[\ Function approximation in the scaling function basis]{\ Function approximation in the scaling function basis} A function \textit{f(x)} may be approximated by expansion in the orthonormal scaling function basis at level \textit{n }with the coefficients obtained by simple projection \begin{equation}\label{seq:refText7} \begin{matrix}\hfill f^{n}(x)&\text{=}&\sum _{l=0}^{2^{n}-1}\sum _{i=0}^{k=1}s_{il}^{n}\phi _{il}^{n}(x)\hfill\null \\\hfill s_{il}^{n}&\text{=}&\int _{-\infty }^{\infty }f(x)\phi _{il}^{n}(x)\mathit{dx}\hfill\null \end{matrix} \end{equation} \bigskip The two scale relationships embodied in (3) and (5) may be combined to write the following matrix equation that relates the scaling function basis at one scale with the scaling function and wavelet basis at the next coarsest scale. \begin{equation} \begin{matrix}\hfill \left(\begin{matrix}\hfill \phi (x)\\\hfill \psi (x)\end{matrix}\right)&\text{=}&\sqrt{2}\left(\begin{matrix}H^{(0)}\hfill\null &H^{(1)}\hfill\null \\G^{(0)}\hfill\null &G^{(1)}\hfill\null \end{matrix}\right)\left(\begin{matrix}\phi (2x)\hfill\null \\\phi (2x-1)\hfill\null \end{matrix}\right)\hfill\null \\\hfill \left(\begin{matrix}\hfill \phi _{l}^{n}(x)\\\hfill \psi _{l}^{n}(x)\end{matrix}\right)&\text{=}&\left(\begin{matrix}H^{(0)}\hfill\null &H^{(1)}\hfill\null \\G^{(0)}\hfill\null &G^{(1)}\hfill\null \end{matrix}\right)\left(\begin{matrix}\phi _{2l}^{n+1}(x)\hfill\null \\\phi _{2l+1}^{n+1}(x)\hfill\null \end{matrix}\right)\hfill\null \end{matrix} \end{equation} Since the transformation is unitary, we also have \begin{equation}\label{seq:refText9} \begin{matrix}\hfill \left(\begin{matrix}\hfill \phi (2x)\\\hfill \phi (2x-1)\end{matrix}\right)&\text{=}&\sqrt{2}\left(\begin{matrix}H^{(0)}\hfill\null &H^{(1)}\hfill\null \\G^{(0)}\hfill\null &G^{(1)}\hfill\null \end{matrix}\right)^{T}\left(\begin{matrix}\phi (x)\hfill\null \\\psi (x)\hfill\null \end{matrix}\right)\hfill\null \\\hfill \left(\begin{matrix}\hfill \phi _{2l}^{n+1}(x)\\\hfill \psi _{2l+1}^{n+1}(x)\end{matrix}\right)&\text{=}&\left(\begin{matrix}H^{(0)}\hfill\null &H^{(1)}\hfill\null \\G^{(0)}\hfill\null &G^{(1)}\hfill\null \end{matrix}\right)^{T}\left(\begin{matrix}\phi _{l}^{n}(x)\hfill\null \\\psi _{l}^{n}(x)\hfill\null \end{matrix}\right)\hfill\null \end{matrix} \end{equation} In table \ref{seq:refTable0} are the filter coefficients for \textit{k=}4, the only point being that these are plain-old-numbers and not anything mysterious. \begin{table}[htdp] \caption{Multi-wavelet filter coefficients for Legendre polynomials, $k=4$.} \begin{center} \begin{tabular}{c c c c|c c c c} \multicolumn{4}{c|}{$H^{(0)}$} & \multicolumn{4}{|c}{$H^{(1)}$} \\ 7.0711e-01 & 0.0000e+00 & 0.0000e+00 & 0.0000e+00 & 7.0711e-01 & 0.0000e+00 & 0.0000e+00 & 0.0000e+00 \\ -6.1237e-01 & 3.5355e-01 & 0.0000e+00 & 0.0000e+00 & 6.1237e-01 & 3.5355e-01 & 0.0000e+00 & 0.0000e+00 \\ 0.0000e+00 &-6.8465e-01 & 1.7678e-01 & 0.0000e+00 & 0.0000e+00 & 6.8465e-01 & 1.7678e-01 & 0.0000e+00 \\ 2.3385e-01 & 4.0505e-01 &-5.2291e-01 & 8.8388e-02 &-2.3385e-01 & 4.0505e-01 & 5.2291e-01 & 8.8388e-02 \\ \hline \multicolumn{4}{c|}{$G^{(0)}$} & \multicolumn{4}{|c}{$G^{(1)}$} \\ 0.0000e+00 & 1.5339e-01 & 5.9409e-01 &-3.5147e-01 & 0.0000e+00 &-1.5339e-01 & 5.9409e-01 & 3.5147e-01 \\ 1.5430e-01 & 2.6726e-01 & 1.7252e-01 &-6.1237e-01 &-1.5430e-01 & 2.6726e-01 &-1.7252e-01 &-6.1237e-01 \\ 0.0000e+00 & 8.7867e-02 & 3.4031e-01 & 6.1357e-01 & 0.0000e+00 &-8.7867e-02 & 3.4031e-01 &-6.1357e-01 \\ 2.1565e-01 & 3.7351e-01 & 4.4362e-01 & 3.4233e-01 &-2.1565e-01 & 3.7351e-01 &-4.4362e-01 & 3.4233e-01 \\ \end{tabular} \end{center} \label{default} \end{table}% \subsection{Wavelet transform} The transformation in (10) expands polynomials on level \textit{n }in terms of polynomials and wavelets on level \textit{n-1}. \ It may be inserted into the function approximation (8) that is in terms of polynomials at level \textit{n. \ }This yields an exactly equivalent approximation in terms of polynomials and wavelets on level \textit{n-1}. \ (I have omitted the multiwavelet index for clarity.). \begin{equation} \begin{matrix}f^{n}(x)&\text{=}&\sum _{l=0}^{2^{n}-1}{s_{l}^{n}\phi _{l}^{n}(x)}\hfill\null \\\text{}&\text{=}&\sum _{l=0}^{2^{n-1}-1}{\left(\begin{matrix}s_{2l}^{n}\hfill\null \\s_{2l+1}^{n}\hfill\null \end{matrix}\right)^{T}\left(\begin{matrix}\phi _{2l}^{n}(x)\hfill\null \\\phi _{2l+1}^{n}(x)\hfill\null \end{matrix}\right)}\hfill\null \\\text{}&\text{=}&\sum _{l=0}^{2^{n-1}-1}{\left(\left(\begin{matrix}H^{(0)}\hfill\null &H^{(1)}\hfill\null \\G^{(0)}\hfill\null &G^{(1)}\hfill\null \end{matrix}\right)\left(\begin{matrix}s_{2l}^{n}\hfill\null \\s_{2l+1}^{n}\hfill\null \end{matrix}\right)\right)^{T}\left(\begin{matrix}\phi _{l}^{n-1}(x)\hfill\null \\\psi _{l}^{n-1}(x)\hfill\null \end{matrix}\right)}\hfill\null \\\text{}&\text{=}&\sum _{l=0}^{2^{n-1}-1}{\left(\begin{matrix}s_{l}^{n-1}(x)\hfill\null \\d_{l}^{n-1}\hfill\null \end{matrix}\right)^{T}\left(\begin{matrix}\phi _{l}^{n-1}(x)\hfill\null \\\psi _{l}^{n-1}(x)\hfill\null \end{matrix}\right)}\hfill\null \end{matrix} \end{equation} The sum and difference (scaling function and wavelet) coefficients at level \textit{n-1} are therefore given by this transformation \begin{equation} \left(\begin{matrix}s_{l}^{n-1}(x)\\d_{l}^{n-1}(x)\end{matrix}\right)=\left(\begin{matrix}H^{(0)}&H^{(1)}\\G^{(0)}&G^{(1)}\end{matrix}\right)\left(\begin{matrix}s_{2l}^{n}\\s_{2l+1}^{n}\end{matrix}\right) \end{equation} The transformation may be recursively applied to obtain the following representation of a function in the wavelet basis c.f. (6) with direct analogy to the telescoping series (4). \begin{equation}\label{seq:refText12} f^{n}(x)=\sum _{i=0}^{k-1}s_{i0}^{0}\phi _{i0}^{0}(x)+\sum _{n=0,\ldots }\sum _{l=0}^{2^{n}-1}\sum _{i}^{k-1}d_{il}^{n}\psi _{il}^{n}(x) \end{equation} The wavelet transformation (13) is unitary and is therefore a very stable numerical operation. \subsection{Properties of the scaling functions} \subsubsection{Symmetry} \begin{equation} \phi _{i}(x)=(-1)^{i}\phi _{i}(1-x) \end{equation} \subsubsection{Derivatives} \begin{equation} \frac{1}{2\sqrt{2i+1}}\frac{d}{\mathit{dx}}\phi _{i}(x)=\sqrt{2i-1}\phi _{i-1}(x)+\frac{1}{2\sqrt{2i-5}}\frac{d}{\mathit{dx}}\phi _{i-2}(x) \end{equation} \subsubsection[Values at end points]{Values at end points} \bigskip \begin{equation} \begin{matrix}\hfill \phi _{i}(0)&\text{=}&(-1)^{i}\sqrt{2i+1}\hfill\null \\\hfill \phi _{i}(1)&\text{=}&\sqrt{2i+1}\hfill\null \\\hfill \frac{d\phi _{i}}{\mathit{dx}}(0)&\text{=}&(-1)^{i}i(i+1)\sqrt{2i+1}\hfill\null \\\hfill \frac{d\phi _{i}}{\mathit{dx}}(1)&\text{=}&i(i+1)\sqrt{2i+1}\hfill\null \end{matrix} \end{equation} \section{User and simulation coordinates} Internal to the MADNESS implementation, all computations occur in the unit volume in \textit{d }dimensions $[0,1]^{d}$. \ The unit cube is referred to as simulation coordinates. \ However, the user operates in coordinates that in each dimension $q=0,\ldots ,d-1$ may have different upper and lower bounds $[\mathrm{{\mathit{lo}}}_{q},\mathrm{{\mathit{hi}}}_{q}]$ that represents a diagonal linear transformation between the user and simulation coordinates. \bigskip \begin{equation} \begin{matrix}\hfill x_{q}^{\mathrm{{\mathit{user}}}}(x_{q}^{\mathrm{{sim}}})&\text{=}&\left(\mathrm{{\mathit{hi}}}_{q}-\mathrm{{\mathit{lo}}}_{q}\right)x_{q}^{\mathrm{{sim}}}+\mathrm{{\mathit{lo}}}_{q}\hfill\null \\\hfill x_{q}^{\mathrm{{sim}}}(x_{q}^{\mathrm{{\mathit{user}}}})&\text{=}&\frac{x_{q}^{\mathrm{{\mathit{user}}}}-\mathrm{{\mathit{lo}}}_{q}}{\mathrm{{\mathit{hi}}}_{q}-\mathrm{{\mathit{lo}}}_{q}}\hfill\null \end{matrix} \end{equation} \bigskip This is a convenience motivated by the number of errors due to users neglecting the factors arising from mapping the user space volume onto the unit cube. \ More general linear and non-linear transformations must presently be handled by the user. To clarify further the expected behavior and how/when this mapping of coordinates is performed: All coordinates, values, norms, thresholds, integrals, operators, etc., provided by/to the user are in user coordinates. \ The advantage of this is that the user does not have to worry about mapping the physical simulation space. \ E.g., if a user computes the norm of a function what is returned is precisely the value \begin{equation} \left|f\right|_{2}^{2}=\int _{\mathrm{{\mathit{lo}}}}^{\mathrm{{\mathit{hi}}}}\left|f(x^{\mathrm{{\mathit{user}}}})\right|^{2}\mathit{dx}^{\mathrm{{\mathit{user}}}} \end{equation} Similarly, when a user truncates a function with a norm-wise error $\epsilon $ this should be the error in the above norm, and coefficients should be discarded so as to maintain this accuracy independent of the user volume. All sum and difference coefficients, quadrature points and weights, operators, etc. are internally maintained in simulation coordinates. \ The advantage of this is that the operators can all be consistently formulated just once and we only have to worry about conversions at the \ user/application interface. \subsection[Normalization of scaling functions in the user coordinates]{Normalization of scaling functions in the user coordinates} The scaling functions as written in equation (2) are normalized in simulation coordinates. \ \ Normalizing the functions in user coordinates requires an additional factor of $V^{-1/2}$ where \textit{V }is the user volume (which is just hi-lo in 1D). \begin{equation} \begin{matrix}\int _{\mathrm{{\mathit{lo}}}}^{\mathrm{{\mathit{hi}}}}\left(V^{-1/2}\phi _{il}^{n}(x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}}))\right)^{2}dx^{\mathrm{{\mathit{user}}}}&\text{=}&V^{-1}\int _{\mathrm{{\mathit{lo}}}}^{\mathrm{{\mathit{hi}}}}\phi _{il}^{n}(x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}}))^{2}dx^{\mathrm{{\mathit{user}}}}=1\end{matrix} \end{equation} \section{Function approximation} The function is approximated as follows \begin{equation}\label{seq:refText19} f^{n}(x^{\mathrm{{\mathit{user}}}})=V^{-1/2}\sum _{il}s_{il}^{n}\phi _{il}^{n}(x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}})) \end{equation} Note that we have expanded the function in terms of basis functions normalized in the user coordinates. \ This has several benefits, and in particular eliminates most logic about coordinate conversion factors in truncation thresholds, norms, etc. \subsection{Evaluation} Evaluation proceeds by mapping the user coordinates into simulation coordinates, recurring down the tree to find the appropriate box of coefficients, evaluating the polynomials, contracting with the coefficients, and scaling by $V^{-1/2}$. \subsection{Projection into the scaling function basis} The user provides a function/functor that given a point in user coordinates returns the value. \ Gauss-Legendre quadrature of the same or higher order as the polynomial is used to evaluate the integral \begin{equation} \begin{matrix}s_{il}^{n}&\text{=}&V^{-1/2}\int _{\mathrm{{\mathit{lo}}}}^{\mathrm{{\mathit{hi}}}}{f(x^{\mathrm{{\mathit{user}}}})\phi _{il}^{n}(x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}}))\mathit{dx}^{\mathrm{{\mathit{user}}}}}\hfill\null \\&\text{=}&V^{1/2}\int _{0}^{1}{f(x^{\mathrm{{\mathit{user}}}}(x^{\mathrm{{sim}}}))\phi _{il}^{n}(x^{\mathrm{{sim}}})\mathit{dx}^{\mathrm{{\mathit{sim}}}}}\hfill\null \\&\text{=}&2^{dn/2}V^{1/2}\int _{l2^{-n}}^{(l+1)2^{-n}}{f(x^{\mathrm{{\mathit{user}}}}(x^{\mathrm{{sim}}}))\phi _{i}(2^{n}x^{\mathrm{{sim}}}-l)\mathit{dx}^{\mathrm{{\mathit{sim}}}}}\hfill\null \\&\text{=}&2^{-dn/2}V^{1/2}\int _{0}^{1}{f(x^{\mathrm{{\mathit{user}}}}(2^{-n}(x+l)))\phi _{i}(x)\mathit{dx}}\hfill\null \\&\simeq &2^{-dn/2}V^{1/2}\sum _{\mu =0}^{n_{\mathrm{{\mathit{pt}}}}}{\omega _{\mu }f(x^{\mathrm{{\mathit{user}}}}(2^{-n}(x_{\mu }+l)))\phi _{i}(x_{\mu })}\hfill\null \end{matrix} \end{equation} $x_{\mu }$ and $\omega _{\mu }$ are the points and weights for the Gauss-Legendre rule of order $n_{\mathrm{{\mathit{pt}}}}$ over \textit{[0, 1]}. The above can be regarded as an invertible linear transformation between the scaling function coefficients and the approximate function values at the quadrature points ( $\mu =0,\ldots ,n_{\mathrm{{\mathit{pt}}}}$). \begin{equation}\label{seq:refText21} \begin{matrix}s_{il}^{n}&\text{=}&2^{-dn/2}V^{1/2}\sum _{\mu }f_{\mu }{\bar{\phi }}_{\mu i}\\f_{\mu }&\text{=}&2^{dn/2}V^{-1/2}\sum _{i}\phi _{\mu i}s_{il}^{n}\end{matrix} \end{equation} where \begin{equation} \begin{matrix}\hfill f_{\mu }&\text{=}&f(x^{\mathrm{{\mathit{user}}}}(2^{-n}(x_{\mu }+l)))\hfill\null \\\hfill \phi _{\mu i}&\text{=}&\phi _{i}(x_{\mu })\hfill\null \\\hfill {\bar{\phi }}_{\mu i}&\text{=}&\phi _{i}(x_{\mu })\omega _{\mu }\hfill\null \\\hfill \sum _{\mu }\phi _{\mu i}{\bar{\phi }}_{\mu i}&\text{=}&\delta _{ij}\hfill\null \end{matrix} \end{equation} The last line merely restates the orthonormality of the scaling function basis that in the discrete Gauss-Legendre quadrature is exact for the scaling function basis with the choice of the quadrature order $n_{\mathrm{{\mathit{pt}}}}\ge k$. \subsection{Truncation criteria} \label{ref:sectruncmodes}Discarding small difference coefficients while maintaining precision is central to speed and drives the adaptive refinement. \ Different truncation criteria are useful in different contexts. \subsubsection[Mode 0 {}- the default]{\rmfamily Mode 0 - the default} This truncation is appropriate for most calculations and in particular those that have functions with deep levels of refinement (such as around nuclei in all-electron electronic structure calculations). \ Difference coefficients of leaf nodes are neglected according to \begin{equation}\label{seq:refText23} \left\|d_{l}^{n}\right\|_{2}=\sqrt{\sum _{i}\left|d_{il}^{n}\right|^{2}}\le \epsilon \end{equation} \subsubsection{Mode 1} This mode is appropriate when seeking to maintain an accurate representation of both the function and its derivative. \ Difference coefficients of leaf nodes are neglected according to \begin{equation}\label{seq:refText24} \left\|d_{l}^{n}\right\|_{2}\le \epsilon \operatorname{min}(1,L2^{-n}) \end{equation} where L is the minimum width of any dimension of the user coordinate volume. The form for the threshold is motivated by re-expressing the expansion (20) in terms of the mother scaling function and then differentiating (crudely, neglecting continuity with neighboring cells). \begin{equation} \begin{matrix}\hfill f^{n}(x^{\mathrm{{\mathit{user}}}})&\text{=}&V^{-1/2}2^{n/2}\sum _{il}s_{il}^{n}\phi _{i}(2^{n}x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}})-l)\hfill\null \\\hfill \frac{d}{dx^{\mathrm{{\mathit{user}}}}}f^{n}(x^{\mathrm{{\mathit{user}}}})&\simeq &V^{-1/2}2^{3n/2}(\mathrm{{\mathit{hi}}}-\mathrm{{\mathit{lo}}})^{-1}\sum _{il}s_{il}^{n}\phi '_{i}(2^{n}x^{\mathrm{{sim}}}(x^{\mathrm{{\mathit{user}}}})-l)\hfill\null \end{matrix} \end{equation} Thus, we see that the scale dependent part of the derivative is an extra factor of $2^{n}$ arising from differentiating the scaling function. \ We must include the factor hi-lo in order to make the threshold volume independent. \ Finally, we use the minimum to ensure that the threshold (25) is everywhere at least as tight as (24). \subsubsection{Mode 2} This is appropriate only for smooth functions with a nearly uniform level of refinement in the entire volume. \ Difference coefficients are neglected according to \begin{equation} \left\|d_{l}^{n}\right\|_{2}\le \epsilon 2^{-nd/2} \end{equation} This is the truncation scheme described in ABGV. \ If this truncation mode discards all difference coefficients at level \textit{n} it preserves a strong bound on the error between the representations at levels \textit{n} and \textit{n -- 1}. \begin{equation} \left\|f^{n}-f^{n-1}\right\|_{2}^{2}=\sum _{l=0}^{2^{n}-1}\left\|d_{l}^{n}\right\|_{2}^{2}\le \sum _{l=0}^{2^{n}-1}\epsilon ^{2}2^{-nd}=\epsilon ^{2} \end{equation} However, for non-smooth functions beyond two dimensions this conservative threshold can lead to excessive (even uncontrolled) refinement and is usually not recommended. \subsection{Adaptive refinement} After projection has been performed in boxes \textit{2l} and \textit{2l+1} at level \textit{n}, the scaling function coefficients may be filtered using \ to produce the wavelet coefficients in box \textit{l} at level \textit{n-1}. \ If the desired truncation criterion (section \ref{ref:sectruncmodes}) is not satisfied, the process is repeated in the child boxes \textit{4l}, \textit{4l+1}, \textit{4l+2}, \textit{4l+3} at level \textit{n+1}. \ Otherwise, the computed coefficients are inserted at level \textit{n}. \section[Unary operations]{\rmfamily Unary operations} \subsection[Function norm]{\rmfamily Function norm} Due to the chosen normalization of the scaling function coefficients in (20) both the scaling function and wavelet bases are orthonormal in user-space coordinates, thus \begin{equation} \begin{matrix}\hfill \left\|f^{n}\right\|_{2}^{2}&\text{=}&\left\|s_{0}^{0}\right\|_{2}^{2}+\sum _{m=0}^{n-1}\sum _{l=0}^{2^{m}-1}\left\|d_{l}^{m}\right\|_{2}^{2}\hfill\null \\&\text{=}&\sum _{l=0}^{2^{n}-1}\left\|s_{l}^{n}\right\|_{2}^{2}\hfill\null \end{matrix} \end{equation} \subsection[Squaring]{\rmfamily Squaring} This is a special case of multiplication; please look below. \subsection[General unary operation]{\rmfamily General unary operation} In-place, point-wise application of a user-provided function (q) to a MRA function (f), i.e., q (f (x)). After optional auto-refinement, the function f (x) is transformed to the quadrature mesh and the function q (f (x)) computed at each point. \ \ The values are then transformed back to the scaling function basis. \ \ The criterion for auto-refinement is presently the same as used for squaring, but it would be straightforward to make this user-defined. Need to add discussion of error analysis that can ultimately be used to drive rigorous auto-refinement. \subsection{Differentiation} ABGV provides a detailed description of the weak formulation of the differentiation operator with inclusion of boundary conditions. There is also a Maple worksheet that works this out in gory detail. We presently only provide a central difference with either periodic or zero-value Dirichlet boundary conditions though we can very easily add more general forms. With a constant level of refinement differentiation takes this block-tri-diagonal form \begin{equation} t_{il}=L\sum _{j}{r_{ij}^{(\text{+})}s_{jl-1}+r_{ij}^{(0)}s_{jl}+r_{ij}^{(\text{{}-})}s_{jl+1}} \end{equation} where \textit{L} is the size of the dimension being differentiated. The diagonal block of the operator is full rank whereas the off-diagonal blocks are rank one. The problems arise from adaptive refinement. We need all three blocks at the lowest common level. The algorithm starts from leaf nodes in the source tree trying to compute the corresponding node in the output. We probe for nodes to the left and right in the direction of differentiation (and enforcing the boundary conditions). There are three possibilities \begin{itemize} \item Present without coefficients -- i.e., the neighbor is more deeply refined. In this instance we loop through children of the target (central) node and invoke the differentiation operation on them, passing any coefficients that we have already found (which must include the central node and the other neighbor due to the nature of the adaptive refinement). \item Present with coefficients -- be happy. \item Not present -- i.e., the neighbor is less deeply refined. The search for the neighbor recurs up the tree to find the parent that does have coefficients. \end{itemize} Once all three sets of coefficients have been located we will be at the level corresponding to the most deeply refined block. For the other blocks we may have coefficients corresponding to parents in the tree. Thus, we need to project scaling coefficients directly from node $n,l$ to a child node $n',l'$ with $n'\ge n$ and $2^{n'-n}l\le l'<2^{n'-n}(l+1)$. Equation (44) tells us how to compute the function value at the quadrature points on the lowest level and we can project onto the lower level basis using (22). Together, these yield \begin{equation} s_{il'}^{n'}=2^{d(n-n')/2}\sum _{\mu }{{\bar{\phi }}_{\mu i}\sum _{j}{\phi _{\mu j}^{n-n',l,l'}s_{\mathit{jl}}^{n}}} \end{equation} which is most efficiently executed with the summations performed in the order written. Recurring down is also a little tricky. We always have at least the coefficients for the central box with translation \textit{l}. This yields children \textit{2l} and \textit{2l+1} which are automatically left/right neighbors of each other. \subsection{Band-limited, free-particle propagator} The unfiltered real-space kernel of the free-particle propagator for the Schr\"odinger equation is \begin{equation} G_{0}(x,t)=\frac{1}{\sqrt{2\pi it}}e^{-{\frac{x^{2}}{2it}}} \end{equation} For large $x$ this becomes highly oscillatory and impractical to apply exactly. However, when applied to a function that is known to be band limited we can neglect components in $G_{0}$ outside the band limit which causes it to decay. Furthermore, combining application of the propagator with application of a filter enables us to knowingly control high-frequency numerical noise introduced by truncation of the basis (essential for fast computation) and the high-frequencies inherent in the multiwavelet basis (due both to their polynomial form and discontinuous support). Explicitly, consider the representation of the propagator in Fourier space \begin{equation} {\hat{G}}_{0}(k,t)=e^{-i\frac{k^{2}t}{2}} \end{equation} We multiply this by a filter $F(k/c)$ that smoothly switches near $k=c$ from unity to zero. The best form of this filter is still under investigation, but we presently use a 30\textsuperscript{th}{}-order Butterworth filter. \begin{equation} F(k)=\left(1+k^{30}\right)^{-1} \end{equation} For $k\ll 1$ this deviates from unity by about $-k^{30}$. This implies that if frequencies up to a band limit $c_{\mathit{target}}$ are desired to be accurate to a precision $\epsilon $ after $N$ applications of the operator, then we should choose the actual band limit in the filter such that \ $N(c_{\mathit{target}}/c)^{30}\le \epsilon $ or $c\ge c_{\mathit{target}}(N/\epsilon )^{1/30}$. For a precision of 10\textsuperscript{{}-3} in the highest frequency (lower frequencies will much more accurate) after 10\textsuperscript{5} steps we would choose $c\ge 1.85c_{\mathit{target}}$. Similarly, for $k\gg 1$ the filter $F(k)$ differs from zero by circa $k^{-30}$ and therefore we must include in the internal numerical approximation of the operator frequencies about 2x greater than $c$ (more precisely, 2.15x for a precision 1e-10 and 2.5x for a precision of 1e-12). Specifically, we compute the filtered real-space propagator by numerical quadrature of the Fourier expansion of the filtered kernel. The quadrature is performed over $[-c_{\mathit{top}},c_{\mathit{top}}]$ where $c_{\mathit{top}}=2.15\ast c$. The wave length associated with a frequency $k$ is $\lambda =2\pi /k$ and therefore limiting to frequencies less than $c$ implies a smallest grid of $h=\pi /c$. This is oversampled by circa 10x to permit subsequent valuation using cubic interpolation. Finally, the real space kernel is computed by inverse discrete Fourier transform and then cubic interpolation. Fast and accurate application of this operator is still being investigated. We can apply it either in real space directly to the scaling function coefficients or in wavelet space using non-standard form. Presently, the real-space form is both faster and more accurate. \subsection{Integral convolutions} This is described in gory detail in ABGV and the first multi-resolution qchem paper but eventually all of that should be reproduced here. For now, we simply take care of the mapping from the user to simulation coordinates and other stuff differing from the initial approach. We start from a separated representation of the kernel $K(x)$ in user coordinates that is valid over the volume for $x_{\mathit{lo}}^{\mathit{user}}\le |x^{\mathit{user}}|\le L\sqrt{(d)}$ to a \ relative precision $\epsilon ^{\mathit{user}}$ (except where the value is exponentially small) \begin{equation} K(x^{\mathit{user}})=\sum _{\mu =1}^{M}{\prod _{i=1}^{d}{T_{i}^{(\mu )}(x_{i}^{\mathit{user}})}}+O(\epsilon ^{\mathit{user}}) \end{equation} Since the error in the approximate is relative it is the same in both user and simulation coordinates. The most common case is that the kernel is isotropic ( $K(x)=K(|x|)$) and therefore the separated terms do not depend upon direction, i.e., $T_{i}^{(\mu )}=T^{(\mu )}$ (if it is desired to keep the terms real it may be necessary to treat a negative sign separately). In a cubic box the transformation to simulation coordinates is the same in each dimension and therefore we only need to compute and store each operator once for each dimension. However, in non-cubic boxes the transformation to simulation coordinates is different in each direction making it necessary to compute and store each operator in each direction. Doing this will permit us to treat non-isotropic operators in the same framework, the extreme example of which is a convolution that acts only in one dimension. Presently, this is not supported but it is a straightforward modification. Focusing now on just one term and direction, the central quantity is the transition matrix element that is needed in user coordinates but must be computed in simulation coordinates \begin{equation} \begin{matrix}\left[r_{ll'}^{n}\right]_{ii'}&=&L^{-1}\int {\int {T^{\mathit{user}}(x^{\mathit{user}}-y^{\mathit{user}})\phi _{il}^{n}(x^{sim}(x^{\mathit{user}}))\phi _{i'l'}^{n}(y^{sim}(y^{\mathit{user}}))\mathit{dx}^{\mathit{user}}}\mathit{dy}^{\mathit{user}}}\\&=&L\int {\int {T^{\mathit{user}}(L(x^{sim}-y^{sim}))\phi _{il}^{n}(x^{sim})\phi _{i'l'}^{n}(y^{sim})\mathit{dx}^{\mathit{sim}}}\mathit{dy}^{\mathit{sim}}}\hfill\null \end{matrix} \end{equation} where $L$ is the width of the dimension. This enables the identification \begin{equation} T^{sim}(x^{sim})=LT^{\mathit{user}}(Lx^{sim}) \end{equation} Internally, the code computes transition matrix elements for $T^{sim}$ in simulation coordinates. If the operator is represented as a sum of Gaussian functions $c^{\mathit{user}}\exp (-\alpha ^{\mathit{user}}(x^{\mathit{user}})^{2})$ then the corresponding form in simulation coordinates will be $c^{sim}=Lc^{\mathit{user}}$ and $\alpha ^{\mathit{sim}}=L^{2}\alpha ^{\mathit{user}}$. \subsection{Application of the non-standard form} Two things complicate the application of the NS-form operator. The first is specific to the separated representation -- we only have this for the actual operator ( $T$) not for the NS-form which is $T^{n+1}-T^{n}$. Thus at each level we actually apply to the sum and difference coefficients $T^{n+1}$ and subtract off the result of applying $T^{n}$ to just the sum coefficients. Note that screening must be performed using the norm of $T^{n}-T^{n-1}$ since it is sparse. Beyond 1D this approach is a negligible computational overhead and the only real concern is possible loss of precision since we are subtracting two large numbers to obtain a small difference. My current opinion is that there is no effective loss of precision since reconstructing the result will produce values of similar magnitude. This is I think a correct argument for the leaf nodes, but the interior nodes might have larger values and hence we could lose relevant digits. The second issue is what to do about scaling function coefficients at the leaf nodes. Regarding the operator as a matrix being applied to a vector of scaling function coefficients, then the operator is exactly applied by operation upon the sum and difference coefficients at the next level, therefore there is no need to apply the operator to the leaf nodes (this was my initial thinking). However, as pointed out by Beylkin, the operator itself can introduce finer-scale detail which means that we must consider application at the lowest level where the difference coefficients are zero since the operator can introduce difference coefficients at that level. \subsection{Screening} To screen effectively we need to estimate the norm of the blocks of the non-standard operator and also each term in its separated expansion. We could estimate the largest eigenvalue by using a power method and this is implemented in the code for verification, however, it is too expensive to use routinely, especially for each term in a large separated representation (we would spend more time computing the operator than applying it). Thus, we need a more efficient scheme. Each term in the separated expansion is applied as \begin{equation}\label{seq:refText37} R_{x}R_{y}\cdots -T_{x}T_{y}\cdots \end{equation} where \textit{R} is the full non-standard form of the operator in a given direction which takes on the form \begin{equation} R=\left(\begin{matrix}T&B\\C&A\end{matrix}\right) \end{equation} and \textit{T} is the block of\textit{ R }that connects sum-coefficients with sum-coefficients. We could compute the Frobenius norm of the operator in (38) simply as $\sqrt{\|R_{x}\|_{F}^{2}\|R_{y}\|_{F}^{2}\cdots -\|T_{x}\|_{F}^{2}\|T_{y}\|_{F}^{2}\cdots }$ but unfortunately this loses too much precision. Instead, an excellent estimate is provided by \begin{equation} \sqrt{\left(\prod _{i=1}^{d}{\|R_{i}\|_{F}^{2}}\right)\left(\sum _{i=1}^{d}{\frac{\|R_{i}-T_{i}\|_{F}^{2}}{\|R_{i}\|_{F}^{2}}}\right)} \end{equation} which seems in practice to be an effective upper bound that is made tight (within a factor less than two at least for the Coulomb kernel) by replacing the sum with the maximum term in the sum. \subsection[Automatic refinement (aka widening the support)]{Automatic refinement (aka widening the support)} Same as for multiplication ... need explain why this is good. \section[Binary operations]{\rmfamily Binary operations} \subsection[Inner product of two functions]{\rmfamily Inner product of two functions} This is conceptually similar to the norm, but since the two functions may have different levels of refinement we can only compute the inner product in the wavelet basis. \ \begin{equation} \int f(x)^{\text{*}}g(x)dx=\left.s_{f0}^{0}\right.^{dagger}.s_{g0}^{0}+\sum _{m=0}^{n-1}\sum _{l=0}^{2^{m}-1}\left.d_{fl}^{m}\right.^{dagger}.d_{gl}^{m} \end{equation} \subsection[Function addition and subtraction]{\rmfamily Function addition and subtraction} The most general form is the bilinear operation GAXPY (generalized form of SAXPY) $h(x)=\alpha f(x)+\beta g(x)$ that is implemented in both in-place (\textit{h} the same as \textit{f}) and out-of-place versions. The operation is implemented in the wavelet basis in which the operation can be applied directly to the coefficients regardless of different levels of adaptive refinement (missing coefficients are treated as zero). \textit{Need a discussion on screening} -- basically if the functions have the same processor map and this operation is followed by a truncation before doing anything expensive, explicit screening does not seem too critical. \ \ The need for truncation could be reduced by testing on the size of one of the products (e.g., in a Gramm-Schmidt we know that one of the terms is usually small). \subsection{Point-wise multiplication} This is performed starting from the scaling function basis. \ \ Superficially, we transform each function to values at the quadrature points, multiply the values, and transform back. \ \ However, there are three complicating issues. \ First, the product cannot be exactly represented in the polynomial basis. \ The product of two polynomials of order \textit{k-1} produces a polynomial of order \textit{2k-2}. \ \ Beylkin makes a nice analogy to the product of two functions sampled on a grid -- the product can be exactly formed on a grid with double the resolution. \ \ While this is not exact for polynomials it does reduce the error by a factor of $2^{-k}$, where \textit{k} is the order of the wavelet. \ \ Therefore, we provide the option to automatically refine and form the product at a lower level. \ \ This is done by estimating the norm of the part of the product that cannot be exactly represented as follows \begin{equation} \begin{matrix}p_{l}^{n}&\text{=}&\sqrt{\sum _{i=0}^{\left\lfloor (k-1)/2\right\rfloor}\left\|s_{il}^{n}\right\|^{2}}\hfill\null \\q_{l}^{n}&\text{=}&\sqrt{\sum _{i=\left\lfloor (k-1)/2\right\rfloor+1}^{k-1}\left\|s_{il}^{n}\right\|^{2}}\hfill\null \\\epsilon (\mathit{fg})_{l}^{n}&\simeq &p(f)_{l}^{n}q(g)_{l}^{n}+q(f)_{l}^{n}p(g)_{l}^{n}+q(f)_{l}^{n}q(g)_{l}^{n}\end{matrix} \end{equation} \bigskip Second, the functions may have different levels of adaptive refinement. The two options are to compute the function with coefficients at a coarser-scale directly on the grid required for the finer-scale function, or to refine the function down to same level, which is what we previously choose to do. \ However, this will leave the tree with scaling function coefficients at multiple levels that must be cleaned up after the operation. \ \ Since it essential (for efficient parallel computation) to perform multiple operations at a time on a function, having it in an inconsistent state makes things complicated. \ \ If all we wanted to do were perform other multiplication operations, there would be no problem; however this seems to be an unnecessary restriction on the user. \ \ It is also faster (2-fold?) to perform the direct evaluation so this is what we choose to do. \ Third, the above does not use sparsity or smoothness in the functions and does not compute the result to a finite precision. \ \ For instance, if two functions do not overlap their product is zero but the above algorithm will compute at the finest level of the two functions doing a lot of work to evaluate small numbers that will be discarded upon truncation. \ \ Eliminating this overhead is crucial for reduced scaling in electronic structure calculations. \ \ At some scale we can write each function (\textit{f} and \textit{g}) in a volume in terms of its usual scaling function approximation at that level (\textit{s}) and the correction/differences (\textit{d}) from \textit{all }finer scales. \ \ The error in the product of two such functions is then \begin{equation}\label{seq:refText42} \epsilon (\mathit{fg})\simeq \left\|s_{f}\right\|.\left\|d_{g}\right\|+\left\|d_{f}\right\|.\left\|\mathrm{{s}}_{f}\right\|+\left\|d_{f}\right\|.\left\|d_{g}\right\| \end{equation} with a hopefully obvious abuse of notation. Note again that while the scaling function coefficients are as used everywhere else in this document, the difference function (\textit{d}) in (43) is the sum of corrections at \textit{all }finer scales. \ \ Thus, by computing the scaling function coefficients at all levels of the tree and summing the norm of the differences up the tree we can compute with controlled precision at coarser levels of the tree. \ \ The sum of the norm of differences can also be computed by summing the norm of the scaling function coefficients from the finest level and subtracting the local value. If many products are being formed, the overheads (compute and memory) of forming the non-standard form are acceptable, but it is desirable to have a less expensive approach when computing just a few products. \ \ The above exploits both locality and smoothness in each function. The main reduction in cost in the electronic structure algorithms will come from locality (finite domain for each orbital with exponential decay outside) and with that in mind we can bound the entire product in some volume using $\left\|\mathit{fg}\right\|\le \left\|f\right\|.\left\|g\right\|$. \ \ We can compute the norm of each function in each volume by summing the norm of the scaling function coefficients up the tree, which is inexpensive but does require some communication and an implied global synchronization. \ \ If in some box the product is predicted to be negligible we can immediately set it to zero, otherwise we must recur down. Since we are not exploiting smoothness, if a product must be formed it will be formed on the finest level. Thus, we will eventually have three algorithms for point-wise multiplication. Which ones to do first? \ \ We answer this question by asking, ``what products will the DFT and HF codes be performing?'' \begin{itemize} \item Square each orbital to make the density. \item Multiply the potential against each orbital. \item HF exchange needs each orbital times every other orbital. \end{itemize} The first critical one is potential times orbital. \ \ The potential has global extent but the orbitals are localized and we want the cost of each product to be \textit{O(1)} not \textit{O(V)} (\textit{V}, the volume). \ \ Similarly, we must reduce the cost of the $O(N^{2})$ products in HF exchange to \textit{O(N)}. \ \ Therefore, we first did the exact algorithm and will very shortly do one that exploits locality. \subsubsection[Evaluating the function at the quadrature points]{\rmfamily Evaluating the function at the quadrature points} In (22) is described how to transform between function values and scaling coefficients on the same level. \ \ However, for multiplication we will need to evaluate the polynomials at a higher level in a box at a finer level. \ This is not mathematically challenging but there are enough indices involved that care is necessary. \ \ Let $(n,l)$ \ be the parent box and $(n',l')$ be one of its children, and let $x_{\mu }$ be a Gauss-Legendre quadrature point in [0,1]. \ \ We want to evaluate \begin{equation}\label{seq:refText43} \begin{matrix}f_{\mu }&\text{=}&V^{-1/2}\sum _{i}{s_{\mathit{il}}^{n}\phi _{\mathit{il}}^{n}\left(2^{-n'}\left(x_{\mu }+l'\right)\right)}\hfill\null \\&\text{=}&V^{-1/2}2^{\mathit{dn}/2}\sum _{i}{s_{\mathit{il}}^{n}\phi _{i}\left(2^{n-n'}\left(x_{\mu }+l'\right)-l\right)}\hfill\null \\&\text{=}&V^{-1/2}2^{\mathit{dn}/2}\sum _{i}{s_{\mathit{il}}^{n}\phi _{\mu i}^{n-n',l,l'}}\hfill\null \end{matrix} \end{equation} that has the same form as before but now we must use a different transformation for each dimension due to the dependence on the child box coordinates. \section{Error estimation} To estimate the error in the numerical representation relative to some known analytic form, i.e., \begin{equation} \epsilon =\left\|f-f^{n}\right\| \end{equation} we first ensure we are in the scaling function basis and then in each box with coefficients compute the contribution to $\epsilon $ using a quadrature rule with one more point than that used in the initial function projection. \ \ The reason for this is that if $f^{n}$ \ resulted from the initial projection then it is exactly represented at the quadrature points and will appear, incorrectly, to have zero error if sampled there. \ \ One more point permits the next two powers in the local polynomial expansion to be sampled and also ensures that all of the new sampling points interleave the original points near the center of the box. \ \section[Data structures]{Data structures} The d-dimension function is represented as a 2\textsuperscript{d}{}-tree. \ \ Nodes in the tree are labeled by an instance of \texttt{Key{\textless}d{\textgreater}} that wraps the tuple \textit{(n,l)} where \textit{n} is the level and \textit{l} is a \textit{d}{}-vector representing the translation. \ \ Nodes are represented by instances of \texttt{FunctionNode{\textless}T,d{\textgreater}} that presently contains the coefficients and an indicator if this node has children. \ \ Nodes without children are sometimes referred to as leaves. \ \ Nodes, indexed by keys, are stored in a distributed hash table that is an instance of \texttt{WorldContainer{\textless}Key{\textless}d{\textgreater},FunctionNode{\textless}T,d{\textgreater}{\textgreater}}. \ \ This container uses a two-level hash to first map a key to the processes (referred to as the owner) in which the data resides, and then into a local instance of either a GNU \texttt{hash \_map} or a TR1 \texttt{unordered\_map}. \ \ Since it is always possible to compute the key of a parent, child, or neighbor we never actually store (global) pointers to parents or children. \ \ Indeed, a major point of the MADNESS runtime environment is to replace the cumbersome partitioned global address space (global pointer = process id + local pointer) with multiple global name spaces. \ \ Each new container provides a new name space. \ \ Using names rather than pointers permits the application to be written using domain-specific concepts rather a rigid linear model of memory. Folks familiar with Python will immediately appreciate the value of name spaces. Data common to all instances of a function of a given dimension (\textit{d}) and data type (\texttt{T,} e.g., float, double, float \_complex, double \_complex) are gathered together into \texttt{FunctionCommonData{\textless}T,d{\textgreater}} of which one instance is generated per wavelet order (\textit{k}). \ \ An instance of the common data is shared read-only between all instances of functions of that data type, dimension and wavelet order. \ \ Presently there are some mutable scratch arrays in the common data but these will be eliminated when we introduce multi-threading. \ \ In addition to reducing the memory footprint of the code, sharing the common data greatly speeds the instantiation of new functions which is replicated on every processor. In order to facilitate shallow copy/assignment semantics and to make empty functions inexpensive to instantiate, a multi-resolution function, which is an instance of \texttt{Function{\textless}T,d{\textgreater}} contains only a shared pointer to the actual implementation which is an instance of \texttt{FunctionImpl{\textless}T,d{\textgreater}}. \ \ Un-initialized functions (obtained from the default constructor) contain a zero pointer. \ Only a non-default constructor or assignment actually instantiate the implementation. The main function class forwards nearly all methods to the underlying implementation. \ \ The implementation primarily contains a reference to the common data, the distributed container storing the tree, little bits of other state (e.g., a flag indicating the compression status) and a bunch of methods. Default values for all functions of a given dimension are stored in an instance of FunctionDefaults{\textless}d{\textgreater}. \ \ These may be modified to change the default values for subsequently created functions. \ \ Functions have many options and parameters and thus we need an easy way to specify options and selectively override defaults. \ \ Since C++ does not provide named arguments (i.e., arguments with defaults that may be specified in any order rather than relying on position to identify an argument) we adopt the named-parameter idiom. The main constructor for \texttt{Function{\textless}T,d{\textgreater}} takes an instance of \texttt{FunctionFactory{\textless}T,d{\textgreater} }as its sole argument. \ \ The methods of \texttt{FunctionFactory{\textless}T,d{\textgreater}} enable setting of options and parameters and each returns a reference to the object to permit chaining of methods. \ \ A current problem with \texttt{FunctionDefaults} is that it is static data shared between all parallel worlds (MPI sub-communicators). At some point we may need to tie this to the world instance. Pretty much all memory is reference counted using Boost-like shared pointers. An instance of \texttt{SharedPointer{\textless}T{\textgreater}}, which wraps a pointer of type \texttt{T*}, is (almost) always used to wrap memory obtained from the C++ new operator. \ \ The exceptions are where management of the memory is immediately given to a low-level interface. \ \ Shared-pointers may be safely copied and used with no fear of using a stale pointer. \ \ When the last copy of a shared pointer is destroyed the underlying pointer is freed. \ \ With this mode of memory management there is never any need to use the C++ delete operator and most classes do not even need a destructor. Tensors ... STOPPED MOST CLEANUP AND WRITING HERE ... more to follow ... sigh \subsection{Maintaining consistent state of the 2d-tree} The function implementation provides a method \texttt{verify\_tree()} that attempts to check connectivity and consistency of the compression state, presence of coefficients, child flags, etc, as described below. A node in the tree is labeled by the key/tuple \textit{(n,l)} and presently stores the coefficients, if any, and a flag indicating if the node has children. \ \ In 1D, the keys of children are readily computed as \textit{(n+1,2l)} and \textit{(n+1,2l+1). }In many dimensions it is most convenient to use the \texttt{KeyChildIterator} class. \ \ The presence of coefficients is presently determined by looking at the size of the tensor storing the coefficients; size zero means no coefficients. In the reconstructed form (scaling function basis), a tree has coefficients (a \textit{k}\textit{\textsuperscript{d}} tensor) only at the lowest level. \ \ All interior nodes will have no coefficients and will have children. \ \ All leaf nodes will have coefficients and will not have children. In the compressed form (wavelet basis), a tree has coefficients (a (\textit{2k)}\textit{\textsuperscript{d}} tensor) at all levels. \ \ The scaling function block of the coefficients is zero except at level zero. \ \ Logically, this tree is one level shallower than the reconstructed tree since the scaling function coefficients at the leaves are represented by the difference coefficients on the next coarsest level. \ However, to simplify the logic in compress and reconstruct and to maintain consistency with the non-standard compressed form (see below), we do not delete the leaf nodes from the reconstructed tree. \ \ Thus, the compressed tree has the same set of nodes as the reconstructed tree with all interior nodes having coefficients and children, and all leaf nodes having no coefficients and no children. \ In the non-standard compressed form (redundant basis), we keep the scaling function coefficients at all levels and the wavelet coefficients for all interior nodes. \ \ Thus, the compressed tree has the same set of nodes as for the other two forms but with all nodes having coefficients (a (\textit{2k)}\textit{\textsuperscript{d}} tensor for interior nodes and a \textit{k}\textit{\textsuperscript{d}} tensor for leaf nodes) and with only leaf nodes having no children. To keep complexity to a minimum we don't want to introduce special states of the tree or of nodes, thus all operations must by their completion restore the tree to a standard state. Truncation is applied to a tree in compressed form and discards small coefficients that are logically leaf nodes. \ \ Logically, because in the stored tree we still have the empty nodes that used to hold the scaling coefficients. \ For a node to be eligible for truncation it must have only empty children. \ Thus, truncation proceeds as follows. \ \ We initially recur down the tree and for each node spawn a task that takes as arguments futures indicating if each of its children have coefficients. Leaf nodes, by definition, have no children and no coefficients and immediately return their status. Once a task has all information about the children it can execute. \ \ If any children have coefficients a node cannot truncate and can immediately return its status. \ \ Otherwise, it must test the size of its own coefficients. \ \ If it decides to truncate, it must clear its own coefficients, delete all of its children, and set its \texttt{has \_children} flag to false. \ \ Finally, it can return its own status. Adding (subtracting) two functions is performed in the wavelet basis. \ \ If the trees have the same support (level of refinement) we only have to add the coefficients. \ \ If the trees differ, then in addition to adding the coefficients we must also maintain the has \_children flag of the new tree to produce the union of the two input trees. \ \ To permit functions with different processor maps to be added efficiently, we loop over local data in one function and send them to nodes of the other for addition. \ \ Sending a message to a non-existent node causes it to be created. \section[Returning new functions {}-- selection of default parameters]{Returning new functions -- selection of default parameters} When returning a new function there is the question of what parameters (thresholds, distribution, etc.) should be used. \ \ There needs to a convention that is consistent with users' intuition as well as mechanisms for forcing different outcomes. \ We choose to not use \texttt{FunctionDefaults}. \ \ I.e., \texttt{FunctionDefaults} is only used when the user invokes the \texttt{Function} constructor to fill unspecified elements of \texttt{FunctionFactory}. \subsection{Unary operations (e.g., scaling, squaring, copying, type conversion)} The result copies all appropriate state from the input. \subsection{Binary operations (e.g., addition, multiplication)} Writing the binary operation \ \ as a C++ method invocation f.op(g) there is a natural asymmetry that for consistency with a unary operation leads to our choice to copy all appropriate state from the leftmost function, i.e., that which method is being invoked. \subsection{Ternary and higher operations} There are no C++ operators of this form and therefore these will always be of the form \texttt{f.op(g,h)}and we make the same choice as made for binary operations. \subsection{C++ operator overloading and order of evaluation} The main issue with the above convention is clarifying how C++ maps statements with overloaded operators\footnote{\ \textrm{http://www.difranco.net/cop2334/Outlines/ch18.htm}} into method/function invocations which includes understanding the order of evaluation\footnote{\ \textrm{http://msdn2.microsoft.com/en-us/library/yck2zaey(vs.80).aspx}}. \ \ Overloading does not change the precedence or associativity of an operator\footnote{\ \textrm{http://www.difranco.net/cop2334/cpp \_op \_prec.htm}}. \ \ \ Noting * is of higher precedence than + and both are left-to-right associative, \begin{itemize} \item \texttt{f*g+h} becomes \texttt{(f*g)+h} becomes \texttt{(f.mul(g)).add(h)} and thus the result has the same parameters as \texttt{f}. \item \texttt{h+f*g} becomes \texttt{h+(f*g)} becomes \texttt{h.add(f.mul(g))} and thus the result has the same parameters as \texttt{h}. \item \texttt{f*g*h} has undefined order of evaluation since the two operators have equal precedence, but the compiler is not free to assume that multiplication is commutative and hence the result is either \texttt{f.mul(g.mul(h))} or \texttt{f.mul(g).mul(h)} which will both inherit the parameters of \texttt{f}. \end{itemize} In summary, the result always has the parameters of the leftmost function in any expression. For greatest clarity, introduce parentheses or invoke the actual methods/functions rather than relying upon operator overloading. \subsection{Overriding above behaviors} Operations that produce results dependent upon thresholds, etc., must provide additional interfaces that permit specification of all controlling parameters which will be used in the operation and preserved in the result. \ \ For all other operations, it suffices to make thresholds, etc., settable after the completion of an operation. \section{External storage} I/O remains a huge problem on massively parallel computers and should almost never be used except for checkpoint/restart. Several constraints must be borne in mind. First, we must avoid creating many files since parallel file systems are easily over-whelmed if a few thousand processes simultaneously try to create a few tens of files each. Second, I/O should be performed in large transactions with a tunable number of readers/writers in order to obtain the best bandwidth. Third, we need random access to data so purely serial solutions are not acceptable. Finally, the external data representation should ideally be open and readily accessed by other codes. For the purposes of I/O, we distinguish two types of objects. First, objects that will be written by a single process with no involvement from other processes. Typically this would be just by the master process and the objects would be small enough to fit into the memory of a single processor. Second, large objects that will be transferred to/from disk in a collective manner with all processes in the world logically participating. Random read and write access must be feasible for both types of objects. \section[Viewing and editing this document]{Viewing and editing this document} Under Linux ensure you have the Microsoft true-type fonts installed -- they are free. Under Ubuntu install package \texttt{msttcorefonts}. Without these the default Linux fonts will cause pagination and other problems, at least with the title pages. Other than resorting to Latex it does not seem possible to put documents under version control and have the changes merged automatically. Subversion recognizes OpenOffice files as being of mime-type octet-stream and thus treats them as binary, meaning that it does not attempt to merge changing. You must use the OpenOffice compare-and-merge facility to manually merge changes yourself. \bigskip \end{document} madness-0.10/doc/Latex/parallel-runtime.tex000066400000000000000000001216311253377057000207220ustar00rootroot00000000000000% This file was converted to LaTeX by Writer2LaTeX ver. 1.1.7 % see http://writer2latex.sourceforge.net for more info \documentclass[letterpaper]{article} \usepackage[ascii]{inputenc} \usepackage[T1]{fontenc} \usepackage[english]{babel} \usepackage{amsmath} \usepackage{amssymb,amsfonts,textcomp} \usepackage{color} \usepackage{array} \usepackage{hhline} \usepackage{hyperref} \hypersetup{pdftex, colorlinks=true, linkcolor=blue, citecolor=blue, filecolor=blue, urlcolor=blue, pdftitle=, pdfauthor=Robert Harrison, pdfsubject=, pdfkeywords=} % footnotes configuration \makeatletter \renewcommand\thefootnote{\arabic{footnote}} \makeatother % Outline numbering \setcounter{secnumdepth}{3} \renewcommand\thesection{\arabic{section}} \renewcommand\thesubsection{\arabic{section}.\arabic{subsection}} \renewcommand\thesubsubsection{\arabic{section}.\arabic{subsection}.\arabic{subsubsection}} % List styles \newcommand\liststyleLi{% \renewcommand\theenumi{\arabic{enumi}} \renewcommand\theenumii{\arabic{enumii}} \renewcommand\theenumiii{\arabic{enumiii}} \renewcommand\theenumiv{\arabic{enumiv}} \renewcommand\labelenumi{\theenumi.} \renewcommand\labelenumii{\theenumii.} \renewcommand\labelenumiii{\theenumiii.} \renewcommand\labelenumiv{\theenumiv.} } \newcommand\liststyleLii{% \renewcommand\theenumi{\arabic{enumi}} \renewcommand\theenumii{\arabic{enumii}} \renewcommand\theenumiii{\arabic{enumiii}} \renewcommand\theenumiv{\arabic{enumiv}} \renewcommand\labelenumi{\theenumi.} \renewcommand\labelenumii{\theenumii.} \renewcommand\labelenumiii{\theenumiii.} \renewcommand\labelenumiv{\theenumiv.} } \newcommand\liststyleLiii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLiv{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLv{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLvi{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLvii{% \renewcommand\labelitemi{{}--} \renewcommand\labelitemii{{}--} \renewcommand\labelitemiii{{}--} \renewcommand\labelitemiv{{}--} } \newcommand\liststyleLviii{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLix{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } \newcommand\liststyleLx{% \renewcommand\labelitemi{${\bullet}$} \renewcommand\labelitemii{${\circ}$} \renewcommand\labelitemiii{${\blacksquare}$} \renewcommand\labelitemiv{${\bullet}$} } % Page layout (geometry) \setlength\voffset{-1in} \setlength\hoffset{-1in} \setlength\topmargin{0.7874in} \setlength\oddsidemargin{0.7874in} \setlength\textheight{8.825199in} \setlength\textwidth{6.9251995in} \setlength\footskip{0.6in} \setlength\headheight{0cm} \setlength\headsep{0cm} % Footnote rule \setlength{\skip\footins}{0.0469in} \renewcommand\footnoterule{\vspace*{-0.0071in}\setlength\leftskip{0pt}\setlength\rightskip{0pt plus 1fil}\noindent\textcolor{black}{\rule{0.25\columnwidth}{0.0071in}}\vspace*{0.0398in}} % Pages styles \makeatletter \newcommand\ps@Standard{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{} \renewcommand\@evenfoot{} \renewcommand\thepage{\arabic{page}} } \newcommand\ps@LeftPage{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{\thepage{}} \renewcommand\@evenfoot{\@oddfoot} \renewcommand\thepage{\arabic{page}} } \newcommand\ps@Licensepage{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{} \renewcommand\@evenfoot{} \renewcommand\thepage{\arabic{page}} } \newcommand\ps@Firstrightnumberedpage{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{\thepage{}} \renewcommand\@evenfoot{\@oddfoot} \renewcommand\thepage{\arabic{page}} } \newcommand\ps@Titlepage{ \renewcommand\@oddhead{} \renewcommand\@evenhead{} \renewcommand\@oddfoot{} \renewcommand\@evenfoot{} \renewcommand\thepage{\arabic{page}} } \makeatother \pagestyle{Standard} \newcounter{Figure} \renewcommand\theFigure{\arabic{Figure}} \title{} \author{Robert Harrison} \date{2009-12-14} \begin{document} \clearpage\setcounter{page}{1}\pagestyle{Licensepage} \thispagestyle{Titlepage} \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip {\centering\bfseries MADNESS \par} \begin{center} \begin{minipage}{} \liststyleLi \begin{enumerate} \item {\ttfamily \#define WORLD\_INSTANTIATE\_STATIC\_TEMPLATES} \item {\ttfamily \#include {\textless}world/world.h{\textgreater}} \item \bigskip \item {\ttfamily using namespace std;} \item {\ttfamily using namespace madness;} \item \bigskip \item {\ttfamily class Array : public WorldObject{\textless}Array{\textgreater} \{} \item {\ttfamily \ \ \ \ vector{\textless}double{\textgreater} v;} \item {\ttfamily public:} \item {\ttfamily \ \ \ \ /// Make block distributed array with size elements} \item {\ttfamily \ \ \ \ Array(World\& world, size\_t size) } \item {\ttfamily \ \ \ \ \ \ \ \ : WorldObject{\textless}Array{\textgreater}(world), v((size-1)/world.size()+1)} \item {\ttfamily \ \ \ \ \{} \item {\ttfamily \ \ \ \ \ \ \ \ process\_pending();} \item {\ttfamily \ \ \ \ \};} \item \bigskip \item {\ttfamily \ \ \ \ /// Return the process in which element i resides} \item {\ttfamily \ \ \ \ ProcessID owner(size\_t i) const \{return i/v.size();\};} \item \bigskip \item {\ttfamily \ \ \ \ Future{\textless}double{\textgreater} read(size\_t i) const \{} \item {\ttfamily \ \ \ \ \ \ \ \ if (owner(i) == world.rank())} \item {\ttfamily \ \ \ \ \ \ \ \ \ \ \ \ return Future{\textless}double{\textgreater}(v[i-world.rank()*v.size()]);} \item {\ttfamily \ \ \ \ \ \ \ \ else} \item {\ttfamily \ \ \ \ \ \ \ \ \ \ \ \ return send(owner(i), \&Array::read, i);} \item {\ttfamily \ \ \ \ \};} \item \bigskip \item {\ttfamily \ \ \ \ Void write(size\_t i, double value) \{} \item {\ttfamily \ \ \ \ \ \ \ \ if (owner(i) == world.rank())} \item {\ttfamily \ \ \ \ \ \ \ \ \ \ \ \ v[i-world.rank()*v.size()] = value;} \item {\ttfamily \ \ \ \ \ \ \ \ else} \item {\ttfamily \ \ \ \ \ \ \ \ \ \ \ \ send(owner(i), \&Array::write, i, value);} \item {\ttfamily \ \ \ \ \ \ \ \ return None;} \item {\ttfamily \ \ \ \ \};} \item {\ttfamily \};} \item \bigskip \item {\ttfamily int main(int argc, char** argv) \{} \item {\ttfamily \ \ \ \ initialize(argc, argv);} \item {\ttfamily \ \ \ \ madness::World world(MPI::COMM\_WORLD);} \item \bigskip \item {\ttfamily \ \ \ \ Array a(world, 10000), b(world, 10000);} \item \bigskip \item {\ttfamily \ \ \ \ // Without regard to locality, initialize a and b} \item {\ttfamily \ \ \ \ for (int i=world.rank(); i{\textless}10000; i+=world.size()) \{} \item {\ttfamily \ \ \ \ \ \ \ \ a.write(i, 10.0*i);} \item {\ttfamily \ \ \ \ \ \ \ \ b.write(i, \ 7.0*i);} \item {\ttfamily \ \ \ \ \}} \item {\ttfamily \ \ \ \ world.gop.fence();} \item \bigskip \item {\ttfamily \ \ \ \ // All processes verify 100 random values from each array} \item {\ttfamily \ \ \ \ for (int j=0; j{\textless}100; j++) \{} \item {\ttfamily \ \ \ \ \ \ \ \ size\_t i = world.rand()\%10000;} \item {\ttfamily \ \ \ \ \ \ \ \ Future{\textless}double{\textgreater} vala = a.read(i);} \item {\ttfamily \ \ \ \ \ \ \ \ Future{\textless}double{\textgreater} valb = b.read(i);} \item {\ttfamily \ \ \ \ \ \ \ \ // Could do work here until results are available} \item {\ttfamily \ \ \ \ \ \ \ \ MADNESS\_ASSERT(vala.get() == 10.0*i);} \item {\ttfamily \ \ \ \ \ \ \ \ MADNESS\_ASSERT(valb.get() == \ 7.0*i);} \item {\ttfamily \ \ \ \ \}} \item {\ttfamily \ \ \ \ world.gop.fence();} \item \bigskip \item {\ttfamily \ \ \ \ if (world.rank() == 0) print({\textquotedbl}OK!{\textquotedbl});} \item {\ttfamily \ \ \ \ finalize();} \item {\ttfamily \}} \end{enumerate} {\centering\itshape Figure {\refstepcounter{Figure}\theFigure\label{seq:refFigure0}}: Complete example program illustrating the implementation and use of a crude, block-distributed array upon the functionality of \texttt{WorldObject}. \par} \end{minipage} \end{center} \begin{center} \begin{minipage}{} \liststyleLii \begin{enumerate} \item {\ttfamily \#define WORLD\_INSTANTIATE\_STATIC\_TEMPLATES} \item {\ttfamily \#include {\textless}world/world.h{\textgreater}} \item {\ttfamily using namespace madness;} \item \bigskip \item {\ttfamily class Foo : public WorldObject{\textless}Foo{\textgreater} \{} \item {\ttfamily \ \ \ \ const int bar;} \item {\ttfamily public:} \item {\ttfamily \ \ \ \ Foo(World\& world, int bar) : WorldObject{\textless}Foo{\textgreater}(world), bar(bar)} \item {\ttfamily \ \ \ \ \{} \item {\ttfamily \ \ \ \ \ \ \ \ process\_pending();} \item {\ttfamily \ \ \ \ \};} \item \bigskip \item {\ttfamily \ \ \ \ int get() const \{return bar;\};} \item {\ttfamily \};} \item \bigskip \item {\ttfamily int main(int argc, char** argv) \{} \item {\ttfamily \ \ \ \ initialize(argc, argv);} \item {\ttfamily \ \ \ \ madness::World world(MPI::COMM\_WORLD);} \item \bigskip \item {\ttfamily \ \ \ \ Foo a(world,world.rank()), b(world,world.rank()*10);} \item \bigskip \item {\ttfamily \ \ \ \ for (ProcessID p=0; p{\textless}world.size(); p++) \{} \item {\ttfamily \ \ \ \ \ \ \ \ Future{\textless}int{\textgreater} futa = a.send(p,\&Foo::get);} \item {\ttfamily \ \ \ \ \ \ \ \ Future{\textless}int{\textgreater} futb = b.send(p,\&Foo::get);} \item {\ttfamily \ \ \ \ \ \ \ \ // Could work here until the results are available} \item {\ttfamily \ \ \ \ \ \ \ \ MADNESS\_ASSERT(futa.get() == p);} \item {\ttfamily \ \ \ \ \ \ \ \ MADNESS\_ASSERT(futb.get() == p*10);} \item {\ttfamily \ \ \ \ \}} \item {\ttfamily \ \ \ \ world.gop.fence();} \item {\ttfamily \ \ \ \ if (world.rank() == 0) print({\textquotedbl}OK!{\textquotedbl});} \item \bigskip \item {\ttfamily \ \ \ \ finalize();} \item {\ttfamily \}} \end{enumerate} {\centering\itshape Figure {\refstepcounter{Figure}\theFigure\label{seq:refFigure1}}: Simple client-server program implemented using \texttt{WorldObject}. \par} \end{minipage} \end{center} {\centering\bfseries Parallel Runtime \par} \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip {\centering Last modification: 12/14/09 \par} \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip \bigskip This file is part of MADNESS. Copyright (C) 2007 Oak Ridge National Laboratory This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or(at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 Oak Ridge, TN 37831 \bigskip email: harrisonrj@ornl.gov tel: \ \ 865-241-3937 fax: \ \ 865-572-0680 \setcounter{tocdepth}{10} \renewcommand\contentsname{Table of Contents} \tableofcontents \bigskip \clearpage \bigskip \clearpage\setcounter{page}{1}\pagestyle{LeftPage} \thispagestyle{Firstrightnumberedpage} This documents provides an introduction to programming with the MADNESS parallel runtime and describes some of the implementation details. The runtime is used for the actual implementation of the MADNESS numerical capabilities and at least some understanding of the runtime is required by applications using the numerical tools. Also, the runtime may be used independently of the rest of MADNESS and will be separately distributed at some point in the future. \section{Overview} The MADNESS parallel programming environment combines several successful elements from other models and aims to provide a rich and scalable framework for massively parallel computing while seamlessly integrating with legacy applications and libraries. In particular, it is completely compatible with existing MPI and Global Array applications. All code is standard C++ tested for portability with a variety of compilers including the IBM, GNU, Intel, Portland Group, and Pathscale compilers. It includes \liststyleLiii \begin{itemize} \item Distributed sparse containers with one-sided access to items, transparent remote method invocation, an owner-computes task model, and optional user control over placement/distribution. \item Distributed objects that can be globally addressed. \item Futures (results of unevaluated expressions) for composition of latency tolerant algorithms and expression of dependencies between tasks. \item Globally accessible task queues in each process which \ can be used individually or collectively to provide a single global task queue. \item Serialization framework to facilitate transparent interprocess communication. \item Work stealing for dynamic load balancing (coming v. soon). \item Facile management of computations on processor sub-groups. \item Integration with MPI \item Active messages to items in a container, distributed objects, and processes \item Kernel-space threading for use of multi-core processors. \end{itemize} \section{Motivations and attributions} There were several motivations for developing this environment. \liststyleLiv \begin{itemize} \item The rapid evolution of machines from hundreds (pre-2000), to millions (post-2008) of processors demonstrates the need to abandon process-centric models of computation and move to paradigms that virtualize, generalize or even eliminate the concept of process. \ \item The success of applications using the Charm++ environment to scale rapidly to 30+K processes and the enormous effort required to scale most process-centric applications. \item The arrival of multi-core processes and the consequent requirement to express much more concurrency and to adopt techniques for latency hiding motivate the use of light weight work queues to capture much more concurrency and the use of futures for latency hiding. \item The complexity of composing irregular applications in partitioned, global-address space (PGAS) models using only MPI and/or one-sided memory access (GA, UPC, SHMEM, co-Array) motivates the use of an object-centric active-message or remote method invocation (RMI) model so that computation may be moved to the data with the same ease as which data can be moved. \ This greatly simplifies the task of maintaining and using distributed data structures. \item Interoperability with existing programming models to leverage existing functionality and to provide an evolutionary path forward. \item The main early influences for this work were \begin{itemize} \item Cilk (Kuszmaul, http://supertech.csail.mit.edu/cilk), \item Charm++ (Kale, http://charm.cs.uiuc.edu), \item ACE (Schmidt, http://www.cs.wustl.edu/\~{}schmidt/ACE.html), \item STAPL (Rauchwerger and Amato, http://parasol.tamu.edu/groups/rwergergroup/research/stapl), and \item the HPCS language projects and the very talented teams and individuals developing these \begin{itemize} \item X10, http://domino.research.ibm.com/comm/research\_projects.nsf/pages/x10.index.html \item Chapel, http://chapel.cs.washington.edu \item Fortress, http://fortress.sunsource.net \end{itemize} \end{itemize} \end{itemize} \section{Programming environment and capabilities} The entire parallel environment is encapsulated in an instance of the class \texttt{World} that is instantiated by wrapping an MPI communicator. Multiple worlds may exist, overlap, and can be dynamically created and destroyed. Each world has a unique identity and the creation and destruction of a world are a collective operations involving all processes that participate in the world. To ensure full compatibility with existing MPI programs, the concept of place/location/process/rank is defined to be the same as MPI process. The \texttt{World} class is intended to provide one-stop shopping for the parallel programming environment and, in particular, a uniform and consistent state that is always available. A local pointer to a world object may be passed to another process that is a member of the same world. During (de-)serialization the pointer is automatically translated so that in the remote process it correctly points to the local copy of the object. The class has members \liststyleLv \begin{itemize} \item \texttt{mpi} - an instance of \texttt{WorldMPIInterface} that wraps standard MPI functionality to enable instrumentation, though you can obtain the wrapped MPI communicator and call the standard MPI interface directly. \item \texttt{am} - an instance of \texttt{WorldAMInterface} that provides low-level active message services. It is not intended for direct application use but rather as the framework upon which new parallel programming tools can be implemented. \item \texttt{taskq} - an instance of \texttt{WorldTaskQueue} that provides a light-weight task queue within each process that is accessible from all other processes in the world. \item \texttt{gop} -- an instance of WorldGopInterface that provides additional global operations including collective operations that are compatible with simultaneous processing of active messages and tasks. \end{itemize} Within a world, distributed objects and containers (currently associative arrays or hash tables, with dense arrays planned) may be constructed. \subsection{Distributed or world objects} A distributed object has a distinct instance in each process of a world, but all share a common unique identifier. Thus, just like for a pointer to the World instance, a local pointer or reference to a distributed object is automatically translated during (de)serialization when sent to another process. Messages (method invocation) and tasks may be sent directly to such objects from any other process in the world. Figure \ref{seq:refFigure1} contains the complete source for a program that creates two instances (\texttt{a} and \texttt{b} on line 20) of type \texttt{Foo} that serve distinct values from each MPI process. \ These are global objects but no synchronization is required before they may be used since messages to objects that are not yet locally constructed are automatically buffered. \ Lines 23 and 24 illustrate how messages (method invocation) can be sent to corresponding instances in any other process and how results are returned via a future that will hold the result when it finally becomes available. The sample program immediately attempts to read the values (lines 26 and 27) and will wait until the value becomes available. Incoming active messages and any queued tasks are processed while waiting. The fence at line 29 ensures all data motion is globally complete before terminating the program. The comment on line 25 indicates the opportunity to do other work before forcing a future; indeed, this is their \textit{raison d'}\textit{\^e}\textit{tre}. \ They facilitate asynchronous communication/computation and the hiding of both hardware and algorithmic latency. Futures may be probed without blocking to determine status and below we will see how to register callbacks in a future to be invoked when it is assigned. Examining the implementation of \texttt{Foo} in figure \ref{seq:refFigure1}, it inherits from \texttt{WorldObject{\textless}Foo{\textgreater}} using the curiously recurring template pattern. This is because the base class needs the name of the derived class to forward remote method invocations. The \texttt{Foo} constructor initializes the base class and after finishing its own initialization (minimal here) invokes \texttt{process\_pending()} to consume any messages that arrived before it was constructed. It is not possible to invoke this from the base class constructor since the derived class would not yet be fully constructed. Note that the \texttt{get()} method does not need to be modified from the natural sequential version -- the send() template inherited from \texttt{WorldObject{\textless}Foo{\textgreater}} takes care of wrapping the return value in a future. Appropriate reference counting is used behind the scenes to ensure that locally allocated memory persists until the remote operation completes (i.e., the result of a remote operation may be safely discarded). Finally, the first line of the program requests that code for static members of \texttt{WorldObject{\textless}Foo{\textgreater}} be instantiated in the corresponding object file. This is a consequence of staying within standard C++ and not invoking any preprocessor to automate this process. In more sophisticated use, the ownership of a dynamically allocated \texttt{WorldObject} can be passed to the world (using a Boost-like shared pointer) for deferred destruction. At each global synchronization the world examines the reference count of registered objects to determine if any can be freed. Thus, actual destruction of such an object is deferred until the next global synchronization. In turn, this enables multiple such objects to be safely created and destroyed without any otherwise unnecessary global synchronization. No such sophistication is necessary in this example since we are happy with the introduction of the single global fence. Figure \ref{seq:refFigure0} provides the complete implementation and example use of a crude, block-distributed array using the \texttt{WorldObject} functionality. First looking at the main program, on line 40 two distinct arrays are instantiated. \ Inside a parallel loop, the elements of the arrays are initialized without using locality at lines 44 and 45. The fence at line 47 ensures all data motion is complete before attempting to read from the array. This is only necessary with multiple readers or writers since a single process is guaranteed a sequentially consistent view due to the world active-message layer guaranteeing in-order processing of messages. Reading an array element (lines 52 and 53) returns a \texttt{Future{\textless}double{\textgreater}} that will hold the result when it finally becomes available. Turning to the implementation of the \texttt{Array} class in Figure \ref{seq:refFigure0}, reading and writing elements is immediate if the element is local (lines 22 and 29), otherwise \texttt{WorldObject{\textless}Array{\textgreater}.send() }is used to forward the request to the \texttt{Array} object in the owning process, passing arguments as necessary. Attentive readers will have noticed that the \texttt{write()} method returns \texttt{Void} rather than \texttt{void}. This is merely to simplify the current implementation that would otherwise require specialization of most templates to handle \texttt{void} results. Once the interface has stabilized this design choice will be reconsidered. Futures of type \texttt{void} and \texttt{Void} are minimal stubs and cause no communication. There are two main restrictions on methods that are invoked remotely. First, the arguments must be values or constant references and must be serializable (see below). Pointers to \texttt{World}, or pointers or references to \texttt{WorldObjects} are automatically translated to refer to the appropriate remote object, but any other pointers are the responsibility of the application (though their translation via serialization may also be automated). Second, the method itself must not block, e.g., by forcing another future. This restriction can be greatly relaxed, but is presently enforced to avoid potential stack overflow and other problems due to deeply-nested invocation of handlers. {}- - - - - - - - - - - STOPPED HERE ... lots more to do ..... \subsection{Distributed or world containers} Distributed containers are distributed objects specialized to provide the services expected of a container and to pass messages directly to objects in the container. The latter enables non-process centric parallel computation in the sense that all messaging is between objects addressed with user-defined names and with transparent association of names to processes. BLAH BLAH ... The only currently provided containers are associative arrays or maps. \ The underlying sequential container on each process is either the GNU \texttt{hash\_map} or the TR1 \texttt{unordered\_map}. \ A map generalizes the concept of an array (that maps an integer index in a dense range to a value) by mapping an arbitrary key to a value. This is a very natural, general and efficient mechanism for storing sparse data structures. \ The distribution between processes of items in the container is based upon a function which maps the key to a process. \ The default mapping is a pseudo-random uniform mapping based upon a strong hash function, but the user can provide their own (possibly data-dependent) operator to control the distribution. \ Although it presently the case that all processes agree on the mapping of a key to a process, this does not have to be the case. The implementation is designed to support forwarding of remote requests though this code is not yet enabled or tested. The point is that it may be effective to perform local redistributions of data in order to address load or memory problems rather than globally changing the map which must be deferred until a synchronization point. The keys and values associated with containers must be serializble by the MADNESS archive mechanism. Please refer to world/archive/archive.h and documentation therein for information about this. \ In addition, the keys must support \ {}- testing for equality, either by overloading {\textbackslash}c == or by \ specializing {\textbackslash}c std::equal\_to{\textless}key\_type{\textgreater}, and \ {}- computing a hash value by invoking {\textbackslash}c madness::hash(key), which can be done either by providing a member function with signature {\textbackslash}code \ \ \ hashT hash() const; {\textbackslash}endcode \ \ \ or by specializing {\textbackslash}c madness::Hash{\textless}key\_type{\textgreater}. hashT is presently an unsigned 32-bit integer. \ MADNESS provides hash operations for all fundamental types, and variable and fixed dimension arrays of the same. \ Since having a good hash is important, we are using Bob Jenkin's {\textquotedbl}lookup v3{\textquotedbl} hash\footnote{\ http://www.burtleburtle.net/bob/c/lookup3.c}. Here is an example of a key that might be used in an octtree. {\ttfamily \ \ \ struct Key \{} {\ttfamily \ \ \ \ \ \ \ typedef unsigned long ulong;} {\ttfamily \ \ \ \ \ \ \ ulong n, i, j, k;} {\ttfamily \ \ \ \ \ \ \ hashT hashval;} \bigskip {\ttfamily \ \ \ \ \ \ \ Key() \{\};} \bigskip {\ttfamily \ \ \ \ \ \ \ // Precompute the hash function for speed} {\ttfamily \ \ \ \ \ \ \ Key(ulong n, ulong i, ulong j, ulong k)} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ : n(n), i(i), j(j), k(k), hashval(madness::hash(\&this-{\textgreater}n,4,0)) \{\};} \bigskip {\ttfamily \ \ \ \ \ \ \ hashT hash() const \{} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ return hashval;} {\ttfamily \ \ \ \ \ \ \ \};} \bigskip {\ttfamily \ \ \ \ \ \ \ template {\textless}typename Archive{\textgreater}} {\ttfamily \ \ \ \ \ \ \ void serialize(const Archive\& ar) \{} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ ar \& n \& i \& j \& k \& hashval;} {\ttfamily \ \ \ \ \ \ \ \}} \bigskip {\ttfamily \ \ \ \ \ \ \ bool operator==(const Key\& b) const \{} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ // Different keys will probably have a different hash} {\ttfamily \ \ \ \ \ \ \ \ \ \ \ return hashval==b.hashval \&\& n==b.n \&\& i==b.i \&\& j==b.j \&\& k==b.k;} {\ttfamily \ \ \ \ \ \ \ \};} {\ttfamily \ \ \ \};} \bigskip To be added \liststyleLvi \begin{itemize} \item discussion of chaining hashes using initval optional argument \item discussion of overriding the distribution across processes \end{itemize} \bigskip \subsubsection{Tasks, task queues, futures, and dependencies} This is the heart of the matter ... \subsubsection{Serialization} Serialization ...BLAH BLAH ... \section{Recommended programming paradigms} BLAH BLAH .. The recommended approaches to develop scalable and latency tolerant parallel algorithms are either object- or task-centric decompositions rather than the process-centric approach usually forced upon MPI applications. \ The object-centric approach uses distributed containers (or distributed objects) to store application data. \ Computation is expressed by sending tasks or messages to objects, using the task queue to automatically manage dependencies expressed via futures. Placement of data and scheduling or placement of computation can be delegated to the container and task queue, unless there are specific performance concerns in which case the application can have full knowledge and control of these. Items in a container may be accessed largely as if in a standard STL container, but instead of returning an iterator, accessors instead return a Future{\textless}iterator{\textgreater}. A future is a container for the result of a possibly unevaluated expression. \ In the case of an accessor, if the requested item is local then the result is immediately available. However, if the item is remote, it may take some time before the data is made available locally. \ You could immediately try to use the future, which would work but with the downside of internally waiting for all of the communication to occur. \ Much better is to keep on working and only use the future when it is ready. Aside: \liststyleLvii \begin{itemize} \item To avoid a potentially unbounded nested invocation of tasks which could overflow the stack and also be the source of live/deadlocks, new tasks are not presently started while blocking for communication. This will be relaxed in the near future which will reduce the negative impact of blocking for an unready future as long as there is work to perform in the task queue. \item Once fibers or user-space threads are integrated, multiple tasks will always be scheduled and blocking will merely schedule the next fiber. \end{itemize} \bigskip By far the best way to compute with futures is to pass them as arguments to a new task. \ Once the futures are ready, the task will be automatically scheduled for execution. \ Tasks that produce a result also return it as a future, so this same mechanism may be used to express dependencies between tasks. Thus, a very natural expression of a parallel algorithm is as a sequence of dependent tasks. \ For example, in MADNESS many of the algorithms working on distributed, multidimension trees start with just a single task working on the root of the tree, with all other processes waiting for something to do. \ That one task starts recursively (depth or breadth first) traversing the tree and generating new tasks for each node. \ These in turn generate more tasks on their sub-trees. The execution model is sequentially consistent. \ That is, from the perspective of a single thread of execution, operations on the same local/remote object behave as if executed sequentially in the same order as programmed. \ \ This means that performing a read after a write/modify returns the modified value, as expected. Such behavior applies only to the view of a single thread -- the execution of multiple threads and active messages from different threads may be interleaved arbitrarily. \subsection{Abstraction overhead} Creating, executing, and reaping a local, null task with no arguments or results presently takes about 350ns (Centos 4, 3GHz Core2, Pathscale 3.0 compiler, -Ofast). \ The time is dominated by \texttt{new} and and \texttt{delete} of the task structure, and as such is unlikely to get any faster except by caching and reusing the task structures. \ \ Creating and then executing a chain of dependent tasks with the result of one task fed as the argument of the next task (i.e., the input argument is an unevaluated future that will be assigned by the next task) requires about 1us per task (3 GHz Core2). \ Creating a remote task adds the overhead of interprocess communication which is on the scale of 1-3us (Cray XT). \ Note that this is not the actual wall-time latency since everything is presently performed using asynchronous messaging and polling via MPI. \ The wall-time latency, which is largely irrelevant to the application if it has expressed enough parallelism, is mostly determined by the polling interval which is dynamically adjusted depending upon the amount of local work available to reduce the overhead from polling. \ We can improve the runtime software through better agregation of messages and use of deeper message queues to reduce the overhead of remote task creation to essentially that of a local task. Thus, circa 1us defines the granularity above which it is worth considering encapsulating work (c.f., Hockney's n1/2). \ However, this is just considering the balance between overhead incurred v.s. useful work performed. \ The automatic scheduling of tasks dependent upon future arguments confers additional benefits, including \liststyleLviii \begin{itemize} \item \ hiding the wall-time latency of remote data access, \item \ removing from the programmer the burden of correct scheduling of dependent tasks, \item expressing all parallelism at all scales of the algorithm for facile scaling to heavily multi-core architectures and \ massively parallel computers, and \item virtualizing the system resources for maximum future portability and scalability. \end{itemize} Available memory limits the number of tasks that can be generated before any are consumed. \ In addition to application specific data, each task consumes circa 64 bytes on a 64-bit computer. \ Thus, a few hundred thousand outstanding tasks per processor are eminently feasible even on the IBM BG/L. \ Rather than making the application entirely responsible for throttling it's own task production (which it can), if the system exceeds more than a user-settable number of outstanding tasks, it starts to run ready tasks before accepting new tasks. \ The success of this strategy presupposes that there are ready tasks and that these tasks on average produce less than one new task with unsatisfied dependencies per task run. \ Ultimately, similar to the Cilk sequentially-consistent execution model, safe algorithms (in the same sense as safe MPI programs) must express tasks so that dependencies can be satisfied without unreasonable expectation of buffering. In a multiscale approach to parallelism, coarse gain tasks are first enqueued, and these generate finer-grain tasks, which in turn generate finer and finer grain work. \ \ [Expand this discussion and include examples along with work stealing discussion] Discussion points to add \liststyleLix \begin{itemize} \item Why arguments to tasks and AM via DC or taskQ are passed \ \ \ \ by value or by const-ref (for remote operations this should be clear; for local operations it is to enable tasks to be stealable). \ Is there a way to circumvent it? Pointers. \item Virtualization of other resources \item Task stealing \item Controlling distribution in containers \item Caching in containers \item Computing with continuations (user space fibers) \item Priority of hints for tasks \end{itemize} \section[C++ Gotchas]{C++ Gotchas} \subsection{Futures and STL vectors} A common misconception is that STL containers initialize their contents by invoking the default constructor of each item in the container since we are told that the items must be default constructible. \ But this is\textit{ incorrect}. \ The items are initialized by invoking the copy constructor for each element on a \textit{single }object made with the default constructor. \ \ For futures this is a very bad problem. \ For instance, \ \ \ \texttt{vector{\textless} Future{\textless}double{\textgreater} {\textgreater} v(3);} is equivalent to the following with an array of three elements \ \ \ \texttt{Future{\textless}double{\textgreater} junk;} {\ttfamily \ Future{\textless}double{\textgreater} v[3] = \{junk,junk,junk\};} Since the Future copy constructor is by necessity shallow, each element of \texttt{v} ends up referring to the future implementation that underlies \texttt{junk}. \ When you assign to an element of \texttt{v}, you'll also be assigning to \texttt{junk}. \ But since futures are single assignment variables, you can only do that once. \ Hence, when you assign a second element of \texttt{v} you'll get a runtime exception. The fix (other than using arrays) is to initialize STL vectors and other containers from the special element returned by \texttt{Future{\textless}T{\textgreater}::default\_initializer()} that if passed into the copy constructor will cause it to behave just like the default constructor. Thus, the following code is what you actually need to use an STL vector of futures \ \ \ \texttt{vector{\textless} Future{\textless}double{\textgreater} {\textgreater} v(3,Future{\textless}double{\textgreater}::default\_initializer());} which, put politely, is ugly. Thus, we provide the factory function \ \ \texttt{\ template {\textless}typename T{\textgreater}\newline \ \ vector{\textless} Future{\textless}T{\textgreater} {\textgreater} future\_vector\_factory(std::size\_t n);} that enables one to write \ \ \ vector{\textless} Future{\textless}double{\textgreater} {\textgreater} v = future\_vector\_factory{\textless}double{\textgreater}(3);. \section{Multi-threading, the task queue, active messages and SMP parallelism} This design is in flux but the overall objectives are to provide a model and set of abstractions that are compact and well defined so that they may readily understood and reasoned about, yet rich enough to achieve compact and high-performance expression of most algorithms. Key design points: \liststyleLx \begin{itemize} \item MADNESS can be configured to build either with or without threads. If configured with threads the number of active threads can be adjusted at runtime. \item The only guaranteed source of SMP/multi-threaded concurrency is from the task queue. \item Active messages from a given source are executed sequentially in the order sent -- this is so that they may be used to maintain data structures with no additional logic necessary for the application to enforce sequential consistency. \item Active messages from distinct sources may be executed in any order or concurrently. Presently, one thread is devoted to handling all active messages so there is no concurrency at all within the active message queue, but this will change once the active message queue and task queues are unified. \item The main thread of execution may execute concurrently with the active message and task pool threads. Presently, the main thread also acts as the active message thread and in the absence of other work will execute tasks from the queue. \item All World interfaces are thread safe (are there any exceptions?). In particular, WorldContainers and the provided functionality of WorldObjects are thread safe as is the reference counting in SharedPtr. \item Parallel applications written using only MADNESS World constructs (tasks, containers, futures, and active messages) will not need additional mechanisms for SMP concurrency or synchronization. However, some applications will have additional shared data structures and provided are classes for facile use of threads, mutexes and locking pointers. At the lowest level are a portable (limited) set of atomic operations on integer data that are presently used to compose the SharedPtr and Mutex classes. \end{itemize} How to think about all this? Where possible compose your application in terms of tasks with dependencies via futures. This provides the maximum concurrency and as additional task attributes are introduced will enable the most efficient scheduling of work. The overhead of making, executing and reaping a task is about 1us, so a task's runtime should be bigger than circa 10us unless there is additional latency (algorithmic or communication) that will be hidden by making a task. If tasks are too long there may be load-balancing problems. Unless it is unduly cumbersome to split the problem into small tasks there is no reason to make tasks larger than circa 1ms (exception to this is pushing/stealing tasks that may have communication overhead). When to use active messages rather than a task? The main benefits of AM are sequential consistency (from the perspective of the sending process) and high priority for their execution. To keep the real AM latency to a minimum (which means you need less concurrency to hide the latency) AM handlers should be lightweight. Presently, MADNESS does no aggregation of messages but this is something that is clearly needed and would, for instance, reduce the overhead of creating many remote tasks to about that of local task creation. \section{Acknowledgments} \bigskip DOE NSF DARPA ORNL NCCS UT \section{References} \bigskip \end{document} madness-0.10/doc/Makefile.am000066400000000000000000000006411253377057000156770ustar00rootroot00000000000000include $(top_srcdir)/config/MakeGlobal.am if CAN_BUILD_DOC .PHONY: all all: $(DOXYGEN) doxygen.cfg .PHONY: clean clean:: /bin/rm -rf html latex /bin/rm -rf *~ veryclean: clean distclean: clean /bin/rm doxygen.cfg install:: -mkdir -p $(installroot)$(prefix)/doc -cp -r html $(installroot)$(prefix)/doc/ install_devel:: else all: doxygen is not available ... cannot build documentation ... sorry endif madness-0.10/doc/applications.dox000066400000000000000000000023031253377057000170420ustar00rootroot00000000000000/* This file is part of MADNESS. Copyright (C) 2015 Stony Brook University This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 email: harrisonrj@ornl.gov tel: 865-241-3937 fax: 865-572-0680 */ /** \file applications.dox \brief General information about MADNESS applications. \addtogroup applications \todo Write this section. Perhaps include the sample code (include MADNESS, etc.) and how to compile it. */ madness-0.10/doc/configuration.dox000066400000000000000000000024121253377057000172240ustar00rootroot00000000000000/* This file is part of MADNESS. Copyright (C) 2015 Stony Brook University This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 email: harrisonrj@ornl.gov tel: 865-241-3937 fax: 865-572-0680 */ /** \file configuration.dox \brief Notes on installing and configuring MADNESS. \addtogroup configuration \todo Write this section. In the mean time, instructions on installing and configuring MADNESS can be found in the wiki at the MADNESS repository. */ madness-0.10/doc/contribution.dox000066400000000000000000000021561253377057000171010ustar00rootroot00000000000000/* This file is part of MADNESS. Copyright (C) 2015 Stony Brook University This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 email: harrisonrj@ornl.gov tel: 865-241-3937 fax: 865-572-0680 */ /** \file contribution.dox \brief Notes on how to contribute to MADNESS. \addtogroup contribution \todo Write this section. */ madness-0.10/doc/contribution/000077500000000000000000000000001253377057000163615ustar00rootroot00000000000000madness-0.10/doc/contribution/style.dox000066400000000000000000000106171253377057000202420ustar00rootroot00000000000000/* This file is part of MADNESS. Copyright (C) 2015 Stony Brook University This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA For more information please contact: Robert J. Harrison Oak Ridge National Laboratory One Bethel Valley Road P.O. Box 2008, MS-6367 email: harrisonrj@ornl.gov tel: 865-241-3937 fax: 865-572-0680 */ /** \file style.dox \brief Coding style guidelines for MADNESS. \addtogroup style MADNESS code should be documented using doxygen-style comment blocks. More information on doxygen can be found at the doxygen website. This module illustrates how to document your files with doxygen. \par General guidelines for doxygen Every file needs \c \\file and \c \\brief near the top, and usually \c \\addtogroup or \c \ingroup. If you have put the file in a group then the file-level documentation acts as documentation for that module or group, otherwise it acts as documentation for the file. Note that doxygen is really picky about placement and association of comments so you always have to check what was generated. Links to known classes (such as \c madness::World), functions (such as \c madness::error()), and files (such as madness.h) are made automatically. \par Subsection title Use the \c \\par directive to make subsections with an optional heading. Doxygen's section and subsection directives should not be used for now. \par Example files Here is an example header file that appropriately uses doxygen comment blocks. \code /// \file example_doc.h /// \brief Brief description of this file's contents. /// \ingroup example_group // Add your code to a group, if desired, using the \addtogroup command. See doc/mainpage.h for available groups. // The @{ and @} commands allow blocks of code to be added to the group. /// \addtogroup possibly_different_example_group /// @{ /// Every class should be documented (this is the brief line). /// This is the full text ... no need to be verbose, but without /// documentation your class is nearly useless. class ExampleClass1 { public: int datum; ///< Each member datum needs at least a brief raison d'etre. /// Each member function should be documented (this is the brief line). /// Full documentation of the member function. /// /// You should document all arguments and return values. /// @param[in] hope Optimism level. /// @param[in] despair Pessimism level. /// @return Cynicism level. int reality(int hope, int despair) const; }; /// Brief documentation of a second example class in the group. /// Full documentation of the second example class. class ExampleClass2 { public: /// Brief documentation of a member function. /// Full documentation of the member function, including parameters /// and return value. /// @param[in] fud The current level of fear, uncertainty, and doubt. /// @param[in,out] bs The current level of ... /// @return Your state of mind. int morose(int fud, int& bs) {}; void fred(double mary); // Fred is documented in example_doc.cc // We recommend, however, placing documentation in header files when possible. }; int stuff; ///< Global variable in the example group. /// Global function in the example group. /// @param[in] a An input argument. /// @return The return value. int example_func(int a); // Closing group membership. /// @} \endcode And here's the corresponding implementation file. \code /// \file example_doc.cc /// \brief Illustrates how to split documentation between files. /// \ingroup example_group #include "example_doc.h" /// Brief documentation of a class member declared in the header file. /// Full documentation of the class member. /// @param[in] mary Mary's current bank balance. void ExampleClass2::fred(double mary) { return; } \endcode */ madness-0.10/doc/doxygen.cfg.in000077500000000000000000003126141253377057000164170ustar00rootroot00000000000000# Doxyfile 1.8.9.1 # This file describes the settings to be used by the documentation system # doxygen (www.doxygen.org) for a project. # # All text after a double hash (##) is considered a comment and is placed in # front of the TAG it is preceding. # # All text after a single hash (#) is considered a comment and will be ignored. # The format is: # TAG = value [value, ...] # For lists, items can also be appended using: # TAG += value [value, ...] # Values that contain spaces should be placed between quotes (\" \"). #--------------------------------------------------------------------------- # Project related configuration options #--------------------------------------------------------------------------- # This tag specifies the encoding used for all characters in the config file # that follow. The default is UTF-8 which is also the encoding used for all text # before the first occurrence of this tag. Doxygen uses libiconv (or the iconv # built into libc) for the transcoding. See http://www.gnu.org/software/libiconv # for the list of possible encodings. # The default value is: UTF-8. DOXYFILE_ENCODING = UTF-8 # The PROJECT_NAME tag is a single word (or a sequence of words surrounded by # double-quotes, unless you are using Doxywizard) that should identify the # project for which the documentation is generated. This name is used in the # title of most generated pages and in a few other places. # The default value is: My Project. PROJECT_NAME = MADNESS # The PROJECT_NUMBER tag can be used to enter a project or revision number. This # could be handy for archiving the generated documentation or if some version # control system is used. PROJECT_NUMBER = "version @PACKAGE_VERSION@" # Using the PROJECT_BRIEF tag one can provide an optional one line description # for a project that appears at the top of each page and should give viewer a # quick idea about the purpose of the project. Keep the description short. PROJECT_BRIEF = # With the PROJECT_LOGO tag one can specify a logo or an icon that is included # in the documentation. The maximum height of the logo should not exceed 55 # pixels and the maximum width should not exceed 200 pixels. Doxygen will copy # the logo to the output directory. PROJECT_LOGO = # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) path # into which the generated documentation will be written. If a relative path is # entered, it will be relative to the location where doxygen was started. If # left blank the current directory will be used. OUTPUT_DIRECTORY = ./ # If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub- # directories (in 2 levels) under the output directory of each output format and # will distribute the generated files over these directories. Enabling this # option can be useful when feeding doxygen a huge amount of source files, where # putting all generated files in the same directory would otherwise causes # performance problems for the file system. # The default value is: NO. CREATE_SUBDIRS = NO # If the ALLOW_UNICODE_NAMES tag is set to YES, doxygen will allow non-ASCII # characters to appear in the names of generated files. If set to NO, non-ASCII # characters will be escaped, for example _xE3_x81_x84 will be used for Unicode # U+3044. # The default value is: NO. ALLOW_UNICODE_NAMES = NO # The OUTPUT_LANGUAGE tag is used to specify the language in which all # documentation generated by doxygen is written. Doxygen will use this # information to generate all constant output in the proper language. # Possible values are: Afrikaans, Arabic, Armenian, Brazilian, Catalan, Chinese, # Chinese-Traditional, Croatian, Czech, Danish, Dutch, English (United States), # Esperanto, Farsi (Persian), Finnish, French, German, Greek, Hungarian, # Indonesian, Italian, Japanese, Japanese-en (Japanese with English messages), # Korean, Korean-en (Korean with English messages), Latvian, Lithuanian, # Macedonian, Norwegian, Persian (Farsi), Polish, Portuguese, Romanian, Russian, # Serbian, Serbian-Cyrillic, Slovak, Slovene, Spanish, Swedish, Turkish, # Ukrainian and Vietnamese. # The default value is: English. OUTPUT_LANGUAGE = English # If the BRIEF_MEMBER_DESC tag is set to YES, doxygen will include brief member # descriptions after the members that are listed in the file and class # documentation (similar to Javadoc). Set to NO to disable this. # The default value is: YES. BRIEF_MEMBER_DESC = YES # If the REPEAT_BRIEF tag is set to YES, doxygen will prepend the brief # description of a member or function before the detailed description # # Note: If both HIDE_UNDOC_MEMBERS and BRIEF_MEMBER_DESC are set to NO, the # brief descriptions will be completely suppressed. # The default value is: YES. REPEAT_BRIEF = YES # This tag implements a quasi-intelligent brief description abbreviator that is # used to form the text in various listings. Each string in this list, if found # as the leading text of the brief description, will be stripped from the text # and the result, after processing the whole list, is used as the annotated # text. Otherwise, the brief description is used as-is. If left blank, the # following values are used ($name is automatically replaced with the name of # the entity):The $name class, The $name widget, The $name file, is, provides, # specifies, contains, represents, a, an and the. ABBREVIATE_BRIEF = # If the ALWAYS_DETAILED_SEC and REPEAT_BRIEF tags are both set to YES then # doxygen will generate a detailed section even if there is only a brief # description. # The default value is: NO. ALWAYS_DETAILED_SEC = NO # If the INLINE_INHERITED_MEMB tag is set to YES, doxygen will show all # inherited members of a class in the documentation of that class as if those # members were ordinary class members. Constructors, destructors and assignment # operators of the base classes will not be shown. # The default value is: NO. INLINE_INHERITED_MEMB = NO # If the FULL_PATH_NAMES tag is set to YES, doxygen will prepend the full path # before files name in the file list and in the header files. If set to NO the # shortest path that makes the file name unique will be used # The default value is: YES. FULL_PATH_NAMES = NO # The STRIP_FROM_PATH tag can be used to strip a user-defined part of the path. # Stripping is only done if one of the specified strings matches the left-hand # part of the path. The tag can be used to show relative paths in the file list. # If left blank the directory from which doxygen is run is used as the path to # strip. # # Note that you can specify absolute paths here, but also relative paths, which # will be relative from the directory where doxygen is started. # This tag requires that the tag FULL_PATH_NAMES is set to YES. STRIP_FROM_PATH = "@abs_top_srcdir@ @top_srcdir@" # The STRIP_FROM_INC_PATH tag can be used to strip a user-defined part of the # path mentioned in the documentation of a class, which tells the reader which # header file to include in order to use a class. If left blank only the name of # the header file containing the class definition is used. Otherwise one should # specify the list of include paths that are normally passed to the compiler # using the -I flag. STRIP_FROM_INC_PATH = # If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter (but # less readable) file names. This can be useful is your file systems doesn't # support long names like on DOS, Mac, or CD-ROM. # The default value is: NO. SHORT_NAMES = NO # If the JAVADOC_AUTOBRIEF tag is set to YES then doxygen will interpret the # first line (until the first dot) of a Javadoc-style comment as the brief # description. If set to NO, the Javadoc-style will behave just like regular Qt- # style comments (thus requiring an explicit @brief command for a brief # description.) # The default value is: NO. JAVADOC_AUTOBRIEF = NO # If the QT_AUTOBRIEF tag is set to YES then doxygen will interpret the first # line (until the first dot) of a Qt-style comment as the brief description. If # set to NO, the Qt-style will behave just like regular Qt-style comments (thus # requiring an explicit \brief command for a brief description.) # The default value is: NO. QT_AUTOBRIEF = NO # The MULTILINE_CPP_IS_BRIEF tag can be set to YES to make doxygen treat a # multi-line C++ special comment block (i.e. a block of //! or /// comments) as # a brief description. This used to be the default behavior. The new default is # to treat a multi-line C++ comment block as a detailed description. Set this # tag to YES if you prefer the old behavior instead. # # Note that setting this tag to YES also means that rational rose comments are # not recognized any more. # The default value is: NO. MULTILINE_CPP_IS_BRIEF = NO # If the INHERIT_DOCS tag is set to YES then an undocumented member inherits the # documentation from any documented member that it re-implements. # The default value is: YES. INHERIT_DOCS = YES # If the SEPARATE_MEMBER_PAGES tag is set to YES then doxygen will produce a new # page for each member. If set to NO, the documentation of a member will be part # of the file/class/namespace that contains it. # The default value is: NO. SEPARATE_MEMBER_PAGES = NO # The TAB_SIZE tag can be used to set the number of spaces in a tab. Doxygen # uses this value to replace tabs by spaces in code fragments. # Minimum value: 1, maximum value: 16, default value: 4. TAB_SIZE = 4 # This tag can be used to specify a number of aliases that act as commands in # the documentation. An alias has the form: # name=value # For example adding # "sideeffect=@par Side Effects:\n" # will allow you to put the command \sideeffect (or @sideeffect) in the # documentation, which will result in a user-defined paragraph with heading # "Side Effects:". You can put \n's in the value part of an alias to insert # newlines. ALIASES = "website=@PACKAGE_URL@" # This tag can be used to specify a number of word-keyword mappings (TCL only). # A mapping has the form "name=value". For example adding "class=itcl::class" # will allow you to use the command class in the itcl::class meaning. TCL_SUBST = # Set the OPTIMIZE_OUTPUT_FOR_C tag to YES if your project consists of C sources # only. Doxygen will then generate output that is more tailored for C. For # instance, some of the names that are used will be different. The list of all # members will be omitted, etc. # The default value is: NO. OPTIMIZE_OUTPUT_FOR_C = NO # Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java or # Python sources only. Doxygen will then generate output that is more tailored # for that language. For instance, namespaces will be presented as packages, # qualified scopes will look different, etc. # The default value is: NO. OPTIMIZE_OUTPUT_JAVA = NO # Set the OPTIMIZE_FOR_FORTRAN tag to YES if your project consists of Fortran # sources. Doxygen will then generate output that is tailored for Fortran. # The default value is: NO. OPTIMIZE_FOR_FORTRAN = NO # Set the OPTIMIZE_OUTPUT_VHDL tag to YES if your project consists of VHDL # sources. Doxygen will then generate output that is tailored for VHDL. # The default value is: NO. OPTIMIZE_OUTPUT_VHDL = NO # Doxygen selects the parser to use depending on the extension of the files it # parses. With this tag you can assign which parser to use for a given # extension. Doxygen has a built-in mapping, but you can override or extend it # using this tag. The format is ext=language, where ext is a file extension, and # language is one of the parsers supported by doxygen: IDL, Java, Javascript, # C#, C, C++, D, PHP, Objective-C, Python, Fortran (fixed format Fortran: # FortranFixed, free formatted Fortran: FortranFree, unknown formatted Fortran: # Fortran. In the later case the parser tries to guess whether the code is fixed # or free formatted code, this is the default for Fortran type files), VHDL. For # instance to make doxygen treat .inc files as Fortran files (default is PHP), # and .f files as C (default is Fortran), use: inc=Fortran f=C. # # Note: For files without extension you can use no_extension as a placeholder. # # Note that for custom extensions you also need to set FILE_PATTERNS otherwise # the files are not read by doxygen. EXTENSION_MAPPING = # If the MARKDOWN_SUPPORT tag is enabled then doxygen pre-processes all comments # according to the Markdown format, which allows for more readable # documentation. See http://daringfireball.net/projects/markdown/ for details. # The output of markdown processing is further processed by doxygen, so you can # mix doxygen, HTML, and XML commands with Markdown formatting. Disable only in # case of backward compatibilities issues. # The default value is: YES. MARKDOWN_SUPPORT = YES # When enabled doxygen tries to link words that correspond to documented # classes, or namespaces to their corresponding documentation. Such a link can # be prevented in individual cases by putting a % sign in front of the word or # globally by setting AUTOLINK_SUPPORT to NO. # The default value is: YES. AUTOLINK_SUPPORT = YES # If you use STL classes (i.e. std::string, std::vector, etc.) but do not want # to include (a tag file for) the STL sources as input, then you should set this # tag to YES in order to let doxygen match functions declarations and # definitions whose arguments contain STL classes (e.g. func(std::string); # versus func(std::string) {}). This also make the inheritance and collaboration # diagrams that involve STL classes more complete and accurate. # The default value is: NO. BUILTIN_STL_SUPPORT = NO # If you use Microsoft's C++/CLI language, you should set this option to YES to # enable parsing support. # The default value is: NO. CPP_CLI_SUPPORT = NO # Set the SIP_SUPPORT tag to YES if your project consists of sip (see: # http://www.riverbankcomputing.co.uk/software/sip/intro) sources only. Doxygen # will parse them like normal C++ but will assume all classes use public instead # of private inheritance when no explicit protection keyword is present. # The default value is: NO. SIP_SUPPORT = NO # For Microsoft's IDL there are propget and propput attributes to indicate # getter and setter methods for a property. Setting this option to YES will make # doxygen to replace the get and set methods by a property in the documentation. # This will only work if the methods are indeed getting or setting a simple # type. If this is not the case, or you want to show the methods anyway, you # should set this option to NO. # The default value is: YES. IDL_PROPERTY_SUPPORT = YES # If member grouping is used in the documentation and the DISTRIBUTE_GROUP_DOC # tag is set to YES then doxygen will reuse the documentation of the first # member in the group (if any) for the other members of the group. By default # all members of a group must be documented explicitly. # The default value is: NO. DISTRIBUTE_GROUP_DOC = NO # Set the SUBGROUPING tag to YES to allow class member groups of the same type # (for instance a group of public functions) to be put as a subgroup of that # type (e.g. under the Public Functions section). Set it to NO to prevent # subgrouping. Alternatively, this can be done per class using the # \nosubgrouping command. # The default value is: YES. SUBGROUPING = YES # When the INLINE_GROUPED_CLASSES tag is set to YES, classes, structs and unions # are shown inside the group in which they are included (e.g. using \ingroup) # instead of on a separate page (for HTML and Man pages) or section (for LaTeX # and RTF). # # Note that this feature does not work in combination with # SEPARATE_MEMBER_PAGES. # The default value is: NO. INLINE_GROUPED_CLASSES = NO # When the INLINE_SIMPLE_STRUCTS tag is set to YES, structs, classes, and unions # with only public data fields or simple typedef fields will be shown inline in # the documentation of the scope in which they are defined (i.e. file, # namespace, or group documentation), provided this scope is documented. If set # to NO, structs, classes, and unions are shown on a separate page (for HTML and # Man pages) or section (for LaTeX and RTF). # The default value is: NO. INLINE_SIMPLE_STRUCTS = NO # When TYPEDEF_HIDES_STRUCT tag is enabled, a typedef of a struct, union, or # enum is documented as struct, union, or enum with the name of the typedef. So # typedef struct TypeS {} TypeT, will appear in the documentation as a struct # with name TypeT. When disabled the typedef will appear as a member of a file, # namespace, or class. And the struct will be named TypeS. This can typically be # useful for C code in case the coding convention dictates that all compound # types are typedef'ed and only the typedef is referenced, never the tag name. # The default value is: NO. TYPEDEF_HIDES_STRUCT = NO # The size of the symbol lookup cache can be set using LOOKUP_CACHE_SIZE. This # cache is used to resolve symbols given their name and scope. Since this can be # an expensive process and often the same symbol appears multiple times in the # code, doxygen keeps a cache of pre-resolved symbols. If the cache is too small # doxygen will become slower. If the cache is too large, memory is wasted. The # cache size is given by this formula: 2^(16+LOOKUP_CACHE_SIZE). The valid range # is 0..9, the default is 0, corresponding to a cache size of 2^16=65536 # symbols. At the end of a run doxygen will report the cache usage and suggest # the optimal cache size from a speed point of view. # Minimum value: 0, maximum value: 9, default value: 0. LOOKUP_CACHE_SIZE = 0 #--------------------------------------------------------------------------- # Build related configuration options #--------------------------------------------------------------------------- # If the EXTRACT_ALL tag is set to YES, doxygen will assume all entities in # documentation are documented, even if no documentation was available. Private # class members and static file members will be hidden unless the # EXTRACT_PRIVATE respectively EXTRACT_STATIC tags are set to YES. # Note: This will also disable the warnings about undocumented members that are # normally produced when WARNINGS is set to YES. # The default value is: NO. EXTRACT_ALL = YES # If the EXTRACT_PRIVATE tag is set to YES, all private members of a class will # be included in the documentation. # The default value is: NO. EXTRACT_PRIVATE = YES # If the EXTRACT_PACKAGE tag is set to YES, all members with package or internal # scope will be included in the documentation. # The default value is: NO. EXTRACT_PACKAGE = NO # If the EXTRACT_STATIC tag is set to YES, all static members of a file will be # included in the documentation. # The default value is: NO. EXTRACT_STATIC = YES # If the EXTRACT_LOCAL_CLASSES tag is set to YES, classes (and structs) defined # locally in source files will be included in the documentation. If set to NO, # only classes defined in header files are included. Does not have any effect # for Java sources. # The default value is: YES. EXTRACT_LOCAL_CLASSES = YES # This flag is only useful for Objective-C code. If set to YES, local methods, # which are defined in the implementation section but not in the interface are # included in the documentation. If set to NO, only methods in the interface are # included. # The default value is: NO. EXTRACT_LOCAL_METHODS = NO # If this flag is set to YES, the members of anonymous namespaces will be # extracted and appear in the documentation as a namespace called # 'anonymous_namespace{file}', where file will be replaced with the base name of # the file that contains the anonymous namespace. By default anonymous namespace # are hidden. # The default value is: NO. EXTRACT_ANON_NSPACES = NO # If the HIDE_UNDOC_MEMBERS tag is set to YES, doxygen will hide all # undocumented members inside documented classes or files. If set to NO these # members will be included in the various overviews, but no documentation # section is generated. This option has no effect if EXTRACT_ALL is enabled. # The default value is: NO. HIDE_UNDOC_MEMBERS = NO # If the HIDE_UNDOC_CLASSES tag is set to YES, doxygen will hide all # undocumented classes that are normally visible in the class hierarchy. If set # to NO, these classes will be included in the various overviews. This option # has no effect if EXTRACT_ALL is enabled. # The default value is: NO. HIDE_UNDOC_CLASSES = NO # If the HIDE_FRIEND_COMPOUNDS tag is set to YES, doxygen will hide all friend # (class|struct|union) declarations. If set to NO, these declarations will be # included in the documentation. # The default value is: NO. HIDE_FRIEND_COMPOUNDS = NO # If the HIDE_IN_BODY_DOCS tag is set to YES, doxygen will hide any # documentation blocks found inside the body of a function. If set to NO, these # blocks will be appended to the function's detailed documentation block. # The default value is: NO. HIDE_IN_BODY_DOCS = NO # The INTERNAL_DOCS tag determines if documentation that is typed after a # \internal command is included. If the tag is set to NO then the documentation # will be excluded. Set it to YES to include the internal documentation. # The default value is: NO. INTERNAL_DOCS = NO # If the CASE_SENSE_NAMES tag is set to NO then doxygen will only generate file # names in lower-case letters. If set to YES, upper-case letters are also # allowed. This is useful if you have classes or files whose names only differ # in case and if your file system supports case sensitive file names. Windows # and Mac users are advised to set this option to NO. # The default value is: system dependent. CASE_SENSE_NAMES = YES # If the HIDE_SCOPE_NAMES tag is set to NO then doxygen will show members with # their full class and namespace scopes in the documentation. If set to YES, the # scope will be hidden. # The default value is: NO. HIDE_SCOPE_NAMES = NO # If the HIDE_COMPOUND_REFERENCE tag is set to NO (default) then doxygen will # append additional text to a page's title, such as Class Reference. If set to # YES the compound reference will be hidden. # The default value is: NO. HIDE_COMPOUND_REFERENCE= NO # If the SHOW_INCLUDE_FILES tag is set to YES then doxygen will put a list of # the files that are included by a file in the documentation of that file. # The default value is: YES. SHOW_INCLUDE_FILES = YES # If the SHOW_GROUPED_MEMB_INC tag is set to YES then Doxygen will add for each # grouped member an include statement to the documentation, telling the reader # which file to include in order to use the member. # The default value is: NO. SHOW_GROUPED_MEMB_INC = NO # If the FORCE_LOCAL_INCLUDES tag is set to YES then doxygen will list include # files with double quotes in the documentation rather than with sharp brackets. # The default value is: NO. FORCE_LOCAL_INCLUDES = NO # If the INLINE_INFO tag is set to YES then a tag [inline] is inserted in the # documentation for inline members. # The default value is: YES. INLINE_INFO = YES # If the SORT_MEMBER_DOCS tag is set to YES then doxygen will sort the # (detailed) documentation of file and class members alphabetically by member # name. If set to NO, the members will appear in declaration order. # The default value is: YES. SORT_MEMBER_DOCS = YES # If the SORT_BRIEF_DOCS tag is set to YES then doxygen will sort the brief # descriptions of file, namespace and class members alphabetically by member # name. If set to NO, the members will appear in declaration order. Note that # this will also influence the order of the classes in the class list. # The default value is: NO. SORT_BRIEF_DOCS = YES # If the SORT_MEMBERS_CTORS_1ST tag is set to YES then doxygen will sort the # (brief and detailed) documentation of class members so that constructors and # destructors are listed first. If set to NO the constructors will appear in the # respective orders defined by SORT_BRIEF_DOCS and SORT_MEMBER_DOCS. # Note: If SORT_BRIEF_DOCS is set to NO this option is ignored for sorting brief # member documentation. # Note: If SORT_MEMBER_DOCS is set to NO this option is ignored for sorting # detailed member documentation. # The default value is: NO. SORT_MEMBERS_CTORS_1ST = YES # If the SORT_GROUP_NAMES tag is set to YES then doxygen will sort the hierarchy # of group names into alphabetical order. If set to NO the group names will # appear in their defined order. # The default value is: NO. SORT_GROUP_NAMES = NO # If the SORT_BY_SCOPE_NAME tag is set to YES, the class list will be sorted by # fully-qualified names, including namespaces. If set to NO, the class list will # be sorted only by class name, not including the namespace part. # Note: This option is not very useful if HIDE_SCOPE_NAMES is set to YES. # Note: This option applies only to the class list, not to the alphabetical # list. # The default value is: NO. SORT_BY_SCOPE_NAME = NO # If the STRICT_PROTO_MATCHING option is enabled and doxygen fails to do proper # type resolution of all parameters of a function it will reject a match between # the prototype and the implementation of a member function even if there is # only one candidate or it is obvious which candidate to choose by doing a # simple string match. By disabling STRICT_PROTO_MATCHING doxygen will still # accept a match between prototype and implementation in such cases. # The default value is: NO. STRICT_PROTO_MATCHING = NO # The GENERATE_TODOLIST tag can be used to enable (YES) or disable (NO) the todo # list. This list is created by putting \todo commands in the documentation. # The default value is: YES. GENERATE_TODOLIST = YES # The GENERATE_TESTLIST tag can be used to enable (YES) or disable (NO) the test # list. This list is created by putting \test commands in the documentation. # The default value is: YES. GENERATE_TESTLIST = YES # The GENERATE_BUGLIST tag can be used to enable (YES) or disable (NO) the bug # list. This list is created by putting \bug commands in the documentation. # The default value is: YES. GENERATE_BUGLIST = YES # The GENERATE_DEPRECATEDLIST tag can be used to enable (YES) or disable (NO) # the deprecated list. This list is created by putting \deprecated commands in # the documentation. # The default value is: YES. GENERATE_DEPRECATEDLIST= YES # The ENABLED_SECTIONS tag can be used to enable conditional documentation # sections, marked by \if ... \endif and \cond # ... \endcond blocks. ENABLED_SECTIONS = # The MAX_INITIALIZER_LINES tag determines the maximum number of lines that the # initial value of a variable or macro / define can have for it to appear in the # documentation. If the initializer consists of more lines than specified here # it will be hidden. Use a value of 0 to hide initializers completely. The # appearance of the value of individual variables and macros / defines can be # controlled using \showinitializer or \hideinitializer command in the # documentation regardless of this setting. # Minimum value: 0, maximum value: 10000, default value: 30. MAX_INITIALIZER_LINES = 30 # Set the SHOW_USED_FILES tag to NO to disable the list of files generated at # the bottom of the documentation of classes and structs. If set to YES, the # list will mention the files that were used to generate the documentation. # The default value is: YES. SHOW_USED_FILES = YES # Set the SHOW_FILES tag to NO to disable the generation of the Files page. This # will remove the Files entry from the Quick Index and from the Folder Tree View # (if specified). # The default value is: YES. SHOW_FILES = YES # Set the SHOW_NAMESPACES tag to NO to disable the generation of the Namespaces # page. This will remove the Namespaces entry from the Quick Index and from the # Folder Tree View (if specified). # The default value is: YES. SHOW_NAMESPACES = YES # The FILE_VERSION_FILTER tag can be used to specify a program or script that # doxygen should invoke to get the current version for each file (typically from # the version control system). Doxygen will invoke the program by executing (via # popen()) the command command input-file, where command is the value of the # FILE_VERSION_FILTER tag, and input-file is the name of an input file provided # by doxygen. Whatever the program writes to standard output is used as the file # version. For an example see the documentation. FILE_VERSION_FILTER = # The LAYOUT_FILE tag can be used to specify a layout file which will be parsed # by doxygen. The layout file controls the global structure of the generated # output files in an output format independent way. To create the layout file # that represents doxygen's defaults, run doxygen with the -l option. You can # optionally specify a file name after the option, if omitted DoxygenLayout.xml # will be used as the name of the layout file. # # Note that if you run doxygen from a directory containing a file called # DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE # tag is left empty. LAYOUT_FILE = # The CITE_BIB_FILES tag can be used to specify one or more bib files containing # the reference definitions. This must be a list of .bib files. The .bib # extension is automatically appended if omitted. This requires the bibtex tool # to be installed. See also http://en.wikipedia.org/wiki/BibTeX for more info. # For LaTeX the style of the bibliography can be controlled using # LATEX_BIB_STYLE. To use this feature you need bibtex and perl available in the # search path. See also \cite for info how to create references. CITE_BIB_FILES = #--------------------------------------------------------------------------- # Configuration options related to warning and progress messages #--------------------------------------------------------------------------- # The QUIET tag can be used to turn on/off the messages that are generated to # standard output by doxygen. If QUIET is set to YES this implies that the # messages are off. # The default value is: NO. QUIET = NO # The WARNINGS tag can be used to turn on/off the warning messages that are # generated to standard error (stderr) by doxygen. If WARNINGS is set to YES # this implies that the warnings are on. # # Tip: Turn warnings on while writing the documentation. # The default value is: YES. WARNINGS = YES # If the WARN_IF_UNDOCUMENTED tag is set to YES then doxygen will generate # warnings for undocumented members. If EXTRACT_ALL is set to YES then this flag # will automatically be disabled. # The default value is: YES. WARN_IF_UNDOCUMENTED = YES # If the WARN_IF_DOC_ERROR tag is set to YES, doxygen will generate warnings for # potential errors in the documentation, such as not documenting some parameters # in a documented function, or documenting parameters that don't exist or using # markup commands wrongly. # The default value is: YES. WARN_IF_DOC_ERROR = YES # This WARN_NO_PARAMDOC option can be enabled to get warnings for functions that # are documented, but have no documentation for their parameters or return # value. If set to NO, doxygen will only warn about wrong or incomplete # parameter documentation, but not about the absence of documentation. # The default value is: NO. WARN_NO_PARAMDOC = YES # The WARN_FORMAT tag determines the format of the warning messages that doxygen # can produce. The string should contain the $file, $line, and $text tags, which # will be replaced by the file and line number from which the warning originated # and the warning text. Optionally the format may contain $version, which will # be replaced by the version of the file (if it could be obtained via # FILE_VERSION_FILTER) # The default value is: $file:$line: $text. WARN_FORMAT = "$file:$line: $text" # The WARN_LOGFILE tag can be used to specify a file to which warning and error # messages should be written. If left blank the output is written to standard # error (stderr). WARN_LOGFILE = doxygen-warnings.txt #--------------------------------------------------------------------------- # Configuration options related to the input files #--------------------------------------------------------------------------- # The INPUT tag is used to specify the files and/or directories that contain # documented source files. You may enter file names like myfile.cpp or # directories like /usr/src/myproject. Separate the files or directories with # spaces. # Note: If this tag is empty the current directory is searched. INPUT = "@abs_top_srcdir@/doc" \ "@abs_top_srcdir@/src" \ "@abs_top_srcdir@/bin" # This tag can be used to specify the character encoding of the source files # that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses # libiconv (or the iconv built into libc) for the transcoding. See the libiconv # documentation (see: http://www.gnu.org/software/libiconv) for the list of # possible encodings. # The default value is: UTF-8. INPUT_ENCODING = UTF-8 # If the value of the INPUT tag contains directories, you can use the # FILE_PATTERNS tag to specify one or more wildcard patterns (like *.cpp and # *.h) to filter out the source-files in the directories. If left blank the # following patterns are tested:*.c, *.cc, *.cxx, *.cpp, *.c++, *.java, *.ii, # *.ixx, *.ipp, *.i++, *.inl, *.idl, *.ddl, *.odl, *.h, *.hh, *.hxx, *.hpp, # *.h++, *.cs, *.d, *.php, *.php4, *.php5, *.phtml, *.inc, *.m, *.markdown, # *.md, *.mm, *.dox, *.py, *.f90, *.f, *.for, *.tcl, *.vhd, *.vhdl, *.ucf, # *.qsf, *.as and *.js. FILE_PATTERNS = *.dox \ *.h \ *.cpp \ *.cc \ *.c # The RECURSIVE tag can be used to specify whether or not subdirectories should # be searched for input files as well. # The default value is: NO. RECURSIVE = YES # The EXCLUDE tag can be used to specify files and/or directories that should be # excluded from the INPUT source files. This way you can easily exclude a # subdirectory from a directory tree whose root is specified with the INPUT tag. # # Note that relative paths are relative to the directory from which doxygen is # run. EXCLUDE = "@abs_top_srcdir@/src/madness/external" \ "@abs_top_srcdir@/src/apps/DFcode" \ "@abs_top_srcdir@/src/apps/ii" \ "@abs_top_srcdir@/src/apps/jacob" \ "@abs_top_srcdir@/src/apps/nick" \ "@abs_top_srcdir@/src/apps/polar" # The EXCLUDE_SYMLINKS tag can be used to select whether or not files or # directories that are symbolic links (a Unix file system feature) are excluded # from the input. # The default value is: NO. EXCLUDE_SYMLINKS = NO # If the value of the INPUT tag contains directories, you can use the # EXCLUDE_PATTERNS tag to specify one or more wildcard patterns to exclude # certain files from those directories. # # Note that the wildcards are matched against the file with absolute path, so to # exclude all test directories for example use the pattern */test/* EXCLUDE_PATTERNS = LIBS.h \ mraX.cc \ mraX.h # The EXCLUDE_SYMBOLS tag can be used to specify one or more symbol names # (namespaces, classes, functions, etc.) that should be excluded from the # output. The symbol name can be a fully qualified name, a word, or if the # wildcard * is used, a substring. Examples: ANamespace, AClass, # AClass::ANamespace, ANamespace::*Test # # Note that the wildcards are matched against the file with absolute path, so to # exclude all test directories use the pattern */test/* EXCLUDE_SYMBOLS = # The EXAMPLE_PATH tag can be used to specify one or more files or directories # that contain example code fragments that are included (see the \include # command). EXAMPLE_PATH = # If the value of the EXAMPLE_PATH tag contains directories, you can use the # EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp and # *.h) to filter out the source-files in the directories. If left blank all # files are included. EXAMPLE_PATTERNS = # If the EXAMPLE_RECURSIVE tag is set to YES then subdirectories will be # searched for input files to be used with the \include or \dontinclude commands # irrespective of the value of the RECURSIVE tag. # The default value is: NO. EXAMPLE_RECURSIVE = NO # The IMAGE_PATH tag can be used to specify one or more files or directories # that contain images that are to be included in the documentation (see the # \image command). IMAGE_PATH = # The INPUT_FILTER tag can be used to specify a program that doxygen should # invoke to filter for each input file. Doxygen will invoke the filter program # by executing (via popen()) the command: # # # # where is the value of the INPUT_FILTER tag, and is the # name of an input file. Doxygen will then use the output that the filter # program writes to standard output. If FILTER_PATTERNS is specified, this tag # will be ignored. # # Note that the filter must not add or remove lines; it is applied before the # code is scanned, but not when the output code is generated. If lines are added # or removed, the anchors will not be placed correctly. INPUT_FILTER = # The FILTER_PATTERNS tag can be used to specify filters on a per file pattern # basis. Doxygen will compare the file name with each pattern and apply the # filter if there is a match. The filters are a list of the form: pattern=filter # (like *.cpp=my_cpp_filter). See INPUT_FILTER for further information on how # filters are used. If the FILTER_PATTERNS tag is empty or if none of the # patterns match the file name, INPUT_FILTER is applied. FILTER_PATTERNS = # If the FILTER_SOURCE_FILES tag is set to YES, the input filter (if set using # INPUT_FILTER) will also be used to filter the input files that are used for # producing the source files to browse (i.e. when SOURCE_BROWSER is set to YES). # The default value is: NO. FILTER_SOURCE_FILES = NO # The FILTER_SOURCE_PATTERNS tag can be used to specify source filters per file # pattern. A pattern will override the setting for FILTER_PATTERN (if any) and # it is also possible to disable source filtering for a specific pattern using # *.ext= (so without naming a filter). # This tag requires that the tag FILTER_SOURCE_FILES is set to YES. FILTER_SOURCE_PATTERNS = # If the USE_MDFILE_AS_MAINPAGE tag refers to the name of a markdown file that # is part of the input, its contents will be placed on the main page # (index.html). This can be useful if you have a project on for instance GitHub # and want to reuse the introduction page also for the doxygen output. USE_MDFILE_AS_MAINPAGE = #--------------------------------------------------------------------------- # Configuration options related to source browsing #--------------------------------------------------------------------------- # If the SOURCE_BROWSER tag is set to YES then a list of source files will be # generated. Documented entities will be cross-referenced with these sources. # # Note: To get rid of all source code in the generated output, make sure that # also VERBATIM_HEADERS is set to NO. # The default value is: NO. SOURCE_BROWSER = NO # Setting the INLINE_SOURCES tag to YES will include the body of functions, # classes and enums directly into the documentation. # The default value is: NO. INLINE_SOURCES = NO # Setting the STRIP_CODE_COMMENTS tag to YES will instruct doxygen to hide any # special comment blocks from generated source code fragments. Normal C, C++ and # Fortran comments will always remain visible. # The default value is: YES. STRIP_CODE_COMMENTS = NO # If the REFERENCED_BY_RELATION tag is set to YES then for each documented # function all documented functions referencing it will be listed. # The default value is: NO. REFERENCED_BY_RELATION = YES # If the REFERENCES_RELATION tag is set to YES then for each documented function # all documented entities called/used by that function will be listed. # The default value is: NO. REFERENCES_RELATION = YES # If the REFERENCES_LINK_SOURCE tag is set to YES and SOURCE_BROWSER tag is set # to YES then the hyperlinks from functions in REFERENCES_RELATION and # REFERENCED_BY_RELATION lists will link to the source code. Otherwise they will # link to the documentation. # The default value is: YES. REFERENCES_LINK_SOURCE = YES # If SOURCE_TOOLTIPS is enabled (the default) then hovering a hyperlink in the # source code will show a tooltip with additional information such as prototype, # brief description and links to the definition and documentation. Since this # will make the HTML file larger and loading of large files a bit slower, you # can opt to disable this feature. # The default value is: YES. # This tag requires that the tag SOURCE_BROWSER is set to YES. SOURCE_TOOLTIPS = YES # If the USE_HTAGS tag is set to YES then the references to source code will # point to the HTML generated by the htags(1) tool instead of doxygen built-in # source browser. The htags tool is part of GNU's global source tagging system # (see http://www.gnu.org/software/global/global.html). You will need version # 4.8.6 or higher. # # To use it do the following: # - Install the latest version of global # - Enable SOURCE_BROWSER and USE_HTAGS in the config file # - Make sure the INPUT points to the root of the source tree # - Run doxygen as normal # # Doxygen will invoke htags (and that will in turn invoke gtags), so these # tools must be available from the command line (i.e. in the search path). # # The result: instead of the source browser generated by doxygen, the links to # source code will now point to the output of htags. # The default value is: NO. # This tag requires that the tag SOURCE_BROWSER is set to YES. USE_HTAGS = NO # If the VERBATIM_HEADERS tag is set the YES then doxygen will generate a # verbatim copy of the header file for each class for which an include is # specified. Set to NO to disable this. # See also: Section \class. # The default value is: YES. VERBATIM_HEADERS = YES #--------------------------------------------------------------------------- # Configuration options related to the alphabetical class index #--------------------------------------------------------------------------- # If the ALPHABETICAL_INDEX tag is set to YES, an alphabetical index of all # compounds will be generated. Enable this if the project contains a lot of # classes, structs, unions or interfaces. # The default value is: YES. ALPHABETICAL_INDEX = NO # The COLS_IN_ALPHA_INDEX tag can be used to specify the number of columns in # which the alphabetical index list will be split. # Minimum value: 1, maximum value: 20, default value: 5. # This tag requires that the tag ALPHABETICAL_INDEX is set to YES. COLS_IN_ALPHA_INDEX = 5 # In case all classes in a project start with a common prefix, all classes will # be put under the same header in the alphabetical index. The IGNORE_PREFIX tag # can be used to specify a prefix (or a list of prefixes) that should be ignored # while generating the index headers. # This tag requires that the tag ALPHABETICAL_INDEX is set to YES. IGNORE_PREFIX = #--------------------------------------------------------------------------- # Configuration options related to the HTML output #--------------------------------------------------------------------------- # If the GENERATE_HTML tag is set to YES, doxygen will generate HTML output # The default value is: YES. GENERATE_HTML = YES # The HTML_OUTPUT tag is used to specify where the HTML docs will be put. If a # relative path is entered the value of OUTPUT_DIRECTORY will be put in front of # it. # The default directory is: html. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_OUTPUT = html # The HTML_FILE_EXTENSION tag can be used to specify the file extension for each # generated HTML page (for example: .htm, .php, .asp). # The default value is: .html. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_FILE_EXTENSION = .html # The HTML_HEADER tag can be used to specify a user-defined HTML header file for # each generated HTML page. If the tag is left blank doxygen will generate a # standard header. # # To get valid HTML the header file that includes any scripts and style sheets # that doxygen needs, which is dependent on the configuration options used (e.g. # the setting GENERATE_TREEVIEW). It is highly recommended to start with a # default header using # doxygen -w html new_header.html new_footer.html new_stylesheet.css # YourConfigFile # and then modify the file new_header.html. See also section "Doxygen usage" # for information on how to generate the default header that doxygen normally # uses. # Note: The header is subject to change so you typically have to regenerate the # default header when upgrading to a newer version of doxygen. For a description # of the possible markers and block names see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_HEADER = # The HTML_FOOTER tag can be used to specify a user-defined HTML footer for each # generated HTML page. If the tag is left blank doxygen will generate a standard # footer. See HTML_HEADER for more information on how to generate a default # footer and what special commands can be used inside the footer. See also # section "Doxygen usage" for information on how to generate the default footer # that doxygen normally uses. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_FOOTER = # The HTML_STYLESHEET tag can be used to specify a user-defined cascading style # sheet that is used by each HTML page. It can be used to fine-tune the look of # the HTML output. If left blank doxygen will generate a default style sheet. # See also section "Doxygen usage" for information on how to generate the style # sheet that doxygen normally uses. # Note: It is recommended to use HTML_EXTRA_STYLESHEET instead of this tag, as # it is more robust and this tag (HTML_STYLESHEET) will in the future become # obsolete. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_STYLESHEET = # The HTML_EXTRA_STYLESHEET tag can be used to specify additional user-defined # cascading style sheets that are included after the standard style sheets # created by doxygen. Using this option one can overrule certain style aspects. # This is preferred over using HTML_STYLESHEET since it does not replace the # standard style sheet and is therefore more robust against future updates. # Doxygen will copy the style sheet files to the output directory. # Note: The order of the extra style sheet files is of importance (e.g. the last # style sheet in the list overrules the setting of the previous ones in the # list). For an example see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_EXTRA_STYLESHEET = # The HTML_EXTRA_FILES tag can be used to specify one or more extra images or # other source files which should be copied to the HTML output directory. Note # that these files will be copied to the base HTML output directory. Use the # $relpath^ marker in the HTML_HEADER and/or HTML_FOOTER files to load these # files. In the HTML_STYLESHEET file, use the file name only. Also note that the # files will be copied as-is; there are no commands or markers available. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_EXTRA_FILES = # The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. Doxygen # will adjust the colors in the style sheet and background images according to # this color. Hue is specified as an angle on a colorwheel, see # http://en.wikipedia.org/wiki/Hue for more information. For instance the value # 0 represents red, 60 is yellow, 120 is green, 180 is cyan, 240 is blue, 300 # purple, and 360 is red again. # Minimum value: 0, maximum value: 359, default value: 220. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_HUE = 220 # The HTML_COLORSTYLE_SAT tag controls the purity (or saturation) of the colors # in the HTML output. For a value of 0 the output will use grayscales only. A # value of 255 will produce the most vivid colors. # Minimum value: 0, maximum value: 255, default value: 100. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_SAT = 100 # The HTML_COLORSTYLE_GAMMA tag controls the gamma correction applied to the # luminance component of the colors in the HTML output. Values below 100 # gradually make the output lighter, whereas values above 100 make the output # darker. The value divided by 100 is the actual gamma applied, so 80 represents # a gamma of 0.8, The value 220 represents a gamma of 2.2, and 100 does not # change the gamma. # Minimum value: 40, maximum value: 240, default value: 80. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_COLORSTYLE_GAMMA = 80 # If the HTML_TIMESTAMP tag is set to YES then the footer of each generated HTML # page will contain the date and time when the page was generated. Setting this # to NO can help when comparing the output of multiple runs. # The default value is: YES. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_TIMESTAMP = YES # If the HTML_DYNAMIC_SECTIONS tag is set to YES then the generated HTML # documentation will contain sections that can be hidden and shown after the # page has loaded. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_DYNAMIC_SECTIONS = NO # With HTML_INDEX_NUM_ENTRIES one can control the preferred number of entries # shown in the various tree structured indices initially; the user can expand # and collapse entries dynamically later on. Doxygen will expand the tree to # such a level that at most the specified number of entries are visible (unless # a fully collapsed tree already exceeds this amount). So setting the number of # entries 1 will produce a full collapsed tree by default. 0 is a special value # representing an infinite number of entries and will result in a full expanded # tree by default. # Minimum value: 0, maximum value: 9999, default value: 100. # This tag requires that the tag GENERATE_HTML is set to YES. HTML_INDEX_NUM_ENTRIES = 100 # If the GENERATE_DOCSET tag is set to YES, additional index files will be # generated that can be used as input for Apple's Xcode 3 integrated development # environment (see: http://developer.apple.com/tools/xcode/), introduced with # OSX 10.5 (Leopard). To create a documentation set, doxygen will generate a # Makefile in the HTML output directory. Running make will produce the docset in # that directory and running make install will install the docset in # ~/Library/Developer/Shared/Documentation/DocSets so that Xcode will find it at # startup. See http://developer.apple.com/tools/creatingdocsetswithdoxygen.html # for more information. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_DOCSET = NO # This tag determines the name of the docset feed. A documentation feed provides # an umbrella under which multiple documentation sets from a single provider # (such as a company or product suite) can be grouped. # The default value is: Doxygen generated docs. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_FEEDNAME = "Doxygen generated docs" # This tag specifies a string that should uniquely identify the documentation # set bundle. This should be a reverse domain-name style string, e.g. # com.mycompany.MyDocSet. Doxygen will append .docset to the name. # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_BUNDLE_ID = org.doxygen.Project # The DOCSET_PUBLISHER_ID tag specifies a string that should uniquely identify # the documentation publisher. This should be a reverse domain-name style # string, e.g. com.mycompany.MyDocSet.documentation. # The default value is: org.doxygen.Publisher. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_PUBLISHER_ID = org.doxygen.Publisher # The DOCSET_PUBLISHER_NAME tag identifies the documentation publisher. # The default value is: Publisher. # This tag requires that the tag GENERATE_DOCSET is set to YES. DOCSET_PUBLISHER_NAME = Publisher # If the GENERATE_HTMLHELP tag is set to YES then doxygen generates three # additional HTML index files: index.hhp, index.hhc, and index.hhk. The # index.hhp is a project file that can be read by Microsoft's HTML Help Workshop # (see: http://www.microsoft.com/en-us/download/details.aspx?id=21138) on # Windows. # # The HTML Help Workshop contains a compiler that can convert all HTML output # generated by doxygen into a single compiled HTML file (.chm). Compiled HTML # files are now used as the Windows 98 help format, and will replace the old # Windows help format (.hlp) on all Windows platforms in the future. Compressed # HTML files also contain an index, a table of contents, and you can search for # words in the documentation. The HTML workshop also contains a viewer for # compressed HTML files. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_HTMLHELP = NO # The CHM_FILE tag can be used to specify the file name of the resulting .chm # file. You can add a path in front of the file if the result should not be # written to the html output directory. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. CHM_FILE = # The HHC_LOCATION tag can be used to specify the location (absolute path # including file name) of the HTML help compiler (hhc.exe). If non-empty, # doxygen will try to run the HTML help compiler on the generated index.hhp. # The file has to be specified with full path. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. HHC_LOCATION = # The GENERATE_CHI flag controls if a separate .chi index file is generated # (YES) or that it should be included in the master .chm file (NO). # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. GENERATE_CHI = NO # The CHM_INDEX_ENCODING is used to encode HtmlHelp index (hhk), content (hhc) # and project file content. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. CHM_INDEX_ENCODING = # The BINARY_TOC flag controls whether a binary table of contents is generated # (YES) or a normal table of contents (NO) in the .chm file. Furthermore it # enables the Previous and Next buttons. # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. BINARY_TOC = NO # The TOC_EXPAND flag can be set to YES to add extra items for group members to # the table of contents of the HTML help documentation and to the tree view. # The default value is: NO. # This tag requires that the tag GENERATE_HTMLHELP is set to YES. TOC_EXPAND = NO # If the GENERATE_QHP tag is set to YES and both QHP_NAMESPACE and # QHP_VIRTUAL_FOLDER are set, an additional index file will be generated that # can be used as input for Qt's qhelpgenerator to generate a Qt Compressed Help # (.qch) of the generated HTML documentation. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_QHP = NO # If the QHG_LOCATION tag is specified, the QCH_FILE tag can be used to specify # the file name of the resulting .qch file. The path specified is relative to # the HTML output folder. # This tag requires that the tag GENERATE_QHP is set to YES. QCH_FILE = # The QHP_NAMESPACE tag specifies the namespace to use when generating Qt Help # Project output. For more information please see Qt Help Project / Namespace # (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#namespace). # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_QHP is set to YES. QHP_NAMESPACE = # The QHP_VIRTUAL_FOLDER tag specifies the namespace to use when generating Qt # Help Project output. For more information please see Qt Help Project / Virtual # Folders (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#virtual- # folders). # The default value is: doc. # This tag requires that the tag GENERATE_QHP is set to YES. QHP_VIRTUAL_FOLDER = doc # If the QHP_CUST_FILTER_NAME tag is set, it specifies the name of a custom # filter to add. For more information please see Qt Help Project / Custom # Filters (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom- # filters). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_CUST_FILTER_NAME = # The QHP_CUST_FILTER_ATTRS tag specifies the list of the attributes of the # custom filter to add. For more information please see Qt Help Project / Custom # Filters (see: http://qt-project.org/doc/qt-4.8/qthelpproject.html#custom- # filters). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_CUST_FILTER_ATTRS = # The QHP_SECT_FILTER_ATTRS tag specifies the list of the attributes this # project's filter section matches. Qt Help Project / Filter Attributes (see: # http://qt-project.org/doc/qt-4.8/qthelpproject.html#filter-attributes). # This tag requires that the tag GENERATE_QHP is set to YES. QHP_SECT_FILTER_ATTRS = # The QHG_LOCATION tag can be used to specify the location of Qt's # qhelpgenerator. If non-empty doxygen will try to run qhelpgenerator on the # generated .qhp file. # This tag requires that the tag GENERATE_QHP is set to YES. QHG_LOCATION = # If the GENERATE_ECLIPSEHELP tag is set to YES, additional index files will be # generated, together with the HTML files, they form an Eclipse help plugin. To # install this plugin and make it available under the help contents menu in # Eclipse, the contents of the directory containing the HTML and XML files needs # to be copied into the plugins directory of eclipse. The name of the directory # within the plugins directory should be the same as the ECLIPSE_DOC_ID value. # After copying Eclipse needs to be restarted before the help appears. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_ECLIPSEHELP = NO # A unique identifier for the Eclipse help plugin. When installing the plugin # the directory name containing the HTML and XML files should also have this # name. Each documentation set should have its own identifier. # The default value is: org.doxygen.Project. # This tag requires that the tag GENERATE_ECLIPSEHELP is set to YES. ECLIPSE_DOC_ID = org.doxygen.Project # If you want full control over the layout of the generated HTML pages it might # be necessary to disable the index and replace it with your own. The # DISABLE_INDEX tag can be used to turn on/off the condensed index (tabs) at top # of each HTML page. A value of NO enables the index and the value YES disables # it. Since the tabs in the index contain the same information as the navigation # tree, you can set this option to YES if you also set GENERATE_TREEVIEW to YES. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. DISABLE_INDEX = NO # The GENERATE_TREEVIEW tag is used to specify whether a tree-like index # structure should be generated to display hierarchical information. If the tag # value is set to YES, a side panel will be generated containing a tree-like # index structure (just like the one that is generated for HTML Help). For this # to work a browser that supports JavaScript, DHTML, CSS and frames is required # (i.e. any modern browser). Windows users are probably better off using the # HTML help feature. Via custom style sheets (see HTML_EXTRA_STYLESHEET) one can # further fine-tune the look of the index. As an example, the default style # sheet generated by doxygen has an example that shows how to put an image at # the root of the tree instead of the PROJECT_NAME. Since the tree basically has # the same information as the tab index, you could consider setting # DISABLE_INDEX to YES when enabling this option. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. GENERATE_TREEVIEW = NO # The ENUM_VALUES_PER_LINE tag can be used to set the number of enum values that # doxygen will group on one line in the generated HTML documentation. # # Note that a value of 0 will completely suppress the enum values from appearing # in the overview section. # Minimum value: 0, maximum value: 20, default value: 4. # This tag requires that the tag GENERATE_HTML is set to YES. ENUM_VALUES_PER_LINE = 4 # If the treeview is enabled (see GENERATE_TREEVIEW) then this tag can be used # to set the initial width (in pixels) of the frame in which the tree is shown. # Minimum value: 0, maximum value: 1500, default value: 250. # This tag requires that the tag GENERATE_HTML is set to YES. TREEVIEW_WIDTH = 250 # If the EXT_LINKS_IN_WINDOW option is set to YES, doxygen will open links to # external symbols imported via tag files in a separate window. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. EXT_LINKS_IN_WINDOW = NO # Use this tag to change the font size of LaTeX formulas included as images in # the HTML documentation. When you change the font size after a successful # doxygen run you need to manually remove any form_*.png images from the HTML # output directory to force them to be regenerated. # Minimum value: 8, maximum value: 50, default value: 10. # This tag requires that the tag GENERATE_HTML is set to YES. FORMULA_FONTSIZE = 10 # Use the FORMULA_TRANPARENT tag to determine whether or not the images # generated for formulas are transparent PNGs. Transparent PNGs are not # supported properly for IE 6.0, but are supported on all modern browsers. # # Note that when changing this option you need to delete any form_*.png files in # the HTML output directory before the changes have effect. # The default value is: YES. # This tag requires that the tag GENERATE_HTML is set to YES. FORMULA_TRANSPARENT = YES # Enable the USE_MATHJAX option to render LaTeX formulas using MathJax (see # http://www.mathjax.org) which uses client side Javascript for the rendering # instead of using pre-rendered bitmaps. Use this if you do not have LaTeX # installed or if you want to formulas look prettier in the HTML output. When # enabled you may also need to install MathJax separately and configure the path # to it using the MATHJAX_RELPATH option. # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. USE_MATHJAX = NO # When MathJax is enabled you can set the default output format to be used for # the MathJax output. See the MathJax site (see: # http://docs.mathjax.org/en/latest/output.html) for more details. # Possible values are: HTML-CSS (which is slower, but has the best # compatibility), NativeMML (i.e. MathML) and SVG. # The default value is: HTML-CSS. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_FORMAT = HTML-CSS # When MathJax is enabled you need to specify the location relative to the HTML # output directory using the MATHJAX_RELPATH option. The destination directory # should contain the MathJax.js script. For instance, if the mathjax directory # is located at the same level as the HTML output directory, then # MATHJAX_RELPATH should be ../mathjax. The default value points to the MathJax # Content Delivery Network so you can quickly see the result without installing # MathJax. However, it is strongly recommended to install a local copy of # MathJax from http://www.mathjax.org before deployment. # The default value is: http://cdn.mathjax.org/mathjax/latest. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_RELPATH = http://cdn.mathjax.org/mathjax/latest # The MATHJAX_EXTENSIONS tag can be used to specify one or more MathJax # extension names that should be enabled during MathJax rendering. For example # MATHJAX_EXTENSIONS = TeX/AMSmath TeX/AMSsymbols # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_EXTENSIONS = # The MATHJAX_CODEFILE tag can be used to specify a file with javascript pieces # of code that will be used on startup of the MathJax code. See the MathJax site # (see: http://docs.mathjax.org/en/latest/output.html) for more details. For an # example see the documentation. # This tag requires that the tag USE_MATHJAX is set to YES. MATHJAX_CODEFILE = # When the SEARCHENGINE tag is enabled doxygen will generate a search box for # the HTML output. The underlying search engine uses javascript and DHTML and # should work on any modern browser. Note that when using HTML help # (GENERATE_HTMLHELP), Qt help (GENERATE_QHP), or docsets (GENERATE_DOCSET) # there is already a search function so this one should typically be disabled. # For large projects the javascript based search engine can be slow, then # enabling SERVER_BASED_SEARCH may provide a better solution. It is possible to # search using the keyboard; to jump to the search box use + S # (what the is depends on the OS and browser, but it is typically # , /