cclib-1.3.1/0002755000175100016050000000000012467450677012475 5ustar kmlcclib00000000000000cclib-1.3.1/LICENSE0000644000175100016050000006350412424762525013477 0ustar kmlcclib00000000000000 GNU LESSER GENERAL PUBLIC LICENSE Version 2.1, February 1999 Copyright (C) 1991, 1999 Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. [This is the first released version of the Lesser GPL. It also counts as the successor of the GNU Library Public License, version 2, hence the version number 2.1.] Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below. When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things. To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights. We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library. To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others. Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs. When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library. We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances. For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License. In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system. Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run. GNU LESSER GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with. c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. d) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. e) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Libraries If you develop a new library, and you want it to be of the greatest possible use to the public, we recommend making it free software that everyone can redistribute and change. You can do so by permitting redistribution under these terms (or, alternatively, under the terms of the ordinary General Public License). To apply these terms, attach the following notices to the library. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the library, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob' (a library for tweaking knobs) written by James Random Hacker. , 1 April 1990 Ty Coon, President of Vice That's all there is to it! cclib-1.3.1/PKG-INFO0000644000175100016050000000175512467450677013600 0ustar kmlcclib00000000000000Metadata-Version: 1.1 Name: cclib Version: 1.3.1 Summary: cclib: parsers and algorithms for computational chemistry Home-page: http://cclib.github.io/ Author: cclib development team Author-email: cclib-users@lists.sourceforge.net License: LGPL Description: cclib is a Python library that provides parsers for computational chemistry log files. It also provides a platform to implement algorithms in a package-independent manner. Platform: Any. Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: Console Classifier: Intended Audience :: Science/Research Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL) Classifier: Natural Language :: English Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Topic :: Scientific/Engineering :: Chemistry Classifier: Topic :: Software Development :: Libraries :: Python Modules cclib-1.3.1/THANKS0000644000175100016050000000306112467431001013362 0ustar kmlcclib00000000000000The developers of cclib would like the thank the following (in alphabetical order) who have contributed in some way to cclib: Nuno Bandeira -- for bug reporting Björn Baumeier -- for bug reporting Dermot Brougham -- for bug reporting Avril Coghlan -- for designing the cclib logo Björn Dahlgren -- for bug reporting Yafei Dai -- for bug reporting Abhishek Dey -- for bug reporting Matt Ernst -- for patches Clyde Fare -- for bug reporting and patches Christos Garoufalis -- for bug reporting Edward Holland -- for patches Karen Hemelsoet -- for bug reporting Ian Hovell -- for bug reporting Julien Idé -- for bug reporting csjacky -- for bug reporting Russell Johnson -- for providing CCCBDB (NIST) logfiles Jerome Kieffer -- for bug reporting Greg Magoon -- for bug reporting and patches Scott McKechnie -- for bug reporting Rob Paton -- for creating and running Jaguar test jobs Felix Plasser -- for bug reporting and contributing files Martin Rahm -- for bug reporting Marius Retegan -- for bug reporting Tamilmani S -- for bug reporting Melchor Sanchez -- for bug reporting Alex Schild -- for ideas and contributing test files Jen Schwartz -- for helping create and run Jaguar 6.0 test jobs Tiago Silva -- for bug reporting Pavel Solntsev -- for bug reporting Ben Stein -- for patches Adam Swanson -- for bug reporting Joe Townsend -- for giving multiple GAMESS files to test on Chengju Wang -- for bug reporting Andrew Warden -- for bug reporting Samuel Wilson -- for bug reporting Fedor Zhuravlev -- for patches Please let us know if we have omitted someone from this list. cclib-1.3.1/README0000644000175100016050000000064712467446541013355 0ustar kmlcclib00000000000000cclib is a Python library that provides parsers for output files of computational chemistry packages. It also provides a platform for computational chemists to implement algorithms in a platform-independent way. For more information, go to http://cclib.github.io. There is a mailing list for questions at cclib-users@lists.sourceforge.net. The latest stable version of cclib is v1.3.1, whose DOI is 10.5281/zenodo.15108 cclib-1.3.1/ANNOUNCE0000644000175100016050000000452512467446541013625 0ustar kmlcclib00000000000000On behalf of the cclib development team, we are pleased to announce the release of cclib 1.3.1, which is now available for download from http://cclib.github.io. This is a minor update to version 1.3 that includes bug fixes and small improvements. cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from ADF, Firefly, GAMESS (US), GAMESS-UK, Gaussian, Jaguar, Molpro, NWChem, ORCA, Psi and QChem. Among other data, cclib extracts: * coordinates and energies * information about geometry optimization * atomic orbital information * molecular orbital information * information on vibrational modes * the results of a TD-DFT calculation * charges and mutlipole moments (For a complete list see http://cclib.github.io/data.html). cclib also provides some calculation methods for interpreting the electronic properties of molecules using analyses such as: * Mulliken and Lowdin population analyses * Overlap population analysis * Calculation of Mayer's bond orders. (For a complete list see http://cclib.github.io/methods.html). For information on how to use cclib, see http://cclib.github.io/tutorial.html If you need help, find a bug, want new features or have any questions, please send an email to our mailing list: https://lists.sourceforge.net/lists/listinfo/cclib-users If your published work uses cclib, please support its development by citing the following article: N. M. O'Boyle, A. L. Tenderholt, K. M. Langner, cclib: a library for package-independent computational chemistry algorithms, J. Comp. Chem. 29 (5), 839-845 (2008) You can also specifically reference the newest stable version of cclib as: Eric Berquist, Karol M. Langner, Noel M. O'Boyle, and Adam L. Tenderholt. Release of cclib version 1.3.1. 2015. http://dx.doi.org/10.5281/zenodo.15108 Regards, The cclib development team ——— Major changes since cclib 1.3: * New attribute nooccnos for natural orbital occupation numbers (GAMESS, GAMESS-UK, Gaussian) * Add ability to read XYZ files with ccget (OpenBabel is needed) * Improve parsing atomic and molecular orbitals (Molpro, Psi) * Improve parsing TDDFT transitions (QChem) * Bugfixes, thanks to Ben Albrecht, Clyde Fare, Felix Plasser, Axel Schild and others cclib-1.3.1/src/0002755000175100016050000000000012467450677013264 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/0002755000175100016050000000000012467450677014340 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/progress/0002755000175100016050000000000012467450677016204 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/progress/qt4progress.py0000644000175100016050000000240512425223271021030 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. from PyQt4 import QtGui,QtCore class Qt4Progress(QtGui.QProgressDialog): def __init__(self, title, parent=None): QtGui.QProgressDialog.__init__(self, parent) self.nstep = 0 self.text = None self.oldprogress = 0 self.progress = 0 self.calls = 0 self.loop=QtCore.QEventLoop(self) self.setWindowTitle(title) def initialize(self, nstep, text=None): self.nstep = nstep self.text = text self.setRange(0,nstep) if text: self.setLabelText(text) self.setValue(1) #sys.stdout.write("\n") def update(self, step, text=None): if text: self.setLabelText(text) self.setValue(step) self.loop.processEvents(QtCore.QEventLoop.ExcludeUserInputEvents) cclib-1.3.1/src/cclib/progress/__init__.py0000644000175100016050000000114212425223271020267 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. import sys if 'PyQt4' in list(sys.modules.keys()): from .qt4progress import Qt4Progress from .textprogress import TextProgress cclib-1.3.1/src/cclib/progress/textprogress.py0000644000175100016050000000312212425223271021301 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. from __future__ import print_function import sys class TextProgress: def __init__(self): self.nstep = 0 self.text = None self.oldprogress = 0 self.progress = 0 self.calls = 0 def initialize(self, nstep, text=None): self.nstep = float(nstep) self.text = text #sys.stdout.write("\n") def update(self, step, text=None): self.progress = int(step * 100 / self.nstep) if self.progress/2 >= self.oldprogress/2 + 1 or self.text != text: # just went through at least an interval of ten, ie. from 39 to 41, # so update mystr = "\r[" prog = int(self.progress / 10) mystr += prog * "=" + (10-prog) * "-" mystr += "] %3i" % self.progress + "%" if text: mystr += " "+text sys.stdout.write("\r" + 70 * " ") sys.stdout.flush() sys.stdout.write(mystr) sys.stdout.flush() self.oldprogress = self.progress if self.progress >= 100 and text == "Done": print(" ") return cclib-1.3.1/src/cclib/__init__.py0000644000175100016050000000314212467431166016437 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2015 the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """A library for parsing and interpreting results from computational chemistry packages. The goals of cclib are centered around the reuse of data obtained from various computational chemistry programs and typically contained in output files. Specifically, cclib extracts (parses) data from the output files generated by multiple programs and provides a consistent interface to access them. Currently supported programs: ADF, Firefly, GAMESS(US), GAMESS-UK, Gaussian, Jaguar, Molpro, NWChem, ORCA, Psi, Q-Chem Another aim is to facilitate the implementation of algorithms that are not specific to any particular computational chemistry package and to maximise interoperability with other open source computational chemistry and cheminformatic software libraries. To this end, cclib provides a number of bridges to help transfer data to other libraries as well as example methods that take parsed data as input. """ __version__ = "1.3.1" from . import parser from . import progress from . import method from . import bridge # The test module can be imported if it was installed with cclib. try: from . import test except ImportError: pass cclib-1.3.1/src/cclib/method/0002755000175100016050000000000012467450677015620 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/method/volume.py0000644000175100016050000002374412467423323017475 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Calculation methods related to volume based on cclib data.""" from __future__ import print_function import copy import numpy try: from PyQuante.CGBF import CGBF from cclib.bridge import cclib2pyquante module_pyq = True except: module_pyq = False try: from pyvtk import * from pyvtk.DataSetAttr import * module_pyvtk = True except: module_pyvtk = False from cclib.parser.utils import convertor class Volume(object): """Represent a volume in space. Required parameters: origin -- the bottom left hand corner of the volume topcorner -- the top right hand corner spacing -- the distance between the points in the cube Attributes: data -- a numpy array of values for each point in the volume (set to zero at initialisation) numpts -- the numbers of points in the (x,y,z) directions """ def __init__(self, origin, topcorner, spacing): self.origin = origin self.spacing = spacing self.topcorner = topcorner self.numpts = [] for i in range(3): self.numpts.append(int((self.topcorner[i]-self.origin[i])/self.spacing[i] + 1) ) self.data = numpy.zeros( tuple(self.numpts), "d") def __str__(self): """Return a string representation.""" return "Volume %s to %s (density: %s)" % (self.origin, self.topcorner, self.spacing) def write(self, filename, format="Cube"): """Write the volume to file.""" format = format.upper() if format.upper() not in ["VTK", "CUBE"]: raise "Format must be either VTK or Cube" elif format=="VTK": self.writeasvtk(filename) else: self.writeascube(filename) def writeasvtk(self, filename): if not module_pyvtk: raise Exception("You need to have pyvtk installed") ranges = (numpy.arange(self.data.shape[2]), numpy.arange(self.data.shape[1]), numpy.arange(self.data.shape[0])) v = VtkData(RectilinearGrid(*ranges), "Test", PointData(Scalars(self.data.ravel(), "from cclib", "default"))) v.tofile(filename) def integrate(self): boxvol = (self.spacing[0] * self.spacing[1] * self.spacing[2] * convertor(1, "Angstrom", "bohr")**3) return sum(self.data.ravel()) * boxvol def integrate_square(self): boxvol = (self.spacing[0] * self.spacing[1] * self.spacing[2] * convertor(1, "Angstrom", "bohr")**3) return sum(self.data.ravel()**2) * boxvol def writeascube(self, filename): # Remember that the units are bohr, not Angstroms convert = lambda x : convertor(x, "Angstrom", "bohr") ans = [] ans.append("Cube file generated by cclib") ans.append("") format = "%4d%12.6f%12.6f%12.6f" origin = [convert(x) for x in self.origin] ans.append(format % (0, origin[0], origin[1], origin[2])) ans.append(format % (self.data.shape[0], convert(self.spacing[0]), 0.0, 0.0)) ans.append(format % (self.data.shape[1], 0.0, convert(self.spacing[1]), 0.0)) ans.append(format % (self.data.shape[2], 0.0, 0.0, convert(self.spacing[2]))) line = [] for i in range(self.data.shape[0]): for j in range(self.data.shape[1]): for k in range(self.data.shape[2]): line.append(scinotation(self.data[i][j][k])) if len(line)==6: ans.append(" ".join(line)) line = [] if line: ans.append(" ".join(line)) line = [] outputfile = open(filename, "w") outputfile.write("\n".join(ans)) outputfile.close() def scinotation(num): """Write in scientific notation >>> scinotation(1./654) ' 1.52905E-03' >>> scinotation(-1./654) '-1.52905E-03' """ ans = "%10.5E" % num broken = ans.split("E") exponent = int(broken[1]) if exponent<-99: return " 0.000E+00" if exponent<0: sign="-" else: sign="+" return ("%sE%s%s" % (broken[0],sign,broken[1][-2:])).rjust(12) def getbfs(coords, gbasis): """Convenience function for both wavefunction and density based on PyQuante Ints.py.""" mymol = makepyquante(coords, [0 for x in coords]) sym2powerlist = { 'S' : [(0,0,0)], 'P' : [(1,0,0),(0,1,0),(0,0,1)], 'D' : [(2,0,0),(0,2,0),(0,0,2),(1,1,0),(0,1,1),(1,0,1)], 'F' : [(3,0,0),(2,1,0),(2,0,1),(1,2,0),(1,1,1),(1,0,2), (0,3,0),(0,2,1),(0,1,2), (0,0,3)] } bfs = [] for i,atom in enumerate(mymol): bs = gbasis[i] for sym,prims in bs: for power in sym2powerlist[sym]: bf = CGBF(atom.pos(),power) for expnt,coef in prims: bf.add_primitive(expnt,coef) bf.normalize() bfs.append(bf) return bfs def wavefunction(coords, mocoeffs, gbasis, volume): """Calculate the magnitude of the wavefunction at every point in a volume. Attributes: coords -- the coordinates of the atoms mocoeffs -- mocoeffs for one eigenvalue gbasis -- gbasis from a parser object volume -- a template Volume object (will not be altered) """ bfs = getbfs(coords, gbasis) wavefn = copy.copy(volume) wavefn.data = numpy.zeros( wavefn.data.shape, "d") conversion = convertor(1,"bohr","Angstrom") x = numpy.arange(wavefn.origin[0], wavefn.topcorner[0]+wavefn.spacing[0], wavefn.spacing[0]) / conversion y = numpy.arange(wavefn.origin[1], wavefn.topcorner[1]+wavefn.spacing[1], wavefn.spacing[1]) / conversion z = numpy.arange(wavefn.origin[2], wavefn.topcorner[2]+wavefn.spacing[2], wavefn.spacing[2]) / conversion for bs in range(len(bfs)): data = numpy.zeros( wavefn.data.shape, "d") for i,xval in enumerate(x): for j,yval in enumerate(y): for k,zval in enumerate(z): data[i, j, k] = bfs[bs].amp(xval,yval,zval) numpy.multiply(data, mocoeffs[bs], data) numpy.add(wavefn.data, data, wavefn.data) return wavefn def electrondensity(coords, mocoeffslist, gbasis, volume): """Calculate the magnitude of the electron density at every point in a volume. Attributes: coords -- the coordinates of the atoms mocoeffs -- mocoeffs for all of the occupied eigenvalues gbasis -- gbasis from a parser object volume -- a template Volume object (will not be altered) Note: mocoeffs is a list of numpy arrays. The list will be of length 1 for restricted calculations, and length 2 for unrestricted. """ bfs = getbfs(coords, gbasis) density = copy.copy(volume) density.data = numpy.zeros( density.data.shape, "d") conversion = convertor(1,"bohr","Angstrom") x = numpy.arange(density.origin[0], density.topcorner[0]+density.spacing[0], density.spacing[0]) / conversion y = numpy.arange(density.origin[1], density.topcorner[1]+density.spacing[1], density.spacing[1]) / conversion z = numpy.arange(density.origin[2], density.topcorner[2]+density.spacing[2], density.spacing[2]) / conversion for mocoeffs in mocoeffslist: for mocoeff in mocoeffs: wavefn = numpy.zeros( density.data.shape, "d") for bs in range(len(bfs)): data = numpy.zeros( density.data.shape, "d") for i,xval in enumerate(x): for j,yval in enumerate(y): tmp = [] for k,zval in enumerate(z): tmp.append(bfs[bs].amp(xval, yval, zval)) data[i,j,:] = tmp numpy.multiply(data, mocoeff[bs], data) numpy.add(wavefn, data, wavefn) density.data += wavefn**2 if len(mocoeffslist) == 1: density.data = density.data*2. # doubly-occupied return density if __name__=="__main__": try: import psyco psyco.full() except ImportError: pass from cclib.parser import ccopen import logging a = ccopen("../../../data/Gaussian/basicGaussian03/dvb_sp_basis.log") a.logger.setLevel(logging.ERROR) c = a.parse() b = ccopen("../../../data/Gaussian/basicGaussian03/dvb_sp.out") b.logger.setLevel(logging.ERROR) d = b.parse() vol = Volume( (-3.0,-6,-2.0), (3.0, 6, 2.0), spacing=(0.25,0.25,0.25) ) wavefn = wavefunction(d.atomcoords[0], d.mocoeffs[0][d.homos[0]], c.gbasis, vol) assert abs(wavefn.integrate())<1E-6 # not necessarily true for all wavefns assert abs(wavefn.integrate_square() - 1.00)<1E-3 # true for all wavefns print(wavefn.integrate(), wavefn.integrate_square()) vol = Volume( (-3.0,-6,-2.0), (3.0, 6, 2.0), spacing=(0.25,0.25,0.25) ) frontierorbs = [d.mocoeffs[0][(d.homos[0]-3):(d.homos[0]+1)]] density = electrondensity(d.atomcoords[0], frontierorbs, c.gbasis, vol) assert abs(density.integrate()-8.00)<1E-2 print("Combined Density of 4 Frontier orbitals=",density.integrate()) cclib-1.3.1/src/cclib/method/__init__.py0000644000175100016050000000144012467423323017712 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Example analyses and calculations based on data parsed by cclib.""" from .density import Density from .cspa import CSPA from .mpa import MPA from .lpa import LPA from .opa import OPA from .mbo import MBO from .nuclear import Nuclear from .fragments import FragmentAnalysis from .cda import CDA cclib-1.3.1/src/cclib/method/mbo.py0000644000175100016050000001061612467423323016735 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Calculation of Mayer's bond orders based on data parsed by cclib.""" import random import numpy from .density import Density class MBO(Density): """Mayer's bond orders""" def __init__(self, *args): # Call the __init__ method of the superclass. super(MBO, self).__init__(logname="MBO", *args) def __str__(self): """Return a string representation of the object.""" return "Mayer's bond order of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'Mayer\'s bond order("%s")' % (self.data) def calculate(self, indices=None, fupdate=0.05): """Calculate Mayer's bond orders.""" retval = super(MBO, self).calculate(fupdate) if not retval: #making density didn't work return False # Do we have the needed info in the ccData object? if not (hasattr(self.data, "aooverlaps") or hasattr(self.data, "fooverlaps")): self.logger.error("Missing overlap matrix") return False #let the caller of function know we didn't finish if not indices: # Build list of groups of orbitals in each atom for atomresults. if hasattr(self.data, "aonames"): names = self.data.aonames overlaps = self.data.aooverlaps elif hasattr(self.data, "fonames"): names = self.data.fonames overlaps = self.data.fooverlaps else: self.logger.error("Missing aonames or fonames") return False atoms = [] indices = [] name = names[0].split('_')[0] atoms.append(name) indices.append([0]) for i in range(1, len(names)): name = names[i].split('_')[0] try: index = atoms.index(name) except ValueError: #not found in atom list atoms.append(name) indices.append([i]) else: indices[index].append(i) self.logger.info("Creating attribute fragresults: array[3]") size = len(indices) # Determine number of steps, and whether process involves beta orbitals. PS = [] PS.append(numpy.dot(self.density[0], overlaps)) nstep = size**2 #approximately quadratic in size unrestricted = (len(self.data.mocoeffs) == 2) if unrestricted: self.fragresults = numpy.zeros([2, size, size], "d") PS.append(numpy.dot(self.density[1], overlaps)) else: self.fragresults = numpy.zeros([1, size, size], "d") # Intialize progress if available. if self.progress: self.progress.initialize(nstep) step = 0 for i in range(len(indices)): if self.progress and random.random() < fupdate: self.progress.update(step, "Mayer's Bond Order") for j in range(i+1, len(indices)): tempsumA = 0 tempsumB = 0 for a in indices[i]: for b in indices[j]: if unrestricted: tempsumA += 2 * PS[0][a][b] * PS[0][b][a] tempsumB += 2 * PS[1][a][b] * PS[1][b][a] else: tempsumA += PS[0][a][b] * PS[0][b][a] self.fragresults[0][i, j] = tempsumA self.fragresults[0][j, i] = tempsumA if unrestricted: self.fragresults[1][i, j] = tempsumB self.fragresults[1][j, i] = tempsumB if self.progress: self.progress.update(nstep, "Done") return True cclib-1.3.1/src/cclib/method/cspa.py0000644000175100016050000000764212467423323017113 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """C-squared population analysis.""" import random import numpy from .population import Population class CSPA(Population): """The C-squared population analysis.""" def __init__(self, *args): # Call the __init__ method of the superclass. super(CSPA, self).__init__(logname="CSPA", *args) def __str__(self): """Return a string representation of the object.""" return "CSPA of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'CSPA("%s")' % (self.data) def calculate(self, indices=None, fupdate=0.05): """Perform the C squared population analysis. Inputs: indices - list of lists containing atomic orbital indices of fragments """ # Do we have the needed info in the parser? if not hasattr(self.data, "mocoeffs"): self.logger.error("Missing mocoeffs") return False if not hasattr(self.data, "nbasis"): self.logger.error("Missing nbasis") return False if not hasattr(self.data, "homos"): self.logger.error("Missing homos") return False self.logger.info("Creating attribute aoresults: array[3]") # Determine number of steps, and whether process involves beta orbitals. unrestricted = (len(self.data.mocoeffs)==2) nbasis = self.data.nbasis self.aoresults = [] alpha = len(self.data.mocoeffs[0]) self.aoresults.append(numpy.zeros([alpha, nbasis], "d")) nstep = alpha if unrestricted: beta = len(self.data.mocoeffs[1]) self.aoresults.append(numpy.zeros([beta, nbasis], "d")) nstep += beta # Intialize progress if available. if self.progress: self.progress.initialize(nstep) step = 0 for spin in range(len(self.data.mocoeffs)): for i in range(len(self.data.mocoeffs[spin])): if self.progress and random.random() < fupdate: self.progress.update(step, "C^2 Population Analysis") submocoeffs = self.data.mocoeffs[spin][i] scale = numpy.inner(submocoeffs, submocoeffs) tempcoeffs = numpy.multiply(submocoeffs, submocoeffs) tempvec = tempcoeffs/scale self.aoresults[spin][i] = numpy.divide(tempcoeffs, scale).astype("d") step += 1 if self.progress: self.progress.update(nstep, "Done") retval = super(CSPA, self).partition(indices) if not retval: self.logger.error("Error in partitioning results") return False self.logger.info("Creating fragcharges: array[1]") size = len(self.fragresults[0][0]) self.fragcharges = numpy.zeros([size], "d") for spin in range(len(self.fragresults)): for i in range(self.data.homos[spin] + 1): temp = numpy.reshape(self.fragresults[spin][i], (size,)) self.fragcharges = numpy.add(self.fragcharges, temp) if not unrestricted: self.fragcharges = numpy.multiply(self.fragcharges, 2) return True if __name__ == "__main__": import doctest, cspa doctest.testmod(cspa, verbose=False) cclib-1.3.1/src/cclib/method/lpa.py0000644000175100016050000001157712460570633016743 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Löwdin population analysis.""" import random import numpy from .population import Population class LPA(Population): """The Löwdin population analysis""" def __init__(self, *args): # Call the __init__ method of the superclass. super(LPA, self).__init__(logname="LPA", *args) def __str__(self): """Return a string representation of the object.""" return "LPA of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'LPA("%s")' % (self.data) def calculate(self, indices=None, x=0.5, fupdate=0.05): """Perform a calculation of Löwdin population analysis. Inputs: indices - list of lists containing atomic orbital indices of fragments x - overlap matrix exponent in wavefunxtion projection (x=0.5 for Lowdin) """ # Do we have the needed info in the parser? if not hasattr(self.data,"mocoeffs"): self.logger.error("Missing mocoeffs") return False if not (hasattr(self.data, "aooverlaps") \ or hasattr(self.data, "fooverlaps") ): self.logger.error("Missing overlap matrix") return False if not hasattr(self.data, "nbasis"): self.logger.error("Missing nbasis") return False if not hasattr(self.data, "homos"): self.logger.error("Missing homos") return False unrestricted = (len(self.data.mocoeffs) == 2) nbasis = self.data.nbasis # Determine number of steps, and whether process involves beta orbitals. self.logger.info("Creating attribute aoresults: [array[2]]") alpha = len(self.data.mocoeffs[0]) self.aoresults = [ numpy.zeros([alpha, nbasis], "d") ] nstep = alpha if unrestricted: beta = len(self.data.mocoeffs[1]) self.aoresults.append(numpy.zeros([beta, nbasis], "d")) nstep += beta #intialize progress if available if self.progress: self.progress.initialize(nstep) if hasattr(self.data, "aooverlaps"): S = self.data.aooverlaps elif hasattr(self.data, "fooverlaps"): S = self.data.fooverlaps # Get eigenvalues and matrix of eigenvectors for transformation decomposition (U). # Find roots of diagonal elements, and transform backwards using eigevectors. # We need two matrices here, one for S^x, another for S^(1-x). # We don't need to invert U, since S is symmetrical. eigenvalues, U = numpy.linalg.eig(S) UI = U.transpose() Sdiagroot1 = numpy.identity(len(S))*numpy.power(eigenvalues, x) Sdiagroot2 = numpy.identity(len(S))*numpy.power(eigenvalues, 1-x) Sroot1 = numpy.dot(U, numpy.dot(Sdiagroot1, UI)) Sroot2 = numpy.dot(U, numpy.dot(Sdiagroot2, UI)) step = 0 for spin in range(len(self.data.mocoeffs)): for i in range(len(self.data.mocoeffs[spin])): if self.progress and random.random() < fupdate: self.progress.update(step, "Lowdin Population Analysis") ci = self.data.mocoeffs[spin][i] temp1 = numpy.dot(ci, Sroot1) temp2 = numpy.dot(ci, Sroot2) self.aoresults[spin][i] = numpy.multiply(temp1, temp2).astype("d") step += 1 if self.progress: self.progress.update(nstep, "Done") retval = super(LPA, self).partition(indices) if not retval: self.logger.error("Error in partitioning results") return False # Create array for charges. self.logger.info("Creating fragcharges: array[1]") size = len(self.fragresults[0][0]) self.fragcharges = numpy.zeros([size], "d") for spin in range(len(self.fragresults)): for i in range(self.data.homos[spin] + 1): temp = numpy.reshape(self.fragresults[spin][i], (size,)) self.fragcharges = numpy.add(self.fragcharges, temp) if not unrestricted: self.fragcharges = numpy.multiply(self.fragcharges, 2) return True if __name__ == "__main__": import doctest, lpa doctest.testmod(lpa, verbose=False) cclib-1.3.1/src/cclib/method/mpa.py0000644000175100016050000001106312467423323016732 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Calculation of Mulliken population analysis (MPA) based on data parsed by cclib.""" import random import numpy from .population import Population class MPA(Population): """Mulliken population analysis.""" def __init__(self, *args): # Call the __init__ method of the superclass. super(MPA, self).__init__(logname="MPA", *args) def __str__(self): """Return a string representation of the object.""" return "MPA of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'MPA("%s")' % (self.data) def calculate(self, indices=None, fupdate=0.05): """Perform a Mulliken population analysis.""" # Do we have the needed attributes in the data object? if not hasattr(self.data, "mocoeffs"): self.logger.error("Missing mocoeffs") return False if not (hasattr(self.data, "aooverlaps") \ or hasattr(self.data, "fooverlaps") ): self.logger.error("Missing overlap matrix") return False if not hasattr(self.data, "nbasis"): self.logger.error("Missing nbasis") return False if not hasattr(self.data, "homos"): self.logger.error("Missing homos") return False # Determine number of steps, and whether process involves beta orbitals. self.logger.info("Creating attribute aoresults: [array[2]]") nbasis = self.data.nbasis alpha = len(self.data.mocoeffs[0]) self.aoresults = [ numpy.zeros([alpha, nbasis], "d") ] nstep = alpha unrestricted = (len(self.data.mocoeffs) == 2) if unrestricted: beta = len(self.data.mocoeffs[1]) self.aoresults.append(numpy.zeros([beta, nbasis], "d")) nstep += beta # Intialize progress if available. if self.progress: self.progress.initialize(nstep) step = 0 for spin in range(len(self.data.mocoeffs)): for i in range(len(self.data.mocoeffs[spin])): if self.progress and random.random() < fupdate: self.progress.update(step, "Mulliken Population Analysis") #X_{ai} = \sum_b c_{ai} c_{bi} S_{ab} # = c_{ai} \sum_b c_{bi} S_{ab} # = c_{ai} C(i) \cdot S(a) # X = C(i) * [C(i) \cdot S] # C(i) is 1xn and S is nxn, result of matrix mult is 1xn ci = self.data.mocoeffs[spin][i] if hasattr(self.data, "aooverlaps"): temp = numpy.dot(ci, self.data.aooverlaps) #handle spin-unrestricted beta case elif hasattr(self.data, "fooverlaps2") and spin == 1: temp = numpy.dot(ci, self.data.fooverlaps2) elif hasattr(self.data, "fooverlaps"): temp = numpy.dot(ci, self.data.fooverlaps) self.aoresults[spin][i] = numpy.multiply(ci, temp).astype("d") step += 1 if self.progress: self.progress.update(nstep, "Done") retval = super(MPA, self).partition(indices) if not retval: self.logger.error("Error in partitioning results") return False # Create array for mulliken charges. self.logger.info("Creating fragcharges: array[1]") size = len(self.fragresults[0][0]) self.fragcharges = numpy.zeros([size], "d") for spin in range(len(self.fragresults)): for i in range(self.data.homos[spin] + 1): temp = numpy.reshape(self.fragresults[spin][i], (size,)) self.fragcharges = numpy.add(self.fragcharges, temp) if not unrestricted: self.fragcharges = numpy.multiply(self.fragcharges, 2) return True if __name__ == "__main__": import doctest, mpa doctest.testmod(mpa, verbose=False) cclib-1.3.1/src/cclib/method/fragments.py0000644000175100016050000001321212467423323020141 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Fragment analysis based on parsed ADF data.""" import random import numpy numpy.inv = numpy.linalg.inv from .calculationmethod import * class FragmentAnalysis(Method): """Convert a molecule's basis functions from atomic-based to fragment MO-based""" def __init__(self, data, progress=None, loglevel=logging.INFO, logname="FragmentAnalysis of"): # Call the __init__ method of the superclass. super(FragmentAnalysis, self).__init__(data, progress, loglevel, logname) self.parsed = False def __str__(self): """Return a string representation of the object.""" return "Fragment molecule basis of %s" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'Fragment molecular basis("%s")' % (self.data) def calculate(self, fragments, cupdate=0.05): nFragBasis = 0 nFragAlpha = 0 nFragBeta = 0 self.fonames = [] unrestricted = ( len(self.data.mocoeffs) == 2 ) self.logger.info("Creating attribute fonames[]") # Collect basis info on the fragments. for j in range(len(fragments)): nFragBasis += fragments[j].nbasis nFragAlpha += fragments[j].homos[0] + 1 if unrestricted and len(fragments[j].homos) == 1: nFragBeta += fragments[j].homos[0] + 1 #assume restricted fragment elif unrestricted and len(fragments[j].homos) == 2: nFragBeta += fragments[j].homos[1] + 1 #assume unrestricted fragment #assign fonames based on fragment name and MO number for i in range(fragments[j].nbasis): if hasattr(fragments[j],"name"): self.fonames.append("%s_%i"%(fragments[j].name,i+1)) else: self.fonames.append("noname%i_%i"%(j,i+1)) nBasis = self.data.nbasis nAlpha = self.data.homos[0] + 1 if unrestricted: nBeta = self.data.homos[1] + 1 # Check to make sure calcs have the right properties. if nBasis != nFragBasis: self.logger.error("Basis functions don't match") return False if nAlpha != nFragAlpha: self.logger.error("Alpha electrons don't match") return False if unrestricted and nBeta != nFragBeta: self.logger.error("Beta electrons don't match") return False if len(self.data.atomcoords) != 1: self.logger.warning("Molecule calc appears to be an optimization") for frag in fragments: if len(frag.atomcoords) != 1: msg = "One or more fragment appears to be an optimization" self.logger.warning(msg) break last = 0 for frag in fragments: size = frag.natom if self.data.atomcoords[0][last:last+size].tolist() != \ frag.atomcoords[0].tolist(): self.logger.error("Atom coordinates aren't aligned") return False if self.data.atomnos[last:last+size].tolist() != \ frag.atomnos.tolist(): self.logger.error("Elements don't match") return False last += size # And let's begin! self.mocoeffs = [] self.logger.info("Creating mocoeffs in new fragment MO basis: mocoeffs[]") for spin in range(len(self.data.mocoeffs)): blockMatrix = numpy.zeros((nBasis,nBasis), "d") pos = 0 # Build up block-diagonal matrix from fragment mocoeffs. # Need to switch ordering from [mo,ao] to [ao,mo]. for i in range(len(fragments)): size = fragments[i].nbasis if len(fragments[i].mocoeffs) == 1: temp = numpy.transpose(fragments[i].mocoeffs[0]) blockMatrix[pos:pos+size, pos:pos+size] = temp else: temp = numpy.transpose(fragments[i].mocoeffs[spin]) blockMatrix[pos:pos+size, pos:pos+size] = temp pos += size # Invert and mutliply to result in fragment MOs as basis. iBlockMatrix = numpy.inv(blockMatrix) temp = numpy.transpose(self.data.mocoeffs[spin]) results = numpy.transpose(numpy.dot(iBlockMatrix, temp)) self.mocoeffs.append(results) if hasattr(self.data, "aooverlaps"): tempMatrix = numpy.dot(self.data.aooverlaps, blockMatrix) tBlockMatrix = numpy.transpose(blockMatrix) if spin == 0: self.fooverlaps = numpy.dot(tBlockMatrix, tempMatrix) self.logger.info("Creating fooverlaps: array[x,y]") elif spin == 1: self.fooverlaps2 = numpy.dot(tBlockMatrix, tempMatrix) self.logger.info("Creating fooverlaps (beta): array[x,y]") else: self.logger.warning("Overlap matrix missing") self.parsed = True self.nbasis = nBasis self.homos = self.data.homos return True cclib-1.3.1/src/cclib/method/density.py0000644000175100016050000000631512467423323017640 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Building the density matrix from data parsed by cclib.""" import logging import random import numpy from .calculationmethod import Method class Density(Method): """Calculate the density matrix""" def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Density"): # Call the __init__ method of the superclass. super(Density, self).__init__(data, progress, loglevel, logname) def __str__(self): """Return a string representation of the object.""" return "Density matrix of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'Density matrix("%s")' % (self.data) def calculate(self, fupdate=0.05): """Calculate the density matrix.""" # Do we have the needed info in the data object? if not hasattr(self.data, "mocoeffs"): self.logger.error("Missing mocoeffs") return False if not hasattr(self.data,"nbasis"): self.logger.error("Missing nbasis") return False if not hasattr(self.data,"homos"): self.logger.error("Missing homos") return False self.logger.info("Creating attribute density: array[3]") size = self.data.nbasis unrestricted = (len(self.data.mocoeffs) == 2) #determine number of steps, and whether process involves beta orbitals nstep = self.data.homos[0] + 1 if unrestricted: self.density = numpy.zeros([2, size, size], "d") nstep += self.data.homos[1] + 1 else: self.density = numpy.zeros([1, size, size], "d") #intialize progress if available if self.progress: self.progress.initialize(nstep) step = 0 for spin in range(len(self.data.mocoeffs)): for i in range(self.data.homos[spin] + 1): if self.progress and random.random() < fupdate: self.progress.update(step, "Density Matrix") col = numpy.reshape(self.data.mocoeffs[spin][i], (size, 1)) colt = numpy.reshape(col, (1, size)) tempdensity = numpy.dot(col, colt) self.density[spin] = numpy.add(self.density[spin], tempdensity) step += 1 if not unrestricted: #multiply by two to account for second electron self.density[0] = numpy.add(self.density[0], self.density[0]) if self.progress: self.progress.update(nstep, "Done") return True #let caller know we finished density cclib-1.3.1/src/cclib/method/population.py0000644000175100016050000000632512467423323020354 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Population analyses based on cclib data.""" import logging import numpy from .calculationmethod import Method class Population(Method): """An abstract base class for population-type methods.""" def __init__(self, data, progress=None, \ loglevel=logging.INFO, logname="Log"): # Call the __init__ method of the superclass. super(Population, self).__init__(data, progress, loglevel, logname) self.fragresults = None def __str__(self): """Return a string representation of the object.""" return "Population" def __repr__(self): """Return a representation of the object.""" return "Population" def partition(self, indices=None): if not hasattr(self, "aoresults"): self.calculate() if not indices: # Build list of groups of orbitals in each atom for atomresults. if hasattr(self.data, "aonames"): names = self.data.aonames elif hasattr(self.data, "fonames"): names = self.data.fonames atoms = [] indices = [] name = names[0].split('_')[0] atoms.append(name) indices.append([0]) for i in range(1, len(names)): name = names[i].split('_')[0] try: index = atoms.index(name) except ValueError: #not found in atom list atoms.append(name) indices.append([i]) else: indices[index].append(i) natoms = len(indices) nmocoeffs = len(self.aoresults[0]) # Build results numpy array[3]. alpha = len(self.aoresults[0]) results = [] results.append(numpy.zeros([alpha, natoms], "d")) if len(self.aoresults) == 2: beta = len(self.aoresults[1]) results.append(numpy.zeros([beta, natoms], "d")) # For each spin, splice numpy array at ao index, # and add to correct result row. for spin in range(len(results)): for i in range(natoms): # Number of groups. for j in range(len(indices[i])): # For each group. temp = self.aoresults[spin][:, indices[i][j]] results[spin][:, i] = numpy.add(results[spin][:, i], temp) self.logger.info("Saving partitioned results in fragresults: [array[2]]") self.fragresults = results return True if __name__ == "__main__": import doctest, population doctest.testmod(population, verbose=False) cclib-1.3.1/src/cclib/method/calculationmethod.py0000644000175100016050000000331012467423323021650 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. import logging import sys """Abstract based method class.""" class Method(object): """Abstract class for method classes. Subclasses defined by cclib: CDA, CSPA, Density, FragmentAnalysis, LPA, MBO, MPA, Nuclear, OPA, Population, Volume All the modules containing methods should be importable: >>> import cda, cspa, density, fragments, lpa, mbo, mpa, nuclear, opa, population, volume """ def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"): """Initialise the Logfile object. This constructor is typically called by the constructor of a subclass. """ self.data = data self.progress = progress self.loglevel = loglevel self.logname = logname self.logger = logging.getLogger('%s %s' % (self.logname, self.data)) self.logger.setLevel(self.loglevel) self.logformat = "[%(name)s %(levelname)s] %(message)s" handler = logging.StreamHandler(sys.stdout) handler.setFormatter(logging.Formatter(self.logformat)) self.logger.addHandler(handler) if __name__ == "__main__": import doctest doctest.testmod(verbose=False) cclib-1.3.1/src/cclib/method/nuclear.py0000644000175100016050000000317012467423323017606 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Calculate properties of nuclei based on data parsed by cclib.""" import logging import numpy from .calculationmethod import Method class Nuclear(Method): """A container for methods pertaining to atomic nuclei.""" def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"): super(Nuclear, self).__init__(data, progress, loglevel, logname) def __str__(self): """Return a string representation of the object.""" return "Nuclear" def __repr__(self): """Return a representation of the object.""" return "Nuclear" def repulsion_energy(self): """Return the nuclear repulsion energy.""" nre = 0.0 for i in range(self.data.natom): ri = self.data.atomcoords[0][i] zi = self.data.atomnos[i] for j in range(i+1, self.data.natom): rj = self.data.atomcoords[0][j] zj = self.data.atomnos[j] d = numpy.linalg.norm(ri-rj) nre += zi*zj/d return nre if __name__ == "__main__": import doctest doctest.testmod(verbose=True) cclib-1.3.1/src/cclib/method/opa.py0000644000175100016050000001173212467423323016737 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Calculation of overlap population analysis based on cclib data.""" import random import numpy from .calculationmethod import Method def func(x): if x==1: return 1 else: return x+func(x-1) class OPA(Method): """Overlap population analysis.""" def __init__(self, *args): # Call the __init__ method of the superclass. super(OPA, self).__init__(logname="OPA", *args) def __str__(self): """Return a string representation of the object.""" return "OPA of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'OPA("%s")' % (self.data) def calculate(self, indices=None, fupdate=0.05): """Perform an overlap population analysis given the results of a parser""" # Do we have the needed info in the ccData object? if not hasattr(self.data, "mocoeffs") \ and not ( hasattr(self.data, "aooverlaps") \ or hasattr(self.data, "fooverlaps") ) \ and not hasattr(self.data, "nbasis"): self.logger.error("Missing mocoeffs, aooverlaps/fooverlaps or nbasis") return False #let the caller of function know we didn't finish if not indices: # Build list of groups of orbitals in each atom for atomresults. if hasattr(self.data, "aonames"): names = self.data.aonames elif hasattr(self.data, "foonames"): names = self.data.fonames atoms = [] indices = [] name = names[0].split('_')[0] atoms.append(name) indices.append([0]) for i in range(1, len(names)): name = names[i].split('_')[0] try: index = atoms.index(name) except ValueError: #not found in atom list atoms.append(name) indices.append([i]) else: indices[index].append(i) # Determine number of steps, and whether process involves beta orbitals. nfrag = len(indices) #nfrag nstep = func(nfrag - 1) unrestricted = (len(self.data.mocoeffs) == 2) alpha = len(self.data.mocoeffs[0]) nbasis = self.data.nbasis self.logger.info("Creating attribute results: array[4]") results= [ numpy.zeros([nfrag, nfrag, alpha], "d") ] if unrestricted: beta = len(self.data.mocoeffs[1]) results.append(numpy.zeros([nfrag, nfrag, beta], "d")) nstep *= 2 if hasattr(self.data, "aooverlaps"): overlap = self.data.aooverlaps elif hasattr(self.data,"fooverlaps"): overlap = self.data.fooverlaps #intialize progress if available if self.progress: self.progress.initialize(nstep) size = len(self.data.mocoeffs[0]) step = 0 preresults = [] for spin in range(len(self.data.mocoeffs)): two = numpy.array([2.0]*len(self.data.mocoeffs[spin]),"d") # OP_{AB,i} = \sum_{a in A} \sum_{b in B} 2 c_{ai} c_{bi} S_{ab} for A in range(len(indices)-1): for B in range(A+1, len(indices)): if self.progress: #usually only a handful of updates, so remove random part self.progress.update(step, "Overlap Population Analysis") for a in indices[A]: ca = self.data.mocoeffs[spin][:,a] for b in indices[B]: cb = self.data.mocoeffs[spin][:,b] temp = ca * cb * two *overlap[a,b] results[spin][A,B] = numpy.add(results[spin][A,B],temp) results[spin][B,A] = numpy.add(results[spin][B,A],temp) step += 1 temparray2 = numpy.swapaxes(results[0],1,2) self.results = [ numpy.swapaxes(temparray2,0,1) ] if unrestricted: temparray2 = numpy.swapaxes(results[1],1,2) self.results.append(numpy.swapaxes(temparray2, 0, 1)) if self.progress: self.progress.update(nstep, "Done") return True if __name__ == "__main__": import doctest, opa doctest.testmod(opa, verbose=False) cclib-1.3.1/src/cclib/method/cda.py0000644000175100016050000001124312467423323016704 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Charge Decomposition Analysis (CDA)""" from __future__ import print_function import random import numpy from .fragments import FragmentAnalysis class CDA(FragmentAnalysis): """Charge Decomposition Analysis (CDA)""" def __init__(self, *args): # Call the __init__ method of the superclass. super(FragmentAnalysis, self).__init__(logname="CDA", *args) def __str__(self): """Return a string representation of the object.""" return "CDA of" % (self.data) def __repr__(self): """Return a representation of the object.""" return 'CDA("%s")' % (self.data) def calculate(self, fragments, cupdate=0.05): """Perform the charge decomposition analysis. Inputs: fragments - list of ccData data objects """ retval = super(CDA, self).calculate(fragments, cupdate) if not retval: return False # At this point, there should be a mocoeffs and fooverlaps # in analogy to a ccData object. donations = [] bdonations = [] repulsions = [] residuals = [] if len(self.mocoeffs) == 2: occs = 1 else: occs = 2 # Intialize progress if available. nstep = self.data.homos[0] if len(self.data.homos) == 2: nstep += self.data.homos[1] if self.progress: self.progress.initialize(nstep) # Begin the actual method. step = 0 for spin in range(len(self.mocoeffs)): size = len(self.mocoeffs[spin]) homo = self.data.homos[spin] if len(fragments[0].homos) == 2: homoa = fragments[0].homos[spin] else: homoa = fragments[0].homos[0] if len(fragments[1].homos) == 2: homob = fragments[1].homos[spin] else: homob = fragments[1].homos[0] print("handling spin unrestricted") if spin == 0: fooverlaps = self.fooverlaps elif spin == 1 and hasattr(self, "fooverlaps2"): fooverlaps = self.fooverlaps2 offset = fragments[0].nbasis self.logger.info("Creating donations, bdonations, and repulsions: array[]") donations.append(numpy.zeros(size, "d")) bdonations.append(numpy.zeros(size, "d")) repulsions.append(numpy.zeros(size, "d")) residuals.append(numpy.zeros(size, "d")) for i in range(self.data.homos[spin] + 1): # Calculate donation for each MO. for k in range(0, homoa + 1): for n in range(offset + homob + 1, self.data.nbasis): donations[spin][i] += 2 * occs * self.mocoeffs[spin][i,k] \ * self.mocoeffs[spin][i,n] * fooverlaps[k][n] for l in range(offset, offset + homob + 1): for m in range(homoa + 1, offset): bdonations[spin][i] += 2 * occs * self.mocoeffs[spin][i,l] \ * self.mocoeffs[spin][i,m] * fooverlaps[l][m] for k in range(0, homoa + 1): for m in range(offset, offset+homob + 1): repulsions[spin][i] += 2 * occs * self.mocoeffs[spin][i,k] \ * self.mocoeffs[spin][i, m] * fooverlaps[k][m] for m in range(homoa + 1, offset): for n in range(offset + homob + 1, self.data.nbasis): residuals[spin][i] += 2 * occs * self.mocoeffs[spin][i,m] \ * self.mocoeffs[spin][i, n] * fooverlaps[m][n] step += 1 if self.progress and random.random() < cupdate: self.progress.update(step, "Charge Decomposition Analysis...") if self.progress: self.progress.update(nstep, "Done.") self.donations = donations self.bdonations = bdonations self.repulsions = repulsions self.residuals = residuals return True cclib-1.3.1/src/cclib/parser/0002755000175100016050000000000012467450677015634 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/parser/data.py0000644000175100016050000002765112464212717017114 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Classes and tools for storing and handling parsed data""" import numpy class ccData(object): """Stores data extracted by cclib parsers Description of cclib attributes: aonames -- atomic orbital names (list of strings) aooverlaps -- atomic orbital overlap matrix (array[2]) atombasis -- indices of atomic orbitals on each atom (list of lists) atomcharges -- atomic partial charges (dict of arrays[1]) atomcoords -- atom coordinates (array[3], angstroms) atommasses -- atom masses (array[1], daltons) atomnos -- atomic numbers (array[1]) atomspins -- atomic spin densities (dict of arrays[1]) charge -- net charge of the system (integer) ccenergies -- molecular energies with Coupled-Cluster corrections (array[2], eV) coreelectrons -- number of core electrons in atom pseudopotentials (array[1]) enthalpy -- sum of electronic and thermal enthalpies (float, hartree/particle) entropy -- entropy (float, hartree/particle) etenergies -- energies of electronic transitions (array[1], 1/cm) etoscs -- oscillator strengths of electronic transitions (array[1]) etrotats -- rotatory strengths of electronic transitions (array[1], ??) etsecs -- singly-excited configurations for electronic transitions (list of lists) etsyms -- symmetries of electronic transitions (list of string) freeenergy -- sum of electronic and thermal free energies (float, hartree/particle) fonames -- fragment orbital names (list of strings) fooverlaps -- fragment orbital overlap matrix (array[2]) fragnames -- names of fragments (list of strings) frags -- indices of atoms in a fragment (list of lists) gbasis -- coefficients and exponents of Gaussian basis functions (PyQuante format) geotargets -- targets for convergence of geometry optimization (array[1]) geovalues -- current values for convergence of geometry optmization (array[1]) grads -- current values of forces (gradients) in geometry optimization (array[3]) hessian -- elements of the force constant matrix (array[1]) homos -- molecular orbital indices of HOMO(s) (array[1]) mocoeffs -- molecular orbital coefficients (list of arrays[2]) moenergies -- molecular orbital energies (list of arrays[1], eV) moments -- molecular multipole moments (list of arrays[], a.u.) mosyms -- orbital symmetries (list of lists) mpenergies -- molecular electronic energies with Møller-Plesset corrections (array[2], eV) mult -- multiplicity of the system (integer) natom -- number of atoms (integer) nbasis -- number of basis functions (integer) nmo -- number of molecular orbitals (integer) nocoeffs -- natural orbital coefficients (array[2]) nooccnos -- natural orbital occupation numbers (array[1]) optdone -- flags whether an optimization has converged (Boolean) scancoords -- geometries of each scan step (array[3], angstroms) scanenergies -- energies of potential energy surface (list) scannames -- names of varaibles scanned (list of strings) scanparm -- values of parameters in potential energy surface (list of tuples) scfenergies -- molecular electronic energies after SCF (Hartree-Fock, DFT) (array[1], eV) scftargets -- targets for convergence of the SCF (array[2]) scfvalues -- current values for convergence of the SCF (list of arrays[2]) temperature -- temperature used for Thermochemistry (float, kelvin) vibanharms -- vibrational anharmonicity constants (array[2], 1/cm) vibdisps -- cartesian displacement vectors (array[3], delta angstrom) vibfreqs -- vibrational frequencies (array[1], 1/cm) vibirs -- IR intensities (array[1], km/mol) vibramans -- Raman intensities (array[1], A^4/Da) vibsyms -- symmetries of vibrations (list of strings) (1) The term 'array' refers to a numpy array (2) The number of dimensions of an array is given in square brackets (3) Python indexes arrays/lists starting at zero, so if homos==[10], then the 11th molecular orbital is the HOMO """ # The expected types for all supported attributes. _attrtypes = { "aonames": list, "aooverlaps": numpy.ndarray, "atombasis": list, "atomcharges": dict, "atomcoords": numpy.ndarray, "atommasses": numpy.ndarray, "atomnos": numpy.ndarray, "atomspins": dict, "ccenergies": numpy.ndarray, "charge": int, "coreelectrons": numpy.ndarray, "enthalpy": float, "entropy": float, "etenergies": numpy.ndarray, "etoscs": numpy.ndarray, "etrotats": numpy.ndarray, "etsecs": list, "etsyms": list, "freeenergy": float, "fonames": list, "fooverlaps": numpy.ndarray, "fragnames": list, "frags": list, 'gbasis': list, "geotargets": numpy.ndarray, "geovalues": numpy.ndarray, "grads": numpy.ndarray, "hessian": numpy.ndarray, "homos": numpy.ndarray, "mocoeffs": list, "moenergies": list, "moments": list, "mosyms": list, "mpenergies": numpy.ndarray, "mult": int, "natom": int, "nbasis": int, "nmo": int, "nocoeffs": numpy.ndarray, "nooccnos": numpy.ndarray, "optdone": bool, "scancoords": numpy.ndarray, "scanenergies": list, "scannames": list, "scanparm": list, "scfenergies": numpy.ndarray, "scftargets": numpy.ndarray, "scfvalues": list, "temperature": float, "vibanharms": numpy.ndarray, "vibdisps": numpy.ndarray, "vibfreqs": numpy.ndarray, "vibirs": numpy.ndarray, "vibramans": numpy.ndarray, "vibsyms": list, } # The name of all attributes can be generated from the dictionary above. _attrlist = sorted(_attrtypes.keys()) # Arrays are double precision by default, but these will be integer arrays. _intarrays = ['atomnos', 'coreelectrons', 'homos'] # Attributes that should be lists of arrays (double precision). _listsofarrays = ['mocoeffs', 'moenergies', 'moments', 'scfvalues'] # Attributes that should be dictionaries of arrays (double precision). _dictsofarrays = ["atomcharges", "atomspins"] def __init__(self, attributes={}): """Initialize the cclibData object. Normally called in the parse() method of a Logfile subclass. Inputs: attributes - optional dictionary of attributes to load as data """ if attributes: self.setattributes(attributes) def listify(self): """Converts all attributes that are arrays or lists/dicts of arrays to lists.""" attrlist = [k for k in self._attrlist if hasattr(self, k)] for k in attrlist: v = self._attrtypes[k] if v == numpy.ndarray: setattr(self, k, getattr(self, k).tolist()) elif v == list and k in self._listsofarrays: setattr(self, k, [x.tolist() for x in getattr(self, k)]) elif v == dict and k in self._dictsofarrays: items = getattr(self, k).iteritems() pairs = [(key, val.tolist()) for key, val in items] setattr(self, k, dict(pairs)) def arrayify(self): """Converts appropriate attributes to arrays or lists/dicts of arrays.""" attrlist = [k for k in self._attrlist if hasattr(self, k)] for k in attrlist: v = self._attrtypes[k] precision = 'd' if k in self._intarrays: precision = 'i' if v == numpy.ndarray: a = getattr(self, k) setattr(self, k, numpy.array(getattr(self, k), precision)) elif v == list and k in self._listsofarrays: setattr(self, k, [numpy.array(x, precision) for x in getattr(self, k)]) elif v == dict and k in self._dictsofarrays: items = getattr(self, k).items() pairs = [(key, numpy.array(val, precision)) for key, val in items] setattr(self, k, dict(pairs)) def getattributes(self, tolists=False): """Returns a dictionary of existing data attributes. Inputs: tolists - flag to convert attributes to lists where applicable """ if tolists: self.listify() attributes = {} for attr in self._attrlist: if hasattr(self, attr): attributes[attr] = getattr(self, attr) if tolists: self.arrayify() return attributes def setattributes(self, attributes): """Sets data attributes given in a dictionary. Inputs: attributes - dictionary of attributes to set Outputs: invalid - list of attributes names that were not set, which means they are not specified in self._attrlist """ if type(attributes) is not dict: raise TypeError("attributes must be in a dictionary") valid = [a for a in attributes if a in self._attrlist] invalid = [a for a in attributes if a not in self._attrlist] for attr in valid: setattr(self, attr, attributes[attr]) self.arrayify() self.typecheck() return invalid def typecheck(self): """Check the types of all attributes. If an attribute does not match the expected type, then attempt to convert; if that fails, only then raise a TypeError. """ self.arrayify() for attr in [a for a in self._attrlist if hasattr(self, a)]: val = getattr(self, attr) if type(val) == self._attrtypes[attr]: continue try: val = self._attrtypes[attr](val) except ValueError: args = (attr, type(val), self._attrtypes[attr]) raise TypeError("attribute %s is %s instead of %s and could not be converted" % args) class ccData_optdone_bool(ccData): """This is the version of ccData where optdone is a Boolean.""" def __init__(self, *args, **kwargs): super(ccData_optdone_bool, self).__init__(*args, **kwargs) self._attrtypes['optdone'] = bool def setattributes(self, *args, **kwargs): invalid = super(ccData_optdone_bool, self).setattributes(*args, **kwargs) # Reduce optdone to a Boolean, because it will be parsed as a list. If this list has any element, # it means that there was an optimized structure and optdone should be True. if hasattr(self, 'optdone'): self.optdone = len(self.optdone) > 0 cclib-1.3.1/src/cclib/parser/orcaparser.py0000644000175100016050000010565612425223271020340 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for ORCA output files""" from __future__ import print_function import numpy from . import logfileparser from . import utils class ORCA(logfileparser.Logfile): """An ORCA log file.""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(ORCA, self).__init__(logname="ORCA", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "ORCA log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'ORCA("%s")' % (self.filename) def normalisesym(self, label): """Use standard symmetry labels instead of Gaussian labels. To normalise: (1) If label is one of [SG, PI, PHI, DLTA], replace by [sigma, pi, phi, delta] (2) replace any G or U by their lowercase equivalent >>> sym = Gaussian("dummyfile").normalisesym >>> labels = ['A1', 'AG', 'A1G', "SG", "PI", "PHI", "DLTA", 'DLTU', 'SGG'] >>> map(sym, labels) ['A1', 'Ag', 'A1g', 'sigma', 'pi', 'phi', 'delta', 'delta.u', 'sigma.g'] """ def before_parsing(self): # A geometry optimization is started only when # we parse a cycle (so it will be larger than zero(). self.gopt_cycle = 0 # Keep track of whether this is a relaxed scan calculation self.is_relaxed_scan = False def extract(self, inputfile, line): """Extract information from the file object inputfile.""" if line[0:15] == "Number of atoms": natom = int(line.split()[-1]) self.set_attribute('natom', natom) if line[1:13] == "Total Charge": charge = int(line.split()[-1]) self.set_attribute('charge', charge) line = next(inputfile) mult = int(line.split()[-1]) self.set_attribute('mult', mult) # SCF convergence output begins with: # # -------------- # SCF ITERATIONS # -------------- # # However, there are two common formats which need to be handled, implemented as separate functions. if "SCF ITERATIONS" in line: self.skip_line(inputfile, 'dashes') line = next(inputfile) colums = line.split() if colums[1] == "Energy": self.parse_scf_condensed_format(inputfile, colums) elif colums[1] == "Starting": self.parse_scf_expanded_format(inputfile, colums) # Information about the final iteration, which also includes the convergence # targets and the convergence values, is printed separately, in a section like this: # # ***************************************************** # * SUCCESS * # * SCF CONVERGED AFTER 9 CYCLES * # ***************************************************** # # ... # # Total Energy : -382.04963064 Eh -10396.09898 eV # # ... # # ------------------------- ---------------- # FINAL SINGLE POINT ENERGY -382.049630637 # ------------------------- ---------------- # # We cannot use this last message as a stop condition in general, because # often there is vibrational output before it. So we use the 'Total Energy' # line. However, what comes after that is different for single point calculations # and in the inner steps of geometry optimizations. if "SCF CONVERGED AFTER" in line: if not hasattr(self, "scfenergies"): self.scfenergies = [] if not hasattr(self, "scfvalues"): self.scfvalues = [] if not hasattr(self, "scftargets"): self.scftargets = [] while not "Total Energy :" in line: line = next(inputfile) energy = float(line.split()[5]) self.scfenergies.append(energy) self._append_scfvalues_scftargets(inputfile, line) # Sometimes the SCF does not converge, but does not halt the # the run (like in bug 3184890). In this this case, we should # remain consistent and use the energy from the last reported # SCF cycle. In this case, ORCA print a banner like this: # # ***************************************************** # * ERROR * # * SCF NOT CONVERGED AFTER 8 CYCLES * # ***************************************************** if "SCF NOT CONVERGED AFTER" in line: if not hasattr(self, "scfenergies"): self.scfenergies = [] if not hasattr(self, "scfvalues"): self.scfvalues = [] if not hasattr(self, "scftargets"): self.scftargets = [] energy = self.scfvalues[-1][-1][0] self.scfenergies.append(energy) self._append_scfvalues_scftargets(inputfile, line) # The convergence targets for geometry optimizations are printed at the # beginning of the output, although the order and their description is # different than later on. So, try to standardize the names of the criteria # and save them for later so that we can get the order right. # # ***************************** # * Geometry Optimization Run * # ***************************** # # Geometry optimization settings: # Update method Update .... BFGS # Choice of coordinates CoordSys .... Redundant Internals # Initial Hessian InHess .... Almoef's Model # # Convergence Tolerances: # Energy Change TolE .... 5.0000e-06 Eh # Max. Gradient TolMAXG .... 3.0000e-04 Eh/bohr # RMS Gradient TolRMSG .... 1.0000e-04 Eh/bohr # Max. Displacement TolMAXD .... 4.0000e-03 bohr # RMS Displacement TolRMSD .... 2.0000e-03 bohr # if line[25:50] == "Geometry Optimization Run": stars = next(inputfile) blank = next(inputfile) line = next(inputfile) while line[0:23] != "Convergence Tolerances:": line = next(inputfile) if hasattr(self, 'geotargets'): self.logger.warning('The geotargets attribute should not exist yet. There is a problem in the parser.') self.geotargets = [] self.geotargets_names = [] # There should always be five tolerance values printed here. for i in range(5): line = next(inputfile) name = line[:25].strip().lower().replace('.','').replace('displacement', 'step') target = float(line.split()[-2]) self.geotargets_names.append(name) self.geotargets.append(target) # The convergence targets for relaxed surface scan steps are printed at the # beginning of the output, although the order and their description is # different than later on. So, try to standardize the names of the criteria # and save them for later so that we can get the order right. # # ************************************************************* # * RELAXED SURFACE SCAN STEP 12 * # * * # * Dihedral ( 11, 10, 3, 4) : 180.00000000 * # ************************************************************* # # Geometry optimization settings: # Update method Update .... BFGS # Choice of coordinates CoordSys .... Redundant Internals # Initial Hessian InHess .... Almoef's Model # # Convergence Tolerances: # Energy Change TolE .... 5.0000e-06 Eh # Max. Gradient TolMAXG .... 3.0000e-04 Eh/bohr # RMS Gradient TolRMSG .... 1.0000e-04 Eh/bohr # Max. Displacement TolMAXD .... 4.0000e-03 bohr # RMS Displacement TolRMSD .... 2.0000e-03 bohr if line[25:50] == "RELAXED SURFACE SCAN STEP": self.is_relaxed_scan = True blank = next(inputfile) info = next(inputfile) stars = next(inputfile) blank = next(inputfile) line = next(inputfile) while line[0:23] != "Convergence Tolerances:": line = next(inputfile) self.geotargets = [] self.geotargets_names = [] # There should always be five tolerance values printed here. for i in range(5): line = next(inputfile) name = line[:25].strip().lower().replace('.','').replace('displacement', 'step') target = float(line.split()[-2]) self.geotargets_names.append(name) self.geotargets.append(target) # After each geometry optimization step, ORCA prints the current convergence # parameters and the targets (again), so it is a good idea to check that they # have not changed. Note that the order of these criteria here are different # than at the beginning of the output, so make use of the geotargets_names created # before and save the new geovalues in correct order. # # ----------------------|Geometry convergence|--------------------- # Item value Tolerance Converged # ----------------------------------------------------------------- # Energy change 0.00006021 0.00000500 NO # RMS gradient 0.00031313 0.00010000 NO # RMS step 0.01596159 0.00200000 NO # MAX step 0.04324586 0.00400000 NO # .................................................... # Max(Bonds) 0.0218 Max(Angles) 2.48 # Max(Dihed) 0.00 Max(Improp) 0.00 # ----------------------------------------------------------------- # if line[33:53] == "Geometry convergence": if not hasattr(self, "geovalues"): self.geovalues = [] headers = next(inputfile) dashes = next(inputfile) names = [] values = [] targets = [] line = next(inputfile) while list(set(line.strip())) != ["."]: name = line[10:28].strip().lower() value = float(line.split()[2]) target = float(line.split()[3]) names.append(name) values.append(value) targets.append(target) line = next(inputfile) # The energy change is normally not printed in the first iteration, because # there was no previous energy -- in that case assume zero, but check that # no previous geovalues were parsed. newvalues = [] for i, n in enumerate(self.geotargets_names): if (n == "energy change") and (n not in names): if not self.is_relaxed_scan: assert len(self.geovalues) == 0 newvalues.append(0.0) else: newvalues.append(values[names.index(n)]) assert targets[names.index(n)] == self.geotargets[i] self.geovalues.append(newvalues) #if not an optimization, determine structure used if line[0:21] == "CARTESIAN COORDINATES" and not hasattr(self, "atomcoords"): self.skip_line(inputfile, 'dashes') atomnos = [] atomcoords = [] line = next(inputfile) while len(line) > 1: broken = line.split() atomnos.append(self.table.number[broken[0]]) atomcoords.append(list(map(float, broken[1:4]))) line = next(inputfile) self.set_attribute('natom', len(atomnos)) self.set_attribute('atomnos', atomnos) self.atomcoords = [atomcoords] # There's always a banner announcing the next geometry optimization cycle, # which looks something like this: # # ************************************************************* # * GEOMETRY OPTIMIZATION CYCLE 2 * # ************************************************************* if "GEOMETRY OPTIMIZATION CYCLE" in line: # Keep track of the current cycle jsut in case, because some things # are printed differently inside the first/last and other cycles. self.gopt_cycle = int(line.split()[4]) self.skip_lines(inputfile, ['s', 'd', 'text', 'd']) if not hasattr(self,"atomcoords"): self.atomcoords = [] atomnos = [] atomcoords = [] for i in range(self.natom): line = next(inputfile) broken = line.split() atomnos.append(self.table.number[broken[0]]) atomcoords.append(list(map(float, broken[1:4]))) self.atomcoords.append(atomcoords) self.set_attribute('atomnos', atomnos) if line[21:68] == "FINAL ENERGY EVALUATION AT THE STATIONARY POINT": if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.atomcoords)) self.skip_lines(inputfile, ['text', 's', 'd', 'text', 'd']) atomcoords = [] for i in range(self.natom): line = next(inputfile) broken = line.split() atomcoords.append(list(map(float, broken[1:4]))) self.atomcoords.append(atomcoords) if "The optimization did not converge" in line: if not hasattr(self, 'optdone'): self.optdone = [] if line[0:16] == "ORBITAL ENERGIES": self.skip_lines(inputfile, ['d', 'text', 'text']) self.moenergies = [[]] self.homos = [[0]] line = next(inputfile) while len(line) > 20: #restricted calcs are terminated by ------ info = line.split() self.moenergies[0].append(float(info[3])) if float(info[1]) > 0.00: #might be 1 or 2, depending on restricted-ness self.homos[0] = int(info[0]) line = next(inputfile) line = next(inputfile) #handle beta orbitals if line[17:35] == "SPIN DOWN ORBITALS": text = next(inputfile) self.moenergies.append([]) self.homos.append(0) line = next(inputfile) while len(line) > 20: #actually terminated by ------ info = line.split() self.moenergies[1].append(float(info[3])) if float(info[1]) == 1.00: self.homos[1] = int(info[0]) line = next(inputfile) # So nbasis was parsed at first with the first pattern, but it turns out that # semiempirical methods (at least AM1 as reported by Julien Idé) do not use this. # For this reason, also check for the second patterns, and use it as an assert # if nbasis was already parsed. Regression PCB_1_122.out covers this test case. if line[1:32] == "# of contracted basis functions": self.set_attribute('nbasis', int(line.split()[-1])) if line[1:27] == "Basis Dimension Dim": self.set_attribute('nbasis', int(line.split()[-1])) if line[0:14] == "OVERLAP MATRIX": self.skip_line(inputfile, 'dashes') self.aooverlaps = numpy.zeros( (self.nbasis, self.nbasis), "d") for i in range(0, self.nbasis, 6): self.updateprogress(inputfile, "Overlap") header = next(inputfile) size = len(header.split()) for j in range(self.nbasis): line = next(inputfile) broken = line.split() self.aooverlaps[j, i:i+size] = list(map(float, broken[1:size+1])) # Molecular orbital coefficients. # This is also where atombasis is parsed. if line[0:18] == "MOLECULAR ORBITALS": self.skip_line(inputfile, 'dashes') mocoeffs = [ numpy.zeros((self.nbasis, self.nbasis), "d") ] self.aonames = [] self.atombasis = [] for n in range(self.natom): self.atombasis.append([]) for spin in range(len(self.moenergies)): if spin == 1: self.skip_line(inputfile, 'blank') mocoeffs.append(numpy.zeros((self.nbasis, self.nbasis), "d")) for i in range(0, self.nbasis, 6): self.updateprogress(inputfile, "Coefficients") self.skip_lines(inputfile, ['numbers', 'energies', 'occs']) dashes = next(inputfile) broken = dashes.split() size = len(broken) for j in range(self.nbasis): line = next(inputfile) broken = line.split() #only need this on the first time through if spin == 0 and i == 0: atomname = line[3:5].split()[0] num = int(line[0:3]) orbital = broken[1].upper() self.aonames.append("%s%i_%s"%(atomname, num+1, orbital)) self.atombasis[num].append(j) temp = [] vals = line[16:-1] #-1 to remove the last blank space for k in range(0, len(vals), 10): temp.append(float(vals[k:k+10])) mocoeffs[spin][i:i+size, j] = temp self.mocoeffs = mocoeffs if line[0:18] == "TD-DFT/TDA EXCITED": # Could be singlets or triplets if line.find("SINGLETS") >= 0: sym = "Singlet" elif line.find("TRIPLETS") >= 0: sym = "Triplet" else: sym = "Not specified" if not hasattr(self, "etenergies"): self.etsecs = [] self.etenergies = [] self.etsyms = [] lookup = {'a':0, 'b':1} line = next(inputfile) while line.find("STATE") < 0: line = next(inputfile) # Contains STATE or is blank while line.find("STATE") >= 0: broken = line.split() self.etenergies.append(float(broken[-2])) self.etsyms.append(sym) line = next(inputfile) sec = [] # Contains SEC or is blank while line.strip(): start = line[0:8].strip() start = (int(start[:-1]), lookup[start[-1]]) end = line[10:17].strip() end = (int(end[:-1]), lookup[end[-1]]) contrib = float(line[35:47].strip()) sec.append([start, end, contrib]) line = next(inputfile) self.etsecs.append(sec) line = next(inputfile) # This will parse etoscs for TD calculations, but note that ORCA generally # prints two sets, one based on the length form of transition dipole moments, # the other based on the velocity form. Although these should be identical # in the basis set limit, in practice they are rarely the same. Here we will # effectively parse just the spectrum based on the length-form. if (line[25:44] == "ABSORPTION SPECTRUM" or \ line[9:28] == "ABSORPTION SPECTRUM") and not hasattr(self, "etoscs"): self.skip_lines(inputfile, ['d', 'header', 'header', 'd']) self.etoscs = [] for x in self.etsyms: osc = next(inputfile).split()[3] if osc == "spin": # "spin forbidden" osc = 0 else: osc = float(osc) self.etoscs.append(osc) if line[0:23] == "VIBRATIONAL FREQUENCIES": self.skip_lines(inputfile, ['d', 'b']) self.vibfreqs = numpy.zeros((3 * self.natom,),"d") for i in range(3 * self.natom): line = next(inputfile) self.vibfreqs[i] = float(line.split()[1]) if numpy.any(self.vibfreqs[0:6] != 0): msg = "Modes corresponding to rotations/translations " msg += "may be non-zero." self.logger.warning(msg) self.vibfreqs = self.vibfreqs[6:] if line[0:12] == "NORMAL MODES": """ Format: NORMAL MODES ------------ These modes are the cartesian displacements weighted by the diagonal matrix M(i,i)=1/sqrt(m[i]) where m[i] is the mass of the displaced atom Thus, these vectors are normalized but *not* orthogonal 0 1 2 3 4 5 0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... """ self.vibdisps = numpy.zeros(( 3 * self.natom, self.natom, 3), "d") self.skip_lines(inputfile, ['d', 'b', 'text', 'text', 'text', 'b']) for mode in range(0, 3 * self.natom, 6): header = next(inputfile) for atom in range(self.natom): x = next(inputfile).split()[1:] y = next(inputfile).split()[1:] z = next(inputfile).split()[1:] self.vibdisps[mode:mode + 6, atom, 0] = x self.vibdisps[mode:mode + 6, atom, 1] = y self.vibdisps[mode:mode + 6, atom, 2] = z self.vibdisps = self.vibdisps[6:] if line[0:11] == "IR SPECTRUM": self.skip_lines(inputfile, ['d', 'b', 'header', 'd']) self.vibirs = numpy.zeros((3 * self.natom,),"d") line = next(inputfile) while len(line) > 2: num = int(line[0:4]) self.vibirs[num] = float(line.split()[2]) line = next(inputfile) self.vibirs = self.vibirs[6:] if line[0:14] == "RAMAN SPECTRUM": self.skip_lines(inputfile, ['d', 'b', 'header', 'd']) self.vibramans = numpy.zeros((3 * self.natom,),"d") line = next(inputfile) while len(line) > 2: num = int(line[0:4]) self.vibramans[num] = float(line.split()[2]) line = next(inputfile) self.vibramans = self.vibramans[6:] # ORCA will print atomic charges along with the spin populations, # so care must be taken about choosing the proper column. # Population analyses are performed usually only at the end # of a geometry optimization or other run, so we want to # leave just the final atom charges. # Here is an example for Mulliken charges: # -------------------------------------------- # MULLIKEN ATOMIC CHARGES AND SPIN POPULATIONS # -------------------------------------------- # 0 H : 0.126447 0.002622 # 1 C : -0.613018 -0.029484 # 2 H : 0.189146 0.015452 # 3 H : 0.320041 0.037434 # ... # Sum of atomic charges : -0.0000000 # Sum of atomic spin populations: 1.0000000 if line[:23] == "MULLIKEN ATOMIC CHARGES": has_spins = "AND SPIN POPULATIONS" in line if not hasattr(self, "atomcharges"): self.atomcharges = { } if has_spins and not hasattr(self, "atomspins"): self.atomspins = {} self.skip_line(inputfile, 'dashes') charges = [] if has_spins: spins = [] line = next(inputfile) while line[:21] != "Sum of atomic charges": charges.append(float(line[8:20])) if has_spins: spins.append(float(line[20:])) line = next(inputfile) self.atomcharges["mulliken"] = charges if has_spins: self.atomspins["mulliken"] = spins # Things are the same for Lowdin populations, except that the sums # are not printed (there is a blank line at the end). if line[:22] == "LOEWDIN ATOMIC CHARGES": has_spins = "AND SPIN POPULATIONS" in line if not hasattr(self, "atomcharges"): self.atomcharges = { } if has_spins and not hasattr(self, "atomspins"): self.atomspins = {} self.skip_line(inputfile, 'dashes') charges = [] if has_spins: spins = [] line = next(inputfile) while line.strip(): charges.append(float(line[8:20])) if has_spins: spins.append(float(line[20:])) line = next(inputfile) self.atomcharges["lowdin"] = charges if has_spins: self.atomspins["lowdin"] = spins # It is not stated explicitely, but the dipole moment components printed by ORCA # seem to be in atomic units, so they will need to be converted. Also, they # are most probably calculated with respect to the origin . # # ------------- # DIPOLE MOMENT # ------------- # X Y Z # Electronic contribution: 0.00000 -0.00000 -0.00000 # Nuclear contribution : 0.00000 0.00000 0.00000 # ----------------------------------------- # Total Dipole Moment : 0.00000 -0.00000 -0.00000 # ----------------------------------------- # Magnitude (a.u.) : 0.00000 # Magnitude (Debye) : 0.00000 # if line.strip() == "DIPOLE MOMENT": self.skip_lines(inputfile, ['d', 'XYZ', 'electronic', 'nuclear', 'd']) total = next(inputfile) assert "Total Dipole Moment" in total reference = [0.0, 0.0, 0.0] dipole = numpy.array([float(d) for d in total.split()[-3:]]) dipole = utils.convertor(dipole, "ebohr", "Debye") if not hasattr(self, 'moments'): self.moments = [reference, dipole] else: try: assert numpy.all(self.moments[1] == dipole) except AssertionError: self.logger.warning('Overwriting previous multipole moments with new values') self.moments = [reference, dipole] def parse_scf_condensed_format(self, inputfile, line): """ Parse the SCF convergence information in condensed format """ # This is what it looks like # ITER Energy Delta-E Max-DP RMS-DP [F,P] Damp # *** Starting incremental Fock matrix formation *** # 0 -384.5203638934 0.000000000000 0.03375012 0.00223249 0.1351565 0.7000 # 1 -384.5792776162 -0.058913722842 0.02841696 0.00175952 0.0734529 0.7000 # ***Turning on DIIS*** # 2 -384.6074211837 -0.028143567475 0.04968025 0.00326114 0.0310435 0.0000 # 3 -384.6479682063 -0.040547022616 0.02097477 0.00121132 0.0361982 0.0000 # 4 -384.6571124353 -0.009144228947 0.00576471 0.00035160 0.0061205 0.0000 # 5 -384.6574659959 -0.000353560584 0.00191156 0.00010160 0.0025838 0.0000 # 6 -384.6574990782 -0.000033082375 0.00052492 0.00003800 0.0002061 0.0000 # 7 -384.6575005762 -0.000001497987 0.00020257 0.00001146 0.0001652 0.0000 # 8 -384.6575007321 -0.000000155848 0.00008572 0.00000435 0.0000745 0.0000 # **** Energy Check signals convergence **** assert line[2] == "Delta-E" assert line[3] == "Max-DP" if not hasattr(self, "scfvalues"): self.scfvalues = [] self.scfvalues.append([]) # Try to keep track of the converger (NR, DIIS, SOSCF, etc.). diis_active = True while not line == []: if 'Newton-Raphson' in line: diis_active = False elif 'SOSCF' in line: diis_active = False elif line[0].isdigit() and diis_active: energy = float(line[1]) deltaE = float(line[2]) maxDP = float(line[3]) rmsDP = float(line[4]) self.scfvalues[-1].append([deltaE, maxDP, rmsDP]) elif line[0].isdigit() and not diis_active: energy = float(line[1]) deltaE = float(line[2]) maxDP = float(line[5]) rmsDP = float(line[6]) self.scfvalues[-1].append([deltaE, maxDP, rmsDP]) line = next(inputfile).split() def parse_scf_expanded_format(self, inputfile, line): """ Parse SCF convergence when in expanded format. """ # The following is an example of the format # ----------------------------------------- # # *** Starting incremental Fock matrix formation *** # # ---------------------------- # ! ITERATION 0 ! # ---------------------------- # Total Energy : -377.960836651297 Eh # Energy Change : -377.960836651297 Eh # MAX-DP : 0.100175793695 # RMS-DP : 0.004437973661 # Actual Damping : 0.7000 # Actual Level Shift : 0.2500 Eh # Int. Num. El. : 43.99982197 (UP= 21.99991099 DN= 21.99991099) # Exchange : -34.27550826 # Correlation : -2.02540957 # # # ---------------------------- # ! ITERATION 1 ! # ---------------------------- # Total Energy : -378.118458080109 Eh # Energy Change : -0.157621428812 Eh # MAX-DP : 0.053240648588 # RMS-DP : 0.002375092508 # Actual Damping : 0.7000 # Actual Level Shift : 0.2500 Eh # Int. Num. El. : 43.99994143 (UP= 21.99997071 DN= 21.99997071) # Exchange : -34.00291075 # Correlation : -2.01607243 # # ***Turning on DIIS*** # # ---------------------------- # ! ITERATION 2 ! # ---------------------------- # .... # if not hasattr(self, "scfvalues"): self.scfvalues = [] self.scfvalues.append([]) line = "Foo" # dummy argument to enter loop while line.find("******") < 0: line = next(inputfile) info = line.split() if len(info) > 1 and info[1] == "ITERATION": dashes = next(inputfile) energy_line = next(inputfile).split() energy = float(energy_line[3]) deltaE_line = next(inputfile).split() deltaE = float(deltaE_line[3]) if energy == deltaE: deltaE = 0 maxDP_line = next(inputfile).split() maxDP = float(maxDP_line[2]) rmsDP_line = next(inputfile).split() rmsDP = float(rmsDP_line[2]) self.scfvalues[-1].append([deltaE, maxDP, rmsDP]) return # end of parse_scf_expanded_format def _append_scfvalues_scftargets(self, inputfile, line): # The SCF convergence targets are always printed after this, but apparently # not all of them always -- for example the RMS Density is missing for geometry # optimization steps. So, assume the previous value is still valid if it is # not found. For additional certainty, assert that the other targets are unchanged. while not "Last Energy change" in line: line = next(inputfile) deltaE_value = float(line.split()[4]) deltaE_target = float(line.split()[7]) line = next(inputfile) if "Last MAX-Density change" in line: maxDP_value = float(line.split()[4]) maxDP_target = float(line.split()[7]) line = next(inputfile) if "Last RMS-Density change" in line: rmsDP_value = float(line.split()[4]) rmsDP_target = float(line.split()[7]) else: rmsDP_value = self.scfvalues[-1][-1][2] rmsDP_target = self.scftargets[-1][2] assert deltaE_target == self.scftargets[-1][0] assert maxDP_target == self.scftargets[-1][1] self.scfvalues[-1].append([deltaE_value, maxDP_value, rmsDP_value]) self.scftargets.append([deltaE_target, maxDP_target, rmsDP_target]) if __name__ == "__main__": import sys import doctest, orcaparser if len(sys.argv) == 1: doctest.testmod(orcaparser, verbose=False) if len(sys.argv) == 2: parser = orcaparser.ORCA(sys.argv[1]) data = parser.parse() if len(sys.argv) > 2: for i in range(len(sys.argv[2:])): if hasattr(data, sys.argv[2 + i]): print(getattr(data, sys.argv[2 + i])) cclib-1.3.1/src/cclib/parser/__init__.py0000644000175100016050000000237112467425265017741 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Contains parsers for all supported programs""" # These import statements are added for the convenience of users... # Rather than having to type: # from cclib.parser.gaussianparser import Gaussian # they can use: # from cclib.parser import Gaussian from .adfparser import ADF from .gamessparser import GAMESS from .gamessukparser import GAMESSUK from .gaussianparser import Gaussian from .jaguarparser import Jaguar from .molproparser import Molpro from .nwchemparser import NWChem from .orcaparser import ORCA from .psiparser import Psi from .qchemparser import QChem # This allow users to type: # from cclib.parser import ccopen # from cclib.parser import ccread from .ccopen import ccopen from .ccopen import ccread from .data import ccData cclib-1.3.1/src/cclib/parser/gaussianparser.py0000644000175100016050000017123312467423323021226 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for Gaussian output files""" from __future__ import print_function import re import numpy from . import logfileparser from . import utils class Gaussian(logfileparser.Logfile): """A Gaussian 98/03 log file.""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(Gaussian, self).__init__(logname="Gaussian", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "Gaussian log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'Gaussian("%s")' % (self.filename) def normalisesym(self, label): """Use standard symmetry labels instead of Gaussian labels. To normalise: (1) If label is one of [SG, PI, PHI, DLTA], replace by [sigma, pi, phi, delta] (2) replace any G or U by their lowercase equivalent >>> sym = Gaussian("dummyfile").normalisesym >>> labels = ['A1', 'AG', 'A1G', "SG", "PI", "PHI", "DLTA", 'DLTU', 'SGG'] >>> map(sym, labels) ['A1', 'Ag', 'A1g', 'sigma', 'pi', 'phi', 'delta', 'delta.u', 'sigma.g'] """ # note: DLT must come after DLTA greek = [('SG', 'sigma'), ('PI', 'pi'), ('PHI', 'phi'), ('DLTA', 'delta'), ('DLT', 'delta')] for k, v in greek: if label.startswith(k): tmp = label[len(k):] label = v if tmp: label = v + "." + tmp ans = label.replace("U", "u").replace("G", "g") return ans def before_parsing(self): # Used to index self.scftargets[]. SCFRMS, SCFMAX, SCFENERGY = list(range(3)) # Flag for identifying Coupled Cluster runs. self.coupledcluster = False # Fragment number for counterpoise or fragment guess calculations # (normally zero). self.counterpoise = 0 # Flag for identifying ONIOM calculations. self.oniom = False def after_parsing(self): # Correct the percent values in the etsecs in the case of # a restricted calculation. The following has the # effect of including each transition twice. if hasattr(self, "etsecs") and len(self.homos) == 1: new_etsecs = [[(x[0], x[1], x[2] * numpy.sqrt(2)) for x in etsec] for etsec in self.etsecs] self.etsecs = new_etsecs if hasattr(self, "scanenergies"): self.scancoords = [] self.scancoords = self.atomcoords if (hasattr(self, 'enthalpy') and hasattr(self, 'temperature') and hasattr(self, 'freeenergy')): self.set_attribute('entropy', (self.enthalpy - self.freeenergy) / self.temperature) # This bit is needed in order to trim coordinates that are printed a second time # at the end of geometry optimizations. Note that we need to do this for both atomcoords # and inputcoords. The reason is that normally a standard orientation is printed and that # is what we parse into atomcoords, but inputcoords stores the input (unmodified) coordinates # and that is copied over to atomcoords if no standard oritentation was printed, which happens # for example for jobs with no symmetry. This last step, however, is now generic for all parsers. # Perhaps then this part should also be generic code... # Regression that tests this: Gaussian03/cyclopropenyl.rhf.g03.cut.log if hasattr(self, 'optdone') and len(self.optdone) > 0: last_point = self.optdone[-1] if hasattr(self, 'atomcoords'): self.atomcoords = self.atomcoords[:last_point + 1] if hasattr(self, 'inputcoords'): self.inputcoords = self.inputcoords[:last_point + 1] def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # This block contains some general information as well as coordinates, # which could be parsed in the future: # # Symbolic Z-matrix: # Charge = 0 Multiplicity = 1 # C 0.73465 0. 0. # C 1.93465 0. 0. # C # ... # # It also lists fragments, if there are any, which is potentially valuable: # # Symbolic Z-matrix: # Charge = 0 Multiplicity = 1 in supermolecule # Charge = 0 Multiplicity = 1 in fragment 1. # Charge = 0 Multiplicity = 1 in fragment 2. # B(Fragment=1) 0.06457 -0.0279 0.01364 # H(Fragment=1) 0.03117 -0.02317 1.21604 # ... # # Note, however, that currently we only parse information for the whole system # or supermolecule as Gaussian calls it. if line.strip() == "Symbolic Z-matrix:": self.updateprogress(inputfile, "Symbolic Z-matrix", self.fupdate) line = inputfile.next() while line.split()[0] == 'Charge': # For the supermolecule, we can parse the charge and multicplicity. regex = ".*=(.*)Mul.*=\s*-?(\d+).*" match = re.match(regex, line) assert match, "Something unusual about the line: '%s'" % line self.set_attribute('charge', int(match.groups()[0])) self.set_attribute('mult', int(match.groups()[1])) if line.split()[-2] == "fragment": self.nfragments = int(line.split()[-1].strip('.')) if line.strip()[-13:] == "model system.": self.nmodels = getattr(self, 'nmodels', 0) + 1 line = inputfile.next() # The remaining part will allow us to get the atom count. # When coordinates are given, there is a blank line at the end, but if # there is a Z-matrix here, there will also be variables and we need to # stop at those to get the right atom count. # Also, in older versions there is bo blank line (G98 regressions), # so we need to watch out for leaving the link. natom = 0 while line.split() and not "Variables" in line and not "Leave Link" in line: natom += 1 line = inputfile.next() self.set_attribute('natom', natom) # Continuing from above, there is not always a symbolic matrix, for example # if the Z-matrix was in the input file. In such cases, try to match the # line and get at the charge and multiplicity. # # Charge = 0 Multiplicity = 1 in supermolecule # Charge = 0 Multiplicity = 1 in fragment 1. # Charge = 0 Multiplicity = 1 in fragment 2. if line[1:7] == 'Charge' and line.find("Multiplicity") >= 0: self.updateprogress(inputfile, "Charge and Multiplicity", self.fupdate) if line.split()[-1] == "supermolecule" or not "fragment" in line: regex = ".*=(.*)Mul.*=\s*-?(\d+).*" match = re.match(regex, line) assert match, "Something unusual about the line: '%s'" % line self.set_attribute('charge', int(match.groups()[0])) self.set_attribute('mult', int(match.groups()[1])) if line.split()[-2] == "fragment": self.nfragments = int(line.split()[-1].strip('.')) # Number of atoms is also explicitely printed after the above. if line[1:8] == "NAtoms=": self.updateprogress(inputfile, "Attributes", self.fupdate) natom = int(line.split()[1]) self.set_attribute('natom', natom) # Catch message about completed optimization. if line[1:23] == "Optimization completed": if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues) - 1) # Catch message about stopped optimization (not converged). if line[1:21] == "Optimization stopped": if not hasattr(self, "optdone"): self.optdone = [] # Extract the atomic numbers and coordinates from the input orientation, # in the event the standard orientation isn't available. if line.find("Input orientation") > -1 or line.find("Z-Matrix orientation") > -1: # If this is a counterpoise calculation, this output means that # the supermolecule is now being considered, so we can set: self.counterpoise = 0 self.updateprogress(inputfile, "Attributes", self.cupdate) if not hasattr(self, "inputcoords"): self.inputcoords = [] self.inputatoms = [] self.skip_lines(inputfile, ['d', 'cols', 'cols', 'd']) atomcoords = [] line = next(inputfile) while list(set(line.strip())) != ["-"]: broken = line.split() self.inputatoms.append(int(broken[1])) atomcoords.append(list(map(float, broken[3:6]))) line = next(inputfile) self.inputcoords.append(atomcoords) self.set_attribute('atomnos', self.inputatoms) self.set_attribute('natom', len(self.inputatoms)) # Extract the atomic masses. # Typical section: # Isotopes and Nuclear Properties: #(Nuclear quadrupole moments (NQMom) in fm**2, nuclear magnetic moments (NMagM) # in nuclear magnetons) # # Atom 1 2 3 4 5 6 7 8 9 10 # IAtWgt= 12 12 12 12 12 1 1 1 12 12 # AtmWgt= 12.0000000 12.0000000 12.0000000 12.0000000 12.0000000 1.0078250 1.0078250 1.0078250 12.0000000 12.0000000 # NucSpn= 0 0 0 0 0 1 1 1 0 0 # AtZEff= -3.6000000 -3.6000000 -3.6000000 -3.6000000 -3.6000000 -1.0000000 -1.0000000 -1.0000000 -3.6000000 -3.6000000 # NQMom= 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 # NMagM= 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 2.7928460 2.7928460 2.7928460 0.0000000 0.0000000 # ... with blank lines dividing blocks of ten, and Leave Link 101 at the end. # This is generally parsed before coordinates, so atomnos is not defined. # Note that in Gaussian03 the comments are not there yet and the labels are different. if line.strip() == "Isotopes and Nuclear Properties:": if not hasattr(self, "atommasses"): self.atommasses = [] line = next(inputfile) while line[1:16] != "Leave Link 101": if line[1:8] == "AtmWgt=": self.atommasses.extend(list(map(float,line.split()[1:]))) line = next(inputfile) # Extract the atomic numbers and coordinates of the atoms. if line.strip() == "Standard orientation:": self.updateprogress(inputfile, "Attributes", self.cupdate) # If this is a counterpoise calculation, this output means that # the supermolecule is now being considered, so we can set: self.counterpoise = 0 if not hasattr(self, "atomcoords"): self.atomcoords = [] self.skip_lines(inputfile, ['d', 'cols', 'cols', 'd']) atomnos = [] atomcoords = [] line = next(inputfile) while list(set(line.strip())) != ["-"]: broken = line.split() atomnos.append(int(broken[1])) atomcoords.append(list(map(float, broken[-3:]))) line = next(inputfile) self.atomcoords.append(atomcoords) self.set_attribute('natom', len(atomnos)) self.set_attribute('atomnos', atomnos) # This is a bit of a hack for regression Gaussian09/BH3_fragment_guess.pop_minimal.log # to skip output for all fragments, assuming the supermolecule is always printed first. # Eventually we want to make this more general, or even better parse the output for # all fragment, but that will happen in a newer version of cclib. if line[1:16] == "Fragment guess:" and getattr(self, 'nfragments', 0) > 1: if not "full" in line: inputfile.file.seek(0, 2) # Another hack for regression Gaussian03/ortho_prod_prod_freq.log, which is an ONIOM job. # Basically for now we stop parsing after the output for the real system, because # currently we don't support changes in system size or fragments in cclib. When we do, # we will want to parse the model systems, too, and that is what nmodels could track. if "ONIOM: generating point" in line and line.strip()[-13:] == 'model system.' and getattr(self, 'nmodels', 0) > 0: inputfile.file.seek(0,2) # With the gfinput keyword, the atomic basis set functios are: # # AO basis set in the form of general basis input (Overlap normalization): # 1 0 # S 3 1.00 0.000000000000 # 0.7161683735D+02 0.1543289673D+00 # 0.1304509632D+02 0.5353281423D+00 # 0.3530512160D+01 0.4446345422D+00 # SP 3 1.00 0.000000000000 # 0.2941249355D+01 -0.9996722919D-01 0.1559162750D+00 # 0.6834830964D+00 0.3995128261D+00 0.6076837186D+00 # 0.2222899159D+00 0.7001154689D+00 0.3919573931D+00 # **** # 2 0 # S 3 1.00 0.000000000000 # 0.7161683735D+02 0.1543289673D+00 # ... # # The same is also printed when the gfprint keyword is used, but the # interstitial lines differ and there are no stars between atoms: # # AO basis set (Overlap normalization): # Atom C1 Shell 1 S 3 bf 1 - 1 0.509245180608 -2.664678875191 0.000000000000 # 0.7161683735D+02 0.1543289673D+00 # 0.1304509632D+02 0.5353281423D+00 # 0.3530512160D+01 0.4446345422D+00 # Atom C1 Shell 2 SP 3 bf 2 - 5 0.509245180608 -2.664678875191 0.000000000000 # 0.2941249355D+01 -0.9996722919D-01 0.1559162750D+00 # ... #ONIOM calculations result basis sets reported for atoms that are not in order of atom number which breaks this code (line 390 relies on atoms coming in order) if line[1:13] == "AO basis set" and not self.oniom: self.gbasis = [] # For counterpoise fragment calcualtions, skip these lines. if self.counterpoise != 0: return atom_line = inputfile.next() self.gfprint = atom_line.split()[0] == "Atom" self.gfinput = not self.gfprint # Note how the shell information is on a separate line for gfinput, # whereas for gfprint it is on the same line as atom information. if self.gfinput: shell_line = inputfile.next() shell = [] while len(self.gbasis) < self.natom: if self.gfprint: cols = atom_line.split() subshells = cols[4] ngauss = int(cols[5]) else: cols = shell_line.split() subshells = cols[0] ngauss = int(cols[1]) parameters = [] for ig in range(ngauss): line = inputfile.next() parameters.append(list(map(self.float, line.split()))) for iss, ss in enumerate(subshells): contractions = [] for param in parameters: exponent = param[0] coefficient = param[iss+1] contractions.append((exponent, coefficient)) subshell = (ss, contractions) shell.append(subshell) if self.gfprint: line = inputfile.next() if line.split()[0] == "Atom": atomnum = int(re.sub(r"\D", "", line.split()[1])) if atomnum == len(self.gbasis) + 2: self.gbasis.append(shell) shell = [] atom_line = line else: self.gbasis.append(shell) else: line = inputfile.next() if line.strip() == "****": self.gbasis.append(shell) shell = [] atom_line = inputfile.next() shell_line = inputfile.next() else: shell_line = line # Find the targets for SCF convergence (QM calcs). if line[1:44] == 'Requested convergence on RMS density matrix': if not hasattr(self, "scftargets"): self.scftargets = [] # The following can happen with ONIOM which are mixed SCF # and semi-empirical if type(self.scftargets) == type(numpy.array([])): self.scftargets = [] scftargets = [] # The RMS density matrix. scftargets.append(self.float(line.split('=')[1].split()[0])) line = next(inputfile) # The MAX density matrix. scftargets.append(self.float(line.strip().split('=')[1][:-1])) line = next(inputfile) # For G03, there's also the energy (not for G98). if line[1:10] == "Requested": scftargets.append(self.float(line.strip().split('=')[1][:-1])) self.scftargets.append(scftargets) # Extract SCF convergence information (QM calcs). if line[1:10] == 'Cycle 1': if not hasattr(self, "scfvalues"): self.scfvalues = [] scfvalues = [] line = next(inputfile) while line.find("SCF Done") == -1: self.updateprogress(inputfile, "QM convergence", self.fupdate) if line.find(' E=') == 0: self.logger.debug(line) # RMSDP=3.74D-06 MaxDP=7.27D-05 DE=-1.73D-07 OVMax= 3.67D-05 # or # RMSDP=1.13D-05 MaxDP=1.08D-04 OVMax= 1.66D-04 if line.find(" RMSDP") == 0: parts = line.split() newlist = [self.float(x.split('=')[1]) for x in parts[0:2]] energy = 1.0 if len(parts) > 4: energy = parts[2].split('=')[1] if energy == "": energy = self.float(parts[3]) else: energy = self.float(energy) if len(self.scftargets[0]) == 3: # Only add the energy if it's a target criteria newlist.append(energy) scfvalues.append(newlist) try: line = next(inputfile) # May be interupted by EOF. except StopIteration: break self.scfvalues.append(scfvalues) # Extract SCF convergence information (AM1, INDO and other semi-empirical calcs). # The output (for AM1) looks like this: # Ext34=T Pulay=F Camp-King=F BShift= 0.00D+00 # It= 1 PL= 0.103D+01 DiagD=T ESCF= 31.564733 Diff= 0.272D+02 RMSDP= 0.152D+00. # It= 2 PL= 0.114D+00 DiagD=T ESCF= 7.265370 Diff=-0.243D+02 RMSDP= 0.589D-02. # ... # It= 11 PL= 0.184D-04 DiagD=F ESCF= 4.687669 Diff= 0.260D-05 RMSDP= 0.134D-05. # It= 12 PL= 0.105D-04 DiagD=F ESCF= 4.687669 Diff=-0.686D-07 RMSDP= 0.215D-05. # 4-point extrapolation. # It= 13 PL= 0.110D-05 DiagD=F ESCF= 4.687669 Diff=-0.111D-06 RMSDP= 0.653D-07. # Energy= 0.172272018655 NIter= 14. if line[1:4] == 'It=': scftargets = numpy.array([1E-7], "d") # This is the target value for the rms scfvalues = [[]] while line.find(" Energy") == -1: self.updateprogress(inputfile, "AM1 Convergence") if line[1:4] == "It=": parts = line.strip().split() scfvalues[0].append(self.float(parts[-1][:-1])) line = next(inputfile) # If an AM1 or INDO guess is used (Guess=INDO in the input, for example), # this will be printed after a single iteration, so that is the line # that should trigger a break from this loop. At least that's what we see # for regression Gaussian/Gaussian09/guessIndo_modified_ALT.out if line[:14] == " Initial guess": break # Attach the attributes to the object Only after the energy is found . if line.find(" Energy") == 0: self.scftargets = scftargets self.scfvalues = scfvalues # Note: this needs to follow the section where 'SCF Done' is used # to terminate a loop when extracting SCF convergence information. if line[1:9] == 'SCF Done': if not hasattr(self, "scfenergies"): self.scfenergies = [] self.scfenergies.append(utils.convertor(self.float(line.split()[4]), "hartree", "eV")) # gmagoon 5/27/09: added scfenergies reading for PM3 case # Example line: " Energy= -0.077520562724 NIter= 14." # See regression Gaussian03/QVGXLLKOCUKJST-UHFFFAOYAJmult3Fixed.out if line[1:8] == 'Energy=': if not hasattr(self, "scfenergies"): self.scfenergies = [] self.scfenergies.append(utils.convertor(self.float(line.split()[1]), "hartree", "eV")) # Total energies after Moller-Plesset corrections. # Second order correction is always first, so its first occurance # triggers creation of mpenergies (list of lists of energies). # Further MP2 corrections are appended as found. # # Example MP2 output line: # E2 = -0.9505918144D+00 EUMP2 = -0.28670924198852D+03 # Warning! this output line is subtly different for MP3/4/5 runs if "EUMP2" in line[27:34]: if not hasattr(self, "mpenergies"): self.mpenergies = [] self.mpenergies.append([]) mp2energy = self.float(line.split("=")[2]) self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV")) # Example MP3 output line: # E3= -0.10518801D-01 EUMP3= -0.75012800924D+02 if line[34:39] == "EUMP3": mp3energy = self.float(line.split("=")[2]) self.mpenergies[-1].append(utils.convertor(mp3energy, "hartree", "eV")) # Example MP4 output lines: # E4(DQ)= -0.31002157D-02 UMP4(DQ)= -0.75015901139D+02 # E4(SDQ)= -0.32127241D-02 UMP4(SDQ)= -0.75016013648D+02 # E4(SDTQ)= -0.32671209D-02 UMP4(SDTQ)= -0.75016068045D+02 # Energy for most substitutions is used only (SDTQ by default) if line[34:42] == "UMP4(DQ)": mp4energy = self.float(line.split("=")[2]) line = next(inputfile) if line[34:43] == "UMP4(SDQ)": mp4energy = self.float(line.split("=")[2]) line = next(inputfile) if line[34:44] == "UMP4(SDTQ)": mp4energy = self.float(line.split("=")[2]) self.mpenergies[-1].append(utils.convertor(mp4energy, "hartree", "eV")) # Example MP5 output line: # DEMP5 = -0.11048812312D-02 MP5 = -0.75017172926D+02 if line[29:32] == "MP5": mp5energy = self.float(line.split("=")[2]) self.mpenergies[-1].append(utils.convertor(mp5energy, "hartree", "eV")) # Total energies after Coupled Cluster corrections. # Second order MBPT energies (MP2) are also calculated for these runs, # but the output is the same as when parsing for mpenergies. # Read the consecutive correlated energies # but append only the last one to ccenergies. # Only the highest level energy is appended - ex. CCSD(T), not CCSD. if line[1:10] == "DE(Corr)=" and line[27:35] == "E(CORR)=": self.ccenergy = self.float(line.split()[3]) if line[1:10] == "T5(CCSD)=": line = next(inputfile) if line[1:9] == "CCSD(T)=": self.ccenergy = self.float(line.split()[1]) if line[12:53] == "Population analysis using the SCF density": if hasattr(self, "ccenergy"): if not hasattr(self, "ccenergies"): self.ccenergies = [] self.ccenergies.append(utils.convertor(self.ccenergy, "hartree", "eV")) del self.ccenergy # Geometry convergence information. if line[49:59] == 'Converged?': if not hasattr(self, "geotargets"): self.geovalues = [] self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0], "d") newlist = [0]*4 for i in range(4): line = next(inputfile) self.logger.debug(line) parts = line.split() try: value = self.float(parts[2]) except ValueError: self.logger.error("Problem parsing the value for geometry optimisation: %s is not a number." % parts[2]) else: newlist[i] = value self.geotargets[i] = self.float(parts[3]) self.geovalues.append(newlist) # Gradients. # Read in the cartesian energy gradients (forces) from a block like this: # ------------------------------------------------------------------- # Center Atomic Forces (Hartrees/Bohr) # Number Number X Y Z # ------------------------------------------------------------------- # 1 1 -0.012534744 -0.021754635 -0.008346094 # 2 6 0.018984731 0.032948887 -0.038003451 # 3 1 -0.002133484 -0.006226040 0.023174772 # 4 1 -0.004316502 -0.004968213 0.023174772 # -2 -0.001830728 -0.000743108 -0.000196625 # ------------------------------------------------------------------ # # The "-2" line is for a dummy atom # # Then optimization is done in internal coordinates, Gaussian also # print the forces in internal coordinates, which can be produced from # the above. This block looks like this: # Variable Old X -DE/DX Delta X Delta X Delta X New X # (Linear) (Quad) (Total) # ch 2.05980 0.01260 0.00000 0.01134 0.01134 2.07114 # hch 1.75406 0.09547 0.00000 0.24861 0.24861 2.00267 # hchh 2.09614 0.01261 0.00000 0.16875 0.16875 2.26489 # Item Value Threshold Converged? if line[37:43] == "Forces": if not hasattr(self, "grads"): self.grads = [] self.skip_lines(inputfile, ['header', 'd']) forces = [] line = next(inputfile) while list(set(line.strip())) != ['-']: tmpforces = [] for N in range(3): # Fx, Fy, Fz force = line[23+N*15:38+N*15] if force.startswith("*"): force = "NaN" tmpforces.append(float(force)) forces.append(tmpforces) line = next(inputfile) self.grads.append(forces) #Extract PES scan data #Summary of the potential surface scan: # N A SCF #---- --------- ----------- # 1 109.0000 -76.43373 # 2 119.0000 -76.43011 # 3 129.0000 -76.42311 # 4 139.0000 -76.41398 # 5 149.0000 -76.40420 # 6 159.0000 -76.39541 # 7 169.0000 -76.38916 # 8 179.0000 -76.38664 # 9 189.0000 -76.38833 # 10 199.0000 -76.39391 # 11 209.0000 -76.40231 #---- --------- ----------- if "Summary of the potential surface scan:" in line: scanenergies = [] scanparm = [] colmnames = next(inputfile) hyphens = next(inputfile) line = next(inputfile) while line != hyphens: broken = line.split() scanenergies.append(float(broken[-1])) scanparm.append(map(float, broken[1:-1])) line = next(inputfile) if not hasattr(self, "scanenergies"): self.scanenergies = [] self.scanenergies = scanenergies if not hasattr(self, "scanparm"): self.scanparm = [] self.scanparm = scanparm if not hasattr(self, "scannames"): self.scannames = colmnames.split()[1:-1] # Orbital symmetries. if line[1:20] == 'Orbital symmetries:' and not hasattr(self, "mosyms"): # For counterpoise fragments, skip these lines. if self.counterpoise != 0: return self.updateprogress(inputfile, "MO Symmetries", self.fupdate) self.mosyms = [[]] line = next(inputfile) unres = False if line.find("Alpha Orbitals") == 1: unres = True line = next(inputfile) i = 0 while len(line) > 18 and line[17] == '(': if line.find('Virtual') >= 0: self.homos = numpy.array([i-1], "i") # 'HOMO' indexes the HOMO in the arrays parts = line[17:].split() for x in parts: self.mosyms[0].append(self.normalisesym(x.strip('()'))) i += 1 line = next(inputfile) if unres: line = next(inputfile) # Repeat with beta orbital information i = 0 self.mosyms.append([]) while len(line) > 18 and line[17] == '(': if line.find('Virtual')>=0: # Here we consider beta # If there was also an alpha virtual orbital, # we will store two indices in the array # Otherwise there is no alpha virtual orbital, # only beta virtual orbitals, and we initialize # the array with one element. See the regression # QVGXLLKOCUKJST-UHFFFAOYAJmult3Fixed.out # donated by Gregory Magoon (gmagoon). if (hasattr(self, "homos")): # Extend the array to two elements # 'HOMO' indexes the HOMO in the arrays self.homos.resize([2]) self.homos[1] = i-1 else: # 'HOMO' indexes the HOMO in the arrays self.homos = numpy.array([i-1], "i") parts = line[17:].split() for x in parts: self.mosyms[1].append(self.normalisesym(x.strip('()'))) i += 1 line = next(inputfile) # Some calculations won't explicitely print the number of basis sets used, # and will occasionally drop some without warning. We can infer the number, # however, from the MO symmetries printed here. Specifically, this fixes # regression Gaussian/Gaussian09/dvb_sp_terse.log (#23 on github). self.set_attribute('nmo', len(self.mosyms[-1])) # Alpha/Beta electron eigenvalues. if line[1:6] == "Alpha" and line.find("eigenvalues") >= 0: # For counterpoise fragments, skip these lines. if self.counterpoise != 0: return # For ONIOM calcs, ignore this section in order to bypass assertion failure. if self.oniom: return self.updateprogress(inputfile, "Eigenvalues", self.fupdate) self.moenergies = [[]] HOMO = -2 while line.find('Alpha') == 1: if line.split()[1] == "virt." and HOMO == -2: # If there aren't any symmetries, this is a good way to find the HOMO. # Also, check for consistency if homos was already parsed. HOMO = len(self.moenergies[0])-1 if hasattr(self, "homos"): assert HOMO == self.homos[0] else: self.homos = numpy.array([HOMO], "i") # Convert to floats and append to moenergies, but sometimes Gaussian # doesn't print correctly so test for ValueError (bug 1756789). part = line[28:] i = 0 while i*10+4 < len(part): s = part[i*10:(i+1)*10] try: x = self.float(s) except ValueError: x = numpy.nan self.moenergies[0].append(utils.convertor(x, "hartree", "eV")) i += 1 line = next(inputfile) # If, at this point, self.homos is unset, then there were not # any alpha virtual orbitals if not hasattr(self, "homos"): HOMO = len(self.moenergies[0])-1 self.homos = numpy.array([HOMO], "i") if line.find('Beta') == 2: self.moenergies.append([]) HOMO = -2 while line.find('Beta') == 2: if line.split()[1] == "virt." and HOMO == -2: # If there aren't any symmetries, this is a good way to find the HOMO. # Also, check for consistency if homos was already parsed. HOMO = len(self.moenergies[1])-1 if len(self.homos) == 2: assert HOMO == self.homos[1] else: self.homos.resize([2]) self.homos[1] = HOMO part = line[28:] i = 0 while i*10+4 < len(part): x = part[i*10:(i+1)*10] self.moenergies[1].append(utils.convertor(self.float(x), "hartree", "eV")) i += 1 line = next(inputfile) self.moenergies = [numpy.array(x, "d") for x in self.moenergies] # Start of the IR/Raman frequency section. # Caution is advised here, as additional frequency blocks # can be printed by Gaussian (with slightly different formats), # often doubling the information printed. # See, for a non-standard exmaple, regression Gaussian98/test_H2.log if line[1:14] == "Harmonic freq": self.updateprogress(inputfile, "Frequency Information", self.fupdate) removeold = False # The whole block should not have any blank lines. while line.strip() != "": # The line with indices if line[1:15].strip() == "" and line[15:23].strip().isdigit(): freqbase = int(line[15:23]) if freqbase == 1 and hasattr(self, 'vibfreqs'): # This is a reparse of this information removeold = True # Lines with symmetries and symm. indices begin with whitespace. if line[1:15].strip() == "" and not line[15:23].strip().isdigit(): if not hasattr(self, 'vibsyms'): self.vibsyms = [] syms = line.split() self.vibsyms.extend(syms) if line[1:15] == "Frequencies --": if not hasattr(self, 'vibfreqs'): self.vibfreqs = [] if removeold: # This is a reparse, so throw away the old info if hasattr(self, "vibsyms"): # We have already parsed the vibsyms so don't throw away! self.vibsyms = self.vibsyms[-len(line[15:].split()):] if hasattr(self, "vibirs"): self.vibirs = [] if hasattr(self, 'vibfreqs'): self.vibfreqs = [] if hasattr(self, 'vibramans'): self.vibramans = [] if hasattr(self, 'vibdisps'): self.vibdisps = [] removeold = False freqs = [self.float(f) for f in line[15:].split()] self.vibfreqs.extend(freqs) if line[1:15] == "IR Inten --": if not hasattr(self, 'vibirs'): self.vibirs = [] irs = [] for ir in line[15:].split(): try: irs.append(self.float(ir)) except ValueError: irs.append(self.float('nan')) self.vibirs.extend(irs) if line[1:15] == "Raman Activ --": if not hasattr(self, 'vibramans'): self.vibramans = [] ramans = [] for raman in line[15:].split(): try: ramans.append(self.float(raman)) except ValueError: ramans.append(self.float('nan')) self.vibramans.extend(ramans) # Block with displacement should start with this. if line.strip().split()[0:3] == ["Atom", "AN", "X"]: if not hasattr(self, 'vibdisps'): self.vibdisps = [] disps = [] for n in range(self.natom): line = next(inputfile) numbers = [float(s) for s in line[10:].split()] N = len(numbers) // 3 if not disps: for n in range(N): disps.append([]) for n in range(N): disps[n].append(numbers[3*n:3*n+3]) self.vibdisps.extend(disps) line = next(inputfile) # Electronic transitions. if line[1:14] == "Excited State": if not hasattr(self, "etenergies"): self.etenergies = [] self.etoscs = [] self.etsyms = [] self.etsecs = [] # Need to deal with lines like: # (restricted calc) # Excited State 1: Singlet-BU 5.3351 eV 232.39 nm f=0.1695 # (unrestricted calc) (first excited state is 2!) # Excited State 2: ?Spin -A 0.1222 eV 10148.75 nm f=0.0000 # (Gaussian 09 ZINDO) # Excited State 1: Singlet-?Sym 2.5938 eV 478.01 nm f=0.0000 =0.000 p = re.compile(":(?P.*?)(?P-?\d*\.\d*) eV") groups = p.search(line).groups() self.etenergies.append(utils.convertor(self.float(groups[1]), "eV", "cm-1")) self.etoscs.append(self.float(line.split("f=")[-1].split()[0])) self.etsyms.append(groups[0].strip()) line = next(inputfile) p = re.compile("(\d+)") CIScontrib = [] while line.find(" ->") >= 0: # This is a contribution to the transition parts = line.split("->") self.logger.debug(parts) # Has to deal with lines like: # 32 -> 38 0.04990 # 35A -> 45A 0.01921 frommoindex = 0 # For restricted or alpha unrestricted fromMO = parts[0].strip() if fromMO[-1] == "B": frommoindex = 1 # For beta unrestricted fromMO = int(p.match(fromMO).group())-1 # subtract 1 so that it is an index into moenergies t = parts[1].split() tomoindex = 0 toMO = t[0] if toMO[-1] == "B": tomoindex = 1 toMO = int(p.match(toMO).group())-1 # subtract 1 so that it is an index into moenergies percent = self.float(t[1]) # For restricted calculations, the percentage will be corrected # after parsing (see after_parsing() above). CIScontrib.append([(fromMO, frommoindex), (toMO, tomoindex), percent]) line = next(inputfile) self.etsecs.append(CIScontrib) # Circular dichroism data (different for G03 vs G09) # # G03 # # ## <0|r|b> * (Au), Rotatory Strengths (R) in # ## cgs (10**-40 erg-esu-cm/Gauss) # ## state X Y Z R(length) # ## 1 0.0006 0.0096 -0.0082 -0.4568 # ## 2 0.0251 -0.0025 0.0002 -5.3846 # ## 3 0.0168 0.4204 -0.3707 -15.6580 # ## 4 0.0721 0.9196 -0.9775 -3.3553 # # G09 # # ## 1/2[<0|r|b>* + (<0|rxdel|b>*)*] # ## Rotatory Strengths (R) in cgs (10**-40 erg-esu-cm/Gauss) # ## state XX YY ZZ R(length) R(au) # ## 1 -0.3893 -6.7546 5.7736 -0.4568 -0.0010 # ## 2 -17.7437 1.7335 -0.1435 -5.3845 -0.0114 # ## 3 -11.8655 -297.2604 262.1519 -15.6580 -0.0332 if (line[1:52] == "<0|r|b> * (Au), Rotatory Strengths (R)" or line[1:50] == "1/2[<0|r|b>* + (<0|rxdel|b>*)*]"): self.etrotats = [] self.skip_lines(inputfile, ['units']) headers = next(inputfile) Ncolms = len(headers.split()) line = next(inputfile) parts = line.strip().split() while len(parts) == Ncolms: try: R = self.float(parts[4]) except ValueError: # nan or -nan if there is no first excited state # (for unrestricted calculations) pass else: self.etrotats.append(R) line = next(inputfile) temp = line.strip().split() parts = line.strip().split() self.etrotats = numpy.array(self.etrotats, "d") # Number of basis sets functions. # Has to deal with lines like: # NBasis = 434 NAE= 97 NBE= 97 NFC= 34 NFV= 0 # and... # NBasis = 148 MinDer = 0 MaxDer = 0 # Although the former is in every file, it doesn't occur before # the overlap matrix is printed. if line[1:7] == "NBasis" or line[4:10] == "NBasis": # For counterpoise fragment, skip these lines. if self.counterpoise != 0: return # For ONIOM calcs, ignore this section in order to bypass assertion failure. if self.oniom: return # If nbasis was already parsed, check if it changed. If it did, issue a warning. # In the future, we will probably want to have nbasis, as well as nmo below, # as a list so that we don't need to pick one value when it changes. nbasis = int(line.split('=')[1].split()[0]) if hasattr(self, "nbasis"): try: assert nbasis == self.nbasis except AssertionError: self.logger.warning("Number of basis functions (nbasis) has changed from %i to %i" % (self.nbasis, nbasis)) self.nbasis = nbasis # Number of linearly-independent basis sets. if line[1:7] == "NBsUse": # For counterpoise fragment, skip these lines. if self.counterpoise != 0: return # For ONIOM calcs, ignore this section in order to bypass assertion failure. if self.oniom: return nmo = int(line.split('=')[1].split()[0]) self.set_attribute('nmo', nmo) # For AM1 calculations, set nbasis by a second method, # as nmo may not always be explicitly stated. if line[7:22] == "basis functions, ": nbasis = int(line.split()[0]) self.set_attribute('nbasis', nbasis) # Molecular orbital overlap matrix. # Has to deal with lines such as: # *** Overlap *** # ****** Overlap ****** # Note that Gaussian sometimes drops basis functions, # causing the overlap matrix as parsed below to not be # symmetric (which is a problem for population analyses, etc.) if line[1:4] == "***" and (line[5:12] == "Overlap" or line[8:15] == "Overlap"): # Ensure that this is the main calc and not a fragment if self.counterpoise != 0: return self.aooverlaps = numpy.zeros( (self.nbasis, self.nbasis), "d") # Overlap integrals for basis fn#1 are in aooverlaps[0] base = 0 colmNames = next(inputfile) while base < self.nbasis: self.updateprogress(inputfile, "Overlap", self.fupdate) for i in range(self.nbasis-base): # Fewer lines this time line = next(inputfile) parts = line.split() for j in range(len(parts)-1): # Some lines are longer than others k = float(parts[j+1].replace("D", "E")) self.aooverlaps[base+j, i+base] = k self.aooverlaps[i+base, base+j] = k base += 5 colmNames = next(inputfile) self.aooverlaps = numpy.array(self.aooverlaps, "d") # Molecular orbital coefficients (mocoeffs). # Essentially only produced for SCF calculations. # This is also the place where aonames and atombasis are parsed. if line[5:35] == "Molecular Orbital Coefficients" or line[5:41] == "Alpha Molecular Orbital Coefficients" or line[5:40] == "Beta Molecular Orbital Coefficients": # If counterpoise fragment, return without parsing orbital info if self.counterpoise != 0: return # Skip this for ONIOM calcs if self.oniom: return if line[5:40] == "Beta Molecular Orbital Coefficients": beta = True if self.popregular: return # This was continue before refactoring the parsers. #continue # Not going to extract mocoeffs # Need to add an extra array to self.mocoeffs self.mocoeffs.append(numpy.zeros((self.nmo, self.nbasis), "d")) else: beta = False self.aonames = [] self.atombasis = [] mocoeffs = [numpy.zeros((self.nmo, self.nbasis), "d")] base = 0 self.popregular = False for base in range(0, self.nmo, 5): self.updateprogress(inputfile, "Coefficients", self.fupdate) colmNames = next(inputfile) if not colmNames.split(): self.logger.warning("Molecular coefficients header found but no coefficients.") break; if base == 0 and int(colmNames.split()[0]) != 1: # Implies that this is a POP=REGULAR calculation # and so, only aonames (not mocoeffs) will be extracted self.popregular = True symmetries = next(inputfile) eigenvalues = next(inputfile) for i in range(self.nbasis): line = next(inputfile) if i == 0: # Find location of the start of the basis function name start_of_basis_fn_name = line.find(line.split()[3]) - 1 if base == 0 and not beta: # Just do this the first time 'round parts = line[:start_of_basis_fn_name].split() if len(parts) > 1: # New atom if i > 0: self.atombasis.append(atombasis) atombasis = [] atomname = "%s%s" % (parts[2], parts[1]) orbital = line[start_of_basis_fn_name:20].strip() self.aonames.append("%s_%s" % (atomname, orbital)) atombasis.append(i) part = line[21:].replace("D", "E").rstrip() temp = [] for j in range(0, len(part), 10): temp.append(float(part[j:j+10])) if beta: self.mocoeffs[1][base:base + len(part) / 10, i] = temp else: mocoeffs[0][base:base + len(part) / 10, i] = temp if base == 0 and not beta: # Do the last update of atombasis self.atombasis.append(atombasis) if self.popregular: # We now have aonames, so no need to continue break if not self.popregular and not beta: self.mocoeffs = mocoeffs # Natural orbital coefficients (nocoeffs) and occupation numbers (nooccnos), # which are respectively define the eigenvectors and eigenvalues of the # diagnolized one-electron density matrix. These orbitals are formed after # configuration interact (CI) calculations, but not only. Similarly to mocoeffs, # we can parse and check aonames and atombasis here. # # Natural Orbital Coefficients: # 1 2 3 4 5 # Eigenvalues -- 2.01580 2.00363 2.00000 2.00000 1.00000 # 1 1 O 1S 0.00000 -0.15731 -0.28062 0.97330 0.00000 # 2 2S 0.00000 0.75440 0.57746 0.07245 0.00000 # ... # if line[5:33] == "Natural Orbital Coefficients": self.aonames = [] self.atombasis = [] nocoeffs = numpy.zeros((self.nmo, self.nbasis), "d") nooccnos = [] base = 0 self.popregular = False for base in range(0, self.nmo, 5): self.updateprogress(inputfile, "Natural orbitals", self.fupdate) colmNames = next(inputfile) if base == 0 and int(colmNames.split()[0]) != 1: # Implies that this is a POP=REGULAR calculation # and so, only aonames (not mocoeffs) will be extracted self.popregular = True eigenvalues = next(inputfile) nooccnos.extend(map(float, eigenvalues.split()[2:])) for i in range(self.nbasis): line = next(inputfile) # Just do this the first time 'round. if base == 0: # Changed below from :12 to :11 to deal with Elmar Neumann's example. parts = line[:11].split() # New atom. if len(parts) > 1: if i > 0: self.atombasis.append(atombasis) atombasis = [] atomname = "%s%s" % (parts[2], parts[1]) orbital = line[11:20].strip() self.aonames.append("%s_%s" % (atomname, orbital)) atombasis.append(i) part = line[21:].replace("D", "E").rstrip() temp = [] for j in range(0, len(part), 10): temp.append(float(part[j:j+10])) nocoeffs[base:base + len(part) / 10, i] = temp # Do the last update of atombasis. if base == 0: self.atombasis.append(atombasis) # We now have aonames, so no need to continue. if self.popregular: break if not self.popregular: self.nocoeffs = nocoeffs self.nooccnos = nooccnos # For FREQ=Anharm, extract anharmonicity constants if line[1:40] == "X matrix of Anharmonic Constants (cm-1)": Nvibs = len(self.vibfreqs) self.vibanharms = numpy.zeros( (Nvibs, Nvibs), "d") base = 0 colmNames = next(inputfile) while base < Nvibs: for i in range(Nvibs-base): # Fewer lines this time line = next(inputfile) parts = line.split() for j in range(len(parts)-1): # Some lines are longer than others k = float(parts[j+1].replace("D", "E")) self.vibanharms[base+j, i+base] = k self.vibanharms[i+base, base+j] = k base += 5 colmNames = next(inputfile) # Pseudopotential charges. if line.find("Pseudopotential Parameters") > -1: self.skip_lines(inputfile, ['e', 'label1', 'label2', 'e']) line = next(inputfile) if line.find("Centers:") < 0: return # This was continue before parser refactoring. # continue # Needs to handle code like the following: # # Center Atomic Valence Angular Power Coordinates # Number Number Electrons Momentum of R Exponent Coefficient X Y Z # =================================================================================================================================== # Centers: 1 # Centers: 16 # Centers: 21 24 # Centers: 99100101102 # 1 44 16 -4.012684 -0.696698 0.006750 # F and up # 0 554.3796303 -0.05152700 centers = [] while line.find("Centers:") >= 0: temp = line[10:] for i in range(0, len(temp)-3, 3): centers.append(int(temp[i:i+3])) line = next(inputfile) centers.sort() # Not always in increasing order self.coreelectrons = numpy.zeros(self.natom, "i") for center in centers: front = line[:10].strip() while not (front and int(front) == center): line = next(inputfile) front = line[:10].strip() info = line.split() self.coreelectrons[center-1] = int(info[1]) - int(info[2]) line = next(inputfile) # This will be printed for counterpoise calcualtions only. # To prevent crashing, we need to know which fragment is being considered. # Other information is also printed in lines that start like this. if line[1:14] == 'Counterpoise:': if line[42:50] == "fragment": self.counterpoise = int(line[51:54]) # This will be printed only during ONIOM calcs; use it to set a flag # that will allow assertion failures to be bypassed in the code. if line[1:7] == "ONIOM:": self.oniom = True # Atomic charges are straightforward to parse, although the header # has changed over time somewhat. # # Mulliken charges: # 1 # 1 C -0.004513 # 2 C -0.077156 # ... # Sum of Mulliken charges = 0.00000 # Mulliken charges with hydrogens summed into heavy atoms: # 1 # 1 C -0.004513 # 2 C 0.002063 # ... # if line[1:25] == "Mulliken atomic charges:" or line[1:18] == "Mulliken charges:" or \ line[1:23] == "Lowdin Atomic Charges:" or line[1:16] == "Lowdin charges:": if not hasattr(self, "atomcharges"): self.atomcharges = {} ones = next(inputfile) charges = [] nline = next(inputfile) while not "Sum of" in nline: charges.append(float(nline.split()[2])) nline = next(inputfile) if "Mulliken" in line: self.atomcharges["mulliken"] = charges else: self.atomcharges["lowdin"] = charges if line.strip() == "Natural Population": if not hasattr(self, 'atomcharges'): self.atomcharges = {} line1 = next(inputfile) line2 = next(inputfile) if line1.split()[0] == 'Natural' and line2.split()[2] == 'Charge': dashes = next(inputfile) charges = [] for i in range(self.natom): nline = next(inputfile) charges.append(float(nline.split()[2])) self.atomcharges["natural"] = charges #Extract Thermochemistry #Temperature 298.150 Kelvin. Pressure 1.00000 Atm. #Zero-point correction= 0.342233 (Hartree/ #Thermal correction to Energy= 0. #Thermal correction to Enthalpy= 0. #Thermal correction to Gibbs Free Energy= 0.302940 #Sum of electronic and zero-point Energies= -563.649744 #Sum of electronic and thermal Energies= -563.636699 #Sum of electronic and thermal Enthalpies= -563.635755 #Sum of electronic and thermal Free Energies= -563.689037 if "Sum of electronic and thermal Enthalpies" in line: self.set_attribute('enthalpy', float(line.split()[6])) if "Sum of electronic and thermal Free Energies=" in line: self.set_attribute('freenergy', float(line.split()[7])) if line[1:12] == "Temperature": self.set_attribute('temperature', float(line.split()[1])) if __name__ == "__main__": import doctest, gaussianparser, sys if len(sys.argv) == 1: doctest.testmod(gaussianparser, verbose=False) if len(sys.argv) >= 2: parser = gaussianparser.Gaussian(sys.argv[1]) data = parser.parse() if len(sys.argv) > 2: for i in range(len(sys.argv[2:])): if hasattr(data, sys.argv[2 + i]): print(getattr(data, sys.argv[2 + i])) cclib-1.3.1/src/cclib/parser/nwchemparser.py0000644000175100016050000012514412467423323020675 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2008-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for NWChem output files""" import itertools import re import numpy from . import logfileparser from . import utils class NWChem(logfileparser.Logfile): """An NWChem log file.""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(NWChem, self).__init__(logname="NWChem", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "NWChem log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'NWChem("%s")' % (self.filename) def normalisesym(self, label): """Use standard symmetry labels instead of NWChem labels. To normalise: (1) If label is one of [SG, PI, PHI, DLTA], replace by [sigma, pi, phi, delta] (2) replace any G or U by their lowercase equivalent >>> sym = NWChem("dummyfile").normalisesym >>> labels = ['A1', 'AG', 'A1G', "SG", "PI", "PHI", "DLTA", 'DLTU', 'SGG'] >>> map(sym, labels) ['A1', 'Ag', 'A1g', 'sigma', 'pi', 'phi', 'delta', 'delta.u', 'sigma.g'] """ # FIXME if necessary return label name2element = lambda self, lbl: "".join(itertools.takewhile(str.isalpha, str(lbl))) def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # This is printed in the input module, so should always be the first coordinates, # and contains some basic information we want to parse as well. However, this is not # the only place where the coordinates are printed during geometry optimization, # since the gradients module has a separate coordinate printout, which happens # alongside the coordinate gradients. This geometry printout happens at the # beginning of each optimization step only. if line.strip() == 'Geometry "geometry" -> ""' or line.strip() == 'Geometry "geometry" -> "geometry"': self.skip_lines(inputfile, ['dashes', 'blank', 'units', 'blank', 'header', 'dashes']) if not hasattr(self, 'atomcoords'): self.atomcoords = [] line = next(inputfile) coords = [] atomnos = [] while line.strip(): # The column labeled 'tag' is usually empty, but I'm not sure whether it can have spaces, # so for now assume that it can and that there will be seven columns in that case. if len(line.split()) == 6: index, atomname, nuclear, x, y, z = line.split() else: index, atomname, tag, nuclear, x, y, z = line.split() coords.append(list(map(float, [x,y,z]))) atomnos.append(int(float(nuclear))) line = next(inputfile) self.atomcoords.append(coords) self.set_attribute('atomnos', atomnos) # If the geometry is printed in XYZ format, it will have the number of atoms. if line[12:31] == "XYZ format geometry": self.skip_line(inputfile, 'dashes') natom = int(next(inputfile).strip()) self.set_attribute('natom', natom) if line.strip() == "NWChem Geometry Optimization": self.skip_lines(inputfile, ['d', 'b', 'b', 'b', 'b', 'title', 'b', 'b']) line = next(inputfile) while line.strip(): if "maximum gradient threshold" in line: gmax = float(line.split()[-1]) if "rms gradient threshold" in line: grms = float(line.split()[-1]) if "maximum cartesian step threshold" in line: xmax = float(line.split()[-1]) if "rms cartesian step threshold" in line: xrms = float(line.split()[-1]) line = next(inputfile) self.set_attribute('geotargets', [gmax, grms, xmax, xrms]) # NWChem does not normally print the basis set for each atom, but rather # chooses the concise option of printing Gaussian coefficients for each # atom type/element only once. Therefore, we need to first parse those # coefficients and afterwards build the appropriate gbasis attribute based # on that and atom types/elements already parsed (atomnos). However, if atom # are given different names (number after element, like H1 and H2), then NWChem # generally prints the gaussian parameters for all unique names, like this: # # Basis "ao basis" -> "ao basis" (cartesian) # ----- # O (Oxygen) # ---------- # Exponent Coefficients # -------------- --------------------------------------------------------- # 1 S 1.30709320E+02 0.154329 # 1 S 2.38088610E+01 0.535328 # (...) # # H1 (Hydrogen) # ------------- # Exponent Coefficients # -------------- --------------------------------------------------------- # 1 S 3.42525091E+00 0.154329 # (...) # # H2 (Hydrogen) # ------------- # Exponent Coefficients # -------------- --------------------------------------------------------- # 1 S 3.42525091E+00 0.154329 # (...) # # This current parsing code below assumes all atoms of the same element # use the same basis set, but that might not be true, and this will probably # need to be considered in the future when such a logfile appears. if line.strip() == """Basis "ao basis" -> "ao basis" (cartesian)""": self.skip_line(inputfile, 'dashes') gbasis_dict = {} line = next(inputfile) while line.strip(): atomname = line.split()[0] atomelement = self.name2element(atomname) gbasis_dict[atomelement] = [] self.skip_lines(inputfile, ['d', 'labels', 'd']) shells = [] line = next(inputfile) while line.strip() and line.split()[0].isdigit(): shell = None while line.strip(): nshell, type, exp, coeff = line.split() nshell = int(nshell) assert len(shells) == nshell - 1 if not shell: shell = (type, []) else: assert shell[0] == type exp = float(exp) coeff = float(coeff) shell[1].append((exp,coeff)) line = next(inputfile) shells.append(shell) line = next(inputfile) gbasis_dict[atomelement].extend(shells) gbasis = [] for i in range(self.natom): atomtype = utils.PeriodicTable().element[self.atomnos[i]] gbasis.append(gbasis_dict[atomtype]) self.set_attribute('gbasis', gbasis) # Normally the indexes of AOs assigned to specific atoms are also not printed, # so we need to infer that. We could do that from the previous section, # it might be worthwhile to take numbers from two different places, hence # the code below, which builds atombasis based on the number of functions # listed in this summary of the AO basis. Similar to previous section, here # we assume all atoms of the same element have the same basis sets, but # this will probably need to be revised later. if line.strip() == """Summary of "ao basis" -> "ao basis" (cartesian)""": self.skip_lines(inputfile, ['d', 'headers', 'd']) atombasis_dict = {} line = next(inputfile) while line.strip(): atomname, desc, shells, funcs, types = line.split() atomelement = self.name2element(atomname) atombasis_dict[atomelement] = int(funcs) line = next(inputfile) last = 0 atombasis = [] for i in range(self.natom): atomelement = utils.PeriodicTable().element[self.atomnos[i]] nfuncs = atombasis_dict[atomelement] atombasis.append(list(range(last,last+nfuncs))) last = atombasis[-1][-1] + 1 self.set_attribute('atombasis', atombasis) # This section contains general parameters for Hartree-Fock calculations, # which do not contain the 'General Information' section like most jobs. if line.strip() == "NWChem SCF Module": self.skip_lines(inputfile, ['d', 'b', 'b', 'title', 'b', 'b', 'b']) line = next(inputfile) while line.strip(): if line[2:8] == "charge": charge = int(float(line.split()[-1])) self.set_attribute('charge', charge) if line[2:13] == "open shells": unpaired = int(line.split()[-1]) self.set_attribute('mult', 2*unpaired + 1) if line[2:7] == "atoms": natom = int(line.split()[-1]) self.set_attribute('natom', natom) if line[2:11] == "functions": nfuncs = int(line.split()[-1]) self.set_attribute("nbasis", nfuncs) line = next(inputfile) # This section contains general parameters for DFT calculations, as well as # for the many-electron theory module. if line.strip() == "General Information": if hasattr(self, 'linesearch') and self.linesearch: return while line.strip(): if "No. of atoms" in line: self.set_attribute('natom', int(line.split()[-1])) if "Charge" in line: self.set_attribute('charge', int(line.split()[-1])) if "Spin multiplicity" in line: mult = line.split()[-1] if mult == "singlet": mult = 1 self.set_attribute('mult', int(mult)) if "AO basis - number of function" in line: nfuncs = int(line.split()[-1]) self.set_attribute('nbasis', nfuncs) # These will be present only in the DFT module. if "Convergence on energy requested" in line: target_energy = float(line.split()[-1].replace('D', 'E')) if "Convergence on density requested" in line: target_density = float(line.split()[-1].replace('D', 'E')) if "Convergence on gradient requested" in line: target_gradient = float(line.split()[-1].replace('D', 'E')) line = next(inputfile) # Pretty nasty temporary hack to set scftargets only in the SCF module. if "target_energy" in dir() and "target_density" in dir() and "target_gradient" in dir(): if not hasattr(self, 'scftargets'): self.scftargets = [] self.scftargets.append([target_energy, target_density, target_gradient]) if line.strip() in ("The SCF is already converged", "The DFT is already converged"): if self.linesearch: return self.scftargets.append(self.scftargets[-1]) self.scfvalues.append(self.scfvalues[-1]) # The default (only?) SCF algorithm for Hartree-Fock is a preconditioned conjugate # gradient method that apparently "always" converges, so this header should reliably # signal a start of the SCF cycle. The convergence targets are also printed here. if line.strip() == "Quadratically convergent ROHF": if hasattr(self, 'linesearch') and self.linesearch: return while not "Final" in line: # Only the norm of the orbital gradient is used to test convergence. if line[:22] == " Convergence threshold": target = float(line.split()[-1]) if not hasattr(self, "scftargets"): self.scftargets = [] self.scftargets.append([target]) # This is critical for the stop condition of the section, # because the 'Final Fock-matrix accuracy' is along the way. # It would be prudent to find a more robust stop condition. while list(set(line.strip())) != ["-"]: line = next(inputfile) if line.split() == ['iter', 'energy', 'gnorm', 'gmax', 'time']: values = [] self.skip_line(inputfile, 'dashes') line = next(inputfile) while line.strip(): iter,energy,gnorm,gmax,time = line.split() gnorm = float(gnorm.replace('D','E')) values.append([gnorm]) line = next(inputfile) if not hasattr(self, 'scfvalues'): self.scfvalues = [] self.scfvalues.append(values) line = next(inputfile) # The SCF for DFT does not use the same algorithm as Hartree-Fock, but always # seems to use the following format to report SCF convergence: # convergence iter energy DeltaE RMS-Dens Diis-err time # ---------------- ----- ----------------- --------- --------- --------- ------ # d= 0,ls=0.0,diis 1 -382.2544324446 -8.28D+02 1.42D-02 3.78D-01 23.2 # d= 0,ls=0.0,diis 2 -382.3017298534 -4.73D-02 6.99D-03 3.82D-02 39.3 # d= 0,ls=0.0,diis 3 -382.2954343173 6.30D-03 4.21D-03 7.95D-02 55.3 # ... if line.split() == ['convergence', 'iter', 'energy', 'DeltaE', 'RMS-Dens', 'Diis-err', 'time']: if hasattr(self, 'linesearch') and self.linesearch: return self.skip_line(inputfile, 'dashes') line = next(inputfile) values = [] while line.strip(): # Sometimes there are things in between iterations with fewer columns, # and we want to skip those lines, most probably. An exception might # unrestricted calcualtions, which show extra RMS density and DIIS # errors, although it is not clear yet whether these are for the # beta orbitals or somethine else. The iterations look like this in that case: # convergence iter energy DeltaE RMS-Dens Diis-err time # ---------------- ----- ----------------- --------- --------- --------- ------ # d= 0,ls=0.0,diis 1 -382.0243202601 -8.28D+02 7.77D-03 1.04D-01 30.0 # 7.68D-03 1.02D-01 # d= 0,ls=0.0,diis 2 -382.0647539758 -4.04D-02 4.64D-03 1.95D-02 59.2 # 5.39D-03 2.36D-02 # ... if len(line[17:].split()) == 6: iter, energy, deltaE, dens, diis, time = line[17:].split() val_energy = float(deltaE.replace('D', 'E')) val_density = float(dens.replace('D', 'E')) val_gradient = float(diis.replace('D', 'E')) values.append([val_energy, val_density, val_gradient]) line = next(inputfile) if not hasattr(self, 'scfvalues'): self.scfvalues = [] self.scfvalues.append(values) # These triggers are supposed to catch the current step in a geometry optimization search # and determine whether we are currently in the main (initial) SCF cycle of that step # or in the subsequent line search. The step is printed between dashes like this: # # -------- # Step 0 # -------- # # and the summary lines that describe the main SCF cycle for the frsit step look like this: # #@ Step Energy Delta E Gmax Grms Xrms Xmax Walltime #@ ---- ---------------- -------- -------- -------- -------- -------- -------- #@ 0 -379.76896249 0.0D+00 0.04567 0.01110 0.00000 0.00000 4.2 # ok ok # # However, for subsequent step the format is a bit different: # # Step Energy Delta E Gmax Grms Xrms Xmax Walltime # ---- ---------------- -------- -------- -------- -------- -------- -------- #@ 2 -379.77794602 -7.4D-05 0.00118 0.00023 0.00440 0.01818 14.8 # ok # # There is also a summary of the line search (which we don't use now), like this: # # Line search: # step= 1.00 grad=-1.8D-05 hess= 8.9D-06 energy= -379.777955 mode=accept # new step= 1.00 predicted energy= -379.777955 # if line[10:14] == "Step": self.geostep = int(line.split()[-1]) self.skip_line(inputfile, 'dashes') self.linesearch = False if line[0] == "@" and line.split()[1] == "Step": at_and_dashes = next(inputfile) line = next(inputfile) assert int(line.split()[1]) == self.geostep == 0 gmax = float(line.split()[4]) grms = float(line.split()[5]) xrms = float(line.split()[6]) xmax = float(line.split()[7]) if not hasattr(self, 'geovalues'): self.geovalues = [] self.geovalues.append([gmax, grms, xmax, xrms]) self.linesearch = True if line[2:6] == "Step": self.skip_line(inputfile, 'dashes') line = next(inputfile) assert int(line.split()[1]) == self.geostep if self.linesearch: #print(line) return gmax = float(line.split()[4]) grms = float(line.split()[5]) xrms = float(line.split()[6]) xmax = float(line.split()[7]) if not hasattr(self, 'geovalues'): self.geovalues = [] self.geovalues.append([gmax, grms, xmax, xrms]) self.linesearch = True # There is a clear message when the geometry optimization has converged: # # ---------------------- # Optimization converged # ---------------------- # if line.strip() == "Optimization converged": self.skip_line(inputfile, 'dashes') if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues) - 1) if "Failed to converge" in line and hasattr(self, 'geovalues'): if not hasattr(self, 'optdone'): self.optdone = [] # The line containing the final SCF energy seems to be always identifiable like this. if "Total SCF energy" in line or "Total DFT energy" in line: # NWChem often does a line search during geometry optimization steps, reporting # the SCF information but not the coordinates (which are not necessarily 'intermediate' # since the step size can become smaller). We want to skip these SCF cycles, # unless the coordinates can also be extracted (possibly from the gradients?). if hasattr(self, 'linesearch') and self.linesearch: return if not hasattr(self, "scfenergies"): self.scfenergies = [] energy = float(line.split()[-1]) energy = utils.convertor(energy, "hartree", "eV") self.scfenergies.append(energy) # The final MO orbitals are printed in a simple list, but apparently not for # DFT calcs, and often this list does not contain all MOs, so make sure to # parse them from the MO analysis below if possible. This section will be like this: # # Symmetry analysis of molecular orbitals - final # ----------------------------------------------- # # Numbering of irreducible representations: # # 1 ag 2 au 3 bg 4 bu # # Orbital symmetries: # # 1 bu 2 ag 3 bu 4 ag 5 bu # 6 ag 7 bu 8 ag 9 bu 10 ag # ... if line.strip() == "Symmetry analysis of molecular orbitals - final": self.skip_lines(inputfile, ['d', 'b', 'numbering', 'b', 'reps', 'b', 'syms', 'b']) if not hasattr(self, 'mosyms'): self.mosyms = [[None]*self.nbasis] line = next(inputfile) while line.strip(): ncols = len(line.split()) assert ncols % 2 == 0 for i in range(ncols//2): index = int(line.split()[i*2]) - 1 sym = line.split()[i*2+1] sym = sym[0].upper() + sym[1:] if self.mosyms[0][index]: if self.mosyms[0][index] != sym: self.logger.warning("Symmetry of MO %i has changed" % (index+1)) self.mosyms[0][index] = sym line = next(inputfile) # The same format is used for HF and DFT molecular orbital analysis. We want to parse # the MO energies from this section, although it is printed already before this with # less precision (might be useful to parse that if this is not available). Also, this # section contains coefficients for the leading AO contributions, so it might also # be useful to parse and use those values if the full vectors are not printed. # # The block looks something like this (two separate alpha/beta blocks in the unrestricted case): # # ROHF Final Molecular Orbital Analysis # ------------------------------------- # # Vector 1 Occ=2.000000D+00 E=-1.104059D+01 Symmetry=bu # MO Center= 1.4D-17, 0.0D+00, -6.5D-37, r^2= 2.1D+00 # Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function # ----- ------------ --------------- ----- ------------ --------------- # 1 0.701483 1 C s 6 -0.701483 2 C s # # Vector 2 Occ=2.000000D+00 E=-1.104052D+01 Symmetry=ag # ... # Vector 12 Occ=2.000000D+00 E=-1.020253D+00 Symmetry=bu # MO Center= -1.4D-17, -5.6D-17, 2.9D-34, r^2= 7.9D+00 # Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function # ----- ------------ --------------- ----- ------------ --------------- # 36 -0.298699 11 C s 41 0.298699 12 C s # 2 0.270804 1 C s 7 -0.270804 2 C s # 48 -0.213655 15 C s 53 0.213655 16 C s # ... # if "Final" in line and "Molecular Orbital Analysis" in line: # Unrestricted jobs have two such blocks, for alpha and beta orbitals, and # we need to keep track of which one we're parsing (always alpha in restricted case). unrestricted = ("Alpha" in line) or ("Beta" in line) alphabeta = int("Beta" in line) self.skip_lines(inputfile, ['dashes', 'blank']) energies = [] symmetries = [None]*self.nbasis line = next(inputfile) homo = 0 while line[:7] == " Vector": # Note: the vector count starts from 1 in NWChem. nvector = int(line[7:12]) # A nonzero occupancy for SCF jobs means the orbital is occupied. if ("Occ=2.0" in line) or ("Occ=1.0" in line): homo = nvector-1 # If the printout does not start from the first MO, assume None for all previous orbitals. if len(energies) == 0 and nvector > 1: for i in range(1,nvector): energies.append(None) energy = float(line[34:47].replace('D','E')) energy = utils.convertor(energy, "hartree", "eV") energies.append(energy) # When symmetry is not used, this part of the line is missing. if line[47:58].strip() == "Symmetry=": sym = line[58:].strip() sym = sym[0].upper() + sym[1:] symmetries[nvector-1] = sym line = next(inputfile) if "MO Center" in line: line = next(inputfile) if "Bfn." in line: line = next(inputfile) if "-----" in line: line = next(inputfile) while line.strip(): line = next(inputfile) line = next(inputfile) self.set_attribute('nmo', nvector) if not hasattr(self, 'moenergies') or (len(self.moenergies) > alphabeta): self.moenergies = [] self.moenergies.append(energies) if not hasattr(self, 'mosyms') or (len(self.mosyms) > alphabeta): self.mosyms = [] self.mosyms.append(symmetries) if not hasattr(self, 'homos') or (len(self.homos) > alphabeta): self.homos = [] self.homos.append(homo) # This is where the full MO vectors are printed, but a special directive is needed for it: # # Final MO vectors # ---------------- # # # global array: alpha evecs[1:60,1:60], handle: -995 # # 1 2 3 4 5 6 # ----------- ----------- ----------- ----------- ----------- ----------- # 1 -0.69930 -0.69930 -0.02746 -0.02769 -0.00313 -0.02871 # 2 -0.03156 -0.03135 0.00410 0.00406 0.00078 0.00816 # 3 0.00002 -0.00003 0.00067 0.00065 -0.00526 -0.00120 # ... # if line.strip() == "Final MO vectors": if not hasattr(self, 'mocoeffs'): self.mocoeffs = [] self.skip_lines(inputfile, ['d', 'b', 'b']) # The columns are MOs, rows AOs, but that's and educated guess since no # atom information is printed alongside the indices. This next line gives # the dimensions, which we can check. if set before this. Also, this line # specifies whether we are dealing with alpha or beta vectors. array_info = next(inputfile) while ("global array" in array_info): alphabeta = int(line.split()[2] == "beta") size = array_info.split('[')[1].split(']')[0] nbasis = int(size.split(',')[0].split(':')[1]) nmo = int(size.split(',')[1].split(':')[1]) self.set_attribute('nbasis', nbasis) self.set_attribute('nmo', nmo) self.skip_line(inputfile, 'blank') mocoeffs = [] while len(mocoeffs) < self.nmo: nmos = list(map(int,next(inputfile).split())) assert len(mocoeffs) == nmos[0]-1 for n in nmos: mocoeffs.append([]) self.skip_line(inputfile, 'dashes') for nb in range(nbasis): line = next(inputfile) index = int(line.split()[0]) assert index == nb+1 coefficients = list(map(float,line.split()[1:])) assert len(coefficients) == len(nmos) for i,c in enumerate(coefficients): mocoeffs[nmos[i]-1].append(c) self.skip_line(inputfile, 'blank') self.mocoeffs.append(mocoeffs) array_info = next(inputfile) # For Hartree-Fock, the atomic Mulliken charges are typically printed like this: # # Mulliken analysis of the total density # -------------------------------------- # # Atom Charge Shell Charges # ----------- ------ ------------------------------------------------------- # 1 C 6 6.00 1.99 1.14 2.87 # 2 C 6 6.00 1.99 1.14 2.87 # ... if line.strip() == "Mulliken analysis of the total density": if not hasattr(self, "atomcharges"): self.atomcharges = {} self.skip_lines(inputfile, ['d', 'b', 'header', 'd']) charges = [] line = next(inputfile) while line.strip(): index, atomname, nuclear, atom = line.split()[:4] shells = line.split()[4:] charges.append(float(atom)-float(nuclear)) line = next(inputfile) self.atomcharges['mulliken'] = charges # If the full overlap matrix is printed, it looks like this: # # ---------------------------- # Mulliken population analysis # ---------------------------- # # ----- Total overlap population ----- # # 1 2 3 4 5 6 7 # # 1 1 C s 2.0694818227 -0.0535883400 -0.0000000000 -0.0000000000 -0.0000000000 -0.0000000000 0.0000039991 # 2 1 C s -0.0535883400 0.8281341291 0.0000000000 -0.0000000000 0.0000000000 0.0000039991 -0.0009906747 # ... # # Also, DFT does not seem to print the separate listing of Mulliken charges # by default, but they are printed by this modules later on. They are also print # for Hartree-Fock runs, though, so in that case make sure they are consistent. if line.strip() == "Mulliken population analysis": self.skip_lines(inputfile, ['d', 'b', 'total_overlap_population', 'b']) overlaps = [] line= next(inputfile) while all([c.isdigit() for c in line.split()]): # There is always a line with the MO indices printed in thie block. indices = [int(i)-1 for i in line.split()] for i in indices: overlaps.append([]) # There is usually a blank line after the MO indices, but # there are exceptions, so check if line is blank first. line = next(inputfile) if not line.strip(): line = next(inputfile) # Now we can iterate or atomic orbitals. for nao in range(self.nbasis): data = list(map(float, line.split()[4:])) for i,d in enumerate(data): overlaps[indices[i]].append(d) line = next(inputfile) line = next(inputfile) self.aooverlaps = overlaps # This header should be printed later, before the charges are print, which of course # are just sums of the overlaps and could be calculated. But we just go ahead and # parse them, make sure they're consistent with previously parsed values and # use these since they are more precise (previous precision could have been just 0.01). while "Total gross population on atoms" not in line: line = next(inputfile) self.skip_line(inputfile, 'blank') charges = [] for i in range(self.natom): line = next(inputfile) iatom, element, ncharge, epop = line.split() iatom = int(iatom) ncharge = float(ncharge) epop = float(epop) assert iatom == (i+1) charges.append(epop-ncharge) if not hasattr(self, 'atomcharges'): self.atomcharges = {} if not "mulliken" in self.atomcharges: self.atomcharges['mulliken'] = charges else: assert max(self.atomcharges['mulliken'] - numpy.array(charges)) < 0.01 self.atomcharges['mulliken'] = charges # NWChem prints the dipole moment in atomic units first, and we could just fast forward # to the values in Debye, which are also printed. But we can also just convert them # right away and so parse a little bit less. Note how the reference point is print # here within the block nicely, as it is for all moment later. # # ------------- # Dipole Moment # ------------- # # Center of charge (in au) is the expansion point # X = 0.0000000 Y = 0.0000000 Z = 0.0000000 # # Dipole moment 0.0000000000 Debye(s) # DMX 0.0000000000 DMXEFC 0.0000000000 # DMY 0.0000000000 DMYEFC 0.0000000000 # DMZ -0.0000000000 DMZEFC 0.0000000000 # # ... # if line.strip() == "Dipole Moment": self.skip_lines(inputfile, ['d', 'b']) reference_comment = next(inputfile) assert "(in au)" in reference_comment reference = next(inputfile).split() self.reference = [reference[-7], reference[-4], reference[-1]] self.reference = numpy.array([float(x) for x in self.reference]) self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom') self.skip_line(inputfile, 'blank') magnitude = next(inputfile) assert magnitude.split()[-1] == "A.U." dipole = [] for i in range(3): line = next(inputfile) dipole.append(float(line.split()[1])) dipole = utils.convertor(numpy.array(dipole), "ebohr", "Debye") if not hasattr(self, 'moments'): self.moments = [self.reference, dipole] else: self.moments[1] == dipole # The quadrupole moment is pretty straightforward to parse. There are several # blocks printed, and the first one called 'second moments' contains the raw # moments, and later traceless values are printed. The moments, however, are # not in lexicographical order, so we need to sort them. Also, the first block # is in atomic units, so remember to convert to Buckinghams along the way. # # ----------------- # Quadrupole Moment # ----------------- # # Center of charge (in au) is the expansion point # X = 0.0000000 Y = 0.0000000 Z = 0.0000000 # # < R**2 > = ********** a.u. ( 1 a.u. = 0.280023 10**(-16) cm**2 ) # ( also called diamagnetic susceptibility ) # # Second moments in atomic units # # Component Electronic+nuclear Point charges Total # -------------------------------------------------------------------------- # XX -38.3608511210 0.0000000000 -38.3608511210 # YY -39.0055467347 0.0000000000 -39.0055467347 # ... # if line.strip() == "Quadrupole Moment": self.skip_lines(inputfile, ['d', 'b']) reference_comment = next(inputfile) assert "(in au)" in reference_comment reference = next(inputfile).split() self.reference = [reference[-7], reference[-4], reference[-1]] self.reference = numpy.array([float(x) for x in self.reference]) self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom') self.skip_lines(inputfile, ['b', 'units', 'susc', 'b']) line = next(inputfile) assert line.strip() == "Second moments in atomic units" self.skip_lines(inputfile, ['b', 'header', 'd']) # Parse into a dictionary and then sort by the component key. quadrupole = {} for i in range(6): line = next(inputfile) quadrupole[line.split()[0]] = float(line.split()[-1]) lex = sorted(quadrupole.keys()) quadrupole = [quadrupole[key] for key in lex] quadrupole = utils.convertor(numpy.array(quadrupole), "ebohr2", "Buckingham") # The checking of potential previous values if a bit more involved here, # because it turns out NWChem has separate keywords for dipole, quadrupole # and octupole output. So, it is perfectly possible to print the quadrupole # and not the dipole... if that is the case set the former to None and # issue a warning. Also, a regression has been added to cover this case. if not hasattr(self, 'moments') or len(self.moments) <2: self.logger.warning("Found quadrupole moments but no previous dipole") self.moments = [self.reference, None, quadrupole] else: if len(self.moments) == 2: self.moments.append(quadrupole) else: assert self.moments[2] == quadrupole # The octupole moment is analogous to the quadrupole, but there are more components # and the checking of previously parsed dipole and quadrupole moments is more involved, # with a corresponding test also added to regressions. # # --------------- # Octupole Moment # --------------- # # Center of charge (in au) is the expansion point # X = 0.0000000 Y = 0.0000000 Z = 0.0000000 # # Third moments in atomic units # # Component Electronic+nuclear Point charges Total # -------------------------------------------------------------------------- # XXX -0.0000000000 0.0000000000 -0.0000000000 # YYY -0.0000000000 0.0000000000 -0.0000000000 # ... # if line.strip() == "Octupole Moment": self.skip_lines(inputfile, ['d', 'b']) reference_comment = next(inputfile) assert "(in au)" in reference_comment reference = next(inputfile).split() self.reference = [reference[-7], reference[-4], reference[-1]] self.reference = numpy.array([float(x) for x in self.reference]) self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom') self.skip_line(inputfile, 'blank') line = next(inputfile) assert line.strip() == "Third moments in atomic units" self.skip_lines(inputfile, ['b', 'header', 'd']) octupole = {} for i in range(10): line = next(inputfile) octupole[line.split()[0]] = float(line.split()[-1]) lex = sorted(octupole.keys()) octupole = [octupole[key] for key in lex] octupole = utils.convertor(numpy.array(octupole), "ebohr3", "Debye.ang2") if not hasattr(self, 'moments') or len(self.moments) < 2: self.logger.warning("Found octupole moments but no previous dipole or quadrupole moments") self.moments = [self.reference, None, None, octupole] elif len(self.moments) == 2: self.logger.warning("Found octupole moments but no previous quadrupole moments") self.moments.append(None) self.moments.append(octupole) else: if len(self.moments) == 3: self.moments.append(octupole) else: assert self.moments[3] == octupole if "Total MP2 energy" in line: mpenerg = float(line.split()[-1]) if not hasattr(self, "mpenergies"): self.mpenergies = [] self.mpenergies.append([]) self.mpenergies[-1].append(utils.convertor(mpenerg, "hartree", "eV")) if "CCSD(T) total energy / hartree" in line: ccenerg = float(line.split()[-1]) if not hasattr(self, "ccenergies"): self.ccenergies = [] self.ccenergies.append([]) self.ccenergies[-1].append(utils.convertor(ccenerg, "hartree", "eV")) if __name__ == "__main__": import doctest, nwchemparser doctest.testmod(nwchemparser, verbose=False) cclib-1.3.1/src/cclib/parser/jaguarparser.py0000644000175100016050000007317612467423323020674 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for Jaguar output files""" import re import numpy from . import logfileparser from . import utils class Jaguar(logfileparser.Logfile): """A Jaguar output file""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(Jaguar, self).__init__(logname="Jaguar", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "Jaguar output file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'Jaguar("%s")' % (self.filename) def normalisesym(self, label): """Normalise the symmetries used by Jaguar. To normalise, three rules need to be applied: (1) To handle orbitals of E symmetry, retain everything before the / (2) Replace two p's by " (2) Replace any remaining single p's by ' >>> t = Jaguar("dummyfile").normalisesym >>> labels = ['A', 'A1', 'Ag', 'Ap', 'App', "A1p", "A1pp", "E1pp/Ap"] >>> answers = map(t, labels) >>> print answers ['A', 'A1', 'Ag', "A'", 'A"', "A1'", 'A1"', 'E1"'] """ ans = label.split("/")[0].replace("pp", '"').replace("p", "'") return ans def before_parsing(self): # We need to track whether we are inside geometry optimization in order # to parse SCF targets/values correctly. self.geoopt = False def after_parsing(self): # This is to make sure we always have optdone after geometry optimizations, # even if it is to be empty for unconverged runs. We have yet to test this # with a regression for Jaguar, though. if self.geoopt and not hasattr(self, 'optdone'): self.optdone = [] def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # Extract charge and multiplicity if line[2:22] == "net molecular charge": self.set_attribute('charge', int(line.split()[-1])) self.set_attribute('mult', int(next(inputfile).split()[-1])) # The Gaussian basis set information is printed before the geometry, and we need # to do some indexing to get this into cclib format, because fn increments # for each engular momentum, but cclib does not (we have just P instead of # all three X/Y/Z with the same parameters. On the other hand, fn enumerates # the atomic orbitals correctly, so use it to build atombasis. # # Gaussian basis set information # # renorm mfac*renorm # atom fn prim L z coef coef coef # -------- ----- ---- --- ------------- ----------- ----------- ----------- # C1 1 1 S 7.161684E+01 1.5433E-01 2.7078E+00 2.7078E+00 # C1 1 2 S 1.304510E+01 5.3533E-01 2.6189E+00 2.6189E+00 # ... # C1 3 6 X 2.941249E+00 2.2135E-01 1.2153E+00 1.2153E+00 # 4 Y 1.2153E+00 # 5 Z 1.2153E+00 # C1 2 8 S 2.222899E-01 1.0000E+00 2.3073E-01 2.3073E-01 # C1 3 7 X 6.834831E-01 8.6271E-01 7.6421E-01 7.6421E-01 # ... # C2 6 1 S 7.161684E+01 1.5433E-01 2.7078E+00 2.7078E+00 # ... # if line.strip() == "Gaussian basis set information": self.skip_lines(inputfile, ['b', 'renorm', 'header', 'd']) # This is probably the only place we can get this information from Jaguar. self.gbasis = [] atombasis = [] line = next(inputfile) fn_per_atom = [] while line.strip(): if len(line.split()) > 3: aname = line.split()[0] fn = int(line.split()[1]) prim = int(line.split()[2]) L = line.split()[3] z = float(line.split()[4]) coef = float(line.split()[5]) # The primitive count is reset for each atom, so use that for adding # new elements to atombasis and gbasis. We could also probably do this # using the atom name, although that perhaps might not always be unique. if prim == 1: atombasis.append([]) fn_per_atom = [] self.gbasis.append([]) # Remember that fn is repeated when functions are contracted. if not fn-1 in atombasis[-1]: atombasis[-1].append(fn-1) # Here we use fn only to know when a new contraction is encountered, # so we don't need to decrement it, and we don't even use all values. # What's more, since we only wish to save the parameters for each subshell # once, we don't even need to consider lines for orbitals other than # those for X*, making things a bit easier. if not fn in fn_per_atom: fn_per_atom.append(fn) label = {'S': 'S', 'X': 'P', 'XX': 'D', 'XXX': 'F'}[L] self.gbasis[-1].append((label, [])) igbasis = fn_per_atom.index(fn) self.gbasis[-1][igbasis][1].append([z, coef]) else: fn = int(line.split()[0]) L = line.split()[1] # Some AO indices are only printed in these lines, for L > 0. if not fn-1 in atombasis[-1]: atombasis[-1].append(fn-1) line = next(inputfile) # The indices for atombasis can also be read later from the molecular orbital output. self.set_attribute('atombasis', atombasis) # This length of atombasis should always be the number of atoms. self.set_attribute('natom', len(self.atombasis)) # Effective Core Potential # # Atom Electrons represented by ECP # Mo 36 # Maximum angular term 3 # F Potential 1/r^n Exponent Coefficient # ----- -------- ----------- # 0 140.4577691 -0.0469492 # 1 89.4739342 -24.9754989 # ... # S-F Potential 1/r^n Exponent Coefficient # ----- -------- ----------- # 0 33.7771969 2.9278406 # 1 10.0120020 34.3483716 # ... # O 0 # Cl 10 # Maximum angular term 2 # D Potential 1/r^n Exponent Coefficient # ----- -------- ----------- # 1 94.8130000 -10.0000000 # ... if line.strip() == "Effective Core Potential": self.skip_line(inputfile, 'blank') line = next(inputfile) assert line.split()[0] == "Atom" assert " ".join(line.split()[1:]) == "Electrons represented by ECP" self.coreelectrons = [] line = next(inputfile) while line.strip(): if len(line.split()) == 2: self.coreelectrons.append(int(line.split()[1])) line = next(inputfile) if line[2:14] == "new geometry" or line[1:21] == "Symmetrized geometry" or line.find("Input geometry") > 0: # Get the atom coordinates if not hasattr(self, "atomcoords") or line[1:21] == "Symmetrized geometry": # Wipe the "Input geometry" if "Symmetrized geometry" present self.atomcoords = [] p = re.compile("(\D+)\d+") # One/more letters followed by a number atomcoords = [] atomnos = [] angstrom = next(inputfile) title = next(inputfile) line = next(inputfile) while line.strip(): temp = line.split() element = p.findall(temp[0])[0] atomnos.append(self.table.number[element]) atomcoords.append(list(map(float, temp[1:]))) line = next(inputfile) self.atomcoords.append(atomcoords) self.atomnos = numpy.array(atomnos, "i") self.set_attribute('natom', len(atomcoords)) # Hartree-Fock energy after SCF if line[1:18] == "SCFE: SCF energy:": if not hasattr(self, "scfenergies"): self.scfenergies = [] temp = line.strip().split() scfenergy = float(temp[temp.index("hartrees") - 1]) scfenergy = utils.convertor(scfenergy, "hartree", "eV") self.scfenergies.append(scfenergy) # Energy after LMP2 correction if line[1:18] == "Total LMP2 Energy": if not hasattr(self, "mpenergies"): self.mpenergies = [[]] lmp2energy = float(line.split()[-1]) lmp2energy = utils.convertor(lmp2energy, "hartree", "eV") self.mpenergies[-1].append(lmp2energy) if line[15:45] == "Geometry optimization complete": if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues) - 1) if line.find("number of occupied orbitals") > 0: # Get number of MOs occs = int(line.split()[-1]) line = next(inputfile) virts = int(line.split()[-1]) self.nmo = occs + virts self.homos = numpy.array([occs-1], "i") self.unrestrictedflag = False if line[1:28] == "number of occupied orbitals": self.homos = numpy.array([float(line.strip().split()[-1])-1], "i") if line[2:27] == "number of basis functions": nbasis = int(line.strip().split()[-1]) self.set_attribute('nbasis', nbasis) if line.find("number of alpha occupied orb") > 0: # Get number of MOs for an unrestricted calc aoccs = int(line.split()[-1]) line = next(inputfile) avirts = int(line.split()[-1]) line = next(inputfile) boccs = int(line.split()[-1]) line = next(inputfile) bvirt = int(line.split()[-1]) self.nmo = aoccs + avirts self.homos = numpy.array([aoccs-1, boccs-1], "i") self.unrestrictedflag = True if line[0:4] == "etot": # Get SCF convergence information if not hasattr(self, "scfvalues"): self.scfvalues = [] self.scftargets = [[5E-5, 5E-6]] values = [] while line[0:4] == "etot": # Jaguar 4.2 # etot 1 N N 0 N -382.08751886450 2.3E-03 1.4E-01 # etot 2 Y Y 0 N -382.27486023153 1.9E-01 1.4E-03 5.7E-02 # Jaguar 6.5 # etot 1 N N 0 N -382.08751881733 2.3E-03 1.4E-01 # etot 2 Y Y 0 N -382.27486018708 1.9E-01 1.4E-03 5.7E-02 temp = line.split()[7:] if len(temp)==3: denergy = float(temp[0]) else: denergy = 0 # Should really be greater than target value # or should we just ignore the values in this line ddensity = float(temp[-2]) maxdiiserr = float(temp[-1]) if not self.geoopt: values.append([denergy, ddensity]) else: values.append([ddensity]) line = next(inputfile) self.scfvalues.append(values) # MO energies and symmetries. # Jaguar 7.0: provides energies and symmetries for both # restricted and unrestricted calculations, like this: # Alpha Orbital energies/symmetry label: # -10.25358 Bu -10.25353 Ag -10.21931 Bu -10.21927 Ag # -10.21792 Bu -10.21782 Ag -10.21773 Bu -10.21772 Ag # ... # Jaguar 6.5: prints both only for restricted calculations, # so for unrestricted calculations the output it looks like this: # Alpha Orbital energies: # -10.25358 -10.25353 -10.21931 -10.21927 -10.21792 -10.21782 # -10.21773 -10.21772 -10.21537 -10.21537 -1.02078 -0.96193 # ... # Presence of 'Orbital energies' is enough to catch all versions. if "Orbital energies" in line: # Parsing results is identical for restricted/unrestricted # calculations, just assert later that alpha/beta order is OK. spin = int(line[2:6] == "Beta") # Check if symmetries are printed also. issyms = "symmetry label" in line if not hasattr(self, "moenergies"): self.moenergies = [] if issyms and not hasattr(self, "mosyms"): self.mosyms = [] # Grow moeneriges/mosyms and make sure they are empty when # parsed multiple times - currently cclib returns only # the final output (ex. in a geomtry optimization). if len(self.moenergies) < spin+1: self.moenergies.append([]) self.moenergies[spin] = [] if issyms: if len(self.mosyms) < spin+1: self.mosyms.append([]) self.mosyms[spin] = [] line = next(inputfile).split() while len(line) > 0: if issyms: energies = [float(line[2*i]) for i in range(len(line)//2)] syms = [line[2*i+1] for i in range(len(line)//2)] else: energies = [float(e) for e in line] energies = [utils.convertor(e, "hartree", "eV") for e in energies] self.moenergies[spin].extend(energies) if issyms: syms = [self.normalisesym(s) for s in syms] self.mosyms[spin].extend(syms) line = next(inputfile).split() line = next(inputfile) # The second trigger string is in the version 8.3 unit test and the first one was # encountered in version 6.x and is followed by a bit different format. In particular, # the line with occupations is missing in each block. Here is a fragment of this block # from version 8.3: # # ***************************************** # # occupied + virtual orbitals: final wave function # # ***************************************** # # # 1 2 3 4 5 # eigenvalues- -11.04064 -11.04058 -11.03196 -11.03196 -11.02881 # occupations- 2.00000 2.00000 2.00000 2.00000 2.00000 # 1 C1 S 0.70148 0.70154 -0.00958 -0.00991 0.00401 # 2 C1 S 0.02527 0.02518 0.00380 0.00374 0.00371 # ... # if line.find("Occupied + virtual Orbitals- final wvfn") > 0 or \ line.find("occupied + virtual orbitals: final wave function") > 0: self.skip_lines(inputfile, ['b', 's', 'b', 'b']) if not hasattr(self,"mocoeffs"): self.mocoeffs = [] aonames = [] lastatom = "X" readatombasis = False if not hasattr(self, "atombasis"): self.atombasis = [] for i in range(self.natom): self.atombasis.append([]) readatombasis = True offset = 0 spin = 1 + int(self.unrestrictedflag) for s in range(spin): mocoeffs = numpy.zeros((len(self.moenergies[s]), self.nbasis), "d") if s == 1: #beta case self.skip_lines(inputfile, ['s', 'b', 'title', 'b', 's', 'b', 'b']) for k in range(0, len(self.moenergies[s]), 5): self.updateprogress(inputfile, "Coefficients") # All known version have a line with indices followed by the eigenvalues. self.skip_lines(inputfile, ['numbers', 'eigens']) # Newer version also have a line with occupation numbers here. line = next(inputfile) if "occupations-" in line: line = next(inputfile) for i in range(self.nbasis): info = line.split() # Fill atombasis only first time around. if readatombasis and k == 0: orbno = int(info[0]) atom = info[1] if atom[1].isalpha(): atomno = int(atom[2:]) else: atomno = int(atom[1:]) self.atombasis[atomno-1].append(orbno-1) if not hasattr(self,"aonames"): if lastatom != info[1]: scount = 1 pcount = 3 dcount = 6 #six d orbitals in Jaguar if info[2] == 'S': aonames.append("%s_%i%s"%(info[1], scount, info[2])) scount += 1 if info[2] == 'X' or info[2] == 'Y' or info[2] == 'Z': aonames.append("%s_%iP%s"%(info[1], pcount / 3, info[2])) pcount += 1 if info[2] == 'XX' or info[2] == 'YY' or info[2] == 'ZZ' or \ info[2] == 'XY' or info[2] == 'XZ' or info[2] == 'YZ': aonames.append("%s_%iD%s"%(info[1], dcount / 6, info[2])) dcount += 1 lastatom = info[1] for j in range(len(info[3:])): mocoeffs[j+k, i] = float(info[3+j]) line = next(inputfile) if not hasattr(self,"aonames"): self.aonames = aonames offset += 5 self.mocoeffs.append(mocoeffs) # Atomic charges from Mulliken population analysis: # # Atom C1 C2 C3 C4 C5 # Charge 0.00177 -0.06075 -0.05956 0.00177 -0.06075 # # Atom H6 H7 H8 C9 C10 # ... if line.strip() == "Atomic charges from Mulliken population analysis:": if not hasattr(self, 'atomcharges'): self.atomcharges = {} charges = [] self.skip_line(inputfile, "blank") line = next(inputfile) while "sum of atomic charges" not in line: assert line.split()[0] == "Atom" line = next(inputfile) assert line.split()[0] == "Charge" charges.extend([float(c) for c in line.split()[1:]]) self.skip_line(inputfile, "blank") line = next(inputfile) self.atomcharges['mulliken'] = charges if (line[2:6] == "olap") or (line.strip() == "overlap matrix:"): if line[6] == "-": return # This was continue (in loop) before parser refactoring. # continue # avoid "olap-dev" self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d") for i in range(0, self.nbasis, 5): self.updateprogress(inputfile, "Overlap") self.skip_lines(inputfile, ['b', 'header']) for j in range(i, self.nbasis): temp = list(map(float, next(inputfile).split()[1:])) self.aooverlaps[j, i:(i+len(temp))] = temp self.aooverlaps[i:(i+len(temp)), j] = temp if line[2:24] == "start of program geopt": if not self.geoopt: # Need to keep only the RMS density change info # if this is a geooptz self.scftargets = [[self.scftargets[0][0]]] if hasattr(self, "scfvalues"): self.scfvalues[0] = [[x[0]] for x in self.scfvalues[0]] self.geoopt = True else: self.scftargets.append([5E-5]) # Get Geometry Opt convergence information # # geometry optimization step 7 # energy: -382.30219111487 hartrees # [ turning on trust-radius adjustment ] # ** restarting optimization from step 6 ** # # # Level shifts adjusted to satisfy step-size constraints # Step size: 0.0360704 # Cos(theta): 0.8789215 # Final level shift: -8.6176299E-02 # # energy change: 2.5819E-04 . ( 5.0000E-05 ) # gradient maximum: 5.0947E-03 . ( 4.5000E-04 ) # gradient rms: 1.2996E-03 . ( 3.0000E-04 ) # displacement maximum: 1.3954E-02 . ( 1.8000E-03 ) # displacement rms: 4.6567E-03 . ( 1.2000E-03 ) # if line[2:28] == "geometry optimization step": if not hasattr(self, "geovalues"): self.geovalues = [] self.geotargets = numpy.zeros(5, "d") gopt_step = int(line.split()[-1]) energy = next(inputfile) blank = next(inputfile) # A quick hack for messages that show up right after the energy # at this point, which include: # ** restarting optimization from step 2 ** # [ turning on trust-radius adjustment ] # as found in regression file ptnh3_2_H2O_2_2plus.out and other logfiles. restarting_from_1 = False while blank.strip(): if blank.strip() == "** restarting optimization from step 1 **": restarting_from_1 = True blank = next(inputfile) # One or more blank lines, depending on content. line = next(inputfile) while not line.strip(): line = next(inputfile) # Note that the level shift message is followed by a blank, too. if "Level shifts adjusted" in line: while line.strip(): line = next(inputfile) line = next(inputfile) # The first optimization step does not produce an energy change, and # ther is also no energy change when the optimization is restarted # from step 1 (since step 1 had no change). values = [] target_index = 0 if (gopt_step == 1) or restarting_from_1: values.append(0.0) target_index = 1 while line.strip(): if len(line) > 40 and line[41] == "(": # A new geo convergence value values.append(float(line[26:37])) self.geotargets[target_index] = float(line[43:54]) target_index += 1 line = next(inputfile) self.geovalues.append(values) # IR output looks like this: # frequencies 72.45 113.25 176.88 183.76 267.60 312.06 # symmetries Au Bg Au Bu Ag Bg # intensities 0.07 0.00 0.28 0.52 0.00 0.00 # reduc. mass 1.90 0.74 1.06 1.42 1.19 0.85 # force const 0.01 0.01 0.02 0.03 0.05 0.05 # C1 X 0.00000 0.00000 0.00000 -0.05707 -0.06716 0.00000 # C1 Y 0.00000 0.00000 0.00000 0.00909 -0.02529 0.00000 # C1 Z 0.04792 -0.06032 -0.01192 0.00000 0.00000 0.11613 # C2 X 0.00000 0.00000 0.00000 -0.06094 -0.04635 0.00000 # ... etc. ... # This is a complete ouput, some files will not have intensities, # and older Jaguar versions sometimes skip the symmetries. if line[2:23] == "start of program freq": self.skip_line(inputfile, 'blank') # Version 8.3 has two blank lines here, earlier versions just one. line = next(inputfile) if not line.strip(): line = next(inputfile) self.vibfreqs = [] self.vibdisps = [] forceconstants = False intensities = False while line.strip(): if "force const" in line: forceconstants = True if "intensities" in line: intensities = True line = next(inputfile) # In older version, the last block had an extra blank line after it, # which could be caught. This is not true in newer version (including 8.3), # but in general it would be better to bound this loop more strictly. freqs = next(inputfile) while freqs.strip() and not "imaginary frequencies" in freqs: # Number of modes (columns printed in this block). nmodes = len(freqs.split())-1 # Append the frequencies. self.vibfreqs.extend(list(map(float, freqs.split()[1:]))) line = next(inputfile).split() # May skip symmetries (older Jaguar versions). if line[0] == "symmetries": if not hasattr(self, "vibsyms"): self.vibsyms = [] self.vibsyms.extend(list(map(self.normalisesym, line[1:]))) line = next(inputfile).split() if intensities: if not hasattr(self, "vibirs"): self.vibirs = [] self.vibirs.extend(list(map(float, line[1:]))) line = next(inputfile).split() if forceconstants: line = next(inputfile) # Start parsing the displacements. # Variable 'q' holds up to 7 lists of triplets. q = [ [] for i in range(7) ] for n in range(self.natom): # Variable 'p' holds up to 7 triplets. p = [ [] for i in range(7) ] for i in range(3): line = next(inputfile) disps = [float(disp) for disp in line.split()[2:]] for j in range(nmodes): p[j].append(disps[j]) for i in range(nmodes): q[i].append(p[i]) self.vibdisps.extend(q[:nmodes]) self.skip_line(inputfile, 'blank') freqs = next(inputfile) # Convert new data to arrays. self.vibfreqs = numpy.array(self.vibfreqs, "d") self.vibdisps = numpy.array(self.vibdisps, "d") if hasattr(self, "vibirs"): self.vibirs = numpy.array(self.vibirs, "d") # Parse excited state output (for CIS calculations). # Jaguar calculates only singlet states. if line[2:15] == "Excited State": if not hasattr(self, "etenergies"): self.etenergies = [] if not hasattr(self, "etoscs"): self.etoscs = [] if not hasattr(self, "etsecs"): self.etsecs = [] self.etsyms = [] etenergy = float(line.split()[3]) etenergy = utils.convertor(etenergy, "eV", "cm-1") self.etenergies.append(etenergy) self.skip_lines(inputfile, ['line', 'line', 'line', 'line']) line = next(inputfile) self.etsecs.append([]) # Jaguar calculates only singlet states. self.etsyms.append('Singlet-A') while line.strip() != "": fromMO = int(line.split()[0])-1 toMO = int(line.split()[2])-1 coeff = float(line.split()[-1]) self.etsecs[-1].append([(fromMO, 0), (toMO, 0), coeff]) line = next(inputfile) # Skip 3 lines for i in range(4): line = next(inputfile) strength = float(line.split()[-1]) self.etoscs.append(strength) if __name__ == "__main__": import doctest, jaguarparser doctest.testmod(jaguarparser, verbose=False) cclib-1.3.1/src/cclib/parser/psiparser.py0000644000175100016050000011547012467423323020210 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2008-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for Psi3 and Psi4 output files""" import re import numpy from . import logfileparser from . import utils class Psi(logfileparser.Logfile): """A Psi log file.""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(Psi, self).__init__(logname="Psi", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "Psi log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'Psi("%s")' % (self.filename) def before_parsing(self): # There are some major differences between the output of Psi3 and Psi4, # so it will be useful to register which one we are dealing with. self.version = None # This is just used to track which part of the output we are in for Psi4, # with changes triggered by ==> things like this <== (Psi3 does not have this) self.section = None def normalisesym(self, label): """Use standard symmetry labels instead of NWChem labels. To normalise: (1) If label is one of [SG, PI, PHI, DLTA], replace by [sigma, pi, phi, delta] (2) replace any G or U by their lowercase equivalent >>> sym = NWChem("dummyfile").normalisesym >>> labels = ['A1', 'AG', 'A1G', "SG", "PI", "PHI", "DLTA", 'DLTU', 'SGG'] >>> map(sym, labels) ['A1', 'Ag', 'A1g', 'sigma', 'pi', 'phi', 'delta', 'delta.u', 'sigma.g'] """ # FIXME if necessary return label def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # The version should always be detected. if "PSI3: An Open-Source Ab Initio" in line: self.version = 3 if "PSI4: An Open-Source Ab Initio" in line: self.version = 4 # This will automatically change the section attribute for Psi4, when encountering # a line that <== looks like this ==>, to whatever is in between. if (line.strip()[:3] == "==>") and (line.strip()[-3:] == "<=="): self.section = line.strip()[4:-4] # Psi3 print the coordinates in several configurations, and we will parse the # the canonical coordinates system in Angstroms as the first coordinate set, # although ir is actually somewhere later in the input, after basis set, etc. # We can also get or verify he number of atoms and atomic numbers from this block. if (self.version == 3) and (line.strip() == "-Geometry in the canonical coordinate system (Angstrom):"): self.skip_lines(inputfile, ['header', 'd']) coords = [] numbers = [] line = next(inputfile) while line.strip(): element = line.split()[0] numbers.append(self.table.number[element]) x = float(line.split()[1]) y = float(line.split()[2]) z = float(line.split()[3]) coords.append([x,y,z]) line = next(inputfile) self.set_attribute('natom', len(coords)) self.set_attribute('atomnos', numbers) if not hasattr(self, 'atomcoords'): self.atomcoords = [] self.atomcoords.append(coords) # ==> Geometry <== # # Molecular point group: c2h # Full point group: C2h # # Geometry (in Angstrom), charge = 0, multiplicity = 1: # # Center X Y Z # ------------ ----------------- ----------------- ----------------- # C -1.415253322400 0.230221785400 0.000000000000 # C 1.415253322400 -0.230221785400 0.000000000000 # ... # if (self.section == "Geometry") and ("Geometry (in Angstrom), charge" in line): assert line.split()[3] == "charge" charge = int(line.split()[5].strip(',')) self.set_attribute('charge', charge) assert line.split()[6] == "multiplicity" mult = int(line.split()[8].strip(':')) self.set_attribute('mult', mult) self.skip_line(inputfile, "blank") line = next(inputfile) # Usually there is the header and dashes, but, for example, the coordinates # printed when a geometry optimization finishes do not have it. if line.split()[0] == "Center": self.skip_line(inputfile, "dashes") line = next(inputfile) elements = [] coords = [] while line.strip(): el, x, y, z = line.split() elements.append(el) coords.append([float(x), float(y), float(z)]) line = next(inputfile) self.set_attribute('atomnos', [self.table.number[el] for el in elements]) if not hasattr(self, 'atomcoords'): self.atomcoords = [] # This condition discards any repeated coordinates that Psi print. For example, # geometry optimizations will print the coordinates at the beginning of and SCF # section and also at the start of the gradient calculation. if len(self.atomcoords) == 0 or self.atomcoords[-1] != coords: self.atomcoords.append(coords) # In Psi3 there are these two helpful sections. if (self.version == 3) and (line.strip() == '-SYMMETRY INFORMATION:'): line = next(inputfile) while line.strip(): if "Number of atoms" in line: self.set_attribute('natom', int(line.split()[-1])) line = next(inputfile) if (self.version == 3) and (line.strip() == "-BASIS SET INFORMATION:"): line = next(inputfile) while line.strip(): if "Number of AO" in line: self.set_attribute('nbasis', int(line.split()[-1])) line = next(inputfile) # Psi4 repeats the charge and multiplicity after the geometry. if (self.section == "Geometry") and (line[2:16].lower() == "charge ="): charge = int(line.split()[-1]) self.set_attribute('charge', charge) if (self.section == "Geometry") and (line[2:16].lower() == "multiplicity ="): mult = int(line.split()[-1]) self.set_attribute('mult', mult) # In Psi3, the section with the contraction scheme can be used to infer atombasis. if (self.version == 3) and line.strip() == "-Contraction Scheme:": self.skip_lines(inputfile, ['header', 'd']) indices = [] line = next(inputfile) while line.strip(): shells = line.split('//')[-1] expression = shells.strip().replace(' ', '+') expression = expression.replace('s', '*1') expression = expression.replace('p', '*3') expression = expression.replace('d', '*6') nfuncs = eval(expression) if len(indices) == 0: indices.append(range(nfuncs)) else: start = indices[-1][-1] + 1 indices.append(range(start, start+nfuncs)) line = next(inputfile) self.set_attribute('atombasis', indices) # In Psi3, the integrals program prints useful information when invoked. if (self.version == 3) and (line.strip() == "CINTS: An integrals program written in C"): self.skip_lines(inputfile, ['authors', 'd', 'b', 'b']) line = next(inputfile) assert line.strip() == "-OPTIONS:" while line.strip(): line = next(inputfile) line = next(inputfile) assert line.strip() == "-CALCULATION CONSTANTS:" while line.strip(): if "Number of atoms" in line: natom = int(line.split()[-1]) self.set_attribute('natom', natom) if "Number of atomic orbitals" in line: nbasis = int(line.split()[-1]) self.set_attribute('nbasis', nbasis) line = next(inputfile) # In Psi3, this part contains alot of important data pertaining to the SCF, but not only: if (self.version == 3) and (line.strip() == "CSCF3.0: An SCF program written in C"): self.skip_lines(inputfile, ['b', 'authors', 'b', 'd', 'b', 'mult', 'mult_comment', 'b']) line = next(inputfile) while line.strip(): if line.split()[0] == "multiplicity": mult = int(line.split()[-1]) self.set_attribute('mult', mult) if line.split()[0] == "charge": charge = int(line.split()[-1]) self.set_attribute('charge', charge) if line.split()[0] == "convergence": conv = float(line.split()[-1]) line = next(inputfile) if not hasattr(self, 'scftargets'): self.scftargets = [] self.scftargets.append([conv]) # The printout for Psi4 has a more obvious trigger for the SCF parameter printout. if (self.section == "Algorithm") and (line.strip() == "==> Algorithm <=="): self.skip_line(inputfile, 'blank') line = next(inputfile) while line.strip(): if "Energy threshold" in line: etarget = float(line.split()[-1]) if "Density threshold" in line: dtarget = float(line.split()[-1]) line = next(inputfile) if not hasattr(self, "scftargets"): self.scftargets = [] self.scftargets.append([etarget, dtarget]) # This section prints contraction information before the atomic basis set functions and # is a good place to parse atombasis indices as well as atomnos. However, the section this line # is in differs between HF and DFT outputs. # # -Contraction Scheme: # Atom Type All Primitives // Shells: # ------ ------ -------------------------- # 1 C 6s 3p // 2s 1p # 2 C 6s 3p // 2s 1p # 3 C 6s 3p // 2s 1p # ... if (self.section == "Primary Basis" or self.section == "DFT Potential") and line.strip() == "-Contraction Scheme:": self.skip_lines(inputfile, ['headers', 'd']) atomnos = [] atombasis = [] atombasis_pos = 0 line = next(inputfile) while line.strip(): element = line.split()[1] atomnos.append(self.table.number[element]) # To count the number of atomic orbitals for the atom, sum up the orbitals # in each type of shell, times the numbers of shells. Currently, we assume # the multiplier is a single digit and that there are only s and p shells, # which will need to be extended later when considering larger basis sets, # with corrections for the cartesian/spherical cases. ao_count = 0 shells = line.split('//')[1].split() for s in shells: count, type = s multiplier = 3*(type=='p') or 1 ao_count += multiplier*int(count) if len(atombasis) > 0: atombasis_pos = atombasis[-1][-1] + 1 atombasis.append(list(range(atombasis_pos, atombasis_pos+ao_count))) line = next(inputfile) self.set_attribute('natom', len(atomnos)) self.set_attribute('atomnos', atomnos) self.set_attribute('atombasis', atombasis) # The atomic basis set is straightforward to parse, but there are some complications # when symmetry is used, because in that case Psi4 only print the symmetry-unique atoms, # and the list of symmetry-equivalent ones is not printed. Therefore, for simplicity here # when an atomic is missing (atom indices are printed) assume the atomic orbitals of the # last atom of the same element before it. This might not work if a mixture of basis sets # is used somehow... but it should cover almost all cases for now. # # Note that Psi also print normalized coefficients (details below). # # ==> AO Basis Functions <== # # [ STO-3G ] # spherical # **** # C 1 # S 3 1.00 # 71.61683700 2.70781445 # 13.04509600 2.61888016 # ... if (self.section == "AO Basis Functions") and (line.strip() == "==> AO Basis Functions <=="): def get_symmetry_atom_basis(gbasis): """Get symmetry atom by replicating the last atom in gbasis of the same element.""" missing_index = len(gbasis) missing_atomno = self.atomnos[missing_index] ngbasis = len(gbasis) last_same = ngbasis - self.atomnos[:ngbasis][::-1].index(missing_atomno) - 1 return gbasis[last_same] dfact = lambda n: (n <= 0) or n * dfact(n-2) def get_normalization_factor(exp, lx, ly, lz): norm_s = (2*exp/numpy.pi)**0.75 if lx + ly + lz > 0: nom = (4*exp)**((lx+ly+lz)/2.0) den = numpy.sqrt(dfact(2*lx-1) * dfact(2*ly-1) * dfact(2*lz-1)) return norm_s * nom / den else: return norm_s self.skip_lines(inputfile, ['b', 'basisname']) line = next(inputfile) spherical = line.strip() == "spherical" if hasattr(self, 'spherical_basis'): assert self.spherical_basis == spherical else: self.spherical_basis = spherical gbasis = [] self.skip_line(inputfile, 'stars') line = next(inputfile) while line.strip(): element, index = line.split() atomno = self.table.number[element] index = int(index) # This is the code that adds missing atoms when symmetry atoms are excluded # from the basis set printout. Again, this will work only if all atoms of # the same element use the same basis set. while index > len(gbasis) + 1: gbasis.append(get_symmetry_atom_basis(gbasis)) gbasis.append([]) line = next(inputfile) while line.find("*") == -1: # The shell type and primitive count is in the first line. shell_type, nprimitives, smthg = line.split() nprimitives = int(nprimitives) # Get the angular momentum for this shell type. momentum = { 'S' : 0, 'P' : 1, 'D' : 2, 'F' : 3, 'G' : 4 }[shell_type.upper()] # Read in the primitives. primitives_lines = [next(inputfile) for i in range(nprimitives)] primitives = [list(map(float, pl.split())) for pl in primitives_lines] # Un-normalize the coefficients. Psi prints the normalized coefficient # of the highest polynomial, namely XX for D orbitals, XXX for F, and so on. for iprim, prim in enumerate(primitives): exp, coef = prim coef = coef / get_normalization_factor(exp, momentum, 0, 0) primitives[iprim] = [exp, coef] primitives = [tuple(p) for p in primitives] shell = [shell_type, primitives] gbasis[-1].append(shell) line = next(inputfile) line = next(inputfile) # We will also need to add symmetry atoms that are missing from the input # at the end of this block, if the symmetry atoms are last. while len(gbasis) < self.natom: gbasis.append(get_symmetry_atom_basis(gbasis)) self.gbasis = gbasis # A block called 'Calculation Information' prints these before starting the SCF. if (self.section == "Pre-Iterations") and ("Number of atoms" in line): natom = int(line.split()[-1]) self.set_attribute('natom', natom) if (self.section == "Pre-Iterations") and ("Number of atomic orbitals" in line): nbasis = int(line.split()[-1]) self.set_attribute('nbasis', nbasis) # ==> Iterations <== # Psi3 converges just the density elements, although it reports in the iterations # changes in the energy as well as the DIIS error. psi3_iterations_header = "iter total energy delta E delta P diiser" if (self.version == 3) and (line.strip() == psi3_iterations_header): if not hasattr(self, 'scfvalues'): self.scfvalues = [] self.scfvalues.append([]) line = next(inputfile) while line.strip(): ddensity = float(line.split()[-2]) self.scfvalues[-1].append([ddensity]) line = next(inputfile) # Psi4 converges both the SCF energy and density elements and reports both in the # iterations printout. However, the default convergence scheme involves a density-fitted # algorithm for efficiency, and this is often followed by a something with exact electron # repulsion integrals. In that case, there are actually two convergence cycles performed, # one for the density-fitted algorithm and one for the exact one, and the iterations are # printed in two blocks separated by some set-up information. if (self.section == "Iterations") and (line.strip() == "==> Iterations <=="): if not hasattr(self, 'scfvalues'): self.scfvalues = [] self.skip_line(inputfile, 'blank') header = next(inputfile) assert header.strip() == "Total Energy Delta E RMS |[F,P]|" scfvals = [] self.skip_line(inputfile, 'blank') line = next(inputfile) while line.strip() != "==> Post-Iterations <==": if line.strip() and line.split()[0] in ["@DF-RHF", "@RHF", "@DF-RKS", "@RKS"]: denergy = float(line.split()[4]) ddensity = float(line.split()[5]) scfvals.append([denergy, ddensity]) line = next(inputfile) self.section = "Post-Iterations" self.scfvalues.append(scfvals) # This section, from which we parse molecular orbital symmetries and # orbital energies, is quite similar for both Psi3 and Psi4, and in fact # the format for orbtials is the same, although the headers and spacers # are a bit different. Let's try to get both parsed with one code block. # # Here is how the block looks like for Psi4: # # Orbital Energies (a.u.) # ----------------------- # # Doubly Occupied: # # 1Bu -11.040586 1Ag -11.040524 2Bu -11.031589 # 2Ag -11.031589 3Bu -11.028950 3Ag -11.028820 # (...) # 15Ag -0.415620 1Bg -0.376962 2Au -0.315126 # 2Bg -0.278361 3Bg -0.222189 # # Virtual: # # 3Au 0.198995 4Au 0.268517 4Bg 0.308826 # 5Au 0.397078 5Bg 0.521759 16Ag 0.565017 # (...) # 24Ag 0.990287 24Bu 1.027266 25Ag 1.107702 # 25Bu 1.124938 # # The case is different in the trigger string. if "orbital energies (a.u.)" in line.lower(): # If this is Psi4, we will be in the appropriate section. assert (self.version == 3) or (self.section == "Post-Iterations") self.moenergies = [[]] self.mosyms = [[]] # Psi4 has dashes under the trigger line, but Psi3 did not. if self.version == 4: self.skip_line(inputfile, 'dashes') self.skip_line(inputfile, 'blank') # Both versions have this case insensisitive substring. doubly = next(inputfile) assert "doubly occupied" in doubly.lower() # Psi4 now has a blank line, Psi3 does not. if self.version == 4: self.skip_line(inputfile, 'blank') line = next(inputfile) while line.strip(): for i in range(len(line.split())//2): self.mosyms[0].append(line.split()[i*2][-2:]) self.moenergies[0].append(line.split()[i*2+1]) line = next(inputfile) # The last orbital energy here represented the HOMO. self.homos = [len(self.moenergies[0])-1] # Different numbers of blank lines in Psi3 and Psi4. if self.version == 3: self.skip_line(inputfile, 'blank') # The header for virtual orbitals is different for the two versions. unoccupied = next(inputfile) if self.version == 3: assert unoccupied.strip() == "Unoccupied orbitals" else: assert unoccupied.strip() == "Virtual:" # Psi4 now has a blank line, Psi3 does not. if self.version == 4: self.skip_line(inputfile, 'blank') line = next(inputfile) while line.strip(): for i in range(len(line.split())//2): self.mosyms[0].append(line.split()[i*2][-2:]) self.moenergies[0].append(line.split()[i*2+1]) line = next(inputfile) # Both Psi3 and Psi4 print the final SCF energy right after the orbital energies, # but the label is different. Psi4 also does DFT, and the label is also different in that case. if (self.version == 3 and "* SCF total energy" in line) or \ (self.section == "Post-Iterations" and ("@RHF Final Energy:" in line or "@RKS Final Energy" in line)): e = float(line.split()[-1]) if not hasattr(self, 'scfenergies'): self.scfenergies = [] self.scfenergies.append(utils.convertor(e, 'hartree', 'eV')) # ==> Molecular Orbitals <== # # 1 2 3 4 5 # # 1 0.7014827 0.7015412 0.0096801 0.0100168 0.0016438 # 2 0.0252630 0.0251793 -0.0037890 -0.0037346 0.0016447 # ... # 59 0.0000133 -0.0000067 0.0000005 -0.0047455 -0.0047455 # 60 0.0000133 0.0000067 0.0000005 0.0047455 -0.0047455 # # Ene -11.0288198 -11.0286067 -11.0285837 -11.0174766 -11.0174764 # Sym Ag Bu Ag Bu Ag # Occ 2 2 2 2 2 # # # 11 12 13 14 15 # # 1 0.1066946 0.1012709 0.0029709 0.0120562 0.1002765 # 2 -0.2753689 -0.2708037 -0.0102079 -0.0329973 -0.2790813 # ... # if (self.section == "Molecular Orbitals") and (line.strip() == "==> Molecular Orbitals <=="): self.skip_line(inputfile, 'blank') mocoeffs = [] indices = next(inputfile) while indices.strip(): indices = [int(i) for i in indices.split()] if len(mocoeffs) < indices[-1]: for i in range(len(indices)): mocoeffs.append([]) else: assert len(mocoeffs) == indices[-1] self.skip_line(inputfile, 'blank') line = next(inputfile) while line.strip(): iao = int(line.split()[0]) coeffs = [float(c) for c in line.split()[1:]] for i,c in enumerate(coeffs): mocoeffs[indices[i]-1].append(c) line = next(inputfile) energies = next(inputfile) symmetries = next(inputfile) occupancies = next(inputfile) self.skip_lines(inputfile, ['b', 'b']) indices = next(inputfile) if not hasattr(self, 'mocoeffs'): self.mocoeffs = [] self.mocoeffs.append(mocoeffs) # The formats for Mulliken and Lowdin atomic charges are the same, just with # the name changes, so use the same code for both. # # Properties computed using the SCF density density matrix # Mulliken Charges: (a.u.) # Center Symbol Alpha Beta Spin Total # 1 C 2.99909 2.99909 0.00000 0.00182 # 2 C 2.99909 2.99909 0.00000 0.00182 # ... for pop_type in ["Mulliken", "Lowdin"]: if line.strip() == "%s Charges: (a.u.)" % pop_type: if not hasattr(self, 'atomcharges'): self.atomcharges = {} header = next(inputfile) line = next(inputfile) while not line.strip(): line = next(inputfile) charges = [] while line.strip(): ch = float(line.split()[-1]) charges.append(ch) line = next(inputfile) self.atomcharges[pop_type.lower()] = charges mp_trigger = "MP2 Total Energy (a.u.)" if line.strip()[:len(mp_trigger)] == mp_trigger: mpenergy = utils.convertor(float(line.split()[-1]), 'hartree', 'eV') if not hasattr(self, 'mpenergies'): self.mpenergies = [] self.mpenergies.append([mpenergy]) # Note this is just a start and needs to be modified for CCSD(T), etc. ccsd_trigger = "* CCSD total energy" if line.strip()[:len(ccsd_trigger)] == ccsd_trigger: ccsd_energy = utils.convertor(float(line.split()[-1]), 'hartree', 'eV') if not hasattr(self, "ccenergis"): self.ccenergies = [] self.ccenergies.append(ccsd_energy) # The geometry convergence targets and values are printed in a table, with the legends # describing the convergence annotation. Probably exact slicing of the line needs # to be done in order to extract the numbers correctly. If there are no values for # a paritcular target it means they are not used (marked also with an 'o'), and in this case # we will set a value of numpy.inf so that any value will be smaller. # # ==> Convergence Check <== # # Measures of convergence in internal coordinates in au. # Criteria marked as inactive (o), active & met (*), and active & unmet ( ). # --------------------------------------------------------------------------------------------- # Step Total Energy Delta E MAX Force RMS Force MAX Disp RMS Disp # --------------------------------------------------------------------------------------------- # Convergence Criteria 1.00e-06 * 3.00e-04 * o 1.20e-03 * o # --------------------------------------------------------------------------------------------- # 2 -379.77675264 -7.79e-03 1.88e-02 4.37e-03 o 2.29e-02 6.76e-03 o ~ # --------------------------------------------------------------------------------------------- # if (self.section == "Convergence Check") and line.strip() == "==> Convergence Check <==": self.skip_lines(inputfile, ['b', 'units', 'comment', 'dash+tilde', 'header', 'dash+tilde']) # These are the position in the line at which numbers should start. starts = [27, 41, 55, 69, 83] criteria = next(inputfile) geotargets = [] for istart in starts: if criteria[istart:istart+9].strip(): geotargets.append(float(criteria[istart:istart+9])) else: geotargets.append(numpy.inf) self.skip_line(inputfile, 'dashes') values = next(inputfile) geovalues = [] for istart in starts: if values[istart:istart+9].strip(): geovalues.append(float(values[istart:istart+9])) # This assertion may be too restrictive, but we haven't seen the geotargets change. # If such an example comes up, update the value since we're interested in the last ones. if not hasattr(self, 'geotargets'): self.geotargets = geotargets else: assert self.geotargets == geotargets if not hasattr(self, 'geovalues'): self.geovalues = [] self.geovalues.append(geovalues) # This message signals a converged optimization, in which case we want # to append the index for this step to optdone, which should be equal # to the number of geovalues gathered so far. if line.strip() == "**** Optimization is complete! ****": if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues)) # This message means that optimization has stopped for some reason, but we # still want optdone to exist in this case, although it will be an empty list. if line.strip() == "Optimizer: Did not converge!": if not hasattr(self, 'optdone'): self.optdone = [] # The reference point at which properties are evaluated in Psi4 is explicitely stated, # so we can save it for later. It is not, however, a part of the Properties section, # but it appears before it and also in other places where properies that might depend # on it are printed. # # Properties will be evaluated at 0.000000, 0.000000, 0.000000 Bohr # if (self.version == 4) and ("Properties will be evaluated at" in line.strip()): self.reference = numpy.array([float(x.strip(',')) for x in line.split()[-4:-1]]) assert line.split()[-1] == "Bohr" self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom') # The properties section print the molecular dipole moment: # # ==> Properties <== # # #Properties computed using the SCF density density matrix # Nuclear Dipole Moment: (a.u.) # X: 0.0000 Y: 0.0000 Z: 0.0000 # # Electronic Dipole Moment: (a.u.) # X: 0.0000 Y: 0.0000 Z: 0.0000 # # Dipole Moment: (a.u.) # X: 0.0000 Y: 0.0000 Z: 0.0000 Total: 0.0000 # if (self.section == "Properties") and line.strip() == "Dipole Moment: (a.u.)": line = next(inputfile) dipole = numpy.array([float(line.split()[1]), float(line.split()[3]), float(line.split()[5])]) dipole = utils.convertor(dipole, "ebohr", "Debye") if not hasattr(self, 'moments'): self.moments = [self.reference, dipole] else: try: assert numpy.all(self.moments[1] == dipole) except AssertionError: self.logger.warning('Overwriting previous multipole moments with new values') self.logger.warning('This could be from post-HF properties or geometry optimization') self.moments = [self.reference, dipole] # Higher multipole moments are printed separately, on demand, in lexicographical order. # # Multipole Moments: # # ------------------------------------------------------------------------------------ # Multipole Electric (a.u.) Nuclear (a.u.) Total (a.u.) # ------------------------------------------------------------------------------------ # # L = 1. Multiply by 2.5417462300 to convert to Debye # Dipole X : 0.0000000 0.0000000 0.0000000 # Dipole Y : 0.0000000 0.0000000 0.0000000 # Dipole Z : 0.0000000 0.0000000 0.0000000 # # L = 2. Multiply by 1.3450341749 to convert to Debye.ang # Quadrupole XX : -1535.8888701 1496.8839996 -39.0048704 # Quadrupole XY : -11.5262958 11.4580038 -0.0682920 # ... # if line.strip() == "Multipole Moments:": self.skip_lines(inputfile, ['b', 'd', 'header', 'd', 'b']) # The reference used here should have been printed somewhere # before the properties and parsed above. moments = [self.reference] line = next(inputfile) while "----------" not in line.strip(): rank = int(line.split()[2].strip('.')) multipole = [] line = next(inputfile) while line.strip(): value = float(line.split()[-1]) fromunits = "ebohr" + (rank>1)*("%i" % rank) tounits = "Debye" + (rank>1)*".ang" + (rank>2)*("%i" % (rank-1)) value = utils.convertor(value, fromunits, tounits) multipole.append(value) line = next(inputfile) multipole = numpy.array(multipole) moments.append(multipole) line = next(inputfile) if not hasattr(self, 'moments'): self.moments = moments else: for im,m in enumerate(moments): if len(self.moments) <= im: self.moments.append(m) else: assert numpy.all(self.moments[im] == m) # We can also get some higher moments in Psi3, although here the dipole is not printed # separately and the order is not lexicographical. However, the numbers seem # kind of strange -- the quadrupole seems to be traceless, although I'm not sure # whether the standard transformation has been used. So, until we know what kind # of moment these are and how to make them raw again, we will only parse the dipole. # # -------------------------------------------------------------- # *** Electric multipole moments *** # -------------------------------------------------------------- # # CAUTION : The system has non-vanishing dipole moment, therefore # quadrupole and higher moments depend on the reference point. # # -Coordinates of the reference point (a.u.) : # x y z # -------------------- -------------------- -------------------- # 0.0000000000 0.0000000000 0.0000000000 # # -Electric dipole moment (expectation values) : # # mu(X) = -0.00000 D = -1.26132433e-43 C*m = -0.00000000 a.u. # mu(Y) = 0.00000 D = 3.97987832e-44 C*m = 0.00000000 a.u. # mu(Z) = 0.00000 D = 0.00000000e+00 C*m = 0.00000000 a.u. # |mu| = 0.00000 D = 1.32262368e-43 C*m = 0.00000000 a.u. # # -Components of electric quadrupole moment (expectation values) (a.u.) : # # Q(XX) = 10.62340220 Q(YY) = 1.11816843 Q(ZZ) = -11.74157063 # Q(XY) = 3.64633112 Q(XZ) = 0.00000000 Q(YZ) = 0.00000000 # if (self.version == 3) and line.strip() == "*** Electric multipole moments ***": self.skip_lines(inputfile, ['d', 'b', 'caution1', 'caution2', 'b']) coordinates = next(inputfile) assert coordinates.split()[-2] == "(a.u.)" self.skip_lines(inputfile, ['xyz', 'd']) line = next(inputfile) self.reference = numpy.array([float(x) for x in line.split()]) self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom') self.skip_line(inputfile, "blank") line = next(inputfile) assert "Electric dipole moment" in line self.skip_line(inputfile, "blank") # Make sure to use the column that has the value in Debyes. dipole = [] for i in range(3): line = next(inputfile) dipole.append(float(line.split()[2])) if not hasattr(self, 'moments'): self.moments = [self.reference, dipole] else: assert self.moments[1] == dipole if __name__ == "__main__": import doctest, psiparser doctest.testmod(psiparser, verbose=False) cclib-1.3.1/src/cclib/parser/utils.py0000644000175100016050000001071412467423323017333 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Utilities often used by cclib parsers""" def convertor(value, fromunits, tounits): """Convert from one set of units to another. Sources: NIST 2010 CODATA (http://physics.nist.gov/cuu/Constants/index.html) Documentation of GAMESS-US or other programs as noted >>> print "%.1f" % convertor(8, "eV", "cm-1") 64524.8 """ _convertor = { "Angstrom_to_bohr": lambda x: x * 1.8897261245, "bohr_to_Angstrom": lambda x: x * 0.5291772109, "cm-1_to_eV": lambda x: x / 8065.54429, "cm-1_to_hartree": lambda x: x / 219474.6313708, "cm-1_to_kcal": lambda x: x / 349.7550112, "cm-1_to_kJmol-1": lambda x: x / 83.5934722814, "cm-1_to_nm": lambda x: 1e7 / x, "eV_to_cm-1": lambda x: x * 8065.54429, "eV_to_hartree": lambda x: x / 27.21138505, "eV_to_kcal": lambda x: x * 23.060548867, "eV_to_kJmol-1": lambda x: x * 96.4853364596, "hartree_to_cm-1": lambda x: x * 219474.6313708, "hartree_to_eV": lambda x: x * 27.21138505, "hartree_to_kcal": lambda x: x * 627.50947414, "hartree_to_kJmol-1":lambda x: x * 2625.4996398, "kcal_to_cm-1": lambda x: x * 349.7550112, "kcal_to_eV": lambda x: x / 23.060548867, "kcal_to_hartree": lambda x: x / 627.50947414, "kcal_to_kJmol-1": lambda x: x * 4.184, "kJmol-1_to_cm-1": lambda x: x * 83.5934722814, "kJmol-1_to_eV": lambda x: x / 96.4853364596, "kJmol-1_to_hartree": lambda x: x / 2625.49963978, "kJmol-1_to_kcal": lambda x: x / 4.184, "nm_to_cm-1": lambda x: 1e7 / x, # Taken from GAMESS docs, "Further information", # "Molecular Properties and Conversion Factors" "Debye^2/amu-Angstrom^2_to_km/mol": lambda x: x * 42.255, # Conversion for charges and multipole moments. "e_to_coulomb": lambda x: x * 1.602176565 * 1e-19, "e_to_statcoulomb": lambda x: x * 4.80320425 * 1e-10, "coulomb_to_e": lambda x: x * 0.6241509343 * 1e19, "statcoulomb_to_e": lambda x: x * 0.2081943527 * 1e10, "ebohr_to_Debye": lambda x: x * 2.5417462300, "ebohr2_to_Buckingham": lambda x: x * 1.3450341749, "ebohr2_to_Debye.ang": lambda x: x * 1.3450341749, "ebohr3_to_Debye.ang2": lambda x: x * 0.7117614302, "ebohr4_to_Debye.ang3": lambda x: x * 0.3766479268, "ebohr5_to_Debye.ang4": lambda x: x * 0.1993134985, } return _convertor["%s_to_%s" % (fromunits, tounits)] (value) class PeriodicTable(object): """Allows conversion between element name and atomic no. >>> t = PeriodicTable() >>> t.element[6] 'C' >>> t.number['C'] 6 >>> t.element[44] 'Ru' >>> t.number['Au'] 79 """ def __init__(self): self.element = [None, 'H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F', 'Ne', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Xe', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Pm', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Po', 'At', 'Rn', 'Fr', 'Ra', 'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No', 'Lr', 'Rf', 'Db', 'Sg', 'Bh', 'Hs', 'Mt', 'Ds', 'Rg', 'Uub'] self.number = {} for i in range(1, len(self.element)): self.number[self.element[i]] = i if __name__ == "__main__": import doctest, utils doctest.testmod(utils, verbose=False) cclib-1.3.1/src/cclib/parser/gamessparser.py0000644000175100016050000015654312467423323020702 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for GAMESS(US) output files""" from __future__ import print_function import re import numpy from . import logfileparser from . import utils class GAMESS(logfileparser.Logfile): """A GAMESS/Firefly log file.""" # Used to index self.scftargets[]. SCFRMS, SCFMAX, SCFENERGY = list(range(3)) def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(GAMESS, self).__init__(logname="GAMESS", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "GAMESS log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'GAMESS("%s")' % (self.filename) def normalisesym(self, label): """Normalise the symmetries used by GAMESS. To normalise, two rules need to be applied: (1) Occurences of U/G in the 2/3 position of the label must be lower-cased (2) Two single quotation marks must be replaced by a double >>> t = GAMESS("dummyfile").normalisesym >>> labels = ['A', 'A1', 'A1G', "A'", "A''", "AG"] >>> answers = map(t, labels) >>> print answers ['A', 'A1', 'A1g', "A'", 'A"', 'Ag'] """ if label[1:] == "''": end = '"' else: end = label[1:].replace("U", "u").replace("G", "g") return label[0] + end def before_parsing(self): self.firststdorient = True # Used to decide whether to wipe the atomcoords clean self.cihamtyp = "none" # Type of CI Hamiltonian: saps or dets. self.scftype = "none" # Type of SCF calculation: BLYP, RHF, ROHF, etc. def extract(self, inputfile, line): """Extract information from the file object inputfile.""" if line [1:12] == "INPUT CARD>": return # We are looking for this line: # PARAMETERS CONTROLLING GEOMETRY SEARCH ARE # ... # OPTTOL = 1.000E-04 RMIN = 1.500E-03 if line[10:18] == "OPTTOL =": if not hasattr(self, "geotargets"): opttol = float(line.split()[2]) self.geotargets = numpy.array([opttol, 3. / opttol], "d") # Has to deal with such lines as: # FINAL R-B3LYP ENERGY IS -382.0507446475 AFTER 10 ITERATIONS # FINAL ENERGY IS -379.7594673378 AFTER 9 ITERATIONS # ...so take the number after the "IS" if line.find("FINAL") == 1: if not hasattr(self, "scfenergies"): self.scfenergies = [] temp = line.split() self.scfenergies.append(utils.convertor(float(temp[temp.index("IS") + 1]), "hartree", "eV")) # For total energies after Moller-Plesset corrections, the output looks something like this: # # RESULTS OF MOLLER-PLESSET 2ND ORDER CORRECTION ARE # E(0)= -285.7568061536 # E(1)= 0.0 # E(2)= -0.9679419329 # E(MP2)= -286.7247480864 # where E(MP2) = E(0) + E(2) # # With GAMESS-US 12 Jan 2009 (R3), the preceding text is different: # DIRECT 4-INDEX TRANSFORMATION # SCHWARZ INEQUALITY TEST SKIPPED 0 INTEGRAL BLOCKS # E(SCF)= -76.0088477471 # E(2)= -0.1403745370 # E(MP2)= -76.1492222841 # if line.find("RESULTS OF MOLLER-PLESSET") >= 0 or line[6:37] == "SCHWARZ INEQUALITY TEST SKIPPED": if not hasattr(self, "mpenergies"): self.mpenergies = [] # Each iteration has a new print-out self.mpenergies.append([]) # GAMESS-US presently supports only second order corrections (MP2) # PC GAMESS also has higher levels (3rd and 4th), with different output # Only the highest level MP4 energy is gathered (SDQ or SDTQ) while re.search("DONE WITH MP(\d) ENERGY", line) is None: line = next(inputfile) if len(line.split()) > 0: # Only up to MP2 correction if line.split()[0] == "E(MP2)=": mp2energy = float(line.split()[1]) self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV")) # MP2 before higher order calculations if line.split()[0] == "E(MP2)": mp2energy = float(line.split()[2]) self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV")) if line.split()[0] == "E(MP3)": mp3energy = float(line.split()[2]) self.mpenergies[-1].append(utils.convertor(mp3energy, "hartree", "eV")) if line.split()[0] in ["E(MP4-SDQ)", "E(MP4-SDTQ)"]: mp4energy = float(line.split()[2]) self.mpenergies[-1].append(utils.convertor(mp4energy, "hartree", "eV")) # Total energies after Coupled Cluster calculations # Only the highest Coupled Cluster level result is gathered if line[12:23] == "CCD ENERGY:": if not hasattr(self, "ccenergies"): self.ccenergies = [] ccenergy = float(line.split()[2]) self.ccenergies.append(utils.convertor(ccenergy, "hartree", "eV")) if line.find("CCSD") >= 0 and line.split()[0:2] == ["CCSD", "ENERGY:"]: if not hasattr(self, "ccenergies"): self.ccenergies = [] ccenergy = float(line.split()[2]) line = next(inputfile) if line[8:23] == "CCSD[T] ENERGY:": ccenergy = float(line.split()[2]) line = next(inputfile) if line[8:23] == "CCSD(T) ENERGY:": ccenergy = float(line.split()[2]) self.ccenergies.append(utils.convertor(ccenergy, "hartree", "eV")) # Also collect MP2 energies, which are always calculated before CC if line [8:23] == "MBPT(2) ENERGY:": if not hasattr(self, "mpenergies"): self.mpenergies = [] self.mpenergies.append([]) mp2energy = float(line.split()[2]) self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV")) # Extract charge and multiplicity if line[1:19] == "CHARGE OF MOLECULE": charge = int(line.split()[-1]) self.set_attribute('charge', charge) line = next(inputfile) mult = int(line.split()[-1]) self.set_attribute('mult', mult) # etenergies (originally used only for CIS runs, but now also TD-DFT) if "EXCITATION ENERGIES" in line and line.find("DONE WITH") < 0: if not hasattr(self, "etenergies"): self.etenergies = [] get_etosc = False header = next(inputfile).rstrip() if header.endswith("OSC. STR."): # water_cis_dets.out does not have the oscillator strength # in this table...it is extracted from a different section below get_etosc = True self.etoscs = [] self.skip_line(inputfile, 'dashes') line = next(inputfile) broken = line.split() while len(broken) > 0: # Take hartree value with more numbers, and convert. # Note that the values listed after this are also less exact! etenergy = float(broken[1]) self.etenergies.append(utils.convertor(etenergy, "hartree", "cm-1")) if get_etosc: etosc = float(broken[-1]) self.etoscs.append(etosc) broken = next(inputfile).split() # Detect the CI hamiltonian type, if applicable. # Should always be detected if CIS is done. if line[8:64] == "RESULTS FROM SPIN-ADAPTED ANTISYMMETRIZED PRODUCT (SAPS)": self.cihamtyp = "saps" if line[8:64] == "RESULTS FROM DETERMINANT BASED ATOMIC ORBITAL CI-SINGLES": self.cihamtyp = "dets" # etsecs (used only for CIS runs for now) if line[1:14] == "EXCITED STATE": if not hasattr(self, 'etsecs'): self.etsecs = [] if not hasattr(self, 'etsyms'): self.etsyms = [] statenumber = int(line.split()[2]) spin = int(float(line.split()[7])) if spin == 0: sym = "Singlet" if spin == 1: sym = "Triplet" sym += '-' + line.split()[-1] self.etsyms.append(sym) # skip 5 lines for i in range(5): line = next(inputfile) line = next(inputfile) CIScontribs = [] while line.strip()[0] != "-": MOtype = 0 # alpha/beta are specified for hamtyp=dets if self.cihamtyp == "dets": if line.split()[0] == "BETA": MOtype = 1 fromMO = int(line.split()[-3])-1 toMO = int(line.split()[-2])-1 coeff = float(line.split()[-1]) # With the SAPS hamiltonian, the coefficients are multiplied # by sqrt(2) so that they normalize to 1. # With DETS, both alpha and beta excitations are printed. # if self.cihamtyp == "saps": # coeff /= numpy.sqrt(2.0) CIScontribs.append([(fromMO,MOtype),(toMO,MOtype),coeff]) line = next(inputfile) self.etsecs.append(CIScontribs) # etoscs (used only for CIS runs now) if line[1:50] == "TRANSITION FROM THE GROUND STATE TO EXCITED STATE": if not hasattr(self, "etoscs"): self.etoscs = [] # This was the suggested as a fix in issue #61, and it does allow # the parser to finish without crashing. However, it seems that # etoscs is shorter in this case than the other transition attributes, # so that should be somehow corrected and tested for. if "OPTICALLY" in line: pass else: statenumber = int(line.split()[-1]) # skip 7 lines for i in range(8): line = next(inputfile) strength = float(line.split()[3]) self.etoscs.append(strength) # TD-DFT for GAMESS-US. # The format for excitations has changed a bit between 2007 and 2012. # Original format parser was written for: # # ------------------- # TRIPLET EXCITATIONS # ------------------- # # STATE # 1 ENERGY = 3.027228 EV # OSCILLATOR STRENGTH = 0.000000 # DRF COEF OCC VIR # --- ---- --- --- # 35 -1.105383 35 -> 36 # 69 -0.389181 34 -> 37 # 103 -0.405078 33 -> 38 # 137 0.252485 32 -> 39 # 168 -0.158406 28 -> 40 # # STATE # 2 ENERGY = 4.227763 EV # ... # # Here is the corresponding 2012 version: # # ------------------- # TRIPLET EXCITATIONS # ------------------- # # STATE # 1 ENERGY = 3.027297 EV # OSCILLATOR STRENGTH = 0.000000 # LAMBDA DIAGNOSTIC = 0.925 (RYDBERG/CHARGE TRANSFER CHARACTER) # SYMMETRY OF STATE = A # EXCITATION DE-EXCITATION # OCC VIR AMPLITUDE AMPLITUDE # I A X(I->A) Y(A->I) # --- --- -------- -------- # 35 36 -0.929190 -0.176167 # 34 37 -0.279823 -0.109414 # ... # # We discern these two by the presence of the arrow in the old version. # # The "LET EXCITATIONS" pattern used below catches both # singlet and triplet excitations output. if line[14:29] == "LET EXCITATIONS": self.etenergies = [] self.etoscs = [] self.etsecs = [] etsyms = [] self.skip_lines(inputfile, ['d', 'b']) # Loop while states are still being printed. line = next(inputfile) while line[1:6] == "STATE": self.updateprogress(inputfile, "Excited States") etenergy = utils.convertor(float(line.split()[-2]), "eV", "cm-1") etoscs = float(next(inputfile).split()[-1]) self.etenergies.append(etenergy) self.etoscs.append(etoscs) # Symmetry is not always present, especially in old versions. # Newer versions, on the other hand, can also provide a line # with lambda diagnostic and some extra headers. line = next(inputfile) if "LAMBDA DIAGNOSTIC" in line: line = next(inputfile) if "SYMMETRY" in line: etsyms.append(line.split()[-1]) line = next(inputfile) if "EXCITATION" in line and "DE-EXCITATION" in line: line = next(inputfile) if line.count("AMPLITUDE") == 2: line = next(inputfile) self.skip_line(inputfile, 'dashes') CIScontribs = [] line = next(inputfile) while line.strip(): cols = line.split() if "->" in line: i_occ_vir = [2, 4] i_coeff = 1 else: i_occ_vir = [0, 1] i_coeff = 2 fromMO, toMO = [int(cols[i]) - 1 for i in i_occ_vir] coeff = float(cols[i_coeff]) CIScontribs.append([(fromMO, 0), (toMO, 0), coeff]) line = next(inputfile) self.etsecs.append(CIScontribs) line = next(inputfile) # The symmetries are not always present. if etsyms: self.etsyms = etsyms # Maximum and RMS gradients. if "MAXIMUM GRADIENT" in line or "RMS GRADIENT" in line: parts = line.split() # Avoid parsing the following... ## YOU SHOULD RESTART "OPTIMIZE" RUNS WITH THE COORDINATES ## WHOSE ENERGY IS LOWEST. RESTART "SADPOINT" RUNS WITH THE ## COORDINATES WHOSE RMS GRADIENT IS SMALLEST. THESE ARE NOT ## ALWAYS THE LAST POINT COMPUTED! if parts[0] not in ["MAXIMUM", "RMS", "(1)"]: return if not hasattr(self, "geovalues"): self.geovalues = [] # Newer versions (around 2006) have both maximum and RMS on one line: # MAXIMUM GRADIENT = 0.0531540 RMS GRADIENT = 0.0189223 if len(parts) == 8: maximum = float(parts[3]) rms = float(parts[7]) # In older versions of GAMESS, this spanned two lines, like this: # MAXIMUM GRADIENT = 0.057578167 # RMS GRADIENT = 0.027589766 if len(parts) == 4: maximum = float(parts[3]) line = next(inputfile) parts = line.split() rms = float(parts[3]) # FMO also prints two final one- and two-body gradients (see exam37): # (1) MAXIMUM GRADIENT = 0.0531540 RMS GRADIENT = 0.0189223 if len(parts) == 9: maximum = float(parts[4]) rms = float(parts[8]) self.geovalues.append([maximum, rms]) # This is the input orientation, which is the only data available for # SP calcs, but which should be overwritten by the standard orientation # values, which is the only information available for all geoopt cycles. if line[11:50] == "ATOMIC COORDINATES": if not hasattr(self, "atomcoords"): self.atomcoords = [] line = next(inputfile) atomcoords = [] atomnos = [] line = next(inputfile) while line.strip(): temp = line.strip().split() atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in temp[2:5]]) atomnos.append(int(round(float(temp[1])))) # Don't use the atom name as this is arbitary line = next(inputfile) self.set_attribute('atomnos', atomnos) self.atomcoords.append(atomcoords) if line[12:40] == "EQUILIBRIUM GEOMETRY LOCATED": # Prevent extraction of the final geometry twice if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues) - 1) # Make sure we always have optdone for geomtry optimization, even if not converged. if "GEOMETRY SEARCH IS NOT CONVERGED" in line: if not hasattr(self, 'optdone'): self.optdone = [] # This is the standard orientation, which is the only coordinate # information available for all geometry optimisation cycles. # The input orientation will be overwritten if this is a geometry optimisation # We assume that a previous Input Orientation has been found and # used to extract the atomnos if line[1:29] == "COORDINATES OF ALL ATOMS ARE" and (not hasattr(self, "optdone") or self.optdone == []): self.updateprogress(inputfile, "Coordinates") if self.firststdorient: self.firststdorient = False # Wipes out the single input coordinate at the start of the file self.atomcoords = [] self.skip_lines(inputfile, ['line', '-']) atomcoords = [] line = next(inputfile) for i in range(self.natom): temp = line.strip().split() atomcoords.append(list(map(float, temp[2:5]))) line = next(inputfile) self.atomcoords.append(atomcoords) # Section with SCF information. # # The space at the start of the search string is to differentiate from MCSCF. # Everything before the search string is stored as the type of SCF. # SCF types may include: BLYP, RHF, ROHF, UHF, etc. # # For example, in exam17 the section looks like this (note that this is GVB): # ------------------------ # ROHF-GVB SCF CALCULATION # ------------------------ # GVB STEP WILL USE 119875 WORDS OF MEMORY. # # MAXIT= 30 NPUNCH= 2 SQCDF TOL=1.0000E-05 # NUCLEAR ENERGY= 6.1597411978 # EXTRAP=T DAMP=F SHIFT=F RSTRCT=F DIIS=F SOSCF=F # # ITER EX TOTAL ENERGY E CHANGE SQCDF DIIS ERROR # 0 0 -38.298939963 -38.298939963 0.131784454 0.000000000 # 1 1 -38.332044339 -0.033104376 0.026019716 0.000000000 # ... and will be terminated by a blank line. if line.rstrip()[-16:] == " SCF CALCULATION": # Remember the type of SCF. self.scftype = line.strip()[:-16] self.skip_line(inputfile, 'dashes') while line [:5] != " ITER": self.updateprogress(inputfile, "Attributes") # GVB uses SQCDF for checking convergence (for example in exam17). if "GVB" in self.scftype and "SQCDF TOL=" in line: scftarget = float(line.split("=")[-1]) # Normally, however, the density is used as the convergence criterium. # Deal with various versions: # (GAMESS VERSION = 12 DEC 2003) # DENSITY MATRIX CONV= 2.00E-05 DFT GRID SWITCH THRESHOLD= 3.00E-04 # (GAMESS VERSION = 22 FEB 2006) # DENSITY MATRIX CONV= 1.00E-05 # (PC GAMESS version 6.2, Not DFT?) # DENSITY CONV= 1.00E-05 elif "DENSITY CONV" in line or "DENSITY MATRIX CONV" in line: scftarget = float(line.split()[-1]) line = next(inputfile) if not hasattr(self, "scftargets"): self.scftargets = [] self.scftargets.append([scftarget]) if not hasattr(self,"scfvalues"): self.scfvalues = [] # Normally the iterations print in 6 columns. # For ROHF, however, it is 5 columns, thus this extra parameter. if "ROHF" in self.scftype: self.scf_valcol = 4 else: self.scf_valcol = 5 line = next(inputfile) # SCF iterations are terminated by a blank line. # The first four characters usually contains the step number. # However, lines can also contain messages, including: # * * * INITIATING DIIS PROCEDURE * * * # CONVERGED TO SWOFF, SO DFT CALCULATION IS NOW SWITCHED ON # DFT CODE IS SWITCHING BACK TO THE FINER GRID values = [] while line.strip(): try: temp = int(line[0:4]) except ValueError: pass else: values.append([float(line.split()[self.scf_valcol])]) line = next(inputfile) self.scfvalues.append(values) # Sometimes, only the first SCF cycle has the banner parsed for above, # so we must identify them from the header before the SCF iterations. # The example we have for this is the GeoOpt unittest for Firefly8. if line[1:8] == "ITER EX": # In this case, the convergence targets are not printed, so we assume # they do not change. self.scftargets.append(self.scftargets[-1]) values = [] line = next(inputfile) while line.strip(): try: temp = int(line[0:4]) except ValueError: pass else: values.append([float(line.split()[self.scf_valcol])]) line = next(inputfile) self.scfvalues.append(values) # Extract normal coordinate analysis, including vibrational frequencies (vibfreq), # IT intensities (vibirs) and displacements (vibdisps). # # This section typically looks like the following in GAMESS-US: # # MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS. # # FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2, # REDUCED MASSES IN AMU. # # 1 2 3 4 5 # FREQUENCY: 52.49 41.45 17.61 9.23 10.61 # REDUCED MASS: 3.92418 3.77048 5.43419 6.44636 5.50693 # IR INTENSITY: 0.00013 0.00001 0.00004 0.00000 0.00003 # # ...or in the case of a numerical Hessian job... # # MODES 1 TO 5 ARE TAKEN AS ROTATIONS AND TRANSLATIONS. # # FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2, # REDUCED MASSES IN AMU. # # 1 2 3 4 5 # FREQUENCY: 0.05 0.03 0.03 30.89 30.94 # REDUCED MASS: 8.50125 8.50137 8.50136 1.06709 1.06709 # # ...whereas PC-GAMESS has... # # MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS. # # FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2 # # 1 2 3 4 5 # FREQUENCY: 5.89 1.46 0.01 0.01 0.01 # IR INTENSITY: 0.00000 0.00000 0.00000 0.00000 0.00000 # # If Raman is present we have (for PC-GAMESS)... # # MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS. # # FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2 # RAMAN INTENSITIES IN ANGSTROM**4/AMU, DEPOLARIZATIONS ARE DIMENSIONLESS # # 1 2 3 4 5 # FREQUENCY: 5.89 1.46 0.04 0.03 0.01 # IR INTENSITY: 0.00000 0.00000 0.00000 0.00000 0.00000 # RAMAN INTENSITY: 12.675 1.828 0.000 0.000 0.000 # DEPOLARIZATION: 0.750 0.750 0.124 0.009 0.750 # # If GAMESS-US or PC-GAMESS has not reached the stationary point we have # and additional warning, repeated twice, like so (see n_water.log for an example): # # ******************************************************* # * THIS IS NOT A STATIONARY POINT ON THE MOLECULAR PES * # * THE VIBRATIONAL ANALYSIS IS NOT VALID !!! * # ******************************************************* # # There can also be additional warnings about the selection of modes, for example: # # * * * WARNING, MODE 6 HAS BEEN CHOSEN AS A VIBRATION # WHILE MODE12 IS ASSUMED TO BE A TRANSLATION/ROTATION. # PLEASE VERIFY THE PROGRAM'S DECISION MANUALLY! # if "NORMAL COORDINATE ANALYSIS IN THE HARMONIC APPROXIMATION" in line: self.vibfreqs = [] self.vibirs = [] self.vibdisps = [] # Need to get to the modes line, which is often preceeded by # a list of atomic weights and some possible warnings. # Pass the warnings to the logger if they are there. while not "MODES" in line: self.updateprogress(inputfile, "Frequency Information") line = next(inputfile) if "THIS IS NOT A STATIONARY POINT" in line: msg = "\n This is not a stationary point on the molecular PES" msg += "\n The vibrational analysis is not valid!!!" self.logger.warning(msg) if "* * * WARNING, MODE" in line: line1 = line.strip() line2 = next(inputfile).strip() line3 = next(inputfile).strip() self.logger.warning("\n " + "\n ".join((line1,line2,line3))) # In at least one case (regression zolm_dft3a.log) for older version of GAMESS-US, # the header concerning the range of nodes is formatted wrong and can look like so: # MODES 9 TO14 ARE TAKEN AS ROTATIONS AND TRANSLATIONS. # ... although it's unclear whether this happens for all two-digit values. startrot = int(line.split()[1]) if len(line.split()[2]) == 2: endrot = int(line.split()[3]) else: endrot = int(line.split()[2][2:]) self.skip_line(inputfile, 'blank') # Continue down to the first frequencies line = next(inputfile) while not line.strip() or not line.startswith(" 1"): line = next(inputfile) while not "SAYVETZ" in line: self.updateprogress(inputfile, "Frequency Information") # Note: there may be imaginary frequencies like this (which we make negative): # FREQUENCY: 825.18 I 111.53 12.62 10.70 0.89 # # A note for debuggers: some of these frequencies will be removed later, # assumed to be translations or rotations (see startrot/endrot above). for col in next(inputfile).split()[1:]: if col == "I": self.vibfreqs[-1] *= -1 else: self.vibfreqs.append(float(col)) line = next(inputfile) # Skip the symmetry (appears in newer versions), fixes bug #3476063. if line.find("SYMMETRY") >= 0: line = next(inputfile) # Skip the reduced mass (not always present). if line.find("REDUCED") >= 0: line = next(inputfile) # Not present in numerical Hessian calculations. if line.find("IR INTENSITY") >= 0: irIntensity = map(float, line.strip().split()[2:]) self.vibirs.extend([utils.convertor(x, "Debye^2/amu-Angstrom^2", "km/mol") for x in irIntensity]) line = next(inputfile) # Read in Raman vibrational intensities if present. if line.find("RAMAN") >= 0: if not hasattr(self,"vibramans"): self.vibramans = [] ramanIntensity = line.strip().split() self.vibramans.extend(list(map(float, ramanIntensity[2:]))) depolar = next(inputfile) line = next(inputfile) # This line seems always to be blank. assert line.strip() == '' # Extract the Cartesian displacement vectors. p = [ [], [], [], [], [] ] for j in range(self.natom): q = [ [], [], [], [], [] ] for coord in "xyz": line = next(inputfile)[21:] cols = list(map(float, line.split())) for i, val in enumerate(cols): q[i].append(val) for k in range(len(cols)): p[k].append(q[k]) self.vibdisps.extend(p[:len(cols)]) # Skip the Sayvetz stuff at the end. for j in range(10): line = next(inputfile) self.skip_line(inputfile, 'blank') line = next(inputfile) # Exclude rotations and translations. self.vibfreqs = numpy.array(self.vibfreqs[:startrot-1]+self.vibfreqs[endrot:], "d") self.vibirs = numpy.array(self.vibirs[:startrot-1]+self.vibirs[endrot:], "d") self.vibdisps = numpy.array(self.vibdisps[:startrot-1]+self.vibdisps[endrot:], "d") if hasattr(self, "vibramans"): self.vibramans = numpy.array(self.vibramans[:startrot-1]+self.vibramans[endrot:], "d") if line[5:21] == "ATOMIC BASIS SET": self.gbasis = [] line = next(inputfile) while line.find("SHELL")<0: line = next(inputfile) self.skip_lines(inputfile, ['blank', 'atomname']) # shellcounter stores the shell no of the last shell # in the previous set of primitives shellcounter = 1 while line.find("TOTAL NUMBER")<0: self.skip_line(inputfile, 'blank') line = next(inputfile) shellno = int(line.split()[0]) shellgap = shellno - shellcounter gbasis = [] # Stores basis sets on one atom shellsize = 0 while len(line.split())!=1 and line.find("TOTAL NUMBER")<0: shellsize += 1 coeff = {} # coefficients and symmetries for a block of rows while line.strip(): temp = line.strip().split() sym = temp[1] assert sym in ['S', 'P', 'D', 'F', 'G', 'L'] if sym == "L": # L refers to SP if len(temp)==6: # GAMESS US coeff.setdefault("S", []).append( (float(temp[3]), float(temp[4])) ) coeff.setdefault("P", []).append( (float(temp[3]), float(temp[5])) ) else: # PC GAMESS assert temp[6][-1] == temp[9][-1] == ')' coeff.setdefault("S", []).append( (float(temp[3]), float(temp[6][:-1])) ) coeff.setdefault("P", []).append( (float(temp[3]), float(temp[9][:-1])) ) else: if len(temp)==5: # GAMESS US coeff.setdefault(sym, []).append( (float(temp[3]), float(temp[4])) ) else: # PC GAMESS assert temp[6][-1] == ')' coeff.setdefault(sym, []).append( (float(temp[3]), float(temp[6][:-1])) ) line = next(inputfile) # either a blank or a continuation of the block if sym == "L": gbasis.append( ('S', coeff['S'])) gbasis.append( ('P', coeff['P'])) else: gbasis.append( (sym, coeff[sym])) line = next(inputfile) # either the start of the next block or the start of a new atom or # the end of the basis function section numtoadd = 1 + (shellgap // shellsize) shellcounter = shellno + shellsize for x in range(numtoadd): self.gbasis.append(gbasis) # The eigenvectors, which also include MO energies and symmetries, follow # the *final* report of evalues and the last list of symmetries in the log file: # # ------------ # EIGENVECTORS # ------------ # # 1 2 3 4 5 # -10.0162 -10.0161 -10.0039 -10.0039 -10.0029 # BU AG BU AG AG # 1 C 1 S 0.699293 0.699290 -0.027566 0.027799 0.002412 # 2 C 1 S 0.031569 0.031361 0.004097 -0.004054 -0.000605 # 3 C 1 X 0.000908 0.000632 -0.004163 0.004132 0.000619 # 4 C 1 Y -0.000019 0.000033 0.000668 -0.000651 0.005256 # 5 C 1 Z 0.000000 0.000000 0.000000 0.000000 0.000000 # 6 C 2 S -0.699293 0.699290 0.027566 0.027799 0.002412 # 7 C 2 S -0.031569 0.031361 -0.004097 -0.004054 -0.000605 # 8 C 2 X 0.000908 -0.000632 -0.004163 -0.004132 -0.000619 # 9 C 2 Y -0.000019 -0.000033 0.000668 0.000651 -0.005256 # 10 C 2 Z 0.000000 0.000000 0.000000 0.000000 0.000000 # 11 C 3 S -0.018967 -0.019439 0.011799 -0.014884 -0.452328 # 12 C 3 S -0.007748 -0.006932 0.000680 -0.000695 -0.024917 # 13 C 3 X 0.002628 0.002997 0.000018 0.000061 -0.003608 # ... # # There are blanks lines between each block. # # Warning! There are subtle differences between GAMESS-US and PC-GAMES # in the formatting of the first four columns. In particular, for F orbitals, # PC GAMESS: # 19 C 1 YZ 0.000000 0.000000 0.000000 0.000000 0.000000 # 20 C XXX 0.000000 0.000000 0.000000 0.000000 0.002249 # 21 C YYY 0.000000 0.000000 -0.025555 0.000000 0.000000 # 22 C ZZZ 0.000000 0.000000 0.000000 0.002249 0.000000 # 23 C XXY 0.000000 0.000000 0.001343 0.000000 0.000000 # GAMESS US # 55 C 1 XYZ 0.000000 0.000000 0.000000 0.000000 0.000000 # 56 C 1XXXX -0.000014 -0.000067 0.000000 0.000000 0.000000 # if line.find("EIGENVECTORS") == 10 or line.find("MOLECULAR OBRITALS") == 10: # This is the stuff that we can read from these blocks. self.moenergies = [[]] self.mosyms = [[]] if not hasattr(self, "nmo"): self.nmo = self.nbasis self.mocoeffs = [numpy.zeros((self.nmo, self.nbasis), "d")] readatombasis = False if not hasattr(self, "atombasis"): self.atombasis = [] self.aonames = [] for i in range(self.natom): self.atombasis.append([]) self.aonames = [] readatombasis = True self.skip_line(inputfile, 'dashes') for base in range(0, self.nmo, 5): self.updateprogress(inputfile, "Coefficients") line = next(inputfile) # This makes sure that this section does not end prematurely, # which happens in regression 2CO.ccsd.aug-cc-pVDZ.out. if line.strip() != "": break; numbers = next(inputfile) # Eigenvector numbers. # Sometimes there are some blank lines here. while not line.strip(): line = next(inputfile) # Eigenvalues for these orbitals (in hartrees). try: self.moenergies[0].extend([utils.convertor(float(x), "hartree", "eV") for x in line.split()]) except: self.logger.warning('MO section found but could not be parsed!') break; # Orbital symmetries. line = next(inputfile) if line.strip(): self.mosyms[0].extend(list(map(self.normalisesym, line.split()))) # Now we have nbasis lines. We will use the same method as in normalise_aonames() before. p = re.compile("(\d+)\s*([A-Z][A-Z]?)\s*(\d+)\s*([A-Z]+)") oldatom = '0' i_atom = 0 # counter to keep track of n_atoms > 99 flag_w = True # flag necessary to keep from adding 100's at wrong time for i in range(self.nbasis): line = next(inputfile) # If line is empty, break (ex. for FMO in exam37 which is a regression). if not line.strip(): break # Fill atombasis and aonames only first time around if readatombasis and base == 0: aonames = [] start = line[:17].strip() m = p.search(start) if m: g = m.groups() g2 = int(g[2]) # atom index in GAMESS file; changes to 0 after 99 # Check if we have moved to a hundred # if so, increment the counter and add it to the parsed value # There will be subsequent 0's as that atoms AO's are parsed # so wait until the next atom is parsed before resetting flag if g2 == 0 and flag_w: i_atom = i_atom + 100 flag_w = False # handle subsequent AO's if g2 != 0: flag_w = True # reset flag g2 = g2 + i_atom aoname = "%s%i_%s" % (g[1].capitalize(), g2, g[3]) oldatom = str(g2) atomno = g2-1 orbno = int(g[0])-1 else: # For F orbitals, as shown above g = [x.strip() for x in line.split()] aoname = "%s%s_%s" % (g[1].capitalize(), oldatom, g[2]) atomno = int(oldatom)-1 orbno = int(g[0])-1 self.atombasis[atomno].append(orbno) self.aonames.append(aoname) coeffs = line[15:] # Strip off the crud at the start. j = 0 while j*11+4 < len(coeffs): self.mocoeffs[0][base+j, i] = float(coeffs[j * 11:(j + 1) * 11]) j += 1 line = next(inputfile) # If it's a restricted calc and no more properties, we have: # # ...... END OF RHF/DFT CALCULATION ...... # # If there are more properties (such as the density matrix): # -------------- # # If it's an unrestricted calculation, however, we now get the beta orbitals: # # ----- BETA SET ----- # # ------------ # EIGENVECTORS # ------------ # # 1 2 3 4 5 # ... # line = next(inputfile) # This can come in between the alpha and beta orbitals (see #130). if line.strip() == "LZ VALUE ANALYSIS FOR THE MOS": while line.strip(): line = next(inputfile) line = next(inputfile) if line[2:22] == "----- BETA SET -----": self.mocoeffs.append(numpy.zeros((self.nmo, self.nbasis), "d")) self.moenergies.append([]) self.mosyms.append([]) for i in range(4): line = next(inputfile) for base in range(0, self.nmo, 5): self.updateprogress(inputfile, "Coefficients") blank = next(inputfile) line = next(inputfile) # Eigenvector no line = next(inputfile) self.moenergies[1].extend([utils.convertor(float(x), "hartree", "eV") for x in line.split()]) line = next(inputfile) self.mosyms[1].extend(list(map(self.normalisesym, line.split()))) for i in range(self.nbasis): line = next(inputfile) temp = line[15:] # Strip off the crud at the start j = 0 while j * 11 + 4 < len(temp): self.mocoeffs[1][base+j, i] = float(temp[j * 11:(j + 1) * 11]) j += 1 line = next(inputfile) self.moenergies = [numpy.array(x, "d") for x in self.moenergies] # Natural orbital coefficients and occupation numbers, presently supported only # for CIS calculations. Looks the same as eigenvectors, without symmetry labels. # # -------------------- # CIS NATURAL ORBITALS # -------------------- # # 1 2 3 4 5 # # 2.0158 2.0036 2.0000 2.0000 1.0000 # # 1 O 1 S 0.000000 -0.157316 0.999428 0.164938 0.000000 # 2 O 1 S 0.000000 0.754402 0.004472 -0.581970 0.000000 # ... # if line[10:30] == "CIS NATURAL ORBITALS": self.nocoeffs = numpy.zeros((self.nmo, self.nbasis), "d") self.nooccnos = [] self.skip_line(inputfile, 'dashes') for base in range(0, self.nmo, 5): self.skip_lines(inputfile, ['blank', 'numbers']) # The eigenvalues that go along with these natural orbitals are # their occupation numbers. Sometimes there are blank lines before them. line = next(inputfile) while not line.strip(): line = next(inputfile) eigenvalues = map(float, line.split()) self.nooccnos.extend(eigenvalues) # Orbital symemtry labels are normally here for MO coefficients. line = next(inputfile) # Now we have nbasis lines with the coefficients. for i in range(self.nbasis): line = next(inputfile) coeffs = line[15:] j = 0 while j*11+4 < len(coeffs): self.nocoeffs[base+j, i] = float(coeffs[j * 11:(j + 1) * 11]) j += 1 # We cannot trust this self.homos until we come to the phrase: # SYMMETRIES FOR INITAL GUESS ORBITALS FOLLOW # which either is followed by "ALPHA" or "BOTH" at which point we can say # for certain that it is an un/restricted calculations. # Note that MCSCF calcs also print this search string, so make sure # that self.homos does not exist yet. if line[1:28] == "NUMBER OF OCCUPIED ORBITALS" and not hasattr(self,'homos'): homos = [int(line.split()[-1])-1] line = next(inputfile) homos.append(int(line.split()[-1])-1) self.set_attribute('homos', homos) if line.find("SYMMETRIES FOR INITIAL GUESS ORBITALS FOLLOW") >= 0: # Not unrestricted, so lop off the second index. # In case the search string above was not used (ex. FMO in exam38), # we can try to use the next line which should also contain the # number of occupied orbitals. if line.find("BOTH SET(S)") >= 0: nextline = next(inputfile) if "ORBITALS ARE OCCUPIED" in nextline: homos = int(nextline.split()[0])-1 if hasattr(self,"homos"): try: assert self.homos[0] == homos except AssertionError: self.logger.warning("Number of occupied orbitals not consistent. This is normal for ECP and FMO jobs.") else: self.homos = [homos] self.homos = numpy.resize(self.homos, [1]) # Set the total number of atoms, only once. # Normally GAMESS print TOTAL NUMBER OF ATOMS, however in some cases # this is slightly different (ex. lower case for FMO in exam37). if not hasattr(self,"natom") and "NUMBER OF ATOMS" in line.upper(): natom = int(line.split()[-1]) self.set_attribute('natom', natom) # The first is from Julien's Example and the second is from Alexander's # I think it happens if you use a polar basis function instead of a cartesian one if line.find("NUMBER OF CARTESIAN GAUSSIAN BASIS") == 1 or line.find("TOTAL NUMBER OF BASIS FUNCTIONS") == 1: nbasis = int(line.strip().split()[-1]) self.set_attribute('nbasis', nbasis) elif line.find("TOTAL NUMBER OF CONTAMINANTS DROPPED") >= 0: nmos_dropped = int(line.split()[-1]) if hasattr(self, "nmo"): self.set_attribute('nmo', self.nmo - nmos_dropped) else: self.set_attribute('nmo', self.nbasis - nmos_dropped) # Note that this line is present if ISPHER=1, e.g. for C_bigbasis elif line.find("SPHERICAL HARMONICS KEPT IN THE VARIATION SPACE") >= 0: nmo = int(line.strip().split()[-1]) self.set_attribute('nmo', nmo) # Note that this line is not always present, so by default # NBsUse is set equal to NBasis (see below). elif line.find("TOTAL NUMBER OF MOS IN VARIATION SPACE") == 1: nmo = int(line.split()[-1]) self.set_attribute('nmo', nmo) elif line.find("OVERLAP MATRIX") == 0 or line.find("OVERLAP MATRIX") == 1: # The first is for PC-GAMESS, the second for GAMESS # Read 1-electron overlap matrix if not hasattr(self, "aooverlaps"): self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d") else: self.logger.info("Reading additional aooverlaps...") base = 0 while base < self.nbasis: self.updateprogress(inputfile, "Overlap") self.skip_lines(inputfile, ['b', 'basis_fn_number', 'b']) for i in range(self.nbasis - base): # Fewer lines each time line = next(inputfile) temp = line.split() for j in range(4, len(temp)): self.aooverlaps[base+j-4, i+base] = float(temp[j]) self.aooverlaps[i+base, base+j-4] = float(temp[j]) base += 5 # ECP Pseudopotential information if "ECP POTENTIALS" in line: if not hasattr(self, "coreelectrons"): self.coreelectrons = [0]*self.natom self.skip_lines(inputfile, ['d', 'b']) header = next(inputfile) while header.split()[0] == "PARAMETERS": name = header[17:25] atomnum = int(header[34:40]) # The pseudopotnetial is given explicitely if header[40:50] == "WITH ZCORE": zcore = int(header[50:55]) lmax = int(header[63:67]) self.coreelectrons[atomnum-1] = zcore # The pseudopotnetial is copied from another atom if header[40:55] == "ARE THE SAME AS": atomcopy = int(header[60:]) self.coreelectrons[atomnum-1] = self.coreelectrons[atomcopy-1] line = next(inputfile) while line.split() != []: line = next(inputfile) header = next(inputfile) # This was used before refactoring the parser, geotargets was set here after parsing. #if not hasattr(self, "geotargets"): # opttol = 1e-4 # self.geotargets = numpy.array([opttol, 3. / opttol], "d") #if hasattr(self,"geovalues"): self.geovalues = numpy.array(self.geovalues, "d") # This is quite simple to parse, but some files seem to print certain lines twice, # repeating the populations without charges, but not in proper order. # The unrestricted calculations are a bit tricky, since GAMESS-US prints populations # for both alpha and beta orbitals in the same format and with the same title, # but it still prints the charges only at the very end. if "TOTAL MULLIKEN AND LOWDIN ATOMIC POPULATIONS" in line: if not hasattr(self, "atomcharges"): self.atomcharges = {} header = next(inputfile) line = next(inputfile) # It seems that when population are printed twice (without charges), # there is a blank line along the way (after the first header), # so let's get a flag out of that circumstance. doubles_printed = line.strip() == "" if doubles_printed: title = next(inputfile) header = next(inputfile) line = next(inputfile) # Only go further if the header had five columns, which should # be the case when both populations and charges are printed. # This is pertinent for both double printing and unrestricted output. if not len(header.split()) == 5: return mulliken, lowdin = [], [] while line.strip(): if line.strip() and doubles_printed: line = next(inputfile) mulliken.append(float(line.split()[3])) lowdin.append(float(line.split()[5])) line = next(inputfile) self.atomcharges["mulliken"] = mulliken self.atomcharges["lowdin"] = lowdin # --------------------- # ELECTROSTATIC MOMENTS # --------------------- # # POINT 1 X Y Z (BOHR) CHARGE # -0.000000 0.000000 0.000000 -0.00 (A.U.) # DX DY DZ /D/ (DEBYE) # 0.000000 -0.000000 0.000000 0.000000 # if line.strip() == "ELECTROSTATIC MOMENTS": self.skip_lines(inputfile, ['d', 'b']) line = next(inputfile) # The old PC-GAMESS prints memory assignment information here. if "MEMORY ASSIGNMENT" in line: memory_assignment = next(inputfile) line = next(inputfile) # If something else ever comes up, we should get a signal from this assert. assert line.split()[0] == "POINT" # We can get the reference point from here, as well as # check here that the net charge of the molecule is correct. coords_and_charge = next(inputfile) assert coords_and_charge.split()[-1] == '(A.U.)' reference = numpy.array([float(x) for x in coords_and_charge.split()[:3]]) reference = utils.convertor(reference, 'bohr', 'Angstrom') charge = float(coords_and_charge.split()[-2]) self.set_attribute('charge', charge) dipoleheader = next(inputfile) assert dipoleheader.split()[:3] == ['DX', 'DY', 'DZ'] assert dipoleheader.split()[-1] == "(DEBYE)" dipoleline = next(inputfile) dipole = [float(d) for d in dipoleline.split()[:3]] # The dipole is always the first multipole moment to be printed, # so if it already exists, we will overwrite all moments since we want # to leave just the last printed value (could change in the future). if not hasattr(self, 'moments'): self.moments = [reference, dipole] else: try: assert self.moments[1] == dipole except AssertionError: self.logger.warning('Overwriting previous multipole moments with new values') self.logger.warning('This could be from post-HF properties or geometry optimization') self.moments = [reference, dipole] if __name__ == "__main__": import doctest, gamessparser, sys if len(sys.argv) == 1: doctest.testmod(gamessparser, verbose=False) if len(sys.argv) >= 2: parser = gamessparser.GAMESS(sys.argv[1]) data = parser.parse() if len(sys.argv) > 2: for i in range(len(sys.argv[2:])): if hasattr(data, sys.argv[2 + i]): print(getattr(data, sys.argv[2 + i])) cclib-1.3.1/src/cclib/parser/molproparser.py0000644000175100016050000011221312467423323020715 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for Molpro output files""" import itertools import numpy from . import logfileparser from . import utils def create_atomic_orbital_names(orbitals): """Generate all atomic orbital names that could be used by Molpro. The names are returned in a dictionary, organized by subshell (S, P, D and so on). """ # We can write out the first two manually, since there are not that many. atomic_orbital_names = { 'S': ['s', '1s'], 'P': ['x', 'y', 'z', '2px', '2py', '2pz'], } # Although we could write out all names for the other subshells, it is better # to generate them if we need to expand further, since the number of functions quickly # grows and there are both Cartesian and spherical variants to consider. # For D orbitals, the Cartesian functions are xx, yy, zz, xy, xz and yz, and the # spherical ones are called 3d0, 3d1-, 3d1+, 3d2- and 3d2+. For F orbitals, the Cartesians # are xxx, xxy, xxz, xyy, ... and the sphericals are 4f0, 4f1-, 4f+ and so on. for i, orb in enumerate(orbitals): # Cartesian can be generated directly by combinations. cartesian = list(map(''.join, list(itertools.combinations_with_replacement(['x', 'y', 'z'], i+2)))) # For spherical functions, we need to construct the names. pre = str(i+3) + orb.lower() spherical = [pre + '0'] + [pre + str(j) + s for j in range(1, i+3) for s in ['-', '+']] atomic_orbital_names[orb] = cartesian + spherical return atomic_orbital_names class Molpro(logfileparser.Logfile): """Molpro file parser""" atomic_orbital_names = create_atomic_orbital_names(['D', 'F', 'G']) def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(Molpro, self).__init__(logname="Molpro", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "Molpro log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'Molpro("%s")' % (self.filename) def normalisesym(self, label): """Normalise the symmetries used by Molpro.""" ans = label.replace("`", "'").replace("``", "''") return ans def before_parsing(self): self.electronorbitals = "" self.insidescf = False def after_parsing(self): # If optimization thresholds are default, they are normally not printed and we need # to set them to the default after parsing. Make sure to set them in the same order that # they appear in the in the geometry optimization progress printed in the output, # namely: energy difference, maximum gradient, maximum step. if not hasattr(self, "geotargets"): self.geotargets = [] # Default THRENERG (required accuracy of the optimized energy). self.geotargets.append(1E-6) # Default THRGRAD (required accuracy of the optimized gradient). self.geotargets.append(3E-4) # Default THRSTEP (convergence threshold for the geometry optimization step). self.geotargets.append(3E-4) def extract(self, inputfile, line): """Extract information from the file object inputfile.""" if line[1:19] == "ATOMIC COORDINATES": if not hasattr(self,"atomcoords"): self.atomcoords = [] atomcoords = [] atomnos = [] self.skip_lines(inputfile, ['line', 'line', 'line']) line = next(inputfile) while line.strip(): temp = line.strip().split() atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in temp[3:6]]) #bohrs to angs atomnos.append(int(round(float(temp[2])))) line = next(inputfile) self.atomcoords.append(atomcoords) self.set_attribute('atomnos', atomnos) self.set_attribute('natom', len(self.atomnos)) # Use BASIS DATA to parse input for gbasis, aonames and atombasis. If symmetry is used, # the function number starts from 1 for each irrep (the irrep index comes after the dot). # # BASIS DATA # # Nr Sym Nuc Type Exponents Contraction coefficients # # 1.1 A 1 1s 71.616837 0.154329 # 13.045096 0.535328 # 3.530512 0.444635 # 2.1 A 1 1s 2.941249 -0.099967 # 0.683483 0.399513 # ... # if line[1:11] == "BASIS DATA": # We can do a sanity check with the header. self.skip_line(inputfile, 'blank') header = next(inputfile) assert header.split() == ["Nr", "Sym", "Nuc", "Type", "Exponents", "Contraction", "coefficients"] self.skip_line(inputfile, 'blank') aonames = [] atombasis = [[] for i in range(self.natom)] gbasis = [[] for i in range(self.natom)] while line.strip(): # We need to read the line at the start of the loop here, because the last function # will be added when a blank line signalling the end of the block is encountered. line = next(inputfile) # The formatting here can exhibit subtle differences, including the number of spaces # or indentation size. However, we will rely on explicit slices since not all components # are always available. In fact, components not being there has some meaning (see below). line_nr = line[1:6].strip() line_sym = line[7:9].strip() line_nuc = line[11:14].strip() line_type = line[16:22].strip() line_exp = line[25:38].strip() line_coeffs = line[38:].strip() # If a new function type is printed or the BASIS DATA block ends with a blank line, # then add the previous function to gbasis, except for the first function since # there was no preceeding one. When translating the Molpro function name to gbasis, # note that Molpro prints all components, but we want it only once, with the proper # shell type (S,P,D,F,G). Molpro names also differ between Cartesian/spherical representations. if (line_type and aonames) or line.strip() == "": # All the possible AO names are created with the class. The function should always # find a match in that dictionary, so we can check for that here and will need to # update the dict if something unexpected comes up. funcbasis = None for fb, names in self.atomic_orbital_names.items(): if functype in names: funcbasis = fb assert funcbasis # There is a separate basis function for each column of contraction coefficients. Since all # atomic orbitals for a subshell will have the same parameters, we can simply check if # the function tuple is already in gbasis[i] before adding it. for i in range(len(coefficients[0])): func = (funcbasis, []) for j in range(len(exponents)): func[1].append((exponents[j], coefficients[j][i])) if func not in gbasis[funcatom-1]: gbasis[funcatom-1].append(func) # If it is a new type, set up the variables for the next shell(s). An exception is symmetry functions, # which we want to copy from the previous function and don't have a new number on the line. For them, # we just want to update the nuclear index. if line_type: if line_nr: exponents = [] coefficients = [] functype = line_type funcatom = int(line_nuc) # Add any exponents and coefficients to lists. if line_exp and line_coeffs: funcexp = float(line_exp) funccoeffs = [float(s) for s in line_coeffs.split()] exponents.append(funcexp) coefficients.append(funccoeffs) # If the function number is present then add to atombasis and aonames, which is different from # adding to gbasis since it enumerates AOs rather than basis functions. The number counts functions # in each irrep from 1 and we could add up the functions for each irrep to get the global count, # but it is simpler to just see how many aonames we have already parsed. Any symmetry functions # are also printed, but they don't get numbers so they are nor parsed. if line_nr: element = self.table.element[self.atomnos[funcatom-1]] aoname = "%s%i_%s" % (element, funcatom, functype) aonames.append(aoname) funcnr = len(aonames) atombasis[funcatom-1].append(funcnr-1) self.set_attribute('aonames', aonames) self.set_attribute('atombasis', atombasis) self.set_attribute('gbasis', gbasis) if line[1:23] == "NUMBER OF CONTRACTIONS": nbasis = int(line.split()[3]) self.set_attribute('nbasis', nbasis) # This is used to signalize whether we are inside an SCF calculation. if line[1:8] == "PROGRAM" and line[14:18] == "-SCF": self.insidescf = True # Use this information instead of 'SETTING ...', in case the defaults are standard. # Note that this is sometimes printed in each geometry optimization step. if line[1:20] == "NUMBER OF ELECTRONS": spinup = int(line.split()[3][:-1]) spindown = int(line.split()[4][:-1]) # Nuclear charges (atomnos) should be parsed by now. nuclear = numpy.sum(self.atomnos) charge = nuclear - spinup - spindown self.set_attribute('charge', charge) mult = spinup - spindown + 1 self.set_attribute('mult', mult) # Convergenve thresholds for SCF cycle, should be contained in a line such as: # CONVERGENCE THRESHOLDS: 1.00E-05 (Density) 1.40E-07 (Energy) if self.insidescf and line[1:24] == "CONVERGENCE THRESHOLDS:": if not hasattr(self, "scftargets"): self.scftargets = [] scftargets = list(map(float, line.split()[2::2])) self.scftargets.append(scftargets) # Usually two criteria, but save the names this just in case. self.scftargetnames = line.split()[3::2] # Read in the print out of the SCF cycle - for scfvalues. For RHF looks like: # ITERATION DDIFF GRAD ENERGY 2-EL.EN. DIPOLE MOMENTS DIIS # 1 0.000D+00 0.000D+00 -379.71523700 1159.621171 0.000000 0.000000 0.000000 0 # 2 0.000D+00 0.898D-02 -379.74469736 1162.389787 0.000000 0.000000 0.000000 1 # 3 0.817D-02 0.144D-02 -379.74635529 1162.041033 0.000000 0.000000 0.000000 2 # 4 0.213D-02 0.571D-03 -379.74658063 1162.159929 0.000000 0.000000 0.000000 3 # 5 0.799D-03 0.166D-03 -379.74660889 1162.144256 0.000000 0.000000 0.000000 4 if self.insidescf and line[1:10] == "ITERATION": if not hasattr(self, "scfvalues"): self.scfvalues = [] line = next(inputfile) energy = 0.0 scfvalues = [] while line.strip() != "": if line.split()[0].isdigit(): ddiff = float(line.split()[1].replace('D','E')) newenergy = float(line.split()[3]) ediff = newenergy - energy energy = newenergy # The convergence thresholds must have been read above. # Presently, we recognize MAX DENSITY and MAX ENERGY thresholds. numtargets = len(self.scftargetnames) values = [numpy.nan]*numtargets for n, name in zip(list(range(numtargets)),self.scftargetnames): if "ENERGY" in name.upper(): values[n] = ediff elif "DENSITY" in name.upper(): values[n] = ddiff scfvalues.append(values) line = next(inputfile) self.scfvalues.append(numpy.array(scfvalues)) # SCF result - RHF/UHF and DFT (RKS) energies. if (line[1:5] in ["!RHF", "!UHF", "!RKS"] and line[16:22].lower() == "energy"): if not hasattr(self, "scfenergies"): self.scfenergies = [] scfenergy = float(line.split()[4]) self.scfenergies.append(utils.convertor(scfenergy, "hartree", "eV")) # We are now done with SCF cycle (after a few lines). self.insidescf = False # MP2 energies. if line[1:5] == "!MP2": if not hasattr(self, 'mpenergies'): self.mpenergies = [] mp2energy = float(line.split()[-1]) mp2energy = utils.convertor(mp2energy, "hartree", "eV") self.mpenergies.append([mp2energy]) # MP2 energies if MP3 or MP4 is also calculated. if line[1:5] == "MP2:": if not hasattr(self, 'mpenergies'): self.mpenergies = [] mp2energy = float(line.split()[2]) mp2energy = utils.convertor(mp2energy, "hartree", "eV") self.mpenergies.append([mp2energy]) # MP3 (D) and MP4 (DQ or SDQ) energies. if line[1:8] == "MP3(D):": mp3energy = float(line.split()[2]) mp2energy = utils.convertor(mp3energy, "hartree", "eV") line = next(inputfile) self.mpenergies[-1].append(mp2energy) if line[1:9] == "MP4(DQ):": mp4energy = float(line.split()[2]) line = next(inputfile) if line[1:10] == "MP4(SDQ):": mp4energy = float(line.split()[2]) mp4energy = utils.convertor(mp4energy, "hartree", "eV") self.mpenergies[-1].append(mp4energy) # The CCSD program operates all closed-shel coupled cluster runs. if line[1:15] == "PROGRAM * CCSD": if not hasattr(self, "ccenergies"): self.ccenergies = [] while line[1:20] != "Program statistics:": # The last energy (most exact) will be read last and thus saved. if line[1:5] == "!CCD" or line[1:6] == "!CCSD" or line[1:9] == "!CCSD(T)": ccenergy = float(line.split()[-1]) ccenergy = utils.convertor(ccenergy, "hartree", "eV") line = next(inputfile) self.ccenergies.append(ccenergy) # Read the occupancy (index of HOMO s). # For restricted calculations, there is one line here. For unrestricted, two: # Final alpha occupancy: ... # Final beta occupancy: ... if line[1:17] == "Final occupancy:": self.homos = [int(line.split()[-1])-1] if line[1:23] == "Final alpha occupancy:": self.homos = [int(line.split()[-1])-1] line = next(inputfile) self.homos.append(int(line.split()[-1])-1) # Dipole is always printed on one line after the final RHF energy, and by default # it seems Molpro uses the origin as the reference point. if line.strip()[:13] == "Dipole moment": assert line.split()[2] == "/Debye" reference = [0.0, 0.0, 0.0] dipole = [float(d) for d in line.split()[-3:]] if not hasattr(self, 'moments'): self.moments = [reference, dipole] else: self.moments[1] == dipole # From this block aonames, atombasis, moenergies and mocoeffs can be parsed. The data is # flipped compared to most programs (GAMESS, Gaussian), since the MOs are in rows. Also, Molpro # does not cut the table into parts, rather each MO row has as many lines as it takes ro print # all of the MO coefficients. Each row normally has 10 coefficients, although this can be less # for the last row and when symmetry is used (each irrep has its own block). # # ELECTRON ORBITALS # ================= # # # Orb Occ Energy Couls-En Coefficients # # 1 1s 1 1s 1 2px 1 2py 1 2pz 2 1s (...) # 3 1s 3 1s 3 2px 3 2py 3 2pz 4 1s (...) # (...) # # 1.1 2 -11.0351 -43.4915 0.701460 0.025696 -0.000365 -0.000006 0.000000 0.006922 (...) # -0.006450 0.004742 -0.001028 -0.002955 0.000000 -0.701460 (...) # (...) # if line[1:18] == "ELECTRON ORBITALS" or self.electronorbitals: # For unrestricted calcualtions, ELECTRON ORBITALS is followed on the same line # by FOR POSITIVE SPIN or FOR NEGATIVE SPIN as appropriate. spin = (line[19:36] == "FOR NEGATIVE SPIN") or (self.electronorbitals[19:36] == "FOR NEGATIVE SPIN") if not self.electronorbitals: self.skip_line(inputfile, 'equals') self.skip_lines(inputfile, ['b', 'b', 'headers', 'b']) aonames = [] atombasis = [[] for i in range(self.natom)] moenergies = [] mocoeffs = [] line = next(inputfile) # Besides a double blank line, stop when the next orbitals are encountered for unrestricted jobs # or if there are stars on the line which always signifies the end of the block. while line.strip() and (not "ORBITALS" in line) and (not set(line.strip()) == {'*'}): # The function names are normally printed just once, but if symmetry is used then each irrep # has its own mocoeff block with a preceding list of names. is_aonames = line[:25].strip() == "" if is_aonames: # We need to save this offset for parsing the coefficients later. offset = len(aonames) aonum = len(aonames) while line.strip(): for s in line.split(): if s.isdigit(): atomno = int(s) atombasis[atomno-1].append(aonum) aonum += 1 else: functype = s element = self.table.element[self.atomnos[atomno-1]] aoname = "%s%i_%s" % (element, atomno, functype) aonames.append(aoname) line = next(inputfile) # Now there can be one or two blank lines. while not line.strip(): line = next(inputfile) # Newer versions of Molpro (for example, 2012 test files) will print some # more things here, such as HOMO and LUMO, but these have less than 10 columns. if "HOMO" in line or "LUMO" in line: break # Now parse the MO coefficients, padding the list with an appropriate amount of zeros. coeffs = [0.0 for i in range(offset)] while line.strip() != "": if line[:31].rstrip(): moenergy = float(line.split()[2]) moenergy = utils.convertor(moenergy, "hartree", "eV") moenergies.append(moenergy) # Coefficients are in 10.6f format and splitting does not work since there are not # always spaces between them. If the numbers are very large, there will be stars. str_coeffs = line[31:] ncoeffs = len(str_coeffs) // 10 coeff = [] for ic in range(ncoeffs): p = str_coeffs[ic*10:(ic+1)*10] try: c = float(p) except ValueError as detail: self.logger.warn("setting mocoeff element to zero: %s" % detail) c = 0.0 coeff.append(c) coeffs.extend(coeff) line = next(inputfile) mocoeffs.append(coeffs) # The loop should keep going until there is a double blank line, and there is # a single line between each coefficient block. line = next(inputfile) if not line.strip(): line = next(inputfile) # If symmetry was used (offset was needed) then we will need to pad all MO vectors # up to nbasis for all irreps before the last one. if offset > 0: for im,m in enumerate(mocoeffs): if len(m) < self.nbasis: mocoeffs[im] = m + [0.0 for i in range(self.nbasis - len(m))] self.set_attribute('atombasis', atombasis) self.set_attribute('aonames', aonames) # Consistent with current cclib conventions, reset moenergies/mocoeffs if they have been # previously parsed, since we want to produce only the final values. if not hasattr(self, "moenergies") or spin == 0: self.mocoeffs = [] self.moenergies = [] self.moenergies.append(moenergies) self.mocoeffs.append(mocoeffs) # Check if last line begins the next ELECTRON ORBITALS section, because we already used # this line and need to know when this method is called next time. if line[1:18] == "ELECTRON ORBITALS": self.electronorbitals = line else: self.electronorbitals = "" # If the MATROP program was called appropriately, # the atomic obital overlap matrix S is printed. # The matrix is printed straight-out, ten elements in each row, both halves. # Note that is the entire matrix is not printed, then aooverlaps # will not have dimensions nbasis x nbasis. if line[1:9] == "MATRIX S": if not hasattr(self, "aooverlaps"): self.aooverlaps = [[]] self.skip_lines(inputfile, ['b', 'symblocklabel']) line = next(inputfile) while line.strip() != "": elements = [float(s) for s in line.split()] if len(self.aooverlaps[-1]) + len(elements) <= self.nbasis: self.aooverlaps[-1] += elements else: n = len(self.aooverlaps[-1]) + len(elements) - self.nbasis self.aooverlaps[-1] += elements[:-n] self.aooverlaps.append([]) self.aooverlaps[-1] += elements[-n:] line = next(inputfile) # Thresholds are printed only if the defaults are changed with GTHRESH. # In that case, we can fill geotargets with non-default values. # The block should look like this as of Molpro 2006.1: # THRESHOLDS: # ZERO = 1.00D-12 ONEINT = 1.00D-12 TWOINT = 1.00D-11 PREFAC = 1.00D-14 LOCALI = 1.00D-09 EORDER = 1.00D-04 # ENERGY = 0.00D+00 ETEST = 0.00D+00 EDENS = 0.00D+00 THRDEDEF= 1.00D-06 GRADIENT= 1.00D-02 STEP = 1.00D-03 # ORBITAL = 1.00D-05 CIVEC = 1.00D-05 COEFF = 1.00D-04 PRINTCI = 5.00D-02 PUNCHCI = 9.90D+01 OPTGRAD = 3.00D-04 # OPTENERG= 1.00D-06 OPTSTEP = 3.00D-04 THRGRAD = 2.00D-04 COMPRESS= 1.00D-11 VARMIN = 1.00D-07 VARMAX = 1.00D-03 # THRDOUB = 0.00D+00 THRDIV = 1.00D-05 THRRED = 1.00D-07 THRPSP = 1.00D+00 THRDC = 1.00D-10 THRCS = 1.00D-10 # THRNRM = 1.00D-08 THREQ = 0.00D+00 THRDE = 1.00D+00 THRREF = 1.00D-05 SPARFAC = 1.00D+00 THRDLP = 1.00D-07 # THRDIA = 1.00D-10 THRDLS = 1.00D-07 THRGPS = 0.00D+00 THRKEX = 0.00D+00 THRDIS = 2.00D-01 THRVAR = 1.00D-10 # THRLOC = 1.00D-06 THRGAP = 1.00D-06 THRLOCT = -1.00D+00 THRGAPT = -1.00D+00 THRORB = 1.00D-06 THRMLTP = 0.00D+00 # THRCPQCI= 1.00D-10 KEXTA = 0.00D+00 THRCOARS= 0.00D+00 SYMTOL = 1.00D-06 GRADTOL = 1.00D-06 THROVL = 1.00D-08 # THRORTH = 1.00D-08 GRID = 1.00D-06 GRIDMAX = 1.00D-03 DTMAX = 0.00D+00 if line [1:12] == "THRESHOLDS": self.skip_line(input, 'blank') line = next(inputfile) while line.strip(): if "OPTENERG" in line: start = line.find("OPTENERG") optenerg = line[start+10:start+20] if "OPTGRAD" in line: start = line.find("OPTGRAD") optgrad = line[start+10:start+20] if "OPTSTEP" in line: start = line.find("OPTSTEP") optstep = line[start+10:start+20] line = next(inputfile) self.geotargets = [optenerg, optgrad, optstep] # The optimization history is the source for geovlues: # # END OF GEOMETRY OPTIMIZATION. TOTAL CPU: 246.9 SEC # # ITER. ENERGY(OLD) ENERGY(NEW) DE GRADMAX GRADNORM GRADRMS STEPMAX STEPLEN STEPRMS # 1 -382.02936898 -382.04914450 -0.01977552 0.11354875 0.20127947 0.01183997 0.12972761 0.20171740 0.01186573 # 2 -382.04914450 -382.05059234 -0.00144784 0.03299860 0.03963339 0.00233138 0.05577169 0.06687650 0.00393391 # 3 -382.05059234 -382.05069136 -0.00009902 0.00694359 0.01069889 0.00062935 0.01654549 0.02016307 0.00118606 # ... # # The above is an exerpt from Molpro 2006, but it is a little bit different # for Molpro 2012, namely the 'END OF GEOMETRY OPTIMIZATION occurs after the # actual history list. It seems there is a another consistent line before the # history, but this might not be always true -- so this is a potential weak link. if line[1:30] == "END OF GEOMETRY OPTIMIZATION." or line.strip() == "Quadratic Steepest Descent - Minimum Search": # I think this is the trigger for convergence, and it shows up at the top in Molpro 2006. geometry_converged = line[1:30] == "END OF GEOMETRY OPTIMIZATION." self.skip_line(inputfile, 'blank') # Newer version of Molpro (at least for 2012) print and additional column # with the timing information for each step. Otherwise, the history looks the same. headers = next(inputfile).split() if not len(headers) in (10,11): return # Although criteria can be changed, the printed format should not change. # In case it does, retrieve the columns for each parameter. index_ITER = headers.index('ITER.') index_THRENERG = headers.index('DE') index_THRGRAD = headers.index('GRADMAX') index_THRSTEP = headers.index('STEPMAX') line = next(inputfile) self.geovalues = [] while line.strip(): line = line.split() istep = int(line[index_ITER]) geovalues = [] geovalues.append(float(line[index_THRENERG])) geovalues.append(float(line[index_THRGRAD])) geovalues.append(float(line[index_THRSTEP])) self.geovalues.append(geovalues) line = next(inputfile) if line.strip() == "Freezing grid": line = next(inputfile) # The convergence trigger shows up somewhere at the bottom in Molpro 2012, # before the final stars. If convergence is not reached, there is an additional # line that can be checked for. This is a little tricky, though, since it is # not the last line... so bail out of the loop if convergence failure is detected. while "*****" not in line: line = next(inputfile) if line.strip() == "END OF GEOMETRY OPTIMIZATION.": geometry_converged = True if "No convergence" in line: geometry_converged = False break # Finally, deal with optdone, append the last step to it only if we had convergence. if not hasattr(self, 'optdone'): self.optdone = [] if geometry_converged: self.optdone.append(istep-1) # This block should look like this: # Normal Modes # # 1 Au 2 Bu 3 Ag 4 Bg 5 Ag # Wavenumbers [cm-1] 151.81 190.88 271.17 299.59 407.86 # Intensities [km/mol] 0.33 0.28 0.00 0.00 0.00 # Intensities [relative] 0.34 0.28 0.00 0.00 0.00 # CX1 0.00000 -0.01009 0.02577 0.00000 0.06008 # CY1 0.00000 -0.05723 -0.06696 0.00000 0.06349 # CZ1 -0.02021 0.00000 0.00000 0.11848 0.00000 # CX2 0.00000 -0.01344 0.05582 0.00000 -0.02513 # CY2 0.00000 -0.06288 -0.03618 0.00000 0.00349 # CZ2 -0.05565 0.00000 0.00000 0.07815 0.00000 # ... # Molpro prints low frequency modes in a subsequent section with the same format, # which also contains zero frequency modes, with the title: # Normal Modes of low/zero frequencies if line[1:13] == "Normal Modes": if line[1:37] == "Normal Modes of low/zero frequencies": islow = True else: islow = False self.skip_line(inputfile, 'blank') # Each portion of five modes is followed by a single blank line. # The whole block is followed by an additional blank line. line = next(inputfile) while line.strip(): if line[1:25].isspace(): numbers = list(map(int, line.split()[::2])) vibsyms = line.split()[1::2] if line[1:12] == "Wavenumbers": vibfreqs = list(map(float, line.strip().split()[2:])) if line[1:21] == "Intensities [km/mol]": vibirs = list(map(float, line.strip().split()[2:])) # There should always by 3xnatom displacement rows. if line[1:11].isspace() and line[13:25].strip().isdigit(): # There are a maximum of 5 modes per line. nmodes = len(line.split())-1 vibdisps = [] for i in range(nmodes): vibdisps.append([]) for n in range(self.natom): vibdisps[i].append([]) for i in range(nmodes): disp = float(line.split()[i+1]) vibdisps[i][0].append(disp) for i in range(self.natom*3 - 1): line = next(inputfile) iatom = (i+1)//3 for i in range(nmodes): disp = float(line.split()[i+1]) vibdisps[i][iatom].append(disp) line = next(inputfile) if not line.strip(): if not hasattr(self, "vibfreqs"): self.vibfreqs = [] if not hasattr(self, "vibsyms"): self.vibsyms = [] if not hasattr(self, "vibirs") and "vibirs" in dir(): self.vibirs = [] if not hasattr(self, "vibdisps") and "vibdisps" in dir(): self.vibdisps = [] if not islow: self.vibfreqs.extend(vibfreqs) self.vibsyms.extend(vibsyms) if "vibirs" in dir(): self.vibirs.extend(vibirs) if "vibdisps" in dir(): self.vibdisps.extend(vibdisps) else: nonzero = [f > 0 for f in vibfreqs] vibfreqs = [f for f in vibfreqs if f > 0] self.vibfreqs = vibfreqs + self.vibfreqs vibsyms = [vibsyms[i] for i in range(len(vibsyms)) if nonzero[i]] self.vibsyms = vibsyms + self.vibsyms if "vibirs" in dir(): vibirs = [vibirs[i] for i in range(len(vibirs)) if nonzero[i]] self.vibirs = vibirs + self.vibirs if "vibdisps" in dir(): vibdisps = [vibdisps[i] for i in range(len(vibdisps)) if nonzero[i]] self.vibdisps = vibdisps + self.vibdisps line = next(inputfile) if line[1:16] == "Force Constants": self.logger.info("Creating attribute hessian") self.hessian = [] line = next(inputfile) hess = [] tmp = [] while line.strip(): try: list(map(float, line.strip().split()[2:])) except: line = next(inputfile) line.strip().split()[1:] hess.extend([list(map(float, line.strip().split()[1:]))]) line = next(inputfile) lig = 0 while (lig==0) or (len(hess[0]) > 1): tmp.append(hess.pop(0)) lig += 1 k = 5 while len(hess) != 0: tmp[k] += hess.pop(0) k += 1 if (len(tmp[k-1]) == lig): break if k >= lig: k = len(tmp[-1]) for l in tmp: self.hessian += l if line[1:14] == "Atomic Masses" and hasattr(self,"hessian"): line = next(inputfile) self.amass = list(map(float, line.strip().split()[2:])) while line.strip(): line = next(inputfile) self.amass += list(map(float, line.strip().split()[2:])) #1PROGRAM * POP (Mulliken population analysis) # # # Density matrix read from record 2100.2 Type=RHF/CHARGE (state 1.1) # # Population analysis by basis function type # # Unique atom s p d f g Total Charge # 2 C 3.11797 2.88497 0.00000 0.00000 0.00000 6.00294 - 0.00294 # 3 C 3.14091 2.91892 0.00000 0.00000 0.00000 6.05984 - 0.05984 # ... if line.strip() == "1PROGRAM * POP (Mulliken population analysis)": self.skip_lines(inputfile, ['b', 'b', 'density_source', 'b', 'func_type', 'b']) header = next(inputfile) icharge = header.split().index('Charge') charges = [] line = next(inputfile) while line.strip(): cols = line.split() charges.append(float(cols[icharge]+cols[icharge+1])) line = next(inputfile) if not hasattr(self, "atomcharges"): self.atomcharges = {} self.atomcharges['mulliken'] = charges if __name__ == "__main__": import doctest, molproparser doctest.testmod(molproparser, verbose=False) cclib-1.3.1/src/cclib/parser/adfparser.py0000644000175100016050000013412112467423323020141 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for ADF output files""" from __future__ import print_function import re import numpy from . import logfileparser from . import utils class ADF(logfileparser.Logfile): """An ADF log file""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(ADF, self).__init__(logname="ADF", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "ADF log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'ADF("%s")' % (self.filename) def normalisesym(self, label): """Use standard symmetry labels instead of ADF labels. To normalise: (1) any periods are removed (except in the case of greek letters) (2) XXX is replaced by X, and a " added. (3) XX is replaced by X, and a ' added. (4) The greek letters Sigma, Pi, Delta and Phi are replaced by their lowercase equivalent. >>> sym = ADF("dummyfile").normalisesym >>> labels = ['A','s','A1','A1.g','Sigma','Pi','Delta','Phi','Sigma.g','A.g','AA','AAA','EE1','EEE1'] >>> map(sym,labels) ['A', 's', 'A1', 'A1g', 'sigma', 'pi', 'delta', 'phi', 'sigma.g', 'Ag', "A'", 'A"', "E1'", 'E1"'] """ greeks = ['Sigma', 'Pi', 'Delta', 'Phi'] for greek in greeks: if label.startswith(greek): return label.lower() ans = label.replace(".", "") if ans[1:3] == "''": temp = ans[0] + '"' ans = temp l = len(ans) if l > 1 and ans[0] == ans[1]: # Python only tests the second condition if the first is true if l > 2 and ans[1] == ans[2]: ans = ans.replace(ans[0]*3, ans[0]) + '"' else: ans = ans.replace(ans[0]*2, ans[0]) + "'" return ans def normalisedegenerates(self, label, num, ndict=None): """Generate a string used for matching degenerate orbital labels To normalise: (1) if label is E or T, return label:num (2) if label is P or D, look up in dict, and return answer """ if not ndict: ndict = { 'P': {0:"P:x", 1:"P:y", 2:"P:z"}, \ 'D': {0:"D:z2", 1:"D:x2-y2", 2:"D:xy", 3:"D:xz", 4:"D:yz"}} if label in ndict: if num in ndict[label]: return ndict[label][num] else: return "%s:%i" % (label, num+1) else: return "%s:%i" % (label, num+1) def before_parsing(self): # Used to avoid extracting the final geometry twice in a GeoOpt self.NOTFOUND, self.GETLAST, self.NOMORE = list(range(3)) self.finalgeometry = self.NOTFOUND # Used for calculating the scftarget (variables names taken from the ADF manual) self.accint = self.SCFconv = self.sconv2 = None # keep track of nosym and unrestricted case to parse Energies since it doens't have an all Irreps section self.nosymflag = False self.unrestrictedflag = False SCFCNV, SCFCNV2 = list(range(2)) #used to index self.scftargets[] maxelem, norm = list(range(2)) # used to index scf.values def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # If a file contains multiple calculations, currently we want to print a warning # and skip to the end of the file, since cclib parses only the main system, which # is usually the largest. Here we test this by checking if scftargets has already # been parsed when another INPUT FILE segment is found, although this might # not always be the best indicator. if line.strip() == "(INPUT FILE)" and hasattr(self, "scftargets"): self.logger.warning("Skipping remaining calculations") inputfile.seek(0, 2) return # We also want to check to make sure we aren't parsing "Create" jobs, # which normally come before the calculation we actually want to parse. if line.strip() == "(INPUT FILE)": while True: self.updateprogress(inputfile, "Unsupported Information", self.fupdate) line = next(inputfile) if line.strip() == "(INPUT FILE)" else None if line and not line[:6] in ("Create", "create"): break line = next(inputfile) # In ADF 2014.01, there are (INPUT FILE) messages, so we need to use just # the lines that start with 'Create' and run until the title or something # else we are sure is is the calculation proper. It would be good to combine # this with the previous block, if possible. if line[:6] == "Create": while line[:5] != "title": line = inputfile.next() if line[1:10] == "Symmetry:": info = line.split() if info[1] == "NOSYM": self.nosymflag = True # Use this to read the subspecies of irreducible representations. # It will be a list, with each element representing one irrep. if line.strip() == "Irreducible Representations, including subspecies": self.skip_line(inputfile, 'dashes') self.irreps = [] line = next(inputfile) while line.strip() != "": self.irreps.append(line.split()) line = next(inputfile) if line[4:13] == 'Molecule:': info = line.split() if info[1] == 'UNrestricted': self.unrestrictedflag = True if line[1:6] == "ATOMS": # Find the number of atoms and their atomic numbers # Also extract the starting coordinates (for a GeoOpt anyway) # and the atommasses (previously called vibmasses) self.updateprogress(inputfile, "Attributes", self.cupdate) self.atomcoords = [] self.skip_lines(inputfile, ['header1', 'header2', 'header3']) atomnos = [] atommasses = [] atomcoords = [] coreelectrons = [] line = next(inputfile) while len(line)>2: #ensure that we are reading no blank lines info = line.split() element = info[1].split('.')[0] atomnos.append(self.table.number[element]) atomcoords.append(list(map(float, info[2:5]))) coreelectrons.append(int(float(info[5]) - float(info[6]))) atommasses.append(float(info[7])) line = next(inputfile) self.atomcoords.append(atomcoords) self.set_attribute('natom', len(atomnos)) self.set_attribute('atomnos', atomnos) self.set_attribute('atommasses', atommasses) self.set_attribute('coreelectrons', coreelectrons) if line[1:10] == "FRAGMENTS": header = next(inputfile) self.frags = [] self.fragnames = [] line = next(inputfile) while len(line) > 2: #ensure that we are reading no blank lines info = line.split() if len(info) == 7: #fragment name is listed here self.fragnames.append("%s_%s" % (info[1], info[0])) self.frags.append([]) self.frags[-1].append(int(info[2]) - 1) elif len(info) == 5: #add atoms into last fragment self.frags[-1].append(int(info[0]) - 1) line = next(inputfile) # Extract charge if line[1:11] == "Net Charge": charge = int(line.split()[2]) self.set_attribute('charge', charge) line = next(inputfile) if len(line.strip()): # Spin polar: 1 (Spin_A minus Spin_B electrons) # (Not sure about this for higher multiplicities) mult = int(line.split()[2]) + 1 else: mult = 1 self.set_attribute('mult', mult) if line[1:22] == "S C F U P D A T E S": # find targets for SCF convergence if not hasattr(self,"scftargets"): self.scftargets = [] self.skip_lines(inputfile, ['e', 'b', 'numbers']) line = next(inputfile) self.SCFconv = float(line.split()[-1]) line = next(inputfile) self.sconv2 = float(line.split()[-1]) # In ADF 2013, the default numerical integration method is fuzzy cells, # although it used to be Voronoi polyhedra. Both methods apparently set # the accint parameter, although the latter does so indirectly, based on # a 'grid quality' setting. This is translated into accint using a # dictionary with values taken from the documentation. if "Numerical Integration : Voronoi Polyhedra (Te Velde)" in line: self.integration_method = "voronoi_polyhedra" if line[1:27] == 'General Accuracy Parameter': # Need to know the accuracy of the integration grid to # calculate the scftarget...note that it changes with time self.accint = float(line.split()[-1]) if "Numerical Integration : Fuzzy Cells (Becke)" in line: self.integration_method = 'fuzzy_cells' if line[1:19] == "Becke grid quality": self.grid_quality = line.split()[-1] quality2accint = { 'BASIC' : 2.0, 'NORMAL' : 4.0, 'GOOD' : 6.0, 'VERYGOOD' : 8.0, 'EXCELLENT' : 10.0, } self.accint = quality2accint[self.grid_quality] if line[1:11] == "CYCLE 1": self.updateprogress(inputfile, "QM convergence", self.fupdate) newlist = [] line = next(inputfile) if not hasattr(self,"geovalues"): # This is the first SCF cycle self.scftargets.append([self.sconv2*10, self.sconv2]) elif self.finalgeometry in [self.GETLAST, self.NOMORE]: # This is the final SCF cycle self.scftargets.append([self.SCFconv*10, self.SCFconv]) else: # This is an intermediate SCF cycle in a geometry optimization, # in which case the SCF convergence target needs to be derived # from the accint parameter. For Voronoi polyhedra integration, # accint is printed and parsed. For fuzzy cells, it can be inferred # from the grid quality setting, as is done somewhere above. if self.accint: oldscftst = self.scftargets[-1][1] grdmax = self.geovalues[-1][1] scftst = max(self.SCFconv, min(oldscftst, grdmax/30, 10**(-self.accint))) self.scftargets.append([scftst*10, scftst]) while line.find("SCF CONVERGED") == -1 and line.find("SCF not fully converged, result acceptable") == -1 and line.find("SCF NOT CONVERGED") == -1: if line[4:12] == "SCF test": if not hasattr(self, "scfvalues"): self.scfvalues = [] info = line.split() newlist.append([float(info[4]), abs(float(info[6]))]) try: line = next(inputfile) except StopIteration: #EOF reached? self.logger.warning("SCF did not converge, so attributes may be missing") break if line.find("SCF not fully converged, result acceptable") > 0: self.logger.warning("SCF not fully converged, results acceptable") if line.find("SCF NOT CONVERGED") > 0: self.logger.warning("SCF did not converge! moenergies and mocoeffs are unreliable") if hasattr(self, "scfvalues"): self.scfvalues.append(newlist) # Parse SCF energy for SP calcs from bonding energy decomposition section. # It seems ADF does not print it earlier for SP calculations. # Geometry optimization runs also print this, and we want to parse it # for them, too, even if it repeats the last "Geometry Convergence Tests" # section (but it's usually a bit different). if line[:21] == "Total Bonding Energy:": if not hasattr(self, "scfenergies"): self.scfenergies = [] energy = utils.convertor(float(line.split()[3]), "hartree", "eV") self.scfenergies.append(energy) if line[51:65] == "Final Geometry": self.finalgeometry = self.GETLAST # Get the coordinates from each step of the GeoOpt. if line[1:24] == "Coordinates (Cartesian)" and self.finalgeometry in [self.NOTFOUND, self.GETLAST]: self.skip_lines(inputfile, ['e', 'b', 'title', 'title', 'd']) atomcoords = [] line = next(inputfile) while list(set(line.strip())) != ['-']: atomcoords.append(list(map(float, line.split()[5:8]))) line = next(inputfile) if not hasattr(self, "atomcoords"): self.atomcoords = [] self.atomcoords.append(atomcoords) # Don't get any more coordinates in this case. # KML: I think we could combine this with optdone (see below). if self.finalgeometry == self.GETLAST: self.finalgeometry = self.NOMORE # There have been some changes in the format of the geometry convergence information, # and this is how it is printed in older versions (2007.01 unit tests). # # ========================== # Geometry Convergence Tests # ========================== # # Energy old : -5.14170647 # new : -5.15951374 # # Convergence tests: # (Energies in hartree, Gradients in hartree/angstr or radian, Lengths in angstrom, Angles in degrees) # # Item Value Criterion Conv. Ratio # ------------------------------------------------------------------------- # change in energy -0.01780727 0.00100000 NO 0.00346330 # gradient max 0.03219530 0.01000000 NO 0.30402650 # gradient rms 0.00858685 0.00666667 NO 0.27221261 # cart. step max 0.07674971 0.01000000 NO 0.75559435 # cart. step rms 0.02132310 0.00666667 NO 0.55335378 # if line[1:27] == 'Geometry Convergence Tests': if not hasattr(self, "geotargets"): self.geovalues = [] self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0, 0.0], "d") if not hasattr(self, "scfenergies"): self.scfenergies = [] self.skip_lines(inputfile, ['e', 'b']) energies_old = next(inputfile) energies_new = next(inputfile) self.scfenergies.append(utils.convertor(float(energies_new.split()[-1]), "hartree", "eV")) self.skip_lines(inputfile, ['b', 'convergence', 'units', 'b', 'header', 'd']) values = [] for i in range(5): temp = next(inputfile).split() self.geotargets[i] = float(temp[-3]) values.append(float(temp[-4])) self.geovalues.append(values) # This is to make geometry optimization always have the optdone attribute, # even if it is to be empty for unconverged runs. if not hasattr(self, 'optdone'): self.optdone = [] # After the test, there is a message if the search is converged: # # *************************************************************************************************** # Geometry CONVERGED # *************************************************************************************************** # if line.strip() == "Geometry CONVERGED": self.skip_line(inputfile, 'stars') self.optdone.append(len(self.geovalues) - 1) # Here is the corresponding geometry convergence info from the 2013.01 unit test. # Note that the step number is given, which it will be prudent to use in an assertion. # #---------------------------------------------------------------------- #Geometry Convergence after Step 3 (Hartree/Angstrom,Angstrom) #---------------------------------------------------------------------- #current energy -5.16274478 Hartree #energy change -0.00237544 0.00100000 F #constrained gradient max 0.00884999 0.00100000 F #constrained gradient rms 0.00249569 0.00066667 F #gradient max 0.00884999 #gradient rms 0.00249569 #cart. step max 0.03331296 0.01000000 F #cart. step rms 0.00844037 0.00666667 F if line[:31] == "Geometry Convergence after Step": stepno = int(line.split()[4]) # This is to make geometry optimization always have the optdone attribute, # even if it is to be empty for unconverged runs. if not hasattr(self, 'optdone'): self.optdone = [] # The convergence message is inline in this block, not later as it was before. if "** CONVERGED **" in line: if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues) - 1) self.skip_line(inputfile, 'dashes') current_energy = next(inputfile) energy_change = next(inputfile) constrained_gradient_max = next(inputfile) constrained_gradient_rms = next(inputfile) gradient_max = next(inputfile) gradient_rms = next(inputfile) cart_step_max = next(inputfile) cart_step_rms = next(inputfile) if not hasattr(self, "scfenergies"): self.scfenergies = [] energy = utils.convertor(float(current_energy.split()[-2]), "hartree", "eV") self.scfenergies.append(energy) if not hasattr(self, "geotargets"): self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0, 0.0], "d") self.geotargets[0] = float(energy_change.split()[-2]) self.geotargets[1] = float(constrained_gradient_max.split()[-2]) self.geotargets[2] = float(constrained_gradient_rms.split()[-2]) self.geotargets[3] = float(cart_step_max.split()[-2]) self.geotargets[4] = float(cart_step_rms.split()[-2]) if not hasattr(self, "geovalues"): self.geovalues = [] self.geovalues.append([]) self.geovalues[-1].append(float(energy_change.split()[-3])) self.geovalues[-1].append(float(constrained_gradient_max.split()[-3])) self.geovalues[-1].append(float(constrained_gradient_rms.split()[-3])) self.geovalues[-1].append(float(cart_step_max.split()[-3])) self.geovalues[-1].append(float(cart_step_rms.split()[-3])) if line.find('Orbital Energies, per Irrep and Spin') > 0 and not hasattr(self, "mosyms") and self.nosymflag and not self.unrestrictedflag: #Extracting orbital symmetries and energies, homos for nosym case #Should only be for restricted case because there is a better text block for unrestricted and nosym self.mosyms = [[]] self.moenergies = [[]] self.skip_lines(inputfile, ['e', 'header', 'e', 'label']) line = next(inputfile) info = line.split() if not info[0] == '1': self.logger.warning("MO info up to #%s is missing" % info[0]) #handle case where MO information up to a certain orbital are missing while int(info[0]) - 1 != len(self.moenergies[0]): self.moenergies[0].append(99999) self.mosyms[0].append('A') homoA = None while len(line) > 10: info = line.split() self.mosyms[0].append('A') self.moenergies[0].append(utils.convertor(float(info[2]), 'hartree', 'eV')) if info[1] == '0.000' and not hasattr(self, 'homos'): self.set_attribute('homos', [len(self.moenergies[0]) - 2]) line = next(inputfile) self.moenergies = [numpy.array(self.moenergies[0], "d")] if line[1:29] == 'Orbital Energies, both Spins' and not hasattr(self, "mosyms") and self.nosymflag and self.unrestrictedflag: #Extracting orbital symmetries and energies, homos for nosym case #should only be here if unrestricted and nosym self.mosyms = [[], []] moenergies = [[], []] self.skip_lines(inputfile, ['d', 'b', 'header', 'd']) homoa = 0 homob = None line = next(inputfile) while len(line) > 5: info = line.split() if info[2] == 'A': self.mosyms[0].append('A') moenergies[0].append(utils.convertor(float(info[4]), 'hartree', 'eV')) if info[3] != '0.00': homoa = len(moenergies[0]) - 1 elif info[2] == 'B': self.mosyms[1].append('A') moenergies[1].append(utils.convertor(float(info[4]), 'hartree', 'eV')) if info[3] != '0.00': homob = len(moenergies[1]) - 1 else: print(("Error reading line: %s" % line)) line = next(inputfile) self.moenergies = [numpy.array(x, "d") for x in moenergies] self.set_attribute('homos', [homoa, homob]) # Extracting orbital symmetries and energies, homos. if line[1:29] == 'Orbital Energies, all Irreps' and not hasattr(self, "mosyms"): self.symlist = {} self.mosyms = [[]] self.moenergies = [[]] self.skip_lines(inputfile, ['e', 'b', 'header', 'd']) homoa = None homob = None #multiple = {'E':2, 'T':3, 'P':3, 'D':5} # The above is set if there are no special irreps names = [irrep[0].split(':')[0] for irrep in self.irreps] counts = [len(irrep) for irrep in self.irreps] multiple = dict(list(zip(names, counts))) irrepspecies = {} for n in range(len(names)): indices = list(range(counts[n])) subspecies = self.irreps[n] irrepspecies[names[n]] = dict(list(zip(indices, subspecies))) line = next(inputfile) while line.strip(): info = line.split() if len(info) == 5: #this is restricted #count = multiple.get(info[0][0],1) count = multiple.get(info[0], 1) for repeat in range(count): # i.e. add E's twice, T's thrice self.mosyms[0].append(self.normalisesym(info[0])) self.moenergies[0].append(utils.convertor(float(info[3]), 'hartree', 'eV')) sym = info[0] if count > 1: # add additional sym label sym = self.normalisedegenerates(info[0], repeat, ndict=irrepspecies) try: self.symlist[sym][0].append(len(self.moenergies[0])-1) except KeyError: self.symlist[sym] = [[]] self.symlist[sym][0].append(len(self.moenergies[0])-1) if info[2] == '0.00' and not hasattr(self, 'homos'): self.homos = [len(self.moenergies[0]) - (count + 1)] #count, because need to handle degenerate cases line = next(inputfile) elif len(info) == 6: #this is unrestricted if len(self.moenergies) < 2: #if we don't have space, create it self.moenergies.append([]) self.mosyms.append([]) # count = multiple.get(info[0][0], 1) count = multiple.get(info[0], 1) if info[2] == 'A': for repeat in range(count): # i.e. add E's twice, T's thrice self.mosyms[0].append(self.normalisesym(info[0])) self.moenergies[0].append(utils.convertor(float(info[4]), 'hartree', 'eV')) sym = info[0] if count > 1: #add additional sym label sym = self.normalisedegenerates(info[0], repeat) try: self.symlist[sym][0].append(len(self.moenergies[0])-1) except KeyError: self.symlist[sym] = [[], []] self.symlist[sym][0].append(len(self.moenergies[0])-1) if info[3] == '0.00' and homoa == None: homoa = len(self.moenergies[0]) - (count + 1) #count because degenerate cases need to be handled if info[2] == 'B': for repeat in range(count): # i.e. add E's twice, T's thrice self.mosyms[1].append(self.normalisesym(info[0])) self.moenergies[1].append(utils.convertor(float(info[4]), 'hartree', 'eV')) sym = info[0] if count > 1: #add additional sym label sym = self.normalisedegenerates(info[0], repeat) try: self.symlist[sym][1].append(len(self.moenergies[1])-1) except KeyError: self.symlist[sym] = [[], []] self.symlist[sym][1].append(len(self.moenergies[1])-1) if info[3] == '0.00' and homob == None: homob = len(self.moenergies[1]) - (count + 1) line = next(inputfile) else: #different number of lines print(("Error", info)) if len(info) == 6: #still unrestricted, despite being out of loop self.set_attribute('homos', [homoa, homob]) self.moenergies = [numpy.array(x, "d") for x in self.moenergies] # Section on extracting vibdisps # Also contains vibfreqs, but these are extracted in the # following section (see below) if line[1:28] == "Vibrations and Normal Modes": self.vibdisps = [] self.skip_lines(inputfile, ['e', 'b', 'header', 'header', 'b', 'b']) freqs = next(inputfile) while freqs.strip()!="": minus = next(inputfile) p = [ [], [], [] ] for i in range(len(self.atomnos)): broken = list(map(float, next(inputfile).split()[1:])) for j in range(0, len(broken), 3): p[j//3].append(broken[j:j+3]) self.vibdisps.extend(p[:(len(broken)//3)]) self.skip_lines(inputfile, ['b', 'b']) freqs = next(inputfile) self.vibdisps = numpy.array(self.vibdisps, "d") if line[1:24] == "List of All Frequencies": # Start of the IR/Raman frequency section self.updateprogress(inputfile, "Frequency information", self.fupdate) # self.vibsyms = [] # Need to look into this a bit more self.vibirs = [] self.vibfreqs = [] for i in range(8): line = next(inputfile) line = next(inputfile).strip() while line: temp = line.split() self.vibfreqs.append(float(temp[0])) self.vibirs.append(float(temp[2])) # or is it temp[1]? line = next(inputfile).strip() self.vibfreqs = numpy.array(self.vibfreqs, "d") self.vibirs = numpy.array(self.vibirs, "d") if hasattr(self, "vibramans"): self.vibramans = numpy.array(self.vibramans, "d") #******************************************************************************************************************8 #delete this after new implementation using smat, eigvec print,eprint? # Extract the number of basis sets if line[1:49] == "Total nr. of (C)SFOs (summation over all irreps)": nbasis = int(line.split(":")[1].split()[0]) self.set_attribute('nbasis', nbasis) # now that we're here, let's extract aonames self.fonames = [] self.start_indeces = {} self.skip_line(inputfile, 'blank') note = next(inputfile) symoffset = 0 self.skip_line(inputfile, 'blank') line = next(inputfile) if len(line) > 2: #fix for ADF2006.01 as it has another note self.skip_line(inputfile, 'blank') line = next(inputfile) self.skip_line(inputfile, 'blank') self.nosymreps = [] while len(self.fonames) < self.nbasis: symline = next(inputfile) sym = symline.split()[1] line = next(inputfile) num = int(line.split(':')[1].split()[0]) self.nosymreps.append(num) #read until line "--------..." is found while line.find('-----') < 0: line = next(inputfile) line = next(inputfile) # the start of the first SFO while len(self.fonames) < symoffset + num: info = line.split() #index0 index1 occ2 energy3/4 fragname5 coeff6 orbnum7 orbname8 fragname9 if not sym in list(self.start_indeces.keys()): #have we already set the start index for this symmetry? self.start_indeces[sym] = int(info[1]) orbname = info[8] orbital = info[7] + orbname.replace(":", "") fragname = info[5] frag = fragname + info[9] coeff = float(info[6]) line = next(inputfile) while line.strip() and not line[:7].strip(): # while it's the same SFO # i.e. while not completely blank, but blank at the start info = line[43:].split() if len(info)>0: # len(info)==0 for the second line of dvb_ir.adfout frag += "+" + fragname + info[-1] coeff = float(info[-4]) if coeff < 0: orbital += '-' + info[-3] + info[-2].replace(":", "") else: orbital += '+' + info[-3] + info[-2].replace(":", "") line = next(inputfile) # At this point, we are either at the start of the next SFO or at # a blank line...the end self.fonames.append("%s_%s" % (frag, orbital)) symoffset += num # blankline blankline next(inputfile); next(inputfile) if line[1:32] == "S F O P O P U L A T I O N S ,": #Extract overlap matrix # self.fooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d") symoffset = 0 for nosymrep in self.nosymreps: line = next(inputfile) while line.find('===') < 10: #look for the symmetry labels line = next(inputfile) self.skip_lines(inputfile, ['b', 'b']) text = next(inputfile) if text[13:20] != "Overlap": # verify this has overlap info break col = next(inputfile) row = next(inputfile) if not hasattr(self,"fooverlaps"): # make sure there is a matrix to store this self.fooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d") base = 0 while base < nosymrep: #have we read all the columns? for i in range(nosymrep - base): self.updateprogress(inputfile, "Overlap", self.fupdate) line = next(inputfile) parts = line.split()[1:] for j in range(len(parts)): k = float(parts[j]) self.fooverlaps[base + symoffset + j, base + symoffset +i] = k self.fooverlaps[base + symoffset + i, base + symoffset + j] = k #blank, blank, column for i in range(3): next(inputfile) base += 4 symoffset += nosymrep base = 0 # The commented code below makes the atombasis attribute based on the BAS function in ADF, # but this is probably not so useful, since SFOs are used to build MOs in ADF. # if line[1:54] == "BAS: List of all Elementary Cartesian Basis Functions": # # self.atombasis = [] # # # There will be some text, followed by a line: # # (power of) X Y Z R Alpha on Atom # while not line[1:11] == "(power of)": # line = inputfile.next() # dashes = inputfile.next() # blank = inputfile.next() # line = inputfile.next() # # There will be two blank lines when there are no more atom types. # while line.strip() != "": # atoms = [int(i)-1 for i in line.split()[1:]] # for n in range(len(atoms)): # self.atombasis.append([]) # dashes = inputfile.next() # line = inputfile.next() # while line.strip() != "": # indices = [int(i)-1 for i in line.split()[5:]] # for i in range(len(indices)): # self.atombasis[atoms[i]].append(indices[i]) # line = inputfile.next() # line = inputfile.next() if line[48:67] == "SFO MO coefficients": self.mocoeffs = [numpy.zeros((self.nbasis, self.nbasis), "d")] spin = 0 symoffset = 0 lastrow = 0 # Section ends with "1" at beggining of a line. while line[0] != "1": line = next(inputfile) # If spin is specified, then there will be two coefficient matrices. if line.strip() == "***** SPIN 1 *****": self.mocoeffs = [numpy.zeros((self.nbasis, self.nbasis), "d"), numpy.zeros((self.nbasis, self.nbasis), "d")] # Bump up the spin. if line.strip() == "***** SPIN 2 *****": spin = 1 symoffset = 0 lastrow = 0 # Next symmetry. if line.strip()[:4] == "=== ": sym = line.split()[1] if self.nosymflag: aolist = list(range(self.nbasis)) else: aolist = self.symlist[sym][spin] # Add to the symmetry offset of AO ordering. symoffset += lastrow # Blocks with coefficient always start with "MOs :". if line[1:6] == "MOs :": # Next line has the MO index contributed to. monumbers = [int(n) for n in line[6:].split()] self.skip_lines(inputfile, ['occup', 'label']) # The table can end with a blank line or "1". row = 0 line = next(inputfile) while not line.strip() in ["", "1"]: info = line.split() if int(info[0]) < self.start_indeces[sym]: #check to make sure we aren't parsing CFs line = next(inputfile) continue self.updateprogress(inputfile, "Coefficients", self.fupdate) row += 1 coeffs = [float(x) for x in info[1:]] moindices = [aolist[n-1] for n in monumbers] # The AO index is 1 less than the row. aoindex = symoffset + row - 1 for i in range(len(monumbers)): self.mocoeffs[spin][moindices[i], aoindex] = coeffs[i] line = next(inputfile) lastrow = row # ************************************************************************** # * * # * Final excitation energies from Davidson algorithm * # * * # ************************************************************************** # # Number of loops in Davidson routine = 20 # Number of matrix-vector multiplications = 24 # Type of excitations = SINGLET-SINGLET # # Symmetry B.u # # ... several blocks ... # # Normal termination of EXCITATION program part if line[4:53] == "Final excitation energies from Davidson algorithm": while line[1:9] != "Symmetry" and "Normal termination" not in line: line = next(inputfile) symm = self.normalisesym(line.split()[1]) # Excitation energies E in a.u. and eV, dE wrt prev. cycle, # oscillator strengths f in a.u. # # no. E/a.u. E/eV f dE/a.u. # ----------------------------------------------------- # 1 0.17084 4.6488 0.16526E-01 0.28E-08 # ... while line.split() != ['no.', 'E/a.u.', 'E/eV', 'f', 'dE/a.u.'] and "Normal termination" not in line: line = next(inputfile) self.skip_line(inputfile, 'dashes') etenergies = [] etoscs = [] etsyms = [] line = next(inputfile) while len(line) > 2: info = line.split() etenergies.append(utils.convertor(float(info[2]), "eV", "cm-1")) etoscs.append(float(info[3])) etsyms.append(symm) line = next(inputfile) # There is another section before this, with transition dipole moments, # but this should just skip past it. while line[1:53] != "Major MO -> MO transitions for the above excitations": line = next(inputfile) # Note that here, and later, the number of blank lines can vary between # version of ADF (extra lines are seen in 2013.01 unit tests, for example). self.skip_line(inputfile, 'blank') excitation_occupied = next(inputfile) header = next(inputfile) while not header.strip(): header = next(inputfile) header2 = next(inputfile) x_y_z = next(inputfile) line = next(inputfile) while not line.strip(): line = next(inputfile) # Before we start handeling transitions, we need to create mosyms # with indices; only restricted calcs are possible in ADF. counts = {} syms = [] for mosym in self.mosyms[0]: if list(counts.keys()).count(mosym) == 0: counts[mosym] = 1 else: counts[mosym] += 1 syms.append(str(counts[mosym]) + mosym) etsecs = [] printed_warning = False for i in range(len(etenergies)): etsec = [] info = line.split() while len(info) > 0: match = re.search('[^0-9]', info[1]) index1 = int(info[1][:match.start(0)]) text = info[1][match.start(0):] symtext = text[0].upper() + text[1:] sym1 = str(index1) + self.normalisesym(symtext) match = re.search('[^0-9]', info[3]) index2 = int(info[3][:match.start(0)]) text = info[3][match.start(0):] symtext = text[0].upper() + text[1:] sym2 = str(index2) + self.normalisesym(symtext) try: index1 = syms.index(sym1) except ValueError: if not printed_warning: self.logger.warning("Etsecs are not accurate!") printed_warning = True try: index2 = syms.index(sym2) except ValueError: if not printed_warning: self.logger.warning("Etsecs are not accurate!") printed_warning = True etsec.append([(index1, 0), (index2, 0), float(info[4])]) line = next(inputfile) info = line.split() etsecs.append(etsec) # Again, the number of blank lines between transition can vary. line = next(inputfile) while not line.strip(): line = next(inputfile) if not hasattr(self, "etenergies"): self.etenergies = etenergies else: self.etenergies += etenergies if not hasattr(self, "etoscs"): self.etoscs = etoscs else: self.etoscs += etoscs if not hasattr(self, "etsyms"): self.etsyms = etsyms else: self.etsyms += etsyms if not hasattr(self, "etsecs"): self.etsecs = etsecs else: self.etsecs += etsecs if "M U L L I K E N P O P U L A T I O N S" in line: if not hasattr(self, "atomcharges"): self.atomcharges = {} while line[1:5] != "Atom": line = next(inputfile) self.skip_line(inputfile, 'dashes') mulliken = [] line = next(inputfile) while line.strip(): mulliken.append(float(line.split()[2])) line = next(inputfile) self.atomcharges["mulliken"] = mulliken # Dipole moment is always printed after a point calculation, # and the reference point for this is always the origin (0,0,0) # and not necessarily the center of mass, as explained on the # ADF user mailing list (see cclib/cclib#113 for details). # # ============= # Dipole Moment *** (Debye) *** # ============= # # Vector : 0.00000000 0.00000000 0.00000000 # Magnitude: 0.00000000 # if line.strip()[:13] == "Dipole Moment": self.skip_line(inputfile, 'equals') # There is not always a blank line here, for example when the dipole and quadrupole # moments are printed after the multipole derived atomic charges. Still, to the best # of my knowledge (KML) the values are still in Debye. line = next(inputfile) if not line.strip(): line = next(inputfile) assert line.split()[0] == "Vector" dipole = [float(d) for d in line.split()[-3:]] reference = [0.0, 0.0, 0.0] if not hasattr(self, 'moments'): self.moments = [reference, dipole] else: try: assert self.moments[1] == dipole except AssertionError: self.logger.warning('Overwriting previous multipole moments with new values') self.moments = [reference, dipole] if __name__ == "__main__": import doctest, adfparser doctest.testmod(adfparser, verbose=False) cclib-1.3.1/src/cclib/parser/ccopen.py0000644000175100016050000001301212467425265017443 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2009-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Tools for identifying and working with files and streams for any supported program""" from __future__ import print_function import os import sys from . import data from . import logfileparser from .adfparser import ADF from .gamessparser import GAMESS from .gamessukparser import GAMESSUK from .gaussianparser import Gaussian from .jaguarparser import Jaguar from .molproparser import Molpro from .nwchemparser import NWChem from .orcaparser import ORCA from .psiparser import Psi from .qchemparser import QChem try: from ..bridge import cclib2openbabel except ImportError: print("Could not import openbabel, fallback mechanism might not work.") # Parser choice is triggered by certain phrases occuring the logfile. Where these # strings are unique, we can set the parser and break. In other cases, the situation # is a little but more complicated. Here are the exceptions: # 1. The GAMESS trigger also works for GAMESS-UK files, so we can't break # after finding GAMESS in case the more specific phrase is found. # 2. Molro log files don't have the program header, but always contain # the generic string 1PROGRAM, so don't break here either to be cautious. # 3. The Psi header has two different strings with some variation # # The triggers are defined by the tuples in the list below like so: # (parser, phrases, flag whether we should break) triggers = [ (ADF, ["Amsterdam Density Functional"], True), (GAMESS, ["GAMESS"], False), (GAMESS, ["GAMESS VERSION"], True), (GAMESSUK, ["G A M E S S - U K"], True), (Gaussian, ["Gaussian, Inc."], True), (Jaguar, ["Jaguar"], True), (Molpro, ["PROGRAM SYSTEM MOLPRO"], True), (Molpro, ["1PROGRAM"], False), (NWChem, ["Northwest Computational Chemistry Package"], True), (ORCA, ["O R C A"], True), (Psi, ["PSI", "Ab Initio Electronic Structure"], True), (QChem, ["A Quantum Leap Into The Future Of Chemistry"], True), ] def guess_filetype(inputfile): """Try to guess the filetype by searching for trigger strings.""" filetype = None for line in inputfile: for parser, phrases, do_break in triggers: if all([line.find(p) >= 0 for p in phrases]): filetype = parser if do_break: return filetype return filetype def ccread(source, *args, **kargs): """Attempt to open and read computational chemistry data from a file. If the file is not appropriate for cclib parsers, a fallback mechanism will try to recognize some common chemistry formats and read those using the appropriate bridge such as OpenBabel. Inputs: source - a single logfile, a list of logfiles, or an input stream Returns: a ccData object containing cclib data attributes """ log = ccopen(source, *args, **kargs) if log: if kargs['verbose']: print('Identified logfile to be in %s format' % log.logname) return log.parse() else: if kargs['verbose']: print('Attempting to use fallback mechanism to read file') return fallback(source) def ccopen(source, *args, **kargs): """Guess the identity of a particular log file and return an instance of it. Inputs: source - a single logfile, a list of logfiles, or an input stream Returns: one of ADF, GAMESS, GAMESS UK, Gaussian, Jaguar, Molpro, NWChem, ORCA, Psi, QChem, or None (if it cannot figure it out or the file does not exist). """ # Try to open the logfile(s), using openlogfile. if isinstance(source, str) or \ isinstance(source, list) and all([isinstance(s, str) for s in source]): try: inputfile = logfileparser.openlogfile(source) except IOError as error: (errno, strerror) = error.args print("I/O error %s (%s): %s" % (errno, source, strerror)) return None isstream = False elif hasattr(source, "read"): inputfile = source isstream = True else: raise ValueError # Try to guess the filetype. filetype = guess_filetype(inputfile) # Need to close file before creating a instance. if not isstream: inputfile.close() # Return an instance of the logfile if one was chosen. if filetype: return filetype(source, *args, **kargs) def fallback(source): """Attempt to read standard molecular formats using other libraries. Currently this will read XYZ files with OpenBabel, but this can easily be extended to other formats and libraries, too. """ if isinstance(source, str): ext = os.path.splitext(source)[1][1:].lower() if 'cclib.bridge.cclib2openbabel' in sys.modules and ext in ('xyz', ): return cclib2openbabel.readfile(source, ext) cclib-1.3.1/src/cclib/parser/qchemparser.py0000644000175100016050000013113612467423323020507 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for Q-Chem output files""" from __future__ import print_function import re import numpy from . import logfileparser from . import utils class QChem(logfileparser.Logfile): """A Q-Chem 4 log file.""" def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(QChem, self).__init__(logname="QChem", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "QChem log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'QChem("%s")' % (self.filename) def normalisesym(self, label): """Q-Chem does not require normalizing symmetry labels.""" def before_parsing(self): # Keep track of whether or not we're performing an # (un)restricted calculation. self.unrestricted = False # Compile the dashes-and-or-spaces-only regex. self.dashes_and_spaces = re.compile('^[\s-]+$') def after_parsing(self): # If parsing a fragment job, each of the geometries appended to # `atomcoords` may be of different lengths, which will prevent # conversion from a list to NumPy array. # Take the length of the first geometry as correct, and remove # all others with different lengths. if len(self.atomcoords) > 1: correctlen = len(self.atomcoords[0]) self.atomcoords[:] = [coords for coords in self.atomcoords if len(coords) == correctlen] # At the moment, there is no similar correction for other properties! def extract(self, inputfile, line): """Extract information from the file object inputfile.""" # Charge and multiplicity are present in the input file, which is generally # printed once at the beginning. However, it is also prined for fragment # calculations, so make sure we parse only the first occurance. if '$molecule' in line: line = next(inputfile) charge, mult = map(int, line.split()) if not hasattr(self, 'charge'): self.set_attribute('charge', charge) if not hasattr(self, 'mult'): self.set_attribute('mult', mult) # Extract the atomic numbers and coordinates of the atoms. if 'Standard Nuclear Orientation (Angstroms)' in line: if not hasattr(self, 'atomcoords'): self.atomcoords = [] self.skip_lines(inputfile, ['cols', 'dashes']) atomelements = [] atomcoords = [] line = next(inputfile) while list(set(line.strip())) != ['-']: entry = line.split() atomelements.append(entry[1]) atomcoords.append(list(map(float, entry[2:]))) line = next(inputfile) self.atomcoords.append(atomcoords) if not hasattr(self, 'atomnos'): self.atomnos = [] for atomelement in atomelements: if atomelement == 'GH': self.atomnos.append(0) else: self.atomnos.append(utils.PeriodicTable().number[atomelement]) self.natom = len(self.atomnos) # Number of electrons. # Useful for determining the number of occupied/virtual orbitals. if 'Nuclear Repulsion Energy' in line: if not hasattr(self, 'nalpha'): line = next(inputfile) nelec_re_string = 'There are(\s+[0-9]+) alpha and(\s+[0-9]+) beta electrons' match = re.findall(nelec_re_string, line.strip()) self.nalpha = int(match[0][0].strip()) self.nbeta = int(match[0][1].strip()) # Number of basis functions. # Because Q-Chem's integral recursion scheme is defined using # Cartesian basis functions, there is often a distinction between the # two in the output. We only parse for *pure* functions. # Examples: # Only one type: # There are 30 shells and 60 basis functions # Both Cartesian and pure: # ... if 'basis functions' in line: if not hasattr(self, 'nbasis'): self.set_attribute('nbasis', int(line.split()[-3])) # Check for whether or not we're peforming an # (un)restricted calculation. if 'calculation will be' in line: if ' restricted' in line: self.unrestricted = False if 'unrestricted' in line: self.unrestricted = True # Section with SCF iterations goes like this: # # SCF converges when DIIS error is below 1.0E-05 # --------------------------------------- # Cycle Energy DIIS Error # --------------------------------------- # 1 -381.9238072190 1.39E-01 # 2 -382.2937212775 3.10E-03 # 3 -382.2939780242 3.37E-03 # ... # if 'SCF converges when ' in line: if not hasattr(self, 'scftargets'): self.scftargets = [] target = float(line.split()[-1]) self.scftargets.append([target]) # We should have the header between dashes now, # but sometimes there are lines before the first dashes. while not 'Cycle Energy' in line: line = next(inputfile) self.skip_lines(inputfile, ['d']) values = [] iter_counter = 1 line = next(inputfile) while 'energy in the final basis set' not in line: # Some trickery to avoid a lot of printing that can occur # between each SCF iteration. entry = line.split() if len(entry) > 0: if entry[0] == str(iter_counter): # Q-Chem only outputs one error metric. error = float(entry[2]) values.append([error]) iter_counter += 1 line = next(inputfile) # This is printed in regression Qchem4.2/dvb_sp_unconverged.out # so use it to bail out when convergence fails. if "SCF failed to converge" in line or "Convergence failure" in line: break if not hasattr(self, 'scfvalues'): self.scfvalues = [] self.scfvalues.append(numpy.array(values)) if 'Total energy in the final basis set' in line: if not hasattr(self, 'scfenergies'): self.scfenergies = [] scfenergy = float(line.split()[-1]) self.scfenergies.append(utils.convertor(scfenergy, 'hartree', 'eV')) # Geometry optimization. if 'Maximum Tolerance Cnvgd?' in line: line_g = list(map(float, next(inputfile).split()[1:3])) line_d = list(map(float, next(inputfile).split()[1:3])) line_e = next(inputfile).split()[2:4] if not hasattr(self, 'geotargets'): self.geotargets = [line_g[1], line_d[1], self.float(line_e[1])] if not hasattr(self, 'geovalues'): self.geovalues = [] try: ediff = abs(self.float(line_e[0])) except ValueError: ediff = numpy.nan geovalues = [line_g[0], line_d[0], ediff] self.geovalues.append(geovalues) if '** OPTIMIZATION CONVERGED **' in line: if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.atomcoords)) if '** MAXIMUM OPTIMIZATION CYCLES REACHED **' in line: if not hasattr(self, 'optdone'): self.optdone = [] # Moller-Plesset corrections. # There are multiple modules in Q-Chem for calculating MPn energies: # cdman, ccman, and ccman2, all with different output. # # MP2, RI-MP2, and local MP2 all default to cdman, which has a simple # block of output after the regular SCF iterations. # # MP3 is handled by ccman2. # # MP4 and variants are handled by ccman. # This is the MP2/cdman case. if 'MP2 total energy' in line: if not hasattr(self, 'mpenergies'): self.mpenergies = [] mp2energy = float(line.split()[4]) mp2energy = utils.convertor(mp2energy, 'hartree', 'eV') self.mpenergies.append([mp2energy]) # This is the MP3/ccman2 case. if line[1:11] == 'MP2 energy' and line[12:19] != 'read as': if not hasattr(self, 'mpenergies'): self.mpenergies = [] mpenergies = [] mp2energy = float(line.split()[3]) mpenergies.append(mp2energy) line = next(inputfile) line = next(inputfile) # Just a safe check. if 'MP3 energy' in line: mp3energy = float(line.split()[3]) mpenergies.append(mp3energy) mpenergies = [utils.convertor(mpe, 'hartree', 'eV') for mpe in mpenergies] self.mpenergies.append(mpenergies) # This is the MP4/ccman case. if 'EHF' in line: if not hasattr(self, 'mpenergies'): self.mpenergies = [] mpenergies = [] while list(set(line.strip())) != ['-']: if 'EMP2' in line: mp2energy = float(line.split()[2]) mpenergies.append(mp2energy) if 'EMP3' in line: mp3energy = float(line.split()[2]) mpenergies.append(mp3energy) if 'EMP4SDQ' in line: mp4sdqenergy = float(line.split()[2]) mpenergies.append(mp4sdqenergy) # This is really MP4SD(T)Q. if 'EMP4 ' in line: mp4sdtqenergy = float(line.split()[2]) mpenergies.append(mp4sdtqenergy) line = next(inputfile) mpenergies = [utils.convertor(mpe, 'hartree', 'eV') for mpe in mpenergies] self.mpenergies.append(mpenergies) # Coupled cluster corrections. # Hopefully we only have to deal with ccman2 here. if 'CCD total energy' in line: if not hasattr(self, 'ccenergies'): self.ccenergies = [] ccdenergy = float(line.split()[-1]) ccdenergy = utils.convertor(ccdenergy, 'hartree', 'eV') self.ccenergies.append(ccdenergy) if 'CCSD total energy' in line: has_triples = False if not hasattr(self, 'ccenergies'): self.ccenergies = [] ccsdenergy = float(line.split()[-1]) # Make sure we aren't actually doing CCSD(T). line = next(inputfile) line = next(inputfile) if 'CCSD(T) total energy' in line: has_triples = True ccsdtenergy = float(line.split()[-1]) ccsdtenergy = utils.convertor(ccsdtenergy, 'hartree', 'eV') self.ccenergies.append(ccsdtenergy) if not has_triples: ccsdenergy = utils.convertor(ccsdenergy, 'hartree', 'eV') self.ccenergies.append(ccsdenergy) # Electronic transitions. Works for both CIS and TDDFT. if 'Excitation Energies' in line: # Restricted: # --------------------------------------------------- # TDDFT/TDA Excitation Energies # --------------------------------------------------- # # Excited state 1: excitation energy (eV) = 3.6052 # Total energy for state 1: -382.167872200685 # Multiplicity: Triplet # Trans. Mom.: 0.0000 X 0.0000 Y 0.0000 Z # Strength : 0.0000 # D( 33) --> V( 3) amplitude = 0.2618 # D( 34) --> V( 2) amplitude = 0.2125 # D( 35) --> V( 1) amplitude = 0.9266 # # Unrestricted: # Excited state 2: excitation energy (eV) = 2.3156 # Total energy for state 2: -381.980177630969 # : 0.7674 # Trans. Mom.: -2.7680 X -0.1089 Y 0.0000 Z # Strength : 0.4353 # S( 1) --> V( 1) amplitude = -0.3105 alpha # D( 34) --> S( 1) amplitude = 0.9322 beta self.skip_lines(inputfile, ['dashes', 'blank']) line = next(inputfile) etenergies = [] etsyms = [] etoscs = [] etsecs = [] spinmap = {'alpha': 0, 'beta': 1} while list(set(line.strip())) != ['-']: # Take the total energy for the state and subtract from the # ground state energy, rather than just the EE; # this will be more accurate. if 'Total energy for state' in line: energy = utils.convertor(float(line.split()[-1]), 'hartree', 'cm-1') etenergy = energy - utils.convertor(self.scfenergies[-1], 'eV', 'cm-1') etenergies.append(etenergy) # if 'excitation energy' in line: # etenergy = utils.convertor(float(line.split()[-1]), 'eV', 'cm-1') # etenergies.append(etenergy) if 'Multiplicity' in line: etsym = line.split()[1] etsyms.append(etsym) if 'Strength' in line: strength = float(line.split()[-1]) etoscs.append(strength) # This is the list of transitions. if 'amplitude' in line: sec = [] while line.strip() != '': if self.unrestricted: spin = spinmap[line[42:47].strip()] else: spin = 0 # There is a subtle difference between TDA and RPA calcs, # because in the latter case each transition line is # preceeded by the type of vector: X or Y, name excitation # or deexcitation (see #154 for details). For deexcitations, # we will need to reverse the MO indices. Note also that Q-Chem # starts reindexing virtual orbitals at 1. if line[5] == '(': ttype = 'X' startidx = int(line[6:9]) - 1 endidx = int(line[17:20]) - 1 + self.nalpha contrib = float(line[34:41].strip()) else: assert line[5] == ":" ttype = line[4] startidx = int(line[9:12]) - 1 endidx = int(line[20:23]) - 1 + self.nalpha contrib = float(line[37:44].strip()) start = (startidx, spin) end = (endidx, spin) if ttype == 'X': sec.append([start, end, contrib]) elif ttype == 'Y': sec.append([end, start, contrib]) else: raise ValueError('Unknown transition type: %s' % ttype) line = next(inputfile) etsecs.append(sec) line = next(inputfile) self.set_attribute('etenergies', etenergies) self.set_attribute('etsyms', etsyms) self.set_attribute('etoscs', etoscs) self.set_attribute('etsecs', etsecs) # Molecular orbital energies and symmetries. if 'Orbital Energies (a.u.) and Symmetries' in line: # -------------------------------------------------------------- # Orbital Energies (a.u.) and Symmetries # -------------------------------------------------------------- # # Alpha MOs, Restricted # -- Occupied -- # -10.018 -10.018 -10.008 -10.008 -10.007 -10.007 -10.006 -10.005 # 1 Bu 1 Ag 2 Bu 2 Ag 3 Bu 3 Ag 4 Bu 4 Ag # -9.992 -9.992 -0.818 -0.755 -0.721 -0.704 -0.670 -0.585 # 5 Ag 5 Bu 6 Ag 6 Bu 7 Ag 7 Bu 8 Bu 8 Ag # -0.561 -0.532 -0.512 -0.462 -0.439 -0.410 -0.400 -0.397 # 9 Ag 9 Bu 10 Ag 11 Ag 10 Bu 11 Bu 12 Bu 12 Ag # -0.376 -0.358 -0.349 -0.330 -0.305 -0.295 -0.281 -0.263 # 13 Bu 14 Bu 13 Ag 1 Au 15 Bu 14 Ag 15 Ag 1 Bg # -0.216 -0.198 -0.160 # 2 Au 2 Bg 3 Bg # -- Virtual -- # 0.050 0.091 0.116 0.181 0.280 0.319 0.330 0.365 # 3 Au 4 Au 4 Bg 5 Au 5 Bg 16 Ag 16 Bu 17 Bu # 0.370 0.413 0.416 0.422 0.446 0.469 0.496 0.539 # 17 Ag 18 Bu 18 Ag 19 Bu 19 Ag 20 Bu 20 Ag 21 Ag # 0.571 0.587 0.610 0.627 0.646 0.693 0.743 0.806 # 21 Bu 22 Ag 22 Bu 23 Bu 23 Ag 24 Ag 24 Bu 25 Ag # 0.816 # 25 Bu # # Beta MOs, Restricted # -- Occupied -- # -10.018 -10.018 -10.008 -10.008 -10.007 -10.007 -10.006 -10.005 # 1 Bu 1 Ag 2 Bu 2 Ag 3 Bu 3 Ag 4 Bu 4 Ag # -9.992 -9.992 -0.818 -0.755 -0.721 -0.704 -0.670 -0.585 # 5 Ag 5 Bu 6 Ag 6 Bu 7 Ag 7 Bu 8 Bu 8 Ag # -0.561 -0.532 -0.512 -0.462 -0.439 -0.410 -0.400 -0.397 # 9 Ag 9 Bu 10 Ag 11 Ag 10 Bu 11 Bu 12 Bu 12 Ag # -0.376 -0.358 -0.349 -0.330 -0.305 -0.295 -0.281 -0.263 # 13 Bu 14 Bu 13 Ag 1 Au 15 Bu 14 Ag 15 Ag 1 Bg # -0.216 -0.198 -0.160 # 2 Au 2 Bg 3 Bg # -- Virtual -- # 0.050 0.091 0.116 0.181 0.280 0.319 0.330 0.365 # 3 Au 4 Au 4 Bg 5 Au 5 Bg 16 Ag 16 Bu 17 Bu # 0.370 0.413 0.416 0.422 0.446 0.469 0.496 0.539 # 17 Ag 18 Bu 18 Ag 19 Bu 19 Ag 20 Bu 20 Ag 21 Ag # 0.571 0.587 0.610 0.627 0.646 0.693 0.743 0.806 # 21 Bu 22 Ag 22 Bu 23 Bu 23 Ag 24 Ag 24 Bu 25 Ag # 0.816 # 25 Bu # -------------------------------------------------------------- self.skip_line(inputfile, 'dashes') line = next(inputfile) # Sometimes Q-Chem gets a little confused... while 'Warning : Irrep of orbital' in line: line = next(inputfile) line = next(inputfile) energies_alpha = [] symbols_alpha = [] if self.unrestricted: energies_beta = [] symbols_beta = [] line = next(inputfile) # The end of the block is either a blank line or only dashes. while not self.dashes_and_spaces.search(line): if 'Occupied' in line or 'Virtual' in line: # A nice trick to find where the HOMO is. if 'Virtual' in line: self.homos = [len(energies_alpha)-1] line = next(inputfile) # Parse the energies and symmetries in pairs of lines. # energies = [utils.convertor(energy, 'hartree', 'eV') # for energy in map(float, line.split())] # This convoluted bit handles '*******' when present. energies = [] energy_line = line.split() for e in energy_line: try: energy = utils.convertor(self.float(e), 'hartree', 'eV') except ValueError: energy = numpy.nan energies.append(energy) energies_alpha.extend(energies) line = next(inputfile) symbols = line.split()[1::2] symbols_alpha.extend(symbols) line = next(inputfile) # Only look at the second block if doing an unrestricted calculation. # This might be a problem for ROHF/ROKS. if self.unrestricted: self.skip_line(inputfile, 'header') line = next(inputfile) while not self.dashes_and_spaces.search(line): if 'Occupied' in line or 'Virtual' in line: # This will definitely exist, thanks to the above block. if 'Virtual' in line: if len(self.homos) == 1: self.homos.append(len(energies_beta)-1) line = next(inputfile) energies = [] energy_line = line.split() for e in energy_line: try: energy = utils.convertor(self.float(e), 'hartree', 'eV') except ValueError: energy = numpy.nan energies.append(energy) energies_beta.extend(energies) line = next(inputfile) symbols = line.split()[1::2] symbols_beta.extend(symbols) line = next(inputfile) # For now, only keep the last set of MO energies, even though it is # printed at every step of geometry optimizations and fragment jobs. self.moenergies = [[]] self.mosyms = [[]] self.moenergies[0] = numpy.array(energies_alpha) self.mosyms[0] = symbols_alpha if self.unrestricted: self.moenergies.append([]) self.mosyms.append([]) self.moenergies[1] = numpy.array(energies_beta) self.mosyms[1] = symbols_beta self.set_attribute('nmo', len(self.moenergies[0])) # Molecular orbital energies, no symmetries. if line.strip() == 'Orbital Energies (a.u.)': # In the case of no orbital symmetries, the beta spin block is not # present for restricted calculations. # -------------------------------------------------------------- # Orbital Energies (a.u.) # -------------------------------------------------------------- # # Alpha MOs # -- Occupied -- # ******* -38.595 -34.580 -34.579 -34.578 -19.372 -19.372 -19.364 # -19.363 -19.362 -19.362 -4.738 -3.252 -3.250 -3.250 -1.379 # -1.371 -1.369 -1.365 -1.364 -1.362 -0.859 -0.855 -0.849 # -0.846 -0.840 -0.836 -0.810 -0.759 -0.732 -0.729 -0.704 # -0.701 -0.621 -0.610 -0.595 -0.587 -0.584 -0.578 -0.411 # -0.403 -0.355 -0.354 -0.352 # -- Virtual -- # -0.201 -0.117 -0.099 -0.086 0.020 0.031 0.055 0.067 # 0.075 0.082 0.086 0.092 0.096 0.105 0.114 0.148 # # Beta MOs # -- Occupied -- # ******* -38.561 -34.550 -34.549 -34.549 -19.375 -19.375 -19.367 # -19.367 -19.365 -19.365 -4.605 -3.105 -3.103 -3.102 -1.385 # -1.376 -1.376 -1.371 -1.370 -1.368 -0.863 -0.858 -0.853 # -0.849 -0.843 -0.839 -0.818 -0.765 -0.738 -0.737 -0.706 # -0.702 -0.624 -0.613 -0.600 -0.591 -0.588 -0.585 -0.291 # -0.291 -0.288 -0.275 # -- Virtual -- # -0.139 -0.122 -0.103 0.003 0.014 0.049 0.049 0.059 # 0.061 0.070 0.076 0.081 0.086 0.090 0.098 0.106 # 0.138 # -------------------------------------------------------------- self.skip_lines(inputfile, ['dashes', 'blank']) line = next(inputfile) energies_alpha = [] if self.unrestricted: energies_beta = [] line = next(inputfile) # The end of the block is either a blank line or only dashes. while not self.dashes_and_spaces.search(line): if 'Occupied' in line or 'Virtual' in line: # A nice trick to find where the HOMO is. if 'Virtual' in line: self.homos = [len(energies_alpha)-1] line = next(inputfile) energies = [] energy_line = line.split() for e in energy_line: try: energy = utils.convertor(self.float(e), 'hartree', 'eV') except ValueError: energy = numpy.nan energies.append(energy) energies_alpha.extend(energies) line = next(inputfile) line = next(inputfile) # Only look at the second block if doing an unrestricted calculation. # This might be a problem for ROHF/ROKS. if self.unrestricted: self.skip_lines(inputfile, ['blank']) line = next(inputfile) while not self.dashes_and_spaces.search(line): if 'Occupied' in line or 'Virtual' in line: # This will definitely exist, thanks to the above block. if 'Virtual' in line: if len(self.homos) == 1: self.homos.append(len(energies_beta)-1) line = next(inputfile) energies = [] energy_line = line.split() for e in energy_line: try: energy = utils.convertor(self.float(e), 'hartree', 'eV') except ValueError: energy = numpy.nan energies.append(energy) energies_beta.extend(energies) line = next(inputfile) # For now, only keep the last set of MO energies, even though it is # printed at every step of geometry optimizations and fragment jobs. self.moenergies = [[]] self.moenergies[0] = numpy.array(energies_alpha) if self.unrestricted: self.moenergies.append([]) self.moenergies[1] = numpy.array(energies_beta) self.set_attribute('nmo', len(self.moenergies[0])) # Population analysis. if 'Ground-State Mulliken Net Atomic Charges' in line: self.parse_charge_section(inputfile, 'mulliken') if 'Hirshfeld Atomic Charges' in line: self.parse_charge_section(inputfile, 'hirshfeld') if 'Ground-State ChElPG Net Atomic Charges' in line: self.parse_charge_section(inputfile, 'chelpg') # Multipole moments are not printed in lexicographical order, # so we need to parse and sort them. The units seem OK, but there # is some uncertainty about the reference point and whether it # can be changed. # # Notice how the letter/coordinate labels change to coordinate ranks # after hexadecapole moments, and need to be translated. Additionally, # after 9-th order moments the ranks are not necessarily single digits # and so there are spaces between them. # # ----------------------------------------------------------------- # Cartesian Multipole Moments # LMN = < X^L Y^M Z^N > # ----------------------------------------------------------------- # Charge (ESU x 10^10) # 0.0000 # Dipole Moment (Debye) # X 0.0000 Y 0.0000 Z 0.0000 # Tot 0.0000 # Quadrupole Moments (Debye-Ang) # XX -50.9647 XY -0.1100 YY -50.1441 # XZ 0.0000 YZ 0.0000 ZZ -58.5742 # ... # 5th-Order Moments (Debye-Ang^4) # 500 0.0159 410 -0.0010 320 0.0005 # 230 0.0000 140 0.0005 050 0.0012 # ... # ----------------------------------------------------------------- # if "Cartesian Multipole Moments" in line: # This line appears not by default, but only when # `multipole_order` > 4: line = inputfile.next() if 'LMN = < X^L Y^M Z^N >' in line: line = inputfile.next() # The reference point is always the origin, although normally the molecule # is moved so that the center of charge is at the origin. self.reference = [0.0, 0.0, 0.0] self.moments = [self.reference] # Watch out! This charge is in statcoulombs without the exponent! # We should expect very good agreement, however Q-Chem prints # the charge only with 5 digits, so expect 1e-4 accuracy. charge_header = inputfile.next() assert charge_header.split()[0] == "Charge" charge = float(inputfile.next().strip()) charge = utils.convertor(charge, 'statcoulomb', 'e') * 1e-10 # Allow this to change until fragment jobs are properly implemented. # assert abs(charge - self.charge) < 1e-4 # This will make sure Debyes are used (not sure if it can be changed). line = inputfile.next() assert line.strip() == "Dipole Moment (Debye)" while "-----" not in line: # The current multipole element will be gathered here. multipole = [] line = inputfile.next() while ("-----" not in line) and ("Moment" not in line): cols = line.split() # The total (norm) is printed for dipole but not other multipoles. if cols[0] == 'Tot': line = inputfile.next() continue # Find and replace any 'stars' with NaN before moving on. for i in range(len(cols)): if '***' in cols[i]: cols[i] = numpy.nan # The moments come in pairs (label followed by value) up to the 9-th order, # although above hexadecapoles the labels are digits representing the rank # in each coordinate. Above the 9-th order, ranks are not always single digits, # so there are spaces between them, which means moments come in quartets. if len(self.moments) < 5: for i in range(len(cols)//2): lbl = cols[2*i] m = cols[2*i + 1] multipole.append([lbl, m]) elif len(self.moments) < 10: for i in range(len(cols)//2): lbl = cols[2*i] lbl = 'X'*int(lbl[0]) + 'Y'*int(lbl[1]) + 'Z'*int(lbl[2]) m = cols[2*i + 1] multipole.append([lbl, m]) else: for i in range(len(cols)//4): lbl = 'X'*int(cols[4*i]) + 'Y'*int(cols[4*i + 1]) + 'Z'*int(cols[4*i + 2]) m = cols[4*i + 3] multipole.append([lbl, m]) line = inputfile.next() # Sort should use the first element when sorting lists, # so this should simply work, and afterwards we just need # to extract the second element in each list (the actual moment). multipole.sort() multipole = [m[1] for m in multipole] self.moments.append(multipole) # For `method = force` or geometry optimizations, # the gradient is printed. if 'Gradient of SCF Energy' in line: if not hasattr(self, 'grads'): self.grads = [] grad = numpy.empty(shape=(3, self.natom)) # A maximum of 6 columns/block. ncols = 6 line = next(inputfile) colcounter = 0 while colcounter < self.natom: if line[:5].strip() == '': line = next(inputfile) rowcounter = 0 while rowcounter < 3: row = list(map(float, line.split()[1:])) grad[rowcounter][colcounter:colcounter+ncols] = row line = next(inputfile) rowcounter += 1 colcounter += ncols self.grads.append(grad.T) # For IR-related jobs, the Hessian is printed (dim: 3*natom, 3*natom). # Note that this is *not* the mass-weighted Hessian. if 'Hessian of the SCF Energy' in line: if not hasattr(self, 'hessian'): # A maximum of 6 columns/block. ncols = 6 dim = 3*self.natom self.hessian = numpy.empty(shape=(dim, dim)) line = next(inputfile) colcounter = 0 while colcounter < dim: if line[:5].strip() == '': line = next(inputfile) rowcounter = 0 while rowcounter < dim: row = list(map(float, line.split()[1:])) self.hessian[rowcounter][colcounter:colcounter+ncols] = row line = next(inputfile) rowcounter += 1 colcounter += ncols # Start of the IR/Raman frequency section. if 'VIBRATIONAL ANALYSIS' in line: while 'STANDARD THERMODYNAMIC QUANTITIES' not in line: ## IR, optional Raman: # ********************************************************************** # ** ** # ** VIBRATIONAL ANALYSIS ** # ** -------------------- ** # ** ** # ** VIBRATIONAL FREQUENCIES (CM**-1) AND NORMAL MODES ** # ** FORCE CONSTANTS (mDYN/ANGSTROM) AND REDUCED MASSES (AMU) ** # ** INFRARED INTENSITIES (KM/MOL) ** ##** RAMAN SCATTERING ACTIVITIES (A**4/AMU) AND DEPOLARIZATION RATIOS ** # ** ** # ********************************************************************** # Mode: 1 2 3 # Frequency: -106.88 -102.91 161.77 # Force Cnst: 0.0185 0.0178 0.0380 # Red. Mass: 2.7502 2.8542 2.4660 # IR Active: NO YES YES # IR Intens: 0.000 0.000 0.419 # Raman Active: YES NO NO ##Raman Intens: 2.048 0.000 0.000 ##Depolar: 0.750 0.000 0.000 # X Y Z X Y Z X Y Z # C 0.000 0.000 -0.100 -0.000 0.000 -0.070 -0.000 -0.000 -0.027 # C 0.000 0.000 0.045 -0.000 0.000 -0.074 0.000 -0.000 -0.109 # C 0.000 0.000 0.148 -0.000 -0.000 -0.074 0.000 0.000 -0.121 # C 0.000 0.000 0.100 -0.000 -0.000 -0.070 0.000 0.000 -0.027 # C 0.000 0.000 -0.045 0.000 -0.000 -0.074 -0.000 -0.000 -0.109 # C 0.000 0.000 -0.148 0.000 0.000 -0.074 -0.000 -0.000 -0.121 # H -0.000 0.000 0.086 -0.000 0.000 -0.082 0.000 -0.000 -0.102 # H 0.000 0.000 0.269 -0.000 -0.000 -0.091 0.000 0.000 -0.118 # H 0.000 0.000 -0.086 0.000 -0.000 -0.082 -0.000 0.000 -0.102 # H -0.000 0.000 -0.269 0.000 0.000 -0.091 -0.000 -0.000 -0.118 # C 0.000 -0.000 0.141 -0.000 -0.000 -0.062 -0.000 0.000 0.193 # C -0.000 -0.000 -0.160 0.000 0.000 0.254 -0.000 0.000 0.043 # H 0.000 -0.000 0.378 -0.000 0.000 -0.289 0.000 0.000 0.519 # H -0.000 -0.000 -0.140 0.000 0.000 0.261 -0.000 -0.000 0.241 # H -0.000 -0.000 -0.422 0.000 0.000 0.499 -0.000 0.000 -0.285 # C 0.000 -0.000 -0.141 0.000 0.000 -0.062 -0.000 -0.000 0.193 # C -0.000 -0.000 0.160 -0.000 -0.000 0.254 0.000 0.000 0.043 # H 0.000 -0.000 -0.378 0.000 -0.000 -0.289 -0.000 0.000 0.519 # H -0.000 -0.000 0.140 -0.000 -0.000 0.261 0.000 0.000 0.241 # H -0.000 -0.000 0.422 -0.000 -0.000 0.499 0.000 0.000 -0.285 # TransDip 0.000 -0.000 -0.000 0.000 -0.000 -0.000 -0.000 0.000 0.021 # Mode: 4 5 6 # ... # There isn't any symmetry information for normal modes present # in Q-Chem. # if not hasattr(self, 'vibsyms'): # self.vibsyms = [] if 'Frequency:' in line: if not hasattr(self, 'vibfreqs'): self.vibfreqs = [] vibfreqs = map(float, line.split()[1:]) self.vibfreqs.extend(vibfreqs) if 'IR Intens:' in line: if not hasattr(self, 'vibirs'): self.vibirs = [] vibirs = map(float, line.split()[2:]) self.vibirs.extend(vibirs) if 'Raman Intens:' in line: if not hasattr(self, 'vibramans'): self.vibramans = [] vibramans = map(float, line.split()[2:]) self.vibramans.extend(vibramans) # This is the start of the displacement block. if line.split()[0:3] == ['X', 'Y', 'Z']: if not hasattr(self, 'vibdisps'): self.vibdisps = [] disps = [] for k in range(self.natom): line = next(inputfile) numbers = list(map(float, line.split()[1:])) N = len(numbers) // 3 if not disps: for n in range(N): disps.append([]) for n in range(N): disps[n].append(numbers[3*n:(3*n)+3]) self.vibdisps.extend(disps) line = next(inputfile) # Anharmonic vibrational analysis. # Q-Chem includes 3 theories: VPT2, TOSH, and VCI. # For now, just take the VPT2 results. # if 'VIBRATIONAL ANHARMONIC ANALYSIS' in line: # while list(set(line.strip())) != ['=']: # if 'VPT2' in line: # if not hasattr(self, 'vibanharms'): # self.vibanharms = [] # self.vibanharms.append(float(line.split()[-1])) # line = next(inputfile) if 'STANDARD THERMODYNAMIC QUANTITIES AT' in line: if not hasattr(self, 'temperature'): self.temperature = float(line.split()[4]) # Not supported yet. if not hasattr(self, 'pressure'): self.pressure = float(line.split()[7]) self.skip_lines(inputfile, ['blank', 'Imaginary']) line = next(inputfile) # Not supported yet. if 'Zero point vibrational energy' in line: if not hasattr(self, 'zpe'): # Convert from kcal/mol to Hartree/particle. self.zpe = utils.convertor(float(line.split()[4]), 'kcal', 'hartree') atommasses = [] while 'Archival summary' not in line: if 'Has Mass' in line: atommass = float(line.split()[6]) atommasses.append(atommass) if 'Total Enthalpy' in line: if not hasattr(self, 'enthalpy'): enthalpy = float(line.split()[2]) self.enthalpy = utils.convertor(enthalpy, 'kcal', 'hartree') if 'Total Entropy' in line: if not hasattr(self, 'entropy'): entropy = float(line.split()[2]) * self.temperature / 1000 # This is the *temperature dependent* entropy. self.entropy = utils.convertor(entropy, 'kcal', 'hartree') if not hasattr(self, 'freeenergy'): self.freeenergy = self.enthalpy - self.entropy line = next(inputfile) if not hasattr(self, 'atommasses'): self.atommasses = numpy.array(atommasses) # TODO: # 'aonames' # 'atombasis' # 'freeenergy' # 'fonames' # 'fooverlaps' # 'fragnames' # 'frags' # 'gbasis' # 'mocoeffs' # 'nocoeffs' # 'scancoords' # 'scanenergies' # 'scannames' # 'scanparm' def parse_charge_section(self, inputfile, chargetype): """Parse the population analysis charge block.""" self.skip_line(inputfile, 'blank') line = next(inputfile) has_spins = False if 'Spin' in line: if not hasattr(self, 'atomspins'): self.atomspins = dict() has_spins = True spins = [] self.skip_line(inputfile, 'dashes') if not hasattr(self, 'atomcharges'): self.atomcharges = dict() charges = [] line = next(inputfile) while list(set(line.strip())) != ['-']: elements = line.split() charge = self.float(elements[2]) charges.append(charge) if has_spins: spin = self.float(elements[3]) spins.append(spin) line = next(inputfile) self.atomcharges[chargetype] = numpy.array(charges) if has_spins: self.atomspins[chargetype] = numpy.array(spins) if __name__ == '__main__': import sys import doctest, qchemparser if len(sys.argv) == 1: doctest.testmod(qchemparser, verbose=False) if len(sys.argv) == 2: parser = qchemparser.QChem(sys.argv[1]) data = parser.parse() if len(sys.argv) > 2: for i in range(len(sys.argv[2:])): if hasattr(data, sys.argv[2 + i]): print(getattr(data, sys.argv[2 + i])) cclib-1.3.1/src/cclib/parser/logfileparser.py0000644000175100016050000003710212467425265021040 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Generic output file parser and related tools""" import bz2 import fileinput import gzip import inspect import io import logging import os import random import sys import zipfile import numpy from . import utils from .data import ccData # This seems to avoid a problem with Avogadro. logging.logMultiprocessing = 0 class myBZ2File(bz2.BZ2File): """Return string instead of bytes""" def __next__(self): line = super().__next__() return line.decode("ascii", "replace") class myGzipFile(gzip.GzipFile): """Return string instead of bytes""" def __next__(self): line = super().__next__() return line.decode("ascii", "replace") class FileWrapper(object): """Wrap a file object so that we can maintain position""" def __init__(self, file): self.file = file self.pos = 0 self.file.seek(0, 2) self.size = self.file.tell() self.file.seek(0, 0) def next(self): line = next(self.file) self.pos += len(line) return line def __next__(self): return self.next() def __iter__(self): return self def close(self): self.file.close() def seek(self, pos, ref): self.file.seek(pos, ref) def openlogfile(filename): """Return a file object given a filename. Given the filename of a log file or a gzipped, zipped, or bzipped log file, this function returns a regular Python file object. Given an address starting with http://, this function retrieves the url and returns a file object using a temporary file. Given a list of filenames, this function returns a FileInput object, which can be used for seamless iteration without concatenation. """ # If there is a single string argument given. if type(filename) in [str, str]: extension = os.path.splitext(filename)[1] if extension == ".gz": fileobject = myGzipFile(filename, "r") elif extension == ".zip": zip = zipfile.ZipFile(filename, "r") assert len(zip.namelist()) == 1, "ERROR: Zip file contains more than 1 file" fileobject = io.StringIO(zip.read(zip.namelist()[0]).decode("ascii", "ignore")) elif extension in ['.bz', '.bz2']: # Module 'bz2' is not always importable. assert bz2 != None, "ERROR: module bz2 cannot be imported" fileobject = myBZ2File(filename, "r") else: fileobject = FileWrapper(io.open(filename, "r", errors='ignore')) return fileobject elif hasattr(filename, "__iter__"): # Compression (gzip and bzip) is supported as of Python 2.5. if sys.version_info[0] >= 2 and sys.version_info[1] >= 5: fileobject = fileinput.input(filename, openhook=fileinput.hook_compressed) else: fileobject = fileinput.input(filename) return fileobject class Logfile(object): """Abstract class for logfile objects. Subclasses defined by cclib: ADF, GAMESS, GAMESSUK, Gaussian, Jaguar, Molpro, NWChem, ORCA, Psi, QChem """ def __init__(self, source, loglevel=logging.INFO, logname="Log", logstream=sys.stdout, datatype=ccData, **kwds): """Initialise the Logfile object. This should be called by a subclass in its own __init__ method. Inputs: source - a single logfile, a list of logfiles, or input stream """ # Set the filename to source if it is a string or a list of filenames. # In the case of an input stream, set some arbitrary name and the stream. # Elsewise, raise an Exception. if isinstance(source, str): self.filename = source self.isstream = False elif isinstance(source, list) and all([isinstance(s, str) for s in source]): self.filename = source self.isstream = False elif hasattr(source, "read"): self.filename = "stream %s" % str(type(source)) self.isstream = True self.stream = source else: raise ValueError # Set up the logger. # Note that calling logging.getLogger() with one name always returns the same instance. # Presently in cclib, all parser instances of the same class use the same logger, # which means that care needs to be taken not to duplicate handlers. self.loglevel = loglevel self.logname = logname self.logger = logging.getLogger('%s %s' % (self.logname, self.filename)) self.logger.setLevel(self.loglevel) if len(self.logger.handlers) == 0: handler = logging.StreamHandler(logstream) handler.setFormatter(logging.Formatter("[%(name)s %(levelname)s] %(message)s")) self.logger.addHandler(handler) # Periodic table of elements. self.table = utils.PeriodicTable() # This is the class that will be used in the data object returned by parse(), and should # normally be ccData or a subclass of it. self.datatype = datatype # Change the class used if we want optdone to be a list or if the 'future' option # is used, which might have more consequences in the future. optdone_as_list = kwds.get("optdone_as_list", False) or kwds.get("future", False) optdone_as_list = optdone_as_list if isinstance(optdone_as_list, bool) else False if not optdone_as_list: from .data import ccData_optdone_bool self.datatype = ccData_optdone_bool def __setattr__(self, name, value): # Send info to logger if the attribute is in the list self._attrlist. if name in getattr(self, "_attrlist", {}) and hasattr(self, "logger"): # Call logger.info() only if the attribute is new. if not hasattr(self, name): if type(value) in [numpy.ndarray, list]: self.logger.info("Creating attribute %s[]" %name) else: self.logger.info("Creating attribute %s: %s" %(name, str(value))) # Set the attribute. object.__setattr__(self, name, value) def parse(self, progress=None, fupdate=0.05, cupdate=0.002): """Parse the logfile, using the assumed extract method of the child.""" # Check that the sub-class has an extract attribute, # that is callable with the proper number of arguemnts. if not hasattr(self, "extract"): raise AttributeError("Class %s has no extract() method." %self.__class__.__name__) if not callable(self.extract): raise AttributeError("Method %s._extract not callable." %self.__class__.__name__) if len(inspect.getargspec(self.extract)[0]) != 3: raise AttributeError("Method %s._extract takes wrong number of arguments." %self.__class__.__name__) # Save the current list of attributes to keep after parsing. # The dict of self should be the same after parsing. _nodelete = list(set(self.__dict__.keys())) # Initiate the FileInput object for the input files. # Remember that self.filename can be a list of files. if not self.isstream: inputfile = openlogfile(self.filename) else: inputfile = self.stream # Intialize self.progress is_compressed = isinstance(inputfile, myGzipFile) or isinstance(inputfile, myBZ2File) if progress and not (is_compressed): self.progress = progress self.progress.initialize(inputfile.size) self.progress.step = 0 self.fupdate = fupdate self.cupdate = cupdate # Maybe the sub-class has something to do before parsing. self.before_parsing() # Loop over lines in the file object and call extract(). # This is where the actual parsing is done. for line in inputfile: self.updateprogress(inputfile, "Unsupported information", cupdate) # This call should check if the line begins a section of extracted data. # If it does, it parses some lines and sets the relevant attributes (to self). # Any attributes can be freely set and used across calls, however only those # in data._attrlist will be moved to final data object that is returned. self.extract(inputfile, line) # Close input file object. if not self.isstream: inputfile.close() # Maybe the sub-class has something to do after parsing. self.after_parsing() # If atomcoords were not parsed, but some input coordinates were ("inputcoords"). # This is originally from the Gaussian parser, a regression fix. if not hasattr(self, "atomcoords") and hasattr(self, "inputcoords"): self.atomcoords = numpy.array(self.inputcoords, 'd') # Set nmo if not set already - to nbasis. if not hasattr(self, "nmo") and hasattr(self, "nbasis"): self.nmo = self.nbasis # Creating deafult coreelectrons array. if not hasattr(self, "coreelectrons") and hasattr(self, "natom"): self.coreelectrons = numpy.zeros(self.natom, "i") # Create the data object we want to return. This is normally ccData, but can be changed # by passing the datatype argument to the constructor. All supported cclib attributes # are copied to this object, but beware that in order to be moved an attribute must be # included in the data._attrlist of ccData (or whatever else). # There is the possibility of passing assitional argument via self.data_args, but # we use this sparingly in cases where we want to limit the API with options, etc. data = self.datatype(attributes=self.__dict__) # Now make sure that the cclib attributes in the data object are all the correct type, # including arrays and lists of arrays. data.arrayify() # Delete all temporary attributes (including cclib attributes). # All attributes should have been moved to a data object, which will be returned. for attr in list(self.__dict__.keys()): if not attr in _nodelete: self.__delattr__(attr) # Update self.progress as done. if hasattr(self, "progress"): self.progress.update(inputfile.size, "Done") return data def before_parsing(self): """Set parser-specific variables and do other initial things here.""" pass def after_parsing(self): """Correct data or do parser-specific validation after parsing is finished.""" pass def updateprogress(self, inputfile, msg, xupdate=0.05): """Update progress.""" if hasattr(self, "progress") and random.random() < xupdate: newstep = inputfile.pos if newstep != self.progress.step: self.progress.update(newstep, msg) self.progress.step = newstep def normalisesym(self, symlabel): """Standardise the symmetry labels between parsers. This method should be overwritten by individual parsers, and should contain appropriate doctests. If is not overwritten, this is detected as an error by unit tests. """ return "ERROR: This should be overwritten by this subclass" def float(self, number): """Convert a string to a float. This method should perform certain checks that are specific to cclib, including avoiding the problem with Ds instead of Es in scientific notation. Another point is converting string signifying numerical problems (*****) to something we can manage (Numpy's NaN). >>> t = Logfile("dummyfile") >>> t.float("123.2323E+02") 12323.23 >>> t.float("123.2323D+02") 12323.23 >>> t.float("*****") nan """ if list(set(number)) == ['*']: return numpy.nan return float(number.replace("D","E")) def set_attribute(self, name, value, check=True): """Set an attribute and perform a check when it already exists. Note that this can be used for scalars and lists alike, whenever we want to set a value for an attribute. By default we want to check that the value does not change if the attribute already exists, and this function is a good place to add more tests in the future. """ if check and hasattr(self, name): try: assert getattr(self, name) == value except AssertionError: self.logger.warning("Attribute %s changed value (%s -> %s)" % (name, getattr(self, name), value)) setattr(self, name, value) def skip_lines(self, inputfile, sequence): """Read trivial line types and check they are what they are supposed to be. This function will read len(sequence) lines and do certain checks on them, when the elements of sequence have the appropriate values. Currently the following elements trigger checks: 'blank' or 'b' - the line should be blank 'dashes' or 'd' - the line should contain only dashes (or spaces) 'equals' or 'e' - the line should contain only equal signs (or spaces) 'stars' or 's' - the line should contain only stars (or spaces) """ expected_characters = { '-' : ['dashes', 'd'], '=' : ['equals', 'e'], '*' : ['stars', 's'], } lines = [] for expected in sequence: # Read the line we want to skip. line = next(inputfile) # Blank lines are perhaps the most common thing we want to check for. if expected in ["blank", "b"]: try: assert line.strip() == "" except AssertionError: frame, fname, lno, funcname, funcline, index = inspect.getouterframes(inspect.currentframe())[1] parser = fname.split('/')[-1] msg = "In %s, line %i, line not blank as expected: %s" % (parser, lno, line.strip()) self.logger.warning(msg) # All cases of heterogeneous lines can be dealt with by the same code. for character, keys in expected_characters.items(): if expected in keys: try: assert all([c == character for c in line.strip() if c != ' ']) except AssertionError: frame, fname, lno, funcname, funcline, index = inspect.getouterframes(inspect.currentframe())[1] parser = fname.split('/')[-1] msg = "In %s, line %i, line not all %s as expected: %s" % (parser, lno, keys[0], line.strip()) self.logger.warning(msg) continue # Save the skipped line, and we will return the whole list. lines.append(line) return lines skip_line = lambda self, inputfile, expected: self.skip_lines(inputfile, [expected]) if __name__ == "__main__": import doctest doctest.testmod() cclib-1.3.1/src/cclib/parser/gamessukparser.py0000644000175100016050000007046412467423323021237 0ustar kmlcclib00000000000000# -*- coding: utf-8 -*- # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Parser for GAMESS-UK output files""" import re import numpy from . import logfileparser from . import utils class GAMESSUK(logfileparser.Logfile): """A GAMESS UK log file""" SCFRMS, SCFMAX, SCFENERGY = list(range(3)) # Used to index self.scftargets[] def __init__(self, *args, **kwargs): # Call the __init__ method of the superclass super(GAMESSUK, self).__init__(logname="GAMESSUK", *args, **kwargs) def __str__(self): """Return a string representation of the object.""" return "GAMESS UK log file %s" % (self.filename) def __repr__(self): """Return a representation of the object.""" return 'GAMESSUK("%s")' % (self.filename) def normalisesym(self, label): """Use standard symmetry labels instead of GAMESS UK labels. >>> t = GAMESSUK("dummyfile.txt") >>> labels = ['a', 'a1', 'ag', "a'", 'a"', "a''", "a1''", 'a1"'] >>> labels.extend(["e1+", "e1-"]) >>> answer = [t.normalisesym(x) for x in labels] >>> answer ['A', 'A1', 'Ag', "A'", 'A"', 'A"', 'A1"', 'A1"', 'E1', 'E1'] """ label = label.replace("''", '"').replace("+", "").replace("-", "") ans = label[0].upper() + label[1:] return ans def before_parsing(self): # used for determining whether to add a second mosyms, etc. self.betamosyms = self.betamoenergies = self.betamocoeffs = False def extract(self, inputfile, line): """Extract information from the file object inputfile.""" if line[1:22] == "total number of atoms": natom = int(line.split()[-1]) self.set_attribute('natom', natom) if line[3:44] == "convergence threshold in optimization run": # Assuming that this is only found in the case of OPTXYZ # (i.e. an optimization in Cartesian coordinates) self.geotargets = [float(line.split()[-2])] if line[32:61] == "largest component of gradient": # This is the geotarget in the case of OPTXYZ if not hasattr(self, "geovalues"): self.geovalues = [] self.geovalues.append([float(line.split()[4])]) if line[37:49] == "convergence?": # Get the geovalues and geotargets for OPTIMIZE if not hasattr(self, "geovalues"): self.geovalues = [] self.geotargets = [] geotargets = [] geovalues = [] for i in range(4): temp = line.split() geovalues.append(float(temp[2])) if not self.geotargets: geotargets.append(float(temp[-2])) line = next(inputfile) self.geovalues.append(geovalues) if not self.geotargets: self.geotargets = geotargets # This is the only place coordinates are printed in single point calculations. Note that # in the following fragment, the basis set selection is not always printed: # # ****************** # molecular geometry # ****************** # # **************************************** # * basis selected is sto sto3g * # **************************************** # # ******************************************************************************* # * * # * atom atomic coordinates number of * # * charge x y z shells * # * * # ******************************************************************************* # * * # * * # * c 6.0 0.0000000 -2.6361501 0.0000000 2 * # * 1s 2sp * # * * # * * # * c 6.0 0.0000000 2.6361501 0.0000000 2 * # * 1s 2sp * # * * # ... # if line.strip() == "molecular geometry": self.updateprogress(inputfile, "Coordinates") self.skip_lines(inputfile, ['s', 'b', 's']) line = next(inputfile) if "basis selected is" in line: self.skip_lines(inputfile, ['s', 'b', 's', 's']) self.skip_lines(inputfile, ['header1', 'header2', 's', 's']) atomnos = [] atomcoords = [] line = next(inputfile) while line.strip(): line = next(inputfile) if line.strip()[1:10].strip() and list(set(line.strip())) != ['*']: atomcoords.append(list(map(float, line.split()[3:6]))) atomnos.append(int(round(float(line.split()[2])))) if not hasattr(self, "atomcoords"): self.atomcoords = [] self.atomcoords.append(atomcoords) self.set_attribute('atomnos', atomnos) # Each step of a geometry optimization will also print the coordinates: # # search 0 # ******************* # point 0 nuclear coordinates # ******************* # # x y z chg tag # ============================================================ # 0.0000000 -2.6361501 0.0000000 6.00 c # 0.0000000 2.6361501 0.0000000 6.00 c # .. # if line[40:59] == "nuclear coordinates": self.updateprogress(inputfile, "Coordinates") # We need not remember the first geometry in geometry optimizations, as this will # be already parsed from the "molecular geometry" section (see above). if not hasattr(self, 'firstnuccoords') or self.firstnuccoords: self.firstnuccoords = False return self.skip_lines(inputfile, ['s', 'b', 'colname', 'e']) atomcoords = [] atomnos = [] line = next(inputfile) while list(set(line.strip())) != ['=']: cols = line.split() atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in cols[0:3]]) atomnos.append(int(float(cols[3]))) line = next(inputfile) if not hasattr(self, "atomcoords"): self.atomcoords = [] self.atomcoords.append(atomcoords) self.set_attribute('atomnos', atomnos) # This is printed when a geometry optimization succeeds, after the last gradient of the energy. if line[40:62] == "optimization converged": self.skip_line(inputfile, 's') if not hasattr(self, 'optdone'): self.optdone = [] self.optdone.append(len(self.geovalues)-1) # This is apparently printed when a geometry optimization is not converged but the job ends. if "minimisation not converging" in line: self.skip_line(inputfile, 's') self.optdone = [] if line[1:32] == "total number of basis functions": nbasis = int(line.split()[-1]) self.set_attribute('nbasis', nbasis) while line.find("charge of molecule")<0: line = next(inputfile) charge = int(line.split()[-1]) self.set_attribute('charge', charge) mult = int(next(inputfile).split()[-1]) self.set_attribute('mult', mult) alpha = int(next(inputfile).split()[-1])-1 beta = int(next(inputfile).split()[-1])-1 if self.mult == 1: self.homos = numpy.array([alpha], "i") else: self.homos = numpy.array([alpha, beta], "i") if line[37:69] == "s-matrix over gaussian basis set": self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d") self.skip_lines(inputfile, ['d', 'b']) i = 0 while i < self.nbasis: self.updateprogress(inputfile, "Overlap") self.skip_lines(inputfile, ['b', 'b', 'header', 'b', 'b']) for j in range(self.nbasis): temp = list(map(float, next(inputfile).split()[1:])) self.aooverlaps[j,(0+i):(len(temp)+i)] = temp i += len(temp) if line[18:43] == 'EFFECTIVE CORE POTENTIALS': self.skip_line(inputfile, 'stars') self.coreelectrons = numpy.zeros(self.natom, 'i') line = next(inputfile) while line[15:46] != "*"*31: if line.find("for atoms ...")>=0: atomindex = [] line = next(inputfile) while line.find("core charge")<0: broken = line.split() atomindex.extend([int(x.split("-")[0]) for x in broken]) line = next(inputfile) charge = float(line.split()[4]) for idx in atomindex: self.coreelectrons[idx-1] = self.atomnos[idx-1] - charge line = next(inputfile) if line[3:27] == "Wavefunction convergence": self.scftarget = float(line.split()[-2]) self.scftargets = [] if line[11:22] == "normal mode": if not hasattr(self, "vibfreqs"): self.vibfreqs = [] self.vibirs = [] units = next(inputfile) xyz = next(inputfile) equals = next(inputfile) line = next(inputfile) while line!=equals: temp = line.split() self.vibfreqs.append(float(temp[1])) self.vibirs.append(float(temp[-2])) line = next(inputfile) # Use the length of the vibdisps to figure out # how many rotations and translations to remove self.vibfreqs = self.vibfreqs[-len(self.vibdisps):] self.vibirs = self.vibirs[-len(self.vibdisps):] if line[44:73] == "normalised normal coordinates": self.skip_lines(inputfile, ['e', 'b', 'b']) self.vibdisps = [] freqnum = next(inputfile) while freqnum.find("=")<0: self.skip_lines(inputfile, ['b', 'e', 'freqs', 'e', 'b', 'header', 'e']) p = [ [] for x in range(9) ] for i in range(len(self.atomnos)): brokenx = list(map(float, next(inputfile)[25:].split())) brokeny = list(map(float, next(inputfile)[25:].split())) brokenz = list(map(float, next(inputfile)[25:].split())) for j,x in enumerate(list(zip(brokenx, brokeny, brokenz))): p[j].append(x) self.vibdisps.extend(p) self.skip_lines(inputfile, ['b', 'b']) freqnum = next(inputfile) if line[26:36] == "raman data": self.vibramans = [] self.skip_lines(inputfile, ['s', 'b', 'header', 'b']) line = next(inputfile) while line[1]!="*": self.vibramans.append(float(line.split()[3])) self.skip_line(inputfile, 'blank') line = next(inputfile) # Use the length of the vibdisps to figure out # how many rotations and translations to remove self.vibramans = self.vibramans[-len(self.vibdisps):] if line[3:11] == "SCF TYPE": self.scftype = line.split()[-2] assert self.scftype in ['rhf', 'uhf', 'gvb'], "%s not one of 'rhf', 'uhf' or 'gvb'" % self.scftype if line[15:31] == "convergence data": if not hasattr(self, "scfvalues"): self.scfvalues = [] self.scftargets.append([self.scftarget]) # Assuming it does not change over time while line[1:10] != "="*9: line = next(inputfile) line = next(inputfile) tester = line.find("tester") # Can be in a different place depending assert tester >= 0 while line[1:10] != "="*9: # May be two or three lines (unres) line = next(inputfile) scfvalues = [] line = next(inputfile) while line.strip(): if line[2:6] != "****": # e.g. **** recalulation of fock matrix on iteration 4 (examples/chap12/pyridine.out) scfvalues.append([float(line[tester-5:tester+6])]) line = next(inputfile) self.scfvalues.append(scfvalues) if line[10:22] == "total energy" and len(line.split()) == 3: if not hasattr(self, "scfenergies"): self.scfenergies = [] scfenergy = utils.convertor(float(line.split()[-1]), "hartree", "eV") self.scfenergies.append(scfenergy) # Total energies after Moller-Plesset corrections # Second order correction is always first, so its first occurance # triggers creation of mpenergies (list of lists of energies) # Further corrections are appended as found # Note: GAMESS-UK sometimes prints only the corrections, # so they must be added to the last value of scfenergies if line[10:32] == "mp2 correlation energy" or \ line[10:42] == "second order perturbation energy": if not hasattr(self, "mpenergies"): self.mpenergies = [] self.mpenergies.append([]) self.mp2correction = self.float(line.split()[-1]) self.mp2energy = self.scfenergies[-1] + self.mp2correction self.mpenergies[-1].append(utils.convertor(self.mp2energy, "hartree", "eV")) if line[10:41] == "third order perturbation energy": self.mp3correction = self.float(line.split()[-1]) self.mp3energy = self.mp2energy + self.mp3correction self.mpenergies[-1].append(utils.convertor(self.mp3energy, "hartree", "eV")) if line[40:59] == "molecular basis set": self.gbasis = [] line = next(inputfile) while line.find("contraction coefficients")<0: line = next(inputfile) equals = next(inputfile) blank = next(inputfile) atomname = next(inputfile) basisregexp = re.compile("\d*(\D+)") # Get everything after any digits shellcounter = 1 while line != equals: gbasis = [] # Stores basis sets on one atom blank = next(inputfile) blank = next(inputfile) line = next(inputfile) shellno = int(line.split()[0]) shellgap = shellno - shellcounter shellsize = 0 while len(line.split())!=1 and line!=equals: if line.split(): shellsize += 1 coeff = {} # coefficients and symmetries for a block of rows while line.strip() and line!=equals: temp = line.strip().split() # temp[1] may be either like (a) "1s" and "1sp", or (b) "s" and "sp" # See GAMESS-UK 7.0 distribution/examples/chap12/pyridine2_21m10r.out # for an example of the latter sym = basisregexp.match(temp[1]).groups()[0] assert sym in ['s', 'p', 'd', 'f', 'sp'], "'%s' not a recognized symmetry" % sym if sym == "sp": coeff.setdefault("S", []).append( (float(temp[3]), float(temp[6])) ) coeff.setdefault("P", []).append( (float(temp[3]), float(temp[10])) ) else: coeff.setdefault(sym.upper(), []).append( (float(temp[3]), float(temp[6])) ) line = next(inputfile) # either a blank or a continuation of the block if coeff: if sym == "sp": gbasis.append( ('S', coeff['S'])) gbasis.append( ('P', coeff['P'])) else: gbasis.append( (sym.upper(), coeff[sym.upper()])) if line == equals: continue line = next(inputfile) # either the start of the next block or the start of a new atom or # the end of the basis function section (signified by a line of equals) numtoadd = 1 + (shellgap // shellsize) shellcounter = shellno + shellsize for x in range(numtoadd): self.gbasis.append(gbasis) if line[50:70] == "----- beta set -----": self.betamosyms = True self.betamoenergies = True self.betamocoeffs = True # betamosyms will be turned off in the next # SYMMETRY ASSIGNMENT section if line[31:50] == "SYMMETRY ASSIGNMENT": if not hasattr(self, "mosyms"): self.mosyms = [] multiple = {'a':1, 'b':1, 'e':2, 't':3, 'g':4, 'h':5} equals = next(inputfile) line = next(inputfile) while line != equals: # There may be one or two lines of title (compare mg10.out and duhf_1.out) line = next(inputfile) mosyms = [] line = next(inputfile) while line != equals: temp = line[25:30].strip() if temp[-1] == '?': # e.g. e? or t? or g? (see example/chap12/na7mg_uhf.out) # for two As, an A and an E, and two Es of the same energy respectively. t = line[91:].strip().split() for i in range(1, len(t), 2): for j in range(multiple[t[i][0]]): # add twice for 'e', etc. mosyms.append(self.normalisesym(t[i])) else: for j in range(multiple[temp[0]]): mosyms.append(self.normalisesym(temp)) # add twice for 'e', etc. line = next(inputfile) assert len(mosyms) == self.nmo, "mosyms: %d but nmo: %d" % (len(mosyms), self.nmo) if self.betamosyms: # Only append if beta (otherwise with IPRINT SCF # it will add mosyms for every step of a geo opt) self.mosyms.append(mosyms) self.betamosyms = False elif self.scftype == 'gvb': # gvb has alpha and beta orbitals but they are identical self.mosysms = [mosyms, mosyms] else: self.mosyms = [mosyms] if line[50:62] == "eigenvectors": # Mocoeffs...can get evalues from here too # (only if using FORMAT HIGH though will they all be present) if not hasattr(self, "mocoeffs"): self.aonames = [] aonames = [] minus = next(inputfile) mocoeffs = numpy.zeros( (self.nmo, self.nbasis), "d") readatombasis = False if not hasattr(self, "atombasis"): self.atombasis = [] for i in range(self.natom): self.atombasis.append([]) readatombasis = True self.skip_lines(inputfile, ['b', 'b', 'evalues']) p = re.compile(r"\d+\s+(\d+)\s*(\w+) (\w+)") oldatomname = "DUMMY VALUE" mo = 0 while mo < self.nmo: self.updateprogress(inputfile, "Coefficients") self.skip_lines(inputfile, ['b', 'b', 'nums', 'b', 'b']) for basis in range(self.nbasis): line = next(inputfile) # Fill atombasis only first time around. if readatombasis: orbno = int(line[1:5])-1 atomno = int(line[6:9])-1 self.atombasis[atomno].append(orbno) if not self.aonames: pg = p.match(line[:18].strip()).groups() atomname = "%s%s%s" % (pg[1][0].upper(), pg[1][1:], pg[0]) if atomname != oldatomname: aonum = 1 oldatomname = atomname name = "%s_%d%s" % (atomname, aonum, pg[2].upper()) if name in aonames: aonum += 1 name = "%s_%d%s" % (atomname, aonum, pg[2].upper()) aonames.append(name) temp = list(map(float, line[19:].split())) mocoeffs[mo:(mo+len(temp)), basis] = temp # Fill atombasis only first time around. readatombasis = False if not self.aonames: self.aonames = aonames line = next(inputfile) # blank line while not line.strip(): line = next(inputfile) evalues = line if evalues[:17].strip(): # i.e. if these aren't evalues break # Not all the MOs are present mo += len(temp) mocoeffs = mocoeffs[0:(mo+len(temp)), :] # In case some aren't present if self.betamocoeffs: self.mocoeffs.append(mocoeffs) else: self.mocoeffs = [mocoeffs] if line[7:12] == "irrep": ########## eigenvalues ########### # This section appears once at the start of a geo-opt and once at the end # unless IPRINT SCF is used (when it appears at every step in addition) if not hasattr(self, "moenergies"): self.moenergies = [] equals = next(inputfile) while equals[1:5] != "====": # May be one or two lines of title (compare duhf_1.out and mg10.out) equals = next(inputfile) moenergies = [] line = next(inputfile) if not line.strip(): # May be a blank line here (compare duhf_1.out and mg10.out) line = next(inputfile) while line.strip() and line != equals: # May end with a blank or equals temp = line.strip().split() moenergies.append(utils.convertor(float(temp[2]), "hartree", "eV")) line = next(inputfile) self.nmo = len(moenergies) if self.betamoenergies: self.moenergies.append(moenergies) self.betamoenergies = False elif self.scftype == 'gvb': self.moenergies = [moenergies, moenergies] else: self.moenergies = [moenergies] # The dipole moment is printed by default at the beginning of the wavefunction analysis, # but the value is in atomic units, so we need to convert to Debye. It seems pretty # evident that the reference point is the origin (0,0,0) which is also the center # of mass after reorientation at the beginning of the job, although this is not # stated anywhere (would be good to check). # # ********************* # wavefunction analysis # ********************* # # commence analysis at 24.61 seconds # # dipole moments # # # nuclear electronic total # # x 0.0000000 0.0000000 0.0000000 # y 0.0000000 0.0000000 0.0000000 # z 0.0000000 0.0000000 0.0000000 # if line.strip() == "dipole moments": # In older version there is only one blank line before the header, # and newer version there are two. self.skip_line(inputfile, 'blank') line = next(inputfile) if not line.strip(): line = next(inputfile) self.skip_line(inputfile, 'blank') dipole = [] for i in range(3): line = next(inputfile) dipole.append(float(line.split()[-1])) reference = [0.0, 0.0, 0.0] dipole = utils.convertor(numpy.array(dipole), "ebohr", "Debye") if not hasattr(self, 'moments'): self.moments = [reference, dipole] else: assert self.moments[1] == dipole # Net atomic charges are not printed at all, it seems, # but you can get at them from nuclear charges and # electron populations, which are printed like so: # # --------------------------------------- # mulliken and lowdin population analyses # --------------------------------------- # # ----- total gross population in aos ------ # # 1 1 c s 1.99066 1.98479 # 2 1 c s 1.14685 1.04816 # ... # # ----- total gross population on atoms ---- # # 1 c 6.0 6.00446 5.99625 # 2 c 6.0 6.00446 5.99625 # 3 c 6.0 6.07671 6.04399 # ... if line[10:49] == "mulliken and lowdin population analyses": if not hasattr(self, "atomcharges"): self.atomcharges = {} while not "total gross population on atoms" in line: line = next(inputfile) self.skip_line(inputfile, 'blank') line = next(inputfile) mulliken, lowdin = [], [] while line.strip(): nuclear = float(line.split()[2]) mulliken.append(nuclear - float(line.split()[3])) lowdin.append(nuclear - float(line.split()[4])) line = next(inputfile) self.atomcharges["mulliken"] = mulliken self.atomcharges["lowdin"] = lowdin # ----- spinfree UHF natural orbital occupations ----- # # 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 # # 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 1.9999997 1.9999997 # ... if "natural orbital occupations" in line: occupations = [] self.skip_line(inputfile, "blank") line = inputfile.next() while line.strip(): occupations += map(float, line.split()) self.skip_line(inputfile, "blank") line = inputfile.next() self.set_attribute('nooccnos', occupations) if __name__ == "__main__": import doctest doctest.testmod() cclib-1.3.1/src/cclib/bridge/0002755000175100016050000000000012467450677015574 5ustar kmlcclib00000000000000cclib-1.3.1/src/cclib/bridge/__init__.py0000644000175100016050000000152212425223271017661 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Facilities for moving parsed data to other cheminformatic libraries.""" try: import openbabel except Exception: pass else: from .cclib2openbabel import makeopenbabel try: import PyQuante except ImportError: pass else: from .cclib2pyquante import makepyquante try: from .cclib2biopython import makebiopython except ImportError: pass cclib-1.3.1/src/cclib/bridge/cclib2pyquante.py0000644000175100016050000000254312425223271021053 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2013, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Bridge for using cclib data in PyQuante (http://pyquante.sourceforge.net).""" from __future__ import print_function import sys try: from PyQuante.Molecule import Molecule except ImportError: print("PyQuante could not be imported.") def makepyquante(atomcoords, atomnos, charge=0, mult=1): """Create a PyQuante Molecule. >>> import numpy >>> from PyQuante.hartree_fock import hf >>> atomnos = numpy.array([1,8,1],"i") >>> a = numpy.array([[-1,1,0],[0,0,0],[1,1,0]],"f") >>> pyqmol = makepyquante(a,atomnos) >>> en,orbe,orbs = hf(pyqmol) >>> print int(en * 10) / 10. # Should be around -73.8 -73.8 """ return Molecule("notitle", list(zip(atomnos, atomcoords)), units="Angstrom", charge=charge, multiplicity=mult) if __name__ == "__main__": import doctest doctest.testmod() cclib-1.3.1/src/cclib/bridge/cclib2biopython.py0000644000175100016050000000264112425223271021217 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Bridge for using cclib data in biopython (http://biopython.org).""" from Bio.PDB.Atom import Atom from cclib.parser.utils import PeriodicTable def makebiopython(atomcoords, atomnos): """Create a list of BioPython Atoms. This creates a list of BioPython Atoms suitable for use by Bio.PDB.Superimposer, for example. >>> import numpy >>> from Bio.PDB.Superimposer import Superimposer >>> atomnos = numpy.array([1,8,1],"i") >>> a = numpy.array([[-1,1,0],[0,0,0],[1,1,0]],"f") >>> b = numpy.array([[1.1,2,0],[1,1,0],[2,1,0]],"f") >>> si = Superimposer() >>> si.set_atoms(makebiopython(a,atomnos),makebiopython(b,atomnos)) >>> print si.rms 0.29337859596 """ pt = PeriodicTable() bioatoms = [] for coords, atomno in zip(atomcoords, atomnos): bioatoms.append(Atom(pt.element[atomno], coords, 0, 0, 0, 0, 0)) return bioatoms if __name__ == "__main__": import doctest doctest.testmod() cclib-1.3.1/src/cclib/bridge/cclib2openbabel.py0000644000175100016050000000523112467425265021146 0ustar kmlcclib00000000000000# This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2009-2015, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Bridge between cclib data and openbabel (http://openbabel.org).""" import openbabel as ob from cclib.parser.data import ccData def makeopenbabel(atomcoords, atomnos, charge=0, mult=1): """Create an Open Babel molecule. >>> import numpy, openbabel >>> atomnos = numpy.array([1,8,1],"i") >>> coords = numpy.array([[-1.,1.,0.],[0.,0.,0.],[1.,1.,0.]]) >>> obmol = makeopenbabel(coords, atomnos) >>> obconversion = openbabel.OBConversion() >>> formatok = obconversion.SetOutFormat("inchi") >>> print obconversion.WriteString(obmol).strip() InChI=1/H2O/h1H2 """ obmol = ob.OBMol() for i in range(len(atomnos)): # Note that list(atomcoords[i]) is not equivalent!!! coords = atomcoords[i].tolist() atomno = int(atomnos[i]) obatom = ob.OBAtom() obatom.SetAtomicNum(atomno) obatom.SetVector(*coords) obmol.AddAtom(obatom) obmol.ConnectTheDots() obmol.PerceiveBondOrders() obmol.SetTotalSpinMultiplicity(mult) obmol.SetTotalCharge(charge) return obmol def makecclib(mol): """Create cclib attributes and return a ccData from an OpenBabel molecule. Beyond the numbers, masses and coordinates, we could also set the total charge and multiplicity, but often these are calculated from atomic formal charges so it is better to assume that would not be correct. """ attributes = { 'atomcoords': [], 'atommasses': [], 'atomnos': [], 'natom': mol.NumAtoms(), } for atom in ob.OBMolAtomIter(mol): attributes['atomcoords'].append([atom.GetX(), atom.GetY(), atom.GetZ()]) attributes['atommasses'].append(atom.GetAtomicMass()) attributes['atomnos'].append(atom.GetAtomicNum()) return ccData(attributes) def readfile(fname, format): """Read a file with OpenBabel and extract cclib attributes.""" obc = ob.OBConversion() if obc.SetInFormat(format): mol = ob.OBMol() obc.ReadFile(mol, fname) return makecclib(mol) else: print("Unable to load the %s reader from OpenBabel." % format) return {} if __name__ == "__main__": import doctest doctest.testmod() cclib-1.3.1/src/scripts/0002755000175100016050000000000012467450677014753 5ustar kmlcclib00000000000000cclib-1.3.1/src/scripts/cda0000644000175100016050000000427012467431146015414 0ustar kmlcclib00000000000000#!/usr/bin/env python3 # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2007-2014, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. from __future__ import print_function import os import sys import glob import getopt import logging import numpy from cclib.parser import ccopen from cclib.method import CDA if __name__ == "__main__": parser1 = ccopen(sys.argv[1], logging.ERROR) parser2 = ccopen(sys.argv[2], logging.ERROR) parser3 = ccopen(sys.argv[3], logging.ERROR) data1 = parser1.parse(); data2 = parser2.parse(); data3 = parser3.parse() fa = CDA(data1, None, logging.ERROR) retval = fa.calculate([data2, data3]) if retval: print("Charge decomposition analysis of %s\n"%(sys.argv[1])) if len(data1.homos) == 2: print("ALPHA SPIN:") print("===========") print(" MO# d b r s") print("-------------------------------------") for spin in range(len(data1.homos)): if spin == 1: print("\nBETA SPIN:") print("==========") for i in range(len(fa.donations[spin])): print("%4i: %7.3f %7.3f %7.3f %7.3f" % \ (i + 1, fa.donations[spin][i], fa.bdonations[spin][i], fa.repulsions[spin][i], fa.residuals[spin][i])) if i == data1.homos[spin]: print("------ HOMO - LUMO gap ------") print("-------------------------------------") print(" T: %7.3f %7.3f %7.3f %7.3f" % \ (fa.donations[spin].sum(), fa.bdonations[spin].sum(), fa.repulsions[spin].sum(), fa.residuals[spin].sum())) cclib-1.3.1/src/scripts/ccget0000644000175100016050000001436712467425265015767 0ustar kmlcclib00000000000000#!/usr/bin/env python3 # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2015, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """Script for loading data from computational chemistry files.""" from __future__ import print_function import getopt import glob import logging import os import sys from cclib.parser import ccData from cclib.parser import ccread MSG_USAGE = """\ Usage: ccget [] [] Try ccget --help for more information\ """ MSG_USAGE_LONG = """\ Usage: ccget [] [] where is one of the attributes to be parsed by cclib from each of the compchemlogfiles. For a list of attributes available in a file, use --list (or -l): ccget --list To parse multiple files as one input stream, use --multi (or -m): ccget --multi [] [] Additional options: -v or --verbose: more verbose parsing output (only errors by default) -u or --future: use experimental features (currently optdone_as_list)\ """ # These are the options ccget accepts and their one letter versions. OPTS_LONG = ["help", "list", "multi", "verbose", "future"] OPTS_SHORT = "hlmvu" def ccget(): """Parse files with cclib based on command line arguments.""" # Parse the arguments and pass them to ccget, but print help information # and exit if it fails. try: optlist, arglist = getopt.getopt(sys.argv[1:], OPTS_SHORT, OPTS_LONG) except getopt.GetoptError: print(MSG_USAGE_LONG) sys.exit(1) future = False showattr = False multifile = False verbose = False for opt, arg in optlist: if opt in ("-h", "--help"): print(MSG_USAGE_LONG) sys.exit() if opt in ("-l", "--list"): showattr = True if opt in ("-m", "--multi"): multifile = True if opt in ("-v", "--verbose"): verbose = True if opt in ("-u", "--future"): future = True # We need at least one attribute and the filename, so two arguments, or # just one filename if we want to list attributes that can be extracted. # In multifile mode, we generally want at least two filenames, so the # expected number of arguments is a bit different. if not multifile: correct_number = (not showattr and len(arglist) > 1) or (showattr and len(arglist) > 0) else: correct_number = (not showattr and len(arglist) > 2) or (showattr and len(arglist) > 1) if not correct_number: print("The number of arguments does not seem to be correct.") print(MSG_USAGE) sys.exit(1) # Figure out which are the attribute names and which are the filenames. # Note that in Linux, the shell expands wild cards, but not so in Windows, # so try to do that here using glob. attrnames = [] filenames = [] for arg in arglist: if arg in ccData._attrlist: attrnames.append(arg) elif os.path.isfile(arg): filenames.append(arg) else: wildcardmatches = glob.glob(arg) if wildcardmatches: filenames.extend(wildcardmatches) else: print("%s is neither a filename nor an attribute name." % arg) print(MSG_USAGE) sys.exit(1) # Since there is some ambiguity to the correct number of arguments, check # that there is at least one filename (or two in multifile mode), and also # at least one attribute to parse if the -l option was not passed. if len(filenames) == 0: print("No logfiles given") sys.exit(1) if multifile and len(filenames) == 1: print("Expecting at least two logfiles in multifile mode") sys.exit(1) if not showattr and len(attrnames) == 0: print("No attributes given") sys.exit(1) # This should be sufficient to correctly handle multiple files, that is to # run the loop below only once with all logfiles in the variable `filename`. # Although, perhaps it would be clearer to abstract the contents of the loop # into another function. if multifile: filenames = [filenames] # Now parse each file and print out the requested attributes. for filename in filenames: if multifile: name = ", ".join(filename[:-1]) + " and " + filename[-1] else: name = filename # The keyword dictionary are not used so much. but could be useful for # passing options downstream. For example, we might use --future for # triggering experimental or alternative behavior (as with optdone). kwargs = {} if verbose: kwargs['verbose'] = True kwargs['loglevel'] = logging.INFO else: kwargs['verbose'] = False kwargs['loglevel'] = logging.ERROR if future: kwargs['future'] = True print("Attempting to read %s" % name) data = ccread(filename, **kwargs) if data == None: print("Cannot figure out the format of '%s'" % name) print("Report this to the cclib development team if you think it is an error.") print("\n" + MSG_USAGE) sys.exit() if showattr: print("cclib can parse the following attributes from %s:" % name) for attr in data._attrlist: if hasattr(data, attr): print(" %s" % attr) else: invalid = False for attr in attrnames: if hasattr(data, attr): print("%s:\n%s" % (attr, getattr(data, attr))) else: print("Could not parse %s from this file." % attr) invalid = True if invalid: print(MSG_USAGE_LONG) if __name__ == "__main__": ccget() cclib-1.3.1/setup.py0000755000175100016050000000465712467431027014210 0ustar kmlcclib00000000000000#!/usr/bin/env python3 # # This file is part of cclib (http://cclib.github.io), a library for parsing # and interpreting the results of computational chemistry packages. # # Copyright (C) 2006-2015, the cclib development team # # The library is free software, distributed under the terms of # the GNU Lesser General Public version 2.1 or later. You should have # received a copy of the license along with cclib. You can also access # the full license online at http://www.gnu.org/copyleft/lgpl.html. """cclib: parsers and algorithms for computational chemistry cclib is a Python library that provides parsers for computational chemistry log files. It also provides a platform to implement algorithms in a package-independent manner. """ doclines = __doc__.split("\n") # Chosen from http://www.python.org/pypi?:action=list_classifiers classifiers = """Development Status :: 5 - Production/Stable Environment :: Console Intended Audience :: Science/Research Intended Audience :: Developers License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL) Natural Language :: English Operating System :: OS Independent Programming Language :: Python Topic :: Scientific/Engineering :: Chemistry Topic :: Software Development :: Libraries :: Python Modules""" programs = ['ADF', 'GAMESS', 'GAMESS-UK', 'Gaussian', 'Jaguar', 'Molpro', 'NWChem', 'ORCA', 'Psi', 'QChem'] def setup_cclib(): import os import sys # Import from setuptools only if requested. if 'egg' in sys.argv: sys.argv.pop(sys.argv.index('egg')) from setuptools import setup from distutils.core import setup # The list of packages to be installed. cclib_packages = ['cclib', 'cclib.parser', 'cclib.progress', 'cclib.method', 'cclib.bridge'] setup( name = "cclib", version = "1.3.1", url = "http://cclib.github.io/", author = "cclib development team", author_email = "cclib-users@lists.sourceforge.net", maintainer = "cclib development team", maintainer_email = "cclib-users@lists.sourceforge.net", license = "LGPL", description = doclines[0], long_description = "\n".join(doclines[2:]), classifiers = classifiers.split("\n"), platforms = ["Any."], packages = cclib_packages, package_dir = { 'cclib':'src/cclib' }, scripts = ["src/scripts/ccget", "src/scripts/cda"], ) if __name__ == '__main__': setup_cclib() cclib-1.3.1/CHANGELOG0000644000175100016050000002700412467430156013676 0ustar kmlcclib00000000000000Changes since cclib-1.3: Features: * New attribute nooccnos for natural orbital occupation numbers * Read data from XYZ files using OpenBabel bridge * Start basic tests for bridge functionality Bugfixes: * Better handling of ONIOM logfiles in Gaussian (Clyde Fare) * Fix IR intensity bug in Gaussian parser (Clyde Fare) * Fix QChem parser for OpenMP output * Fix parsing TDDFT/RPA transitions (Felix Plasser) * Fix encoding issues for UTF-8 symbols in parsers and bridges Changes since cclib-1.2: Features: * New parser: cclib can now parse NWChem files * New parser: cclib can now parse Psi (versions 3 and 4) files * New parser: cclib can now parse QChem files (by Eric Berquist) * New method: Nuclear (currently calculates the repulsion energy) * Handle Gaussian basis set output with GFPRINT keyword * Attribute optdone reverted to single Boolean value by default * Add --verbose and --future options to ccget and parsers * Replaced PC-GAMESS test files with newer Firefly versions * Updated test file versions to GAMESS-UK 8.0 Bugfixes: * Handle GAMESS-US file with LZ value analysis (Martin Rahm) * Handle Gaussian jobs with stars in output (Russell Johnson, NIST) * Handle ORCA singlet-only TD calculations (May A.) * Fix parsing of Gaussian jobs with fragments and ONIOM output * Use UTF-8 encodings for files that need them (Matt Ernst) Changes since cclib-1.1: Features: * Move project to github * Transition to Python 3 (Python 2.7 will still work) * Add a multifile mode to ccget script * Extract vibrational displacements for ORCA * Extract natural atom charges for Gaussian (Fedor Zhuravlev) * New attribute optdone flags converged geometry optimization * Updated test file versions to ADF2013.01, GAMESS-US 2012, Gaussian09, Molpro 2012 and ORCA 3.0.1 Bugfixes: * Ignore unicode errors in logfiles * Handle Guassian jobs with terse output (basis set count not reported) * Handle Gaussian jobs using IndoGuess (Scott McKechnie) * Handle Gaussian file with irregular ONION gradients (Tamilmani S) * Handle ORCA file with SCF convergence issue (Melchor Sanchez) * Handle Gaussian file with problematic IRC output (Clyde Fare) * Handle ORCA file with AM1 output (Julien Idé) * Handle GAMESS-US output with irregular frequency format (Andrew Warden) Changes since cclib-1.0.1: Features: * Add progress info for all parsers * Support ONIOM calculations in Gaussian (Karen Hemelsoet) * New attribute atomcharges extracts Mulliken and Lowdin atomic charges if present * New attribute atomspins extracts Mulliken and Lowdin atomic spin densities if present * New thermodynamic attributes: freeenergy, temperature, enthalpy (Edward Holland) * Extract PES information: scanenergies, scancoords, scanparm, scannames (Edward Holland) Bugfixes: * Handle coupled cluster energies in Gaussian 09 (Björn Dahlgren) * Vibrational displacement vectors missing for Gaussian 09 (Björn Dahlgren) * Fix problem parsing vibrational frequencies in some GAMESS-US files * Fix missing final scfenergy in ADF geometry optimisations * Fix missing final scfenergy for ORCA where a specific number of SCF cycles has been specified * ORCA scfenergies not parsed if COSMO solvent effects included * Allow spin unrestricted calculations to use the fragment MO overlaps correctly for the MPA and CDA calculations * Handle Gaussian MO energies that are printed as a row of asterisks (Jerome Kieffer) * Add more explicit license notices, and allow LGPL versions after 2.1 * Support Firefly calculations where nmo != nbasis (Pavel Solntsev) * Fix problem parsing vibrational frequency information in recent GAMESS (US) files (Chengju Wang) * Apply patch from Chengju Wang to handle GAMESS calculations with more than 99 atoms * Handle Gaussian files with more than 99 atoms having pseudopotentials (Björn Baumeier) Changes since cclib-1.0: Features: * New attribute atommasses - atomic masses in Dalton * Added support for Gaussian geometry optimisations that change the number of linearly independent basis functions over the course of the calculation Bugfixes: * Handle triplet PM3 calculations in Gaussian03 (Greg Magoon) * Some Gaussian09 calculations were missing atomnos (Marius Retegan) * Handle multiple pseudopotentials in Gaussian03 (Tiago Silva) * Handle Gaussian calculations with >999 basis functions * ADF versions > 2007 no longer print overlap info by default * Handle parsing Firefly calculations that fail * Fix parsing of ORCA calculation (Marius Retegan) Change since cclib-0.9: Features: * Handle PBC calculations from Gaussian * Updates to handle Gaussian09 * Support TDDFT calculations from ADF * A number of improvements for GAMESS support * ccopen now supports any file-like object with a read() method, so it can parse across HTTP Bugfixes: * Many many additional files parsed thanks to bugs reported by users Change since cclib-0.8: Features: * New parser: cclib can now parse ORCA files * Added option to use setuptools instead of distutils.core for installing * Improved handling of CI and TD-DFT data: TD-DFT data extracted from GAMESS and etsecs standardised across all parsers * Test suite changed to include output from only the newest program versions Bugfixes: * A small number of parsing errors were fixed Change since cclib-0.7: Features: * New parser: cclib can now parse Molpro files * Separation of parser and data objects: Parsed data is now returned as a ccData object that can be pickled, and converted to and from JSON * Parsers: multiple files can be parsed with one parse command * NumPy support: Dropped Numeric support in favour of NumPy * API addition: 'charge' for molecular charge * API addition: 'mult' for spin multiplicity * API addition: 'atombasis' for indices of atom orbitals on each atom * API addition: 'nocoeffs' for Natural Orbital (NO) coefficients * GAMESS-US parser: added 'etoscs' (CIS calculations) * Jaguar parser: added 'mpenergies' (LMP2 calcualtions) * Jaguar parser: added 'etenergies' and 'etoscs' (CIS calculations) * New method: Lowdin Population Analysis (LPA) * Tests: unittests can be run from the Python interpreter, and for a single parser; the number of "passed" tests is also counted and shown Bugfixes: * Several parsing errors were fixed * Fixed some methods to work with different numbers of alpha and beta MO coefficients in mocoeffs (MPA, CSPA, OPA) Changes since cclib-0.6.1: Features: * New parser: cclib can now parse Jaguar files * ccopen: Can handle log files which have been compressed into .zip, .bz2 or .gz files. * API addition: 'gbasis' holds the Gaussian basis set * API addition: 'coreelectrons' contains the number of core electrons in each atom's pseudopotential * API addition: 'mpenergies' holds the Moller-Plesset corrected molecular electronic energies * API addition: 'vibdisps' holds the Cartesian displacement vectors * API change: 'mocoeffs' is now a list of rank 2 arrays, rather than a rank 3 array * API change: 'moenergies' is now a list of rank 1 arrays, rather than rank 2 array * GAMESS-UK parser: added 'vibramans' * New method: Charge Decomposition Analysis (CDA) for studying electron donation, back donation, and repulsion between fragments in a molecule * New method: Fragment Analysis for studing bonding interactions between two or more fragments in a molecule * New method: Ability to calculate the electron density or wavefunction Bugfixes: * GAMESS parser: Failed to parse frequency calculation with imaginary frequencies Rotations and translations now not included in frequencies Failed to parse a DFT calculation * GAMESS-UK parser: 'atomnos' not being extracted Rotations and translations now not included in frequencies * bridge to OpenBabel: No longer dependent on pyopenbabel Changes since cclib-0.6.0: Bugfixes: * cclib: The "import cclib.parsers" statement failed due to references to Molpro and Jaguar parsers which are not present * Gaussian parser: Failed to parse single point calculations where the input coords are a z-matrix, and symmetry is turned off. Changes since cclib-0.6b: Features * ADF parser: If some MO eigenvalues are not present, the parser does not fail, but uses values of 99999 instead and A symmetry Bugfixes * ADF parser: The following bugs have been fixed P/D orbitals for single atoms not handled correctly Problem parsing homos in unrestricted calculations Problem skipping the Create sections in certain calculations * Gaussian parser: The following bugs have been fixed Parser failed if standard orientation not found * ccget: aooverlaps not included when using --list option Changes since cclib-0.5: Features * New parser: GAMESS-UK parser * API addition: the .clean() method The .clean() method of a parser clears all of the parsed attributes. This is useful if you need to reparse during the course of a calculation. * Function rename: guesstype() has been renamed to ccopen() * Speed up: Calculation of Overlap Density of States has been sped up by two orders of magnitude Bugfixes * ccget: Passing multiple filenames now works on Windows too * ADF parser: The following bugs have been fixed Problem with parsing SFOs in certain log files Handling of molecules with orbitals of E symmetry Couldn't find the HOMO in log files from new versions of ADF Parser used to miss attributes if SCF not converged For a symmetrical molecule, mocoeffs were in the wrong order and the homo was not identified correctly if degenerate * Gaussian parser: The following bugs have been fixed SCF values was not extracting the dEnergy value Was extracting Depolar P instead of Raman activity * ccopen: Minor problems fixed with identification of log files Changes since cclib-0.5b: Features * src/scripts/ccget: Added handling of multiple filenames. It's now possible to use ccget as follows: ccget *.log This is a good way of checking out whether cclib is able to parse all of the files in a given directory. Also possible is: ccget homos *.log * Change of license: Changed license from GPL to LGPL Bugfixes * src/cclib/parser/gamessparser.py: Bugfix: gamessparser was dying on GAMESS VERSION = 12 DEC 2003 gopts, as it was unable to parse the scftargets. * src/cclib/parser/gamessparser.py: Remove assertion to catch instances where scftargets is unset. This occurs in the case of failed calculations (e.g. wrong multiplicity). * src/cclib/parser/adfparser.py: Fixed one of the errors with the Mo5Obdt2-c2v-opt.adfout example, which had to do with the SFOs being made of more than two combinations of atoms (4, because of rotation in c2v point group). At least one error is still present with atomcoords. It looks like non-coordinate integers are being parsed as well, which makes some of the atomcoords list have more than the 3 values for x,y,z. * src/cclib/parser/adfparser.py: Hopefully fixed the last error in Mo5Obdt2-c2v-opt. Problem was that it was adding line.split()[5:], but sometimes there was more than 3 fields left, so it was changed to [5:8]. Need to check actual parsed values to make sure it is parsed correctly. * data/Gaussian, logfiledist, src/cclib/parser/gaussianparser.py, test/regression.py: Bug fix: Mo4OSibdt2-opt.log has no atomcoords despite being a geo-opt. This was due to the fact that the parser was extracting "Input orientation" and not "Standard orientation". It's now changed to "Standard orientation" which works for all of the files in the repository. cclib-1.3.1/INSTALL0000644000175100016050000000625012425223271013505 0ustar kmlcclib00000000000000== cclib installation instructions == === Requirements === Before you install cclib, you need to make sure that you have the following: * Python (version 3.0 and up, although 2.7 will still work) * NumPy (at least version 1.5 is recommended). Python is an open-source programming language available from http://www.python.org and it is included in many Linux distributions. In Debian it is installed as follows: (as root) apt-get install python python-dev NumPy (Numerical Python) adds a fast array facility to Python and is available from http://www.numpy.org. Windows users should use the most recent NumPy installation for the Python version they have (2.4, 2.5). Linux users are recommended to find a binary package for their distribution. In Debian it is installed as follows: (as root) apt-get install python-numpy Note: Numeric (the old version of Numerical Python) is not supported by the Numerical Python developers and is not supported by cclib. To test whether Python is on the PATH, open a command prompt window and type: python If Python is not on the PATH and you use Windows, add the full path to the directory containing it to the end of the PATH variable under Control Panel/System/Advanced Settings/Environment Variables. If you use Linux and Python is not on the PATH, put/edit the appropriate line in your .bashrc or similar startup file. To test, try importing NumPy at the Python prompt. You should see something similar to the following: $ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.__version__ '1.6.1' (To exit, press CTRL+Z in Windows or CTRL+D in Linux) === Installing cclib === On Debian, Ubuntu and other derived Linux distribution, cclib can be quickly installed with the command: aptitude install cclib The version installed from a distribuion might not be the most recent one. To install the most recent version, first download the source code of cclib. Extract the cclib tar file or zip file at an appropriate location, which we will call INSTALLDIR. Open a command prompt and change directory to INSTALLDIR. Next, run the following commands: python setup.py build python setup.py install (as root) To test, trying importing cclib at the Python prompt. You should see something similar to the following: $ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. Press ESC for command-line completion (twice for guesses). History is saved to ~/.pyhistory. >>> import cclib >>> cclib.__version__ '1.3' To run the unit tests, change directory into INSTALLDIR/test and run the following command: python testall.py This tests the program using the example data files included in the INSTALLDIR/data directory. === What next? === * Read the tutorial at: http://cclib.github.io/tutorial.html * Read the list and specifications of the extracted data at: http://cclib.github.io/data.html * Send any questions to the cclib-users mailing list at: https://lists.sourceforge.net/lists/listinfo/cclib-users.