cclib-1.6.2/ 0000775 0000000 0000000 00000000000 13535330462 0012600 5 ustar 00root root 0000000 0000000 cclib-1.6.2/.editorconfig 0000664 0000000 0000000 00000000367 13535330462 0015263 0 ustar 00root root 0000000 0000000 # The full list of properties is located at
# https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties.
root = true
[*.py]
charset = utf-8
indent_style = space
indent_size = 4
[.travis.yml]
indent_style = space
indent_size = 2
cclib-1.6.2/.gitignore 0000664 0000000 0000000 00000000134 13535330462 0014566 0 ustar 00root root 0000000 0000000 __pycache__
MANIFEST
build
dist
htmlcov
*.pyc
.cache
.coverage
.pytest_cache
cclib.egg-info
cclib-1.6.2/.travis.yml 0000664 0000000 0000000 00000002136 13535330462 0014713 0 ustar 00root root 0000000 0000000 language: python
python:
- 2.7
- 3.4
- 3.7
addons:
apt:
packages:
- swig
cache:
pip: true
before_install:
- sudo add-apt-repository "deb http://archive.ubuntu.com/ubuntu cosmic universe"
- sudo apt update
- sudo apt install libopenbabel-dev
- pip install -r requirements.txt
install:
- pip install .
before_script:
- |
export DOCS_BRANCH_NAME=master
export DOCS_REPO_NAME=cclib.github.io
export DOCS_REPO_OWNER=cclib
export DOCS_ROOT_DIR="${TRAVIS_BUILD_DIR}"/doc/sphinx
export DOCS_BUILD_DIR="${DOCS_ROOT_DIR}"/_build/html
export THEME_DIR="${DOCS_ROOT_DIR}"/_themes
- install -dm755 "${THEME_DIR}"
script:
- env | sort
- bash travis/run_pytest.bash
- bash travis/build_docs.bash
after_success:
- |
if [[ "${TRAVIS_BRANCH}" == master && "${TRAVIS_PULL_REQUEST}" == false && $TRAVIS_PYTHON_VERSION == 3.7 ]];
then
# Commits to master that are not pull requests, that is, only actual
# addition of code to master, should deploy the documentation.
bash ${TRAVIS_BUILD_DIR}/travis/deploy_docs_travis.bash
fi
cclib-1.6.2/ANNOUNCE 0000664 0000000 0000000 00000005076 13535330462 0013741 0 ustar 00root root 0000000 0000000 On behalf of the cclib development team, we are pleased to announce the release of cclib 1.6.2, which is now available for download from https://cclib.github.io. This is a minor update to version 1.6.1 that includes some new functionality and attributes, as well as bug fixes and small improvements.
cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from 15 different programs: ADF, DALTON, Firefly, GAMESS (US), GAMESS-UK, Gaussian, Jaguar, Molpro, MOLCAS, MOPAC, NWChem, ORCA, Psi, QChem and Turbomole.
Among other data, cclib extracts:
* results of SCF, post-Hartree-Fock, TD-DFT and other calculations
* coordinates, energies and geometry optimization data
* information about atomic and molecular orbitals
* vibrational modes, excited states and transitions
* charges, electrostatic moments and polarizabilities
(For a complete list see https://cclib.github.io/data.html).
cclib also provides some calculation methods for interpreting the electronic properties of molecules such as:
* Mulliken and Lowdin population analyses
* Overlap population analysis
* Mayer's bond orders
(For a complete list see https://cclib.github.io/methods.html).
For information on how to use cclib, see our documentation at https://cclib.github.io.
If you need help, find a bug, want new features or have any questions, please send an email to our mailing list:
https://lists.sourceforge.net/lists/listinfo/cclib-users
If your published work uses cclib, please support its development by citing the following article:
N. M. O'Boyle, A. L. Tenderholt, K. M. Langner, cclib: a library for package-independent computational chemistry algorithms, J. Comp. Chem. 29 (5), 839-845 (2008)
You can also specifically reference this version of cclib as:
Eric Berquist, Karol M. Langner, Noel M. O'Boyle, and Adam L. Tenderholt. Release of cclib version 1.6. 2018. https://dx.doi.org/10.5281/zenodo.1407790
Regards,
The cclib development team
———
Summary of changes since last version:
* New attributes nsocoeffs and nsooccnos for natural spin orbital coefficients (Shiv Upadhyay)
* New methods for alpha and beta electron counts (Jaime Rodríguez-Guerra)
* Support coreelectrons attribute in Molcas (Kunal Sharma)
* Support etoscs for response calculations in Dalton (Peter Reinholdt)
* Updated testing framework (Jaime Rodríguez-Guerra, Maxim Stolyarchuk and others)
* Many other minor improvements and bug fixes from Alessandro Genova, Felipe S. S. Schneider and others
cclib-1.6.2/CHANGELOG 0000664 0000000 0000000 00000045734 13535330462 0014027 0 ustar 00root root 0000000 0000000 Changes in cclib-1.6.2
Features:
* Molden writer now supports ghost atoms (Shiv Upadhyay)
* Handle comments in XYZ files when reading and writing
* Updated regression testing framework (Amanda Dumi, Shiv Upadhyay)
* Updated test file versions to GAMESS-US 2018 (Shiv Upadhyay)
Bugfixes:
* Fixed parsing ORCA output with user comments in coordinates (Felix Plasser)
* Fixed parsing ORCA output with embedding potentials
* Fixed parsing ORCA output with ROCIS in version 4.1
* Fixed parsing etenergies and similar attribute in ORCA for excited states
* Fixed parsing of vibfreqs for ORCA for linear molecules
* Parsing geometry optimization in ORCA is mode robust wrt line endings
Changes in cclib-1.6.1
Features:
* New attribute nsocoeffs for natural spin orbital coefficients (Shiv Upadhyay)
* New attribute nsooccnos for natural spin orbital occupation numbers (Shiv Upadhyay)
* New methods: alpha and beta electron counts (Jaime Rodríguez-Guerra)
* Support coreelectrons attribute in Molcas (Kunal Sharma)
* Support etoscs for response calculations in Dalton (Peter Reinholdt)
* Support etenergies for TDDFT in GAMESS
* Support etrotats attribute in ORCA
* Support functional name in metadata for Psi4 (Alessandro Genova)
* Updated testing framework (Jaime Rodríguez-Guerra, Maxim Stolyarchuk and others)
* Updated test file version to QChem 5.1
Bugfixes:
* Fixed parsing GAMESS output for EOM-CC output
* Fixed parsing Gaussian output for G3 jobs
* Fixed parsing ORCA output for certain invalid inputs (Felipe S. S. Schneider)
* Fixed parsing of mocoeffs in ORCA when they are glued together (Felipe S. S. Schneider)
* Fixed parsing of mocoeffs and vibfreqs in Psi4 (Alessandro Genova)
* Fixed parsing of mocoeffs in Molcas for some files (Shiv Upadhyay)
* Fixed parsing of etsecs in Dalton
* Fixed bond atom indices in CJSON output (Alessandro Genova)
Changes in cclib-1.6
Features:
* New parser: cclib can now parse Molcas files (Kunal Sharma)
* New parser: cclib can now parse Turbomole files (Christopher Rowley, Kunal Sharma)
* New script: ccframe writes data table files from logfiles (Felipe Schneider)
* New method: stoichiometry builds the chemical formula of a system (Jaime Rodríguez-Guerra)
* Support package version in metadata for most parsers
* Support time attribute and BOMD output in Gaussian, NWChem, ORCA and QChem
* Support grads and metadata attributes in ORCA (Jonathon Vandezande)
* Experimental support for CASSCF output in ORCA (Jonathon Vandezande)
* Added entry in metadata for successful completion of jobs
* Updated test file versions to ORCA 4.0
* Updated minimum Python3 version to 3.4
Bugfixes:
* Fixed parsing ORCA output with linear molecules (Jonathon Vandezande)
* Fixed parsing NWChem output with incomplete SCF
Changes in cclib-1.5.3
Features:
* New attribute transprop for electronic transitions (Jonathon Vandezande)
* Support grads attribute in Psi4 (Adam Abbott)
* Support grads attribute in Molpro (Oskar Weser)
* Support optstatus for IRCs and in Psi4 (Emmanuel LaTruelle)
* Updated test file versions to Gaussian16 (Andrew S. Rosen)
* Add ability to write XYZ coordinates for arbitrary indices
Bugfixes:
* Fixed ccwrite script and added unit tests (Georgy Frolov)
* Fixed closed shell determination for Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of natom for >9999 atoms in Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of ADF jobs with no title
* Fixed parsing of charge and core electrons when using ECPs in QChem
* Fixed parsing of scfvalues for malformed output in Gaussian
Changes in cclib-1.5.2:
Features:
* Support for writing Molden and WFX files (Sagar Gaur)
* Support for thermochemistry attributes in ORCA (Jonathon Vandezande)
* Support for chelpg atomic charges in ORCA (Richard Gowers)
* Updated test file versions to GAMESS-US 2017 (Sagar Gaur)
* Added option to print full arrays with ccget (Sagar Gaur)
Bugfixes:
* Fixed polarizability parsing bug in DALTON (Maxim Stolyarchuk)
* Fixed IRC parsing in Gaussian for large trajectories (Dénes Berta, LaTruelle)
* Fixed coordinate parsing for heavy elements in ORCA (Jonathon Vandezande)
* Fixed parsing of large mocoeffs in fixed width format for QChem (srtlg)
* Fixed parsing of large polarizabilities in fixed width format for DALTON (Maxim Stolyarchuk)
* Fixed parsing molecular orbitals when there are more than basis set functions in QChem
Changes in cclib-1.5.1:
Features:
* New attribute polarizabilities for static or dynamic dipole polarizability
* New attribute pressure for thermochemistry (renpj)
* Add property to detect closed shells in parsed data
* Handle RPA excited state calculation in ORCA, in addition to TDA
* Support for Python 3.6
Bugfixes:
* Restore alias cclib.parser.ccopen for backwards compatibility
* Fixed parsing thermochemistry for single atoms in QChem
* Fixed handling of URLs (Alexey Alnatanov)
* Fixed Atom object creation in Biopython bridge (Nitish Garg)
* Fixed ccopen when working with multiple files
Changes in cclib-1.5:
Features:
* Support for both reading and writing CJSON (Sanjeed Schamnad)
* New parser: cclib can now parse MOPAC files (Geoff Hutchison)
* New attribute time tracks coordinates for dynamics jobs (Ramon Crehuet)
* New attribute metadata holds miscellaneous information not in other attributes (bwang2453)
* Extract moments attribute for Gaussian (Geoff Hutchison)
* Extract atombasis for ADF in simple cases (Felix Plasser)
* License change to BSD 3-Clause License
Bugfixes:
* Correct parsing of several attributes for ROHF calculations
* Fixed precision of scfvalues in ORCA
* Fixed MO parsing from older versions of Firefly (mkrompiec)
Changes in cclib-1.4.1:
Features:
* Preliminary support for writing CJSON (Sanjeed Schamnad)
* Tentative support for BOMD trajectories in Gaussian (Ramon Crehuet)
* Support for atombasis in ADF (Felix Plasser)
* Support for nocoeffs and nooccnos in Molpro
Bugfixes:
* Fix for non-standard basis sets in DALTON
* Fix for non-standard MO coefficient printing in GAMESS
Changes in cclib-1.4:
Features:
* New parser: cclib can now parse DALTON files
* New parser: cclib can now parse ORCA files
* New attribute optstatus for status during geometry optimizations and scans
* Extract atommasses for GAMESS-US (Sagar Gaur)
* Extract atombasis, gbasis and mocoeffs for QChem
* Extract gbasis for ORCA (Felix Plasser)
* Handle multi-step jobs by parsing only the supersystem
* Improve parsing vibrational symmetries and displacements for Gaussian (mwykes)
* Improve support for compressed files (mwykes)
* Improve and update unit test and regression suites
* Support for Python 3.5
Bugfixes:
* Fix StopIteration crashes for most parsers
* Fix parsing basis section for Molpro job generated by Avogadro
* Fix parsing multi-job Gaussian output with different orbitals (Geoff Hutchinson)
* Fix parsing ORCA geometry optimization with improper internal coordinates (glideht)
* Fix units in atom coordinates parsed from GAMESS-UK files (mwykes)
* Fix test for vibrational frequencies in Turbomole (mwykes)
* Fix parsing vibration symmetries for Molpro (mwykes)
* Fix parsing eigenvectors in GAMESS-US (Alexis Otero-Calvis)
* Fix duplicate parsing of symmetry labels for Gaussian (Martin Peeks)
Changes in cclib-1.3.1:
Features:
* New attribute nooccnos for natural orbital occupation numbers
* Read data from XYZ files using OpenBabel bridge
* Start basic tests for bridge functionality
Bugfixes:
* Better handling of ONIOM logfiles in Gaussian (Clyde Fare)
* Fix IR intensity bug in Gaussian parser (Clyde Fare)
* Fix QChem parser for OpenMP output
* Fix parsing TDDFT/RPA transitions (Felix Plasser)
* Fix encoding issues for UTF-8 symbols in parsers and bridges
Changes in cclib-1.3:
Features:
* New parser: cclib can now parse NWChem files
* New parser: cclib can now parse Psi (versions 3 and 4) files
* New parser: cclib can now parse QChem files (by Eric Berquist)
* New method: Nuclear (currently calculates the repulsion energy)
* Handle Gaussian basis set output with GFPRINT keyword
* Attribute optdone reverted to single Boolean value by default
* Add --verbose and --future options to ccget and parsers
* Replaced PC-GAMESS test files with newer Firefly versions
* Updated test file versions to GAMESS-UK 8.0
Bugfixes:
* Handle GAMESS-US file with LZ value analysis (Martin Rahm)
* Handle Gaussian jobs with stars in output (Russell Johnson, NIST)
* Handle ORCA singlet-only TD calculations (May A.)
* Fix parsing of Gaussian jobs with fragments and ONIOM output
* Use UTF-8 encodings for files that need them (Matt Ernst)
Changes in cclib-1.2:
Features:
* Move project to GitHub
* Transition to Python 3 (Python 2.7 will still work)
* Add a multifile mode to ccget script
* Extract vibrational displacements for ORCA
* Extract natural atom charges for Gaussian (Fedor Zhuravlev)
* New attribute optdone flags converged geometry optimization
* Updated test file versions to ADF2013.01, GAMESS-US 2012,
Gaussian09, Molpro 2012 and ORCA 3.0.1
Bugfixes:
* Ignore Unicode errors in logfiles
* Handle Gaussian jobs with terse output (basis set count not reported)
* Handle Gaussian jobs using IndoGuess (Scott McKechnie)
* Handle Gaussian file with irregular ONION gradients (Tamilmani S)
* Handle ORCA file with SCF convergence issue (Melchor Sanchez)
* Handle Gaussian file with problematic IRC output (Clyde Fare)
* Handle ORCA file with AM1 output (Julien Idé)
* Handle GAMESS-US output with irregular frequency format (Andrew Warden)
Changes in cclib-1.1:
Features:
* Add progress info for all parsers
* Support ONIOM calculations in Gaussian (Karen Hemelsoet)
* New attribute atomcharges extracts Mulliken and Löwdin atomic
charges if present
* New attribute atomspins extracts Mulliken and Löwdin atomic spin
densities if present
* New thermodynamic attributes: freeenergy, temperature, enthalpy
(Edward Holland)
* Extract PES information: scanenergies, scancoords, scanparm, scannames
(Edward Holland)
Bugfixes:
* Handle coupled cluster energies in Gaussian 09 (Björn Dahlgren)
* Vibrational displacement vectors missing for Gaussian 09 (Björn
Dahlgren)
* Fix problem parsing vibrational frequencies in some GAMESS-US files
* Fix missing final scfenergy in ADF geometry optimisations
* Fix missing final scfenergy for ORCA where a specific number of SCF
cycles has been specified
* ORCA scfenergies not parsed if COSMO solvent effects included
* Allow spin unrestricted calculations to use the fragment MO overlaps
correctly for the MPA and CDA calculations
* Handle Gaussian MO energies that are printed as a row of asterisks
(Jerome Kieffer)
* Add more explicit license notices, and allow LGPL versions after 2.1
* Support Firefly calculations where nmo != nbasis (Pavel Solntsev)
* Fix problem parsing vibrational frequency information in recent
GAMESS (US) files (Chengju Wang)
* Apply patch from Chengju Wang to handle GAMESS calculations with more
than 99 atoms
* Handle Gaussian files with more than 99 atoms having pseudopotentials
(Björn Baumeier)
Changes in cclib-1.0.1:
Features:
* New attribute atommasses - atomic masses in Dalton
* Added support for Gaussian geometry optimisations that change
the number of linearly independent basis functions over the
course of the calculation
Bugfixes:
* Handle triplet PM3 calculations in Gaussian03 (Greg Magoon)
* Some Gaussian09 calculations were missing atomnos (Marius Retegan)
* Handle multiple pseudopotentials in Gaussian03 (Tiago Silva)
* Handle Gaussian calculations with >999 basis functions
* ADF versions > 2007 no longer print overlap info by default
* Handle parsing Firefly calculations that fail
* Fix parsing of ORCA calculation (Marius Retegan)
Changes in cclib-1.0:
Features:
* Handle PBC calculations from Gaussian
* Updates to handle Gaussian09
* Support TDDFT calculations from ADF
* A number of improvements for GAMESS support
* ccopen now supports any file-like object with a read() method, so it
can parse across HTTP
Bugfixes:
* Many many additional files parsed thanks to bugs reported by users
Changes in cclib-0.9:
Features:
* New parser: cclib can now parse ORCA files
* Added option to use setuptools instead of distutils.core for installing
* Improved handling of CI and TD-DFT data: TD-DFT data extracted from
GAMESS and etsecs standardised across all parsers
* Test suite changed to include output from only the newest program versions
Bugfixes:
* A small number of parsing errors were fixed
Changes in cclib-0.8:
Features:
* New parser: cclib can now parse Molpro files
* Separation of parser and data objects: Parsed data is now returned
as a ccData object that can be pickled, and converted to and from JSON
* Parsers: multiple files can be parsed with one parse command
* NumPy support: Dropped Numeric support in favour of NumPy
* API addition: 'charge' for molecular charge
* API addition: 'mult' for spin multiplicity
* API addition: 'atombasis' for indices of atom orbitals on each atom
* API addition: 'nocoeffs' for Natural Orbital (NO) coefficients
* GAMESS-US parser: added 'etoscs' (CIS calculations)
* Jaguar parser: added 'mpenergies' (LMP2 calcualtions)
* Jaguar parser: added 'etenergies' and 'etoscs' (CIS calculations)
* New method: Löwdin Population Analysis (LPA)
* Tests: unittests can be run from the Python interpreter, and for
a single parser; the number of "passed" tests is also counted and shown
Bugfixes:
* Several parsing errors were fixed
* Fixed some methods to work with different numbers of alpha and beta
MO coefficients in mocoeffs (MPA, CSPA, OPA)
Changes in cclib-0.7:
Features:
* New parser: cclib can now parse Jaguar files
* ccopen: Can handle log files which have been compressed into .zip,
.bz2 or .gz files.
* API addition: 'gbasis' holds the Gaussian basis set
* API addition: 'coreelectrons' contains the number of core electrons
in each atom's pseudopotential
* API addition: 'mpenergies' holds the Møller-Plesset corrected
molecular electronic energies
* API addition: 'vibdisps' holds the Cartesian displacement vectors
* API change: 'mocoeffs' is now a list of rank 2 arrays, rather than a
rank 3 array
* API change: 'moenergies' is now a list of rank 1 arrays, rather than
rank 2 array
* GAMESS-UK parser: added 'vibramans'
* New method: Charge Decomposition Analysis (CDA) for studying
electron donation, back donation, and repulsion between fragments
in a molecule
* New method: Fragment Analysis for studing bonding interactions
between two or more fragments in a molecule
* New method: Ability to calculate the electron density or
wavefunction
Bugfixes:
* GAMESS parser:
Failed to parse frequency calculation with imaginary frequencies
Rotations and translations now not included in frequencies
Failed to parse a DFT calculation
* GAMESS-UK parser:
'atomnos' not being extracted
Rotations and translations now not included in frequencies
* bridge to OpenBabel: No longer dependent on pyopenbabel
Changes in cclib-0.6.1:
Bugfixes:
* cclib: The "import cclib.parsers" statement failed due to
references to Molpro and Jaguar parsers which are not present
* Gaussian parser: Failed to parse single point calculations
where the input coords are a z-matrix, and symmetry is turned off.
Changes in cclib-0.6:
Features
* ADF parser: If some MO eigenvalues are not present, the parser
does not fail, but uses values of 99999 instead and A symmetry
Bugfixes
* ADF parser: The following bugs have been fixed
P/D orbitals for single atoms not handled correctly
Problem parsing homos in unrestricted calculations
Problem skipping the Create sections in certain calculations
* Gaussian parser: The following bugs have been fixed
Parser failed if standard orientation not found
* ccget: aooverlaps not included when using --list option
Changes in cclib-0.6b:
Features
* New parser: GAMESS-UK parser
* API addition: the .clean() method
The .clean() method of a parser clears all of the parsed
attributes. This is useful if you need to reparse during
the course of a calculation.
* Function rename: guesstype() has been renamed to ccopen()
* Speed up: Calculation of Overlap Density of States has
been sped up by two orders of magnitude
Bugfixes
* ccget: Passing multiple filenames now works on Windows too
* ADF parser: The following bugs have been fixed
Problem with parsing SFOs in certain log files
Handling of molecules with orbitals of E symmetry
Couldn't find the HOMO in log files from new versions of ADF
Parser used to miss attributes if SCF not converged
For a symmetrical molecule, mocoeffs were in the wrong order and
the homo was not identified correctly if degenerate
* Gaussian parser: The following bugs have been fixed
SCF values was not extracting the dEnergy value
Was extracting Depolar P instead of Raman activity
* ccopen: Minor problems fixed with identification of log files
Changes in cclib-0.5:
Features
* src/scripts/ccget: Added handling of multiple filenames.
It's now possible to use ccget as follows:
ccget *.log
This is a good way of checking out whether cclib is able to
parse all of the files in a given directory.
Also possible is:
ccget homos *.log
* Change of license: Changed license from GPL to LGPL
Bugfixes
* src/cclib/parser/gamessparser.py: Bugfix: gamessparser was dying
on GAMESS VERSION = 12 DEC 2003 gopts, as it was unable to parse
the scftargets.
* src/cclib/parser/gamessparser.py: Remove assertion to catch
instances where scftargets is unset. This occurs in the case of
failed calculations (e.g. wrong multiplicity).
* src/cclib/parser/adfparser.py: Fixed one of the errors with the
Mo5Obdt2-c2v-opt.adfout example, which had to do with the SFOs
being made of more than two combinations of atoms (4, because of
rotation in c2v point group).
At least one error is still present with atomcoords. It looks
like non-coordinate integers are being parsed as well, which
makes some of the atomcoords list have more than the 3 values
for x,y,z.
* src/cclib/parser/adfparser.py: Hopefully fixed the last error in
Mo5Obdt2-c2v-opt. Problem was that it was adding
line.split()[5:], but sometimes there was more than 3 fields
left, so it was changed to [5:8]. Need to check actual parsed
values to make sure it is parsed correctly.
* data/Gaussian, logfiledist, src/cclib/parser/gaussianparser.py,
test/regression.py: Bug fix: Mo4OSibdt2-opt.log has no
atomcoords despite being a geo-opt. This was due to the fact
that the parser was extracting "Input orientation" and not
"Standard orientation". It's now changed to "Standard
orientation" which works for all of the files in the repository.
cclib-1.6.2/INSTALL 0000664 0000000 0000000 00000006241 13535330462 0013634 0 ustar 00root root 0000000 0000000 == cclib installation instructions ==
=== Requirements ===
Before you install cclib, you need to make sure that you have the following:
* Python (version 3.0 and up, although 2.7 will still work)
* NumPy (at least version 1.5 is recommended).
Python is an open-source programming language available from http://www.python.org and it is included in many Linux distributions. In Debian it is installed as follows: (as root)
apt-get install python python-dev
NumPy (Numerical Python) adds a fast array facility to Python and is available from http://www.numpy.org. Windows users should use the most recent NumPy installation for the Python version they have (2.4, 2.5). Linux users are recommended to find a binary package for their distribution. In Debian it is installed as follows: (as root)
apt-get install python-numpy
Note: Numeric (the old version of Numerical Python) is not supported by the Numerical Python developers and is not supported by cclib.
To test whether Python is on the PATH, open a command prompt window and type:
python
If Python is not on the PATH and you use Windows, add the full path to the directory containing it to the end of the PATH variable under Control Panel/System/Advanced Settings/Environment Variables. If you use Linux and Python is not on the PATH, put/edit the appropriate line in your .bashrc or similar startup file.
To test, try importing NumPy at the Python prompt. You should see something similar to the following:
$ python3
Python 3.2.3 (default, Feb 27 2014, 21:31:18)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.6.2'
(To exit, press CTRL+Z in Windows or CTRL+D in Linux)
=== Installing cclib ===
On Debian, Ubuntu and other derived Linux distribution, cclib can be quickly installed with the command:
aptitude install cclib
The version installed from a distribuion might not be the most recent one. To install the most recent version, first download the source code of cclib. Extract the cclib tar file or zip file at an appropriate location, which we will call INSTALLDIR. Open a command prompt and change directory to INSTALLDIR. Next, run the following commands:
python setup.py build
python setup.py install (as root)
To test, trying importing cclib at the Python prompt. You should see something similar to the following:
$ python3
Python 3.2.3 (default, Feb 27 2014, 21:31:18)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Press ESC for command-line completion (twice for guesses).
History is saved to ~/.pyhistory.
>>> import cclib
>>> cclib.__version__
'1.6.2'
To run the unit tests, change directory into INSTALLDIR/test and run the following command:
python testall.py
This tests the program using the example data files included in the INSTALLDIR/data directory.
=== What next? ===
* Read the documentation at:
http://cclib.github.io
* Read the list and specifications of the extracted data at:
http://cclib.github.io/data.html
* Send any questions to the cclib-users mailing list at:
https://lists.sourceforge.net/lists/listinfo/cclib-users.
cclib-1.6.2/LICENSE 0000664 0000000 0000000 00000002766 13535330462 0013620 0 ustar 00root root 0000000 0000000 BSD 3-Clause License
Copyright (c) 2017, the cclib development team
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
cclib-1.6.2/README.md 0000664 0000000 0000000 00000001774 13535330462 0014070 0 ustar 00root root 0000000 0000000 ### cclib
[](https://dx.doi.org/10.5281/zenodo.1407790)
[](https://pypi.python.org/pypi/cclib)
[](https://github.com/cclib/cclib/releases)
[](https://travis-ci.org/cclib/cclib)
[](https://github.com/cclib/cclib/blob/master/LICENSE)
cclib is a Python library that provides parsers for output files of computational chemistry packages. It also provides a platform for computational chemists to implement algorithms in a platform-independent way.
For more information, go to [https://cclib.github.io](https://cclib.github.io). There is a mailing list for questions at cclib-users@lists.sourceforge.net.
cclib-1.6.2/THANKS 0000664 0000000 0000000 00000004341 13535330462 0013515 0 ustar 00root root 0000000 0000000 Core developers:
Eric J. Berquist
Geoff Hutchison
Karol M. Langner
Noel M. O'Boyle (retired)
Adam L. Tenderholt
Developers that have added at least ~1K SLOC:
Sagar Gaur
Sanjeed Schamnad
Kunal Sharma
Casper Steinmann
Jonathon Vandezande
We would like the thank all the following who have contributed to cclib:
Nuno Bandeira -- for bug reporting
Björn Baumeier -- for bug reporting
Dermot Brougham -- for bug reporting
bwang2453 -- for patches and new features
Avril Coghlan -- for designing the cclib logo
Ramon Crehuet -- for new features
Björn Dahlgren -- for bug reporting
Yafei Dai -- for bug reporting
Abhishek Dey -- for bug reporting
Matt Ernst -- for patches
Clyde Fare -- for bug reporting and patches
Christos Garoufalis -- for bug reporting
Alessandro Genova - for patches and fixes
Sagar Gaur -- for patches
glideht -- for bug reporting
Edward Holland -- for patches
Karen Hemelsoet -- for bug reporting
Ian Hovell -- for bug reporting
Julien Idé -- for bug reporting
csjacky -- for bug reporting
Russell Johnson -- for providing CCCBDB (NIST) logfiles
Jerome Kieffer -- for bug reporting
Greg Magoon -- for bug reporting and patches
Scott McKechnie -- for bug reporting
mkrompiec -- for contributing test files
mwykes -- for bug reporting and patches
Alexis Otero-Calvis -- for bug reporting
Rob Paton -- for creating and running Jaguar test jobs
Martin Peeks -- for patches
Felix Plasser -- for fixes, patches and contributing files
Martin Rahm -- for bug reporting
Peter Reinholdt - for patches
Marius Retegan -- for bug reporting
Jaime Rodríguez-Guerra - for patches
Tamilmani S -- for bug reporting
Melchor Sanchez -- for bug reporting
Alex Schild -- for ideas and contributing test files
Felipe S. S. Schneider - for fixes
Jen Schwartz -- for helping create and run Jaguar 6.0 test jobs
Tiago Silva -- for bug reporting
Pavel Solntsev -- for bug reporting
Ben Stein -- for patches
Maxim Stolyarchuk - for patches
Adam Swanson -- for bug reporting
Joe Townsend -- for contributing multiple GAMESS files to test on
Shiv Upadhyay -- for patches and fixes
Chengju Wang -- for bug reporting
Andrew Warden -- for bug reporting
Samuel Wilson -- for bug reporting
Fedor Zhuravlev -- for patches
Please let us know if we have omitted someone from this list.
cclib-1.6.2/cclib/ 0000775 0000000 0000000 00000000000 13535330462 0013654 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/__init__.py 0000664 0000000 0000000 00000003172 13535330462 0015770 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A library for parsing and interpreting results from computational chemistry packages.
The goals of cclib are centered around the reuse of data obtained from various
computational chemistry programs and typically contained in output files. Specifically,
cclib extracts (parses) data from the output files generated by multiple programs
and provides a consistent interface to access them.
Currently supported programs:
ADF, Firefly, GAMESS(US), GAMESS-UK, Gaussian,
Jaguar, Molpro, MOPAC, NWChem, ORCA, Psi, Q-Chem
Another aim is to facilitate the implementation of algorithms that are not specific
to any particular computational chemistry package and to maximise interoperability
with other open source computational chemistry and cheminformatic software libraries.
To this end, cclib provides a number of bridges to help transfer data to other libraries
as well as example methods that take parsed data as input.
"""
__version__ = "1.6.1"
from cclib import parser
from cclib import progress
from cclib import method
from cclib import bridge
from cclib import io
# The test module can be imported if it was installed with cclib.
try:
from cclib import test
except ImportError:
pass
# The objects below constitute our public API. These names will not change
# over time. Names in the sub-modules will typically also be backwards
# compatible, but may sometimes change when code is moved around.
ccopen = io.ccopen
ccwrite = io.ccwrite
cclib-1.6.2/cclib/bridge/ 0000775 0000000 0000000 00000000000 13535330462 0015110 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/bridge/__init__.py 0000664 0000000 0000000 00000001126 13535330462 0017221 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Facilities for moving parsed data to other cheminformatic libraries."""
from cclib.parser.utils import find_package
if find_package("Bio"):
from cclib.bridge.cclib2biopython import makebiopython
if find_package("openbabel"):
from cclib.bridge.cclib2openbabel import makeopenbabel
if find_package("PyQuante"):
from cclib.bridge.cclib2pyquante import makepyquante
del find_package
cclib-1.6.2/cclib/bridge/cclib2biopython.py 0000664 0000000 0000000 00000001757 13535330462 0020566 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Bridge for using cclib data in biopython (http://biopython.org)."""
from cclib.parser.utils import PeriodicTable
from cclib.parser.utils import find_package
_found_biopython = find_package("Bio")
if _found_biopython:
from Bio.PDB.Atom import Atom
def makebiopython(atomcoords, atomnos):
"""Create a list of BioPython Atoms.
This creates a list of BioPython Atoms suitable for use by
Bio.PDB.Superimposer, for example.
"""
if not _found_biopython:
raise ImportError("You must install `biopython` to use this function")
pt = PeriodicTable()
bioatoms = []
for coords, atomno in zip(atomcoords, atomnos):
symbol = pt.element[atomno]
bioatoms.append(Atom(symbol, coords, 0, 0, 0, symbol, 0, symbol.upper()))
return bioatoms
del find_package
cclib-1.6.2/cclib/bridge/cclib2openbabel.py 0000664 0000000 0000000 00000004723 13535330462 0020476 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Bridge between cclib data and openbabel (http://openbabel.org)."""
from cclib.parser.data import ccData
from cclib.parser.utils import find_package
_found_openbabel = find_package("openbabel")
if _found_openbabel:
import openbabel as ob
def _check_openbabel(found_openbabel):
if not found_openbabel:
raise ImportError("You must install `openbabel` to use this function")
def makecclib(mol):
"""Create cclib attributes and return a ccData from an OpenBabel molecule.
Beyond the numbers, masses and coordinates, we could also set the total charge
and multiplicity, but often these are calculated from atomic formal charges
so it is better to assume that would not be correct.
"""
_check_openbabel(_found_openbabel)
attributes = {
'atomcoords': [],
'atommasses': [],
'atomnos': [],
'natom': mol.NumAtoms(),
}
for atom in ob.OBMolAtomIter(mol):
attributes['atomcoords'].append([atom.GetX(), atom.GetY(), atom.GetZ()])
attributes['atommasses'].append(atom.GetAtomicMass())
attributes['atomnos'].append(atom.GetAtomicNum())
return ccData(attributes)
def makeopenbabel(atomcoords, atomnos, charge=0, mult=1):
"""Create an Open Babel molecule."""
_check_openbabel(_found_openbabel)
obmol = ob.OBMol()
for i in range(len(atomnos)):
# Note that list(atomcoords[i]) is not equivalent!!!
# For now, only take the last geometry.
# TODO: option to export last geometry or all geometries?
coords = atomcoords[-1][i].tolist()
atomno = int(atomnos[i])
obatom = ob.OBAtom()
obatom.SetAtomicNum(atomno)
obatom.SetVector(*coords)
obmol.AddAtom(obatom)
obmol.ConnectTheDots()
obmol.PerceiveBondOrders()
obmol.SetTotalSpinMultiplicity(mult)
obmol.SetTotalCharge(int(charge))
return obmol
def readfile(fname, format):
"""Read a file with OpenBabel and extract cclib attributes."""
_check_openbabel(_found_openbabel)
obc = ob.OBConversion()
if obc.SetInFormat(format):
mol = ob.OBMol()
obc.ReadFile(mol, fname)
return makecclib(mol)
else:
print("Unable to load the %s reader from OpenBabel." % format)
return {}
del find_package
cclib-1.6.2/cclib/bridge/cclib2pyquante.py 0000664 0000000 0000000 00000001646 13535330462 0020416 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Bridge for using cclib data in PyQuante (http://pyquante.sourceforge.net)."""
from cclib.parser.utils import find_package
_found_pyquante = find_package("PyQuante")
if _found_pyquante:
from PyQuante.Molecule import Molecule
def _check_pyquante(found_pyquante):
if not found_pyquante:
raise ImportError("You must install `PyQuante` to use this function")
def makepyquante(atomcoords, atomnos, charge=0, mult=1):
"""Create a PyQuante Molecule."""
_check_pyquante(_found_pyquante)
return Molecule("notitle",
list(zip(atomnos, atomcoords)),
units="Angstrom",
charge=charge, multiplicity=mult)
del find_package
cclib-1.6.2/cclib/io/ 0000775 0000000 0000000 00000000000 13535330462 0014263 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/io/__init__.py 0000664 0000000 0000000 00000001673 13535330462 0016403 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Contains all writers for standard chemical representations."""
from cclib.io.cjsonreader import CJSON as CJSONReader
from cclib.io.cjsonwriter import CJSON as CJSONWriter
from cclib.io.cmlwriter import CML
from cclib.io.moldenwriter import MOLDEN
from cclib.io.wfxwriter import WFXWriter
from cclib.io.xyzreader import XYZ as XYZReader
from cclib.io.xyzwriter import XYZ as XYZWriter
# This allows users to type:
# from cclib.io import ccframe
# from cclib.io import ccopen
# from cclib.io import ccread
# from cclib.io import ccwrite
# from cclib.io import URL_PATTERN
from cclib.io.ccio import ccframe
from cclib.io.ccio import ccopen
from cclib.io.ccio import ccread
from cclib.io.ccio import ccwrite
from cclib.io.ccio import URL_PATTERN
cclib-1.6.2/cclib/io/ccio.py 0000664 0000000 0000000 00000045642 13535330462 0015565 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Tools for identifying, reading and writing files and streams."""
from __future__ import print_function
import atexit
import io
import os
import sys
import re
from tempfile import NamedTemporaryFile
# We want this as long as we need to support both Python 2 and 3.
from six import string_types
# Python 2->3 changes the default file object hierarchy.
if sys.version_info[0] == 2:
fileclass = file
from urllib2 import urlopen, URLError
else:
fileclass = io.IOBase
from urllib.request import urlopen
from urllib.error import URLError
from cclib.parser import data
from cclib.parser import logfileparser
from cclib.parser.utils import find_package
from cclib.parser.adfparser import ADF
from cclib.parser.daltonparser import DALTON
from cclib.parser.gamessparser import GAMESS
from cclib.parser.gamessukparser import GAMESSUK
from cclib.parser.gaussianparser import Gaussian
from cclib.parser.jaguarparser import Jaguar
from cclib.parser.molcasparser import Molcas
from cclib.parser.molproparser import Molpro
from cclib.parser.mopacparser import MOPAC
from cclib.parser.nwchemparser import NWChem
from cclib.parser.orcaparser import ORCA
from cclib.parser.psi3parser import Psi3
from cclib.parser.psi4parser import Psi4
from cclib.parser.qchemparser import QChem
from cclib.parser.turbomoleparser import Turbomole
from cclib.io import cjsonreader
from cclib.io import cjsonwriter
from cclib.io import cmlwriter
from cclib.io import moldenwriter
from cclib.io import wfxwriter
from cclib.io import xyzreader
from cclib.io import xyzwriter
_has_cclib2openbabel = find_package("openbabel")
if _has_cclib2openbabel:
from cclib.bridge import cclib2openbabel
_has_pandas = find_package("pandas")
if _has_pandas:
import pandas as pd
# Regular expression for validating URLs
URL_PATTERN = re.compile(
r'^(?:http|ftp)s?://' # http:// or https://
r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|' # domain...
r'localhost|' # localhost...
r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' # ...or ip
r'(?::\d+)?' # optional port
r'(?:/?|[/?]\S+)$', re.IGNORECASE
)
# Parser choice is triggered by certain phrases occurring the logfile. Where these
# strings are unique, we can set the parser and break. In other cases, the situation
# is a little but more complicated. Here are the exceptions:
# 1. The GAMESS trigger also works for GAMESS-UK files, so we can't break
# after finding GAMESS in case the more specific phrase is found.
# 2. Molpro log files don't have the program header, but always contain
# the generic string 1PROGRAM, so don't break here either to be cautious.
# 3. "MOPAC" is used in some packages like GAMESS, so match MOPAC20##
#
# The triggers are defined by the tuples in the list below like so:
# (parser, phrases, flag whether we should break)
triggers = [
(ADF, ["Amsterdam Density Functional"], True),
(DALTON, ["Dalton - An Electronic Structure Program"], True),
(GAMESS, ["GAMESS"], False),
(GAMESS, ["GAMESS VERSION"], True),
(GAMESSUK, ["G A M E S S - U K"], True),
(Gaussian, ["Gaussian, Inc."], True),
(Jaguar, ["Jaguar"], True),
(Molcas, ["MOLCAS"], True),
(Molpro, ["PROGRAM SYSTEM MOLPRO"], True),
(Molpro, ["1PROGRAM"], False),
(MOPAC, ["MOPAC20"], True),
(NWChem, ["Northwest Computational Chemistry Package"], True),
(ORCA, ["O R C A"], True),
(Psi3, ["PSI3: An Open-Source Ab Initio Electronic Structure Package"], True),
(Psi4, ["Psi4: An Open-Source Ab Initio Electronic Structure Package"], True),
(QChem, ["A Quantum Leap Into The Future Of Chemistry"], True),
(Turbomole, ["TURBOMOLE"], True),
]
readerclasses = {
'cjson': cjsonreader.CJSON,
'json': cjsonreader.CJSON,
'xyz': xyzreader.XYZ,
}
writerclasses = {
'cjson': cjsonwriter.CJSON,
'json': cjsonwriter.CJSON,
'cml': cmlwriter.CML,
'molden': moldenwriter.MOLDEN,
'wfx': wfxwriter.WFXWriter,
'xyz': xyzwriter.XYZ,
}
class UnknownOutputFormatError(Exception):
"""Raised when an unknown output format is encountered."""
def guess_filetype(inputfile):
"""Try to guess the filetype by searching for trigger strings."""
if not inputfile:
return None
filetype = None
if isinstance(inputfile, string_types):
for line in inputfile:
for parser, phrases, do_break in triggers:
if all([line.lower().find(p.lower()) >= 0 for p in phrases]):
filetype = parser
if do_break:
return filetype
else:
for fname in inputfile:
for line in inputfile:
for parser, phrases, do_break in triggers:
if all([line.lower().find(p.lower()) >= 0 for p in phrases]):
filetype = parser
if do_break:
return filetype
return filetype
def ccread(source, *args, **kwargs):
"""Attempt to open and read computational chemistry data from a file.
If the file is not appropriate for cclib parsers, a fallback mechanism
will try to recognize some common chemistry formats and read those using
the appropriate bridge such as Open Babel.
Inputs:
source - a single logfile, a list of logfiles (for a single job),
an input stream, or an URL pointing to a log file.
*args, **kwargs - arguments and keyword arguments passed to ccopen
Returns:
a ccData object containing cclib data attributes
"""
log = ccopen(source, *args, **kwargs)
if log:
if kwargs.get('verbose', None):
print('Identified logfile to be in %s format' % log.logname)
# If the input file is a CJSON file and not a standard compchemlog file
cjson_as_input = kwargs.get("cjson", False)
if cjson_as_input:
return log.read_cjson()
else:
return log.parse()
else:
if kwargs.get('verbose', None):
print('Attempting to use fallback mechanism to read file')
return fallback(source)
def ccopen(source, *args, **kwargs):
"""Guess the identity of a particular log file and return an instance of it.
Inputs:
source - a single logfile, a list of logfiles (for a single job),
an input stream, or an URL pointing to a log file.
*args, **kwargs - arguments and keyword arguments passed to filetype
Returns:
one of ADF, DALTON, GAMESS, GAMESS UK, Gaussian, Jaguar,
Molpro, MOPAC, NWChem, ORCA, Psi3, Psi/Psi4, QChem, CJSON or None
(if it cannot figure it out or the file does not exist).
"""
inputfile = None
is_stream = False
# Check if source is a link or contains links. Retrieve their content.
# Try to open the logfile(s), using openlogfile, if the source is a string (filename)
# or list of filenames. If it can be read, assume it is an open file object/stream.
is_string = isinstance(source, str)
is_url = True if is_string and URL_PATTERN.match(source) else False
is_listofstrings = isinstance(source, list) and all([isinstance(s, str) for s in source])
if is_string or is_listofstrings:
# Process links from list (download contents into temporary location)
if is_listofstrings:
filelist = []
for filename in source:
if not URL_PATTERN.match(filename):
filelist.append(filename)
else:
try:
response = urlopen(filename)
tfile = NamedTemporaryFile(delete=False)
tfile.write(response.read())
# Close the file because Windows won't let open it second time
tfile.close()
filelist.append(tfile.name)
# Delete temporary file when the program finishes
atexit.register(os.remove, tfile.name)
except (ValueError, URLError) as error:
if not kwargs.get('quiet', False):
(errno, strerror) = error.args
return None
source = filelist
if not is_url:
try:
inputfile = logfileparser.openlogfile(source)
except IOError as error:
if not kwargs.get('quiet', False):
(errno, strerror) = error.args
return None
else:
try:
response = urlopen(source)
is_stream = True
# Retrieve filename from URL if possible
filename = re.findall("\w+\.\w+", source.split('/')[-1])
filename = filename[0] if filename else ""
inputfile = logfileparser.openlogfile(filename, object=response.read())
except (ValueError, URLError) as error:
if not kwargs.get('quiet', False):
(errno, strerror) = error.args
return None
elif hasattr(source, "read"):
inputfile = source
is_stream = True
# Streams are tricky since they don't have seek methods or seek won't work
# by design even if it is present. We solve this now by reading in the
# entire stream and using a StringIO buffer for parsing. This might be
# problematic for very large streams. Slow streams might also be an issue if
# the parsing is not instantaneous, but we'll deal with such edge cases
# as they arise. Ideally, in the future we'll create a class dedicated to
# dealing with these issues, supporting both files and streams.
if is_stream:
try:
inputfile.seek(0, 0)
except (AttributeError, IOError):
contents = inputfile.read()
try:
inputfile = io.StringIO(contents)
except:
inputfile = io.StringIO(unicode(contents))
inputfile.seek(0, 0)
# Proceed to return an instance of the logfile parser only if the filetype
# could be guessed. Need to make sure the input file is closed before creating
# an instance, because parsers will handle opening/closing on their own.
filetype = guess_filetype(inputfile)
# If the input file isn't a standard compchem log file, try one of
# the readers, falling back to Open Babel.
if not filetype:
if kwargs.get("cjson"):
filetype = readerclasses['cjson']
elif source and not is_stream:
ext = os.path.splitext(source)[1][1:].lower()
for extension in readerclasses:
if ext == extension:
filetype = readerclasses[extension]
# Proceed to return an instance of the logfile parser only if the filetype
# could be guessed. Need to make sure the input file is closed before creating
# an instance, because parsers will handle opening/closing on their own.
if filetype:
# We're going to close and reopen below anyway, so this is just to avoid
# the missing seek method for fileinput.FileInput. In the long run
# we need to refactor to support for various input types in a more
# centralized fashion.
if is_listofstrings:
pass
else:
inputfile.seek(0, 0)
if not is_stream:
if is_listofstrings:
if filetype == Turbomole:
source = sort_turbomole_outputs(source)
inputfile.close()
return filetype(source, *args, **kwargs)
return filetype(inputfile, *args, **kwargs)
def fallback(source):
"""Attempt to read standard molecular formats using other libraries.
Currently this will read XYZ files with OpenBabel, but this can easily
be extended to other formats and libraries, too.
"""
if isinstance(source, str):
ext = os.path.splitext(source)[1][1:].lower()
if _has_cclib2openbabel:
import pybel as pb
if ext in pb.informats:
return cclib2openbabel.readfile(source, ext)
else:
print("Could not import `openbabel`, fallback mechanism might not work.")
def ccwrite(ccobj, outputtype=None, outputdest=None,
indices=None, terse=False, returnstr=False,
*args, **kwargs):
"""Write the parsed data from an outputfile to a standard chemical
representation.
Inputs:
ccobj - Either a job (from ccopen) or a data (from job.parse()) object
outputtype - The output format (should be a string)
outputdest - A filename or file object for writing
indices - One or more indices for extracting specific geometries/etc. (zero-based)
terse - This option is currently limited to the cjson/json format. Whether to indent the cjson/json or not
returnstr - Whether or not to return a string representation.
The different writers may take additional arguments, which are
documented in their respective docstrings.
Returns:
the string representation of the chemical datatype
requested, or None.
"""
# Determine the correct output format.
outputclass = _determine_output_format(outputtype, outputdest)
# Is ccobj an job object (unparsed), or is it a ccdata object (parsed)?
if isinstance(ccobj, logfileparser.Logfile):
jobfilename = ccobj.filename
ccdata = ccobj.parse()
elif isinstance(ccobj, data.ccData):
jobfilename = None
ccdata = ccobj
else:
raise ValueError
# If the logfile name has been passed in through kwargs (such as
# in the ccwrite script), make sure it has precedence.
if 'jobfilename' in kwargs:
jobfilename = kwargs['jobfilename']
# Avoid passing multiple times into the main call.
del kwargs['jobfilename']
outputobj = outputclass(ccdata, jobfilename=jobfilename,
indices=indices, terse=terse,
*args, **kwargs)
output = outputobj.generate_repr()
# If outputdest isn't None, write the output to disk.
if outputdest is not None:
if isinstance(outputdest, str):
with open(outputdest, 'w') as outputobj:
outputobj.write(output)
elif isinstance(outputdest, fileclass):
outputdest.write(output)
else:
raise ValueError
# If outputdest is None, return a string representation of the output.
else:
return output
if returnstr:
return output
def _determine_output_format(outputtype, outputdest):
"""
Determine the correct output format.
Inputs:
outputtype - a string corresponding to the file type
outputdest - a filename string or file handle
Returns:
outputclass - the class corresponding to the correct output format
Raises:
UnknownOutputFormatError for unsupported file writer extensions
"""
# Priority for determining the correct output format:
# 1. outputtype
# 2. outputdest
outputclass = None
# First check outputtype.
if isinstance(outputtype, str):
extension = outputtype.lower()
if extension in writerclasses:
outputclass = writerclasses[extension]
else:
raise UnknownOutputFormatError(extension)
else:
# Then checkout outputdest.
if isinstance(outputdest, str):
extension = os.path.splitext(outputdest)[1].lower()
elif isinstance(outputdest, fileclass):
extension = os.path.splitext(outputdest.name)[1].lower()
else:
raise UnknownOutputFormatError
if extension in writerclasses:
outputclass = writerclasses[extension]
else:
raise UnknownOutputFormatError(extension)
return outputclass
def path_leaf(path):
"""
Splits the path to give the filename. Works irrespective of '\'
or '/' appearing in the path and also with path ending with '/' or '\'.
Inputs:
path - a string path of a logfile.
Returns:
tail - 'directory/subdirectory/logfilename' will return 'logfilename'.
ntpath.basename(head) - 'directory/subdirectory/logfilename/' will return 'logfilename'.
"""
head, tail = os.path.split(path)
return tail or os.path.basename(head)
def sort_turbomole_outputs(filelist):
"""
Sorts a list of inputs (or list of log files) according to the order
defined below. Just appends the unknown files in the end of the sorted list.
Inputs:
filelist - a list of Turbomole log files needed to be parsed.
Returns:
sorted_list - a sorted list of Turbomole files needed for proper parsing.
"""
sorting_order = {
'basis' : 0,
'control' : 1,
'mos' : 2,
'alpha' : 3,
'beta' : 4,
'job.last' : 5,
'coord' : 6,
'gradient' : 7,
'aoforce' : 8,
}
known_files = []
unknown_files = []
sorted_list = []
for fname in filelist:
filename = path_leaf(fname)
if filename in sorting_order:
known_files.append([fname, sorting_order[filename]])
else:
unknown_files.append(fname)
for i in sorted(known_files, key=lambda x: x[1]):
sorted_list.append(i[0])
if unknown_files:
sorted_list.extend(unknown_files)
return sorted_list
def _check_pandas(found_pandas):
if not found_pandas:
raise ImportError("You must install `pandas` to use this function")
def ccframe(ccobjs, *args, **kwargs):
"""Returns a pandas.DataFrame of data attributes parsed by cclib from one
or more logfiles.
Inputs:
ccobjs - an iterable of either cclib jobs (from ccopen) or data (from
job.parse()) objects
Returns:
a pandas.DataFrame
"""
_check_pandas(_has_pandas)
logfiles = []
for ccobj in ccobjs:
# Is ccobj an job object (unparsed), or is it a ccdata object (parsed)?
if isinstance(ccobj, logfileparser.Logfile):
jobfilename = ccobj.filename
ccdata = ccobj.parse()
elif isinstance(ccobj, data.ccData):
jobfilename = None
ccdata = ccobj
else:
raise ValueError
attributes = ccdata.getattributes()
attributes.update({
'jobfilename': jobfilename
})
logfiles.append(pd.Series(attributes))
return pd.DataFrame(logfiles)
del find_package
cclib-1.6.2/cclib/io/cjsonreader.py 0000664 0000000 0000000 00000003570 13535330462 0017141 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A reader for chemical JSON (CJSON) files."""
import json
from cclib.io import filereader
from cclib.parser.data import ccData
class CJSON(filereader.Reader):
"""A reader for chemical JSON (CJSON) log files."""
def __init__(self, source, *args, **kwargs):
super(CJSON, self).__init__(source, *args, **kwargs)
self.representation = dict()
def parse(self):
super(CJSON, self).parse()
json_data = json.loads(self.filecontents)
self.generate_repr(json_data)
return self.representation
def generate_repr(self, json_data):
for k, v in ccData._attributes.items():
json_key = v.json_key
attribute_path = v.attribute_path.split(":")
if attribute_path[0] == 'N/A':
continue
levels = len(attribute_path)
if attribute_path[0] in json_data:
l1_data_object = json_data[attribute_path[0]]
if levels == 1:
if json_key in l1_data_object:
self.representation[k] = l1_data_object[json_key]
elif levels >= 2:
if attribute_path[1] in l1_data_object:
l2_data_object = l1_data_object[attribute_path[1]]
if json_key in l2_data_object:
self.representation[k] = l2_data_object[json_key]
if levels == 3 and attribute_path[2] in l2_data_object:
l3_data_object = l2_data_object[attribute_path[2]]
if json_key in l3_data_object:
self.representation[k] = l3_data_object[json_key]
cclib-1.6.2/cclib/io/cjsonwriter.py 0000664 0000000 0000000 00000023024 13535330462 0017207 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A writer for chemical JSON (CJSON) files."""
import os.path
import json
import numpy as np
from cclib.io import filewriter
from cclib.parser.data import ccData
from cclib.parser.utils import find_package
_has_openbabel = find_package("openbabel")
class CJSON(filewriter.Writer):
"""A writer for chemical JSON (CJSON) files."""
def __init__(self, ccdata, terse=False, *args, **kwargs):
"""Initialize the chemical JSON writer object.
Inputs:
ccdata - An instance of ccData, parsed from a logfile.
"""
super(CJSON, self).__init__(ccdata, terse=terse, *args, **kwargs)
def pathname(self, path):
"""Return filename without extension to be used as name."""
name = os.path.basename(os.path.splitext(path)[0])
return name
def as_dict(self):
""" Build a Python dict with the CJSON data"""
cjson_dict = dict()
# Need to decide on a number format.
cjson_dict['chemical json'] = 0
if self.jobfilename is not None:
cjson_dict['name'] = self.pathname(self.jobfilename)
# These are properties that can be collected using Open Babel.
if _has_openbabel:
cjson_dict['smiles'] = self.pbmol.write('smiles')
cjson_dict['inchi'] = self.pbmol.write('inchi')
cjson_dict['inchikey'] = self.pbmol.write('inchikey')
cjson_dict['formula'] = self.pbmol.formula
# TODO Incorporate unit cell information.
# Iterate through the attribute list present in ccData. Depending on
# the availability of the attribute add it at the right 'level'.
for attribute_name, v in ccData._attributes.items():
if not hasattr(self.ccdata, attribute_name):
continue
attribute_path = v.attribute_path.split(":")
# Depth of the attribute in the CJSON.
levels = len(attribute_path)
# The attributes which haven't been included in the CJSON format.
if attribute_path[0] == 'N/A':
continue
if attribute_path[0] not in cjson_dict:
cjson_dict[attribute_path[0]] = dict()
l1_data_object = cjson_dict[attribute_path[0]]
# 'moments' and 'atomcoords' key will contain processed data
# obtained from the output file. TODO rewrite this
if attribute_name in ('moments', 'atomcoords'):
if attribute_name == 'moments':
dipole_moment = self._calculate_total_dipole_moment()
if dipole_moment is not None:
cjson_dict['properties'][ccData._attributes['moments'].json_key] = dipole_moment
else:
cjson_dict['atoms']['coords'] = dict()
cjson_dict['atoms']['coords']['3d'] = self.ccdata.atomcoords[-1].flatten().tolist()
continue
if levels == 1:
self.set_JSON_attribute(l1_data_object, attribute_name)
elif levels >= 2:
if attribute_path[1] not in l1_data_object:
l1_data_object[attribute_path[1]] = dict()
l2_data_object = l1_data_object[attribute_path[1]]
if levels == 2:
self.set_JSON_attribute(l2_data_object, attribute_name)
elif levels == 3:
if attribute_path[2] not in l2_data_object:
l2_data_object[attribute_path[2]] = dict()
l3_data_object = l2_data_object[attribute_path[2]]
self.set_JSON_attribute(l3_data_object, attribute_name)
# Attributes which are not directly obtained from the output files.
if hasattr(self.ccdata, 'moenergies') and hasattr(self.ccdata, 'homos'):
if 'energy' not in cjson_dict['properties']:
cjson_dict['properties']['energy'] = dict()
cjson_dict['properties']['energy']['alpha'] = dict()
cjson_dict['properties']['energy']['beta'] = dict()
homo_idx_alpha = int(self.ccdata.homos[0])
homo_idx_beta = int(self.ccdata.homos[-1])
energy_alpha_homo = self.ccdata.moenergies[0][homo_idx_alpha]
energy_alpha_lumo = self.ccdata.moenergies[0][homo_idx_alpha + 1]
energy_alpha_gap = energy_alpha_lumo - energy_alpha_homo
energy_beta_homo = self.ccdata.moenergies[-1][homo_idx_beta]
energy_beta_lumo = self.ccdata.moenergies[-1][homo_idx_beta + 1]
energy_beta_gap = energy_beta_lumo - energy_beta_homo
cjson_dict['properties']['energy']['alpha']['homo'] = energy_alpha_homo
cjson_dict['properties']['energy']['alpha']['gap'] = energy_alpha_gap
cjson_dict['properties']['energy']['beta']['homo'] = energy_beta_homo
cjson_dict['properties']['energy']['beta']['gap'] = energy_beta_gap
cjson_dict['properties']['energy']['total'] = self.ccdata.scfenergies[-1]
if hasattr(self.ccdata, 'atomnos'):
cjson_dict['atoms']['elements']['atom count'] = len(self.ccdata.atomnos)
cjson_dict['atoms']['elements']['heavy atom count'] = len([x for x in self.ccdata.atomnos if x > 1])
# Bond attributes:
if _has_openbabel and (len(self.ccdata.atomnos) > 1):
cjson_dict['bonds'] = dict()
cjson_dict['bonds']['connections'] = dict()
cjson_dict['bonds']['connections']['index'] = []
for bond in self.bond_connectivities:
cjson_dict['bonds']['connections']['index'].append(bond[0])
cjson_dict['bonds']['connections']['index'].append(bond[1])
cjson_dict['bonds']['order'] = [bond[2] for bond in self.bond_connectivities]
if _has_openbabel:
cjson_dict['properties']['molecular mass'] = self.pbmol.molwt
cjson_dict['diagram'] = self.pbmol.write(format='svg')
return cjson_dict
def generate_repr(self):
"""Generate the CJSON representation of the logfile data."""
cjson_dict = self.as_dict()
if self.terse:
return json.dumps(cjson_dict, cls=NumpyAwareJSONEncoder)
else:
return json.dumps(cjson_dict, cls=JSONIndentEncoder, sort_keys=True, indent=4)
def set_JSON_attribute(self, object, key):
"""
Args:
object: Python dictionary which is being appended with the key value.
key: cclib attribute name.
Returns:
None. The dictionary is modified to contain the attribute with the
cclib keyname as key
"""
if hasattr(self.ccdata, key):
object[ccData._attributes[key].json_key] = getattr(self.ccdata, key)
class NumpyAwareJSONEncoder(json.JSONEncoder):
"""A encoder for numpy.ndarray's obtained from the cclib attributes.
For all other types the json default encoder is called.
Do Not rename the 'default' method as it is required to be implemented
by any subclass of the json.JSONEncoder
"""
def default(self, obj):
if isinstance(obj, np.ndarray):
if obj.ndim == 1:
nan_list = obj.tolist()
return [None if np.isnan(x) else x for x in nan_list]
else:
return [self.default(obj[i]) for i in range(obj.shape[0])]
return json.JSONEncoder.default(self, obj)
class JSONIndentEncoder(json.JSONEncoder):
def __init__(self, *args, **kwargs):
super(JSONIndentEncoder, self).__init__(*args, **kwargs)
self.current_indent = 0
self.current_indent_str = ""
def encode(self, o):
# Special Processing for lists
if isinstance(o, (list, tuple)):
primitives_only = True
for item in o:
if isinstance(item, (list, tuple, dict)):
primitives_only = False
break
output = []
if primitives_only:
for item in o:
output.append(json.dumps(item, cls=NumpyAwareJSONEncoder))
return "[ " + ", ".join(output) + " ]"
else:
self.current_indent += self.indent
self.current_indent_str = "".join([" " for x in range(self.current_indent)])
for item in o:
output.append(self.current_indent_str + self.encode(item))
self.current_indent -= self.indent
self.current_indent_str = "".join([" " for x in range(self.current_indent)])
return "[\n" + ",\n".join(output) + "\n" + self.current_indent_str + "]"
elif isinstance(o, dict):
output = []
self.current_indent += self.indent
self.current_indent_str = "".join([" " for x in range(self.current_indent)])
for key, value in o.items():
output.append(self.current_indent_str + json.dumps(key, cls=NumpyAwareJSONEncoder) + ": " +
str(self.encode(value)))
self.current_indent -= self.indent
self.current_indent_str = "".join([" " for x in range(self.current_indent)])
return "{\n" + ",\n".join(output) + "\n" + self.current_indent_str + "}"
elif isinstance(o, np.generic):
return json.dumps(o.item(), cls=NumpyAwareJSONEncoder)
else:
return json.dumps(o, cls=NumpyAwareJSONEncoder)
del find_package
cclib-1.6.2/cclib/io/cmlwriter.py 0000664 0000000 0000000 00000007165 13535330462 0016656 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A writer for chemical markup language (CML) files."""
import xml.etree.cElementTree as ET
from cclib.io import filewriter
from cclib.parser.utils import find_package
_has_openbabel = find_package("openbabel")
class CML(filewriter.Writer):
"""A writer for chemical markup language (CML) files."""
def __init__(self, ccdata, *args, **kwargs):
"""Initialize the CML writer object.
Inputs:
ccdata - An instance of ccData, parsed from a logfile.
"""
# Call the __init__ method of the superclass
super(CML, self).__init__(ccdata, *args, **kwargs)
def generate_repr(self):
"""Generate the CML representation of the logfile data."""
# Create the base molecule.
molecule = ET.Element('molecule')
d = {
# Write the namespace directly.
'xmlns': 'http://www.xml-cml.org/schema',
}
if self.jobfilename is not None:
d['id'] = self.jobfilename
_set_attrs(molecule, d)
# Form the listing of all the atoms present.
atomArray = ET.SubElement(molecule, 'atomArray')
if hasattr(self.ccdata, 'atomcoords') and hasattr(self.ccdata, 'atomnos'):
elements = [self.pt.element[Z] for Z in self.ccdata.atomnos]
for atomid in range(self.ccdata.natom):
atom = ET.SubElement(atomArray, 'atom')
x, y, z = self.ccdata.atomcoords[-1][atomid].tolist()
d = {
'id': 'a{}'.format(atomid + 1),
'elementType': elements[atomid],
'x3': '{:.10f}'.format(x),
'y3': '{:.10f}'.format(y),
'z3': '{:.10f}'.format(z),
}
_set_attrs(atom, d)
# Form the listing of all the bonds present.
bondArray = ET.SubElement(molecule, 'bondArray')
if _has_openbabel:
for bc in self.bond_connectivities:
bond = ET.SubElement(bondArray, 'bond')
d = {
'atomRefs2': 'a{} a{}'.format(bc[0] + 1, bc[1] + 1),
'order': str(bc[2]),
}
_set_attrs(bond, d)
_indent(molecule)
return _tostring(molecule)
def _set_attrs(element, d):
"""Set all the key-value pairs from a dictionary as element
attributes.
"""
for (k, v) in d.items():
element.set(k, v)
return
def _indent(elem, level=0):
"""An in-place pretty-print indenter for XML."""
i = "\n" + (level * " ")
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
_indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
def _tostring(element, xml_declaration=True, encoding='utf-8', method='xml'):
"""A reimplementation of tostring() found in ElementTree."""
class dummy:
pass
data = []
file = dummy()
file.write = data.append
ET.ElementTree(element).write(file,
xml_declaration=xml_declaration,
encoding=encoding,
method=method)
return b''.join(data).decode(encoding)
del find_package
cclib-1.6.2/cclib/io/filereader.py 0000664 0000000 0000000 00000002241 13535330462 0016736 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Generic file reader and related tools"""
from abc import ABCMeta, abstractmethod
from six import add_metaclass
@add_metaclass(ABCMeta)
class Reader:
"""Abstract class for reader objects."""
def __init__(self, source, *args, **kwargs):
"""Initialize the Reader object.
This should be called by a subclass in its own __init__ method.
Inputs:
source - A single filename, stream [TODO], or list of filenames/streams [TODO].
"""
if isinstance(source, str):
self.filename = source
else:
raise ValueError
def parse(self):
"""Read the raw contents of the source into the Reader."""
# TODO This cannot currently handle streams.
with open(self.filename) as handle:
self.filecontents = handle.read()
return None
@abstractmethod
def generate_repr(self):
"""Convert the raw contents of the source into the internal representation."""
cclib-1.6.2/cclib/io/filewriter.py 0000664 0000000 0000000 00000012137 13535330462 0017015 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Generic file writer and related tools"""
import logging
import sys
from abc import ABCMeta, abstractmethod
from six import add_metaclass
if sys.version_info <= (3, 3):
from collections import Iterable
else:
from collections.abc import Iterable
import numpy
from cclib.parser.utils import PeriodicTable
from cclib.parser.utils import find_package
_has_openbabel = find_package("openbabel")
if _has_openbabel:
from cclib.bridge import makeopenbabel
import openbabel as ob
import pybel as pb
class MissingAttributeError(Exception):
pass
@add_metaclass(ABCMeta)
class Writer:
"""Abstract class for writer objects."""
required_attrs = ()
def __init__(self, ccdata, jobfilename=None, indices=None, terse=False,
*args, **kwargs):
"""Initialize the Writer object.
This should be called by a subclass in its own __init__ method.
Inputs:
ccdata - An instance of ccData, parsed from a logfile.
jobfilename - The filename of the parsed logfile.
indices - One or more indices for extracting specific geometries/etc. (zero-based)
terse - Whether to print the terse version of the output file - currently limited to cjson/json formats
"""
self.ccdata = ccdata
self.jobfilename = jobfilename
self.indices = indices
self.terse = terse
self.ghost = kwargs.get("ghost")
self.pt = PeriodicTable()
self._check_required_attributes()
# Open Babel isn't necessarily present.
if _has_openbabel:
# Generate the Open Babel/Pybel representation of the molecule.
# Used for calculating SMILES/InChI, formula, MW, etc.
self.obmol, self.pbmol = self._make_openbabel_from_ccdata()
self.bond_connectivities = self._make_bond_connectivity_from_openbabel(self.obmol)
self._fix_indices()
@abstractmethod
def generate_repr(self):
"""Generate the written representation of the logfile data."""
def _calculate_total_dipole_moment(self):
"""Calculate the total dipole moment."""
# ccdata.moments may exist, but only contain center-of-mass coordinates
if len(getattr(self.ccdata, 'moments', [])) > 1:
return numpy.linalg.norm(self.ccdata.moments[1])
def _check_required_attributes(self):
"""Check if required attributes are present in ccdata."""
missing = [x for x in self.required_attrs
if not hasattr(self.ccdata, x)]
if missing:
missing = ' '.join(missing)
raise MissingAttributeError(
'Could not parse required attributes to write file: ' + missing)
def _make_openbabel_from_ccdata(self):
"""Create Open Babel and Pybel molecules from ccData."""
if not hasattr(self.ccdata, 'charge'):
logging.warning("ccdata object does not have charge, setting to 0")
_charge = 0
else:
_charge = self.ccdata.charge
if not hasattr(self.ccdata, 'mult'):
logging.warning("ccdata object does not have spin multiplicity, setting to 1")
_mult = 1
else:
_mult = self.ccdata.mult
obmol = makeopenbabel(self.ccdata.atomcoords,
self.ccdata.atomnos,
charge=_charge,
mult=_mult)
if self.jobfilename is not None:
obmol.SetTitle(self.jobfilename)
return (obmol, pb.Molecule(obmol))
def _make_bond_connectivity_from_openbabel(self, obmol):
"""Based upon the Open Babel/Pybel molecule, create a list of tuples
to represent bonding information, where the three integers are
the index of the starting atom, the index of the ending atom,
and the bond order.
"""
bond_connectivities = []
for obbond in ob.OBMolBondIter(obmol):
bond_connectivities.append((obbond.GetBeginAtom().GetIndex(),
obbond.GetEndAtom().GetIndex(),
obbond.GetBondOrder()))
return bond_connectivities
def _fix_indices(self):
"""Clean up the index container type and remove zero-based indices to
prevent duplicate structures and incorrect ordering when
indices are later sorted.
"""
if not self.indices:
self.indices = set()
elif not isinstance(self.indices, Iterable):
self.indices = set([self.indices])
# This is the most likely place to get the number of
# geometries from.
if hasattr(self.ccdata, 'atomcoords'):
lencoords = len(self.ccdata.atomcoords)
indices = set()
for i in self.indices:
if i < 0:
i += lencoords
indices.add(i)
self.indices = indices
return
del find_package
cclib-1.6.2/cclib/io/moldenwriter.py 0000664 0000000 0000000 00000023314 13535330462 0017353 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A writer for MOLDEN format files."""
import os.path
import math
import decimal
from cclib.parser import utils
from cclib.io import filewriter
def round_molden(num, p=6):
"""Molden style number rounding in [Atoms] section."""
# Digit at pth position after dot.
p_digit = math.floor(abs(num) * 10 ** p) % 10
# If the 6th digit after dot is greater than 5, but is not 7,
# round the number upto 6th place.
# Else truncate at 6th digit after dot.
if p_digit > 5 and p_digit != 7:
return round(num, p)
if num >= 0:
return math.floor(num * 10 ** p) / 10 ** p
else:
return math.ceil(num * 10 ** p) / 10 ** p
class MOLDEN(filewriter.Writer):
"""A writer for MOLDEN files."""
required_attrs = ('atomcoords', 'atomnos', 'natom')
def _title(self, path):
"""Return filename without extension to be used as title."""
title = os.path.basename(os.path.splitext(path)[0])
return title
def _coords_from_ccdata(self, index):
"""Create [Atoms] section using geometry at the given index."""
elements = [self.pt.element[Z] for Z in self.ccdata.atomnos]
if self.ghost is not None:
elements = [self.ghost if e is None else e for e in elements]
elif None in elements:
raise ValueError('It seems that there is at least one ghost atom ' +
'in these elements. Please use the ghost flag to'+
' specify a label for the ghost atoms.')
atomcoords = self.ccdata.atomcoords[index]
atomnos = self.ccdata.atomnos
nos = range(self.ccdata.natom)
# element_name number atomic_number x y z
atom_template = '{:2s} {:5d} {:2d} {:12.6f} {:12.6f} {:12.6f}'
lines = []
for element, no, atomno, coord in zip(elements, nos, atomnos,
atomcoords):
x, y, z = map(round_molden, coord)
lines.append(atom_template.format(element, no + 1, atomno,
x, y, z))
return lines
def _gto_from_ccdata(self):
"""Create [GTO] section using gbasis.
atom_sequence_number1 0
shell_label number_of_primitives 1.00
exponent_primitive_1 contraction_coeff_1 (contraction_coeff_1)
...
empty line
atom_sequence__number2 0
"""
gbasis = self.ccdata.gbasis
label_template = '{:s} {:5d} 1.00'
basis_template = '{:15.9e} {:15.9e}'
lines = []
for no, basis in enumerate(gbasis):
lines.append('{:3d} 0'.format(no + 1))
for prims in basis:
lines.append(label_template.format(prims[0].lower(),
len(prims[1])))
for prim in prims[1]:
lines.append(basis_template.format(prim[0], prim[1]))
lines.append('')
lines.append('')
return lines
def _scfconv_from_ccdata(self):
"""Create [SCFCONV] section using gbasis.
scf-first 1 THROUGH 12
-672.634394
...
-673.590571
-673.590571
"""
lines = ["scf-first 1 THROUGH %d" % len(self.ccdata.scfenergies)]
for scfenergy in self.ccdata.scfenergies:
lines.append('{:15.6f}'.format(scfenergy))
return lines
def _rearrange_mocoeffs(self, mocoeffs):
"""Rearrange cartesian F functions in mocoeffs.
Molden's order:
xxx, yyy, zzz, xyy, xxy, xxz, xzz, yzz, yyz, xyz
cclib's order:
XXX, YYY, ZZZ, XXY, XXZ, YYX, YYZ, ZZX, ZZY, XYZ
cclib's order can be converted by:
moving YYX two indexes ahead, and
moving YYZ two indexes back.
"""
aonames = self.ccdata.aonames
mocoeffs = mocoeffs.tolist()
pos_yyx = [key for key, val in enumerate(aonames) if '_YYX' in val]
pos_yyz = [key for key, val in enumerate(aonames) if '_YYZ' in val]
if pos_yyx:
for pos in pos_yyx:
mocoeffs.insert(pos-2, mocoeffs.pop(pos))
if pos_yyz:
for pos in pos_yyz:
mocoeffs.insert(pos+2, mocoeffs.pop(pos))
return mocoeffs
def _mo_from_ccdata(self):
"""Create [MO] section.
Sym= symmetry_label_1
Ene= mo_energy_1
Spin= (Alpha|Beta)
Occup= mo_occupation_number_1
ao_number_1 mo_coefficient_1
...
ao_number_n mo_coefficient_n
...
"""
moenergies = self.ccdata.moenergies
mocoeffs = self.ccdata.mocoeffs
homos = self.ccdata.homos
mult = self.ccdata.mult
has_syms = False
lines = []
# Sym attribute is optional in [MO] section.
if hasattr(self.ccdata, 'mosyms'):
has_syms = True
syms = self.ccdata.mosyms
spin = 'Alpha'
for i in range(mult):
for j in range(len(moenergies[i])):
if has_syms:
lines.append(' Sym= %s' % syms[i][j])
moenergy = utils.convertor(moenergies[i][j], 'eV', 'hartree')
lines.append(' Ene= {:10.4f}'.format(moenergy))
lines.append(' Spin= %s' % spin)
if j <= homos[i]:
lines.append(' Occup= {:10.6f}'.format(2.0 / mult))
else:
lines.append(' Occup= {:10.6f}'.format(0.0))
# Rearrange mocoeffs according to Molden's lexicographical order.
mocoeffs[i][j] = self._rearrange_mocoeffs(mocoeffs[i][j])
for k, mocoeff in enumerate(mocoeffs[i][j]):
lines.append('{:4d} {:10.6f}'.format(k + 1, mocoeff))
spin = 'Beta'
return lines
def generate_repr(self):
"""Generate the MOLDEN representation of the logfile data."""
molden_lines = ['[Molden Format]']
# Title of file.
if self.jobfilename is not None:
molden_lines.append('[Title]')
molden_lines.append(self._title(self.jobfilename))
# Coordinates for the Electron Density/Molecular orbitals.
# [Atoms] (Angs|AU)
unit = "Angs"
molden_lines.append('[Atoms] %s' % unit)
# Last set of coordinates for geometry optimization runs.
index = -1
molden_lines.extend(self._coords_from_ccdata(index))
# Either both [GTO] and [MO] should be present or none of them.
if hasattr(self.ccdata, 'gbasis') and hasattr(self.ccdata, 'mocoeffs')\
and hasattr(self.ccdata, 'moenergies'):
molden_lines.append('[GTO]')
molden_lines.extend(self._gto_from_ccdata())
molden_lines.append('[MO]')
molden_lines.extend(self._mo_from_ccdata())
# Omitting until issue #390 is resolved.
# https://github.com/cclib/cclib/issues/390
# if hasattr(self.ccdata, 'scfenergies'):
# if len(self.ccdata.scfenergies) > 1:
# molden_lines.append('[SCFCONV]')
# molden_lines.extend(self._scfconv_from_ccdata())
# molden_lines.append('')
return '\n'.join(molden_lines)
class MoldenReformatter(object):
"""Reformat Molden output files."""
def __init__(self, filestring):
self.filestring = filestring
def scinotation(self, num):
"""Convert Molden style number formatting to scientific notation.
0.9910616900D+02 --> 9.910617e+01
"""
num = num.replace('D', 'e')
return str('%.9e' % decimal.Decimal(num))
def reformat(self):
"""Reformat Molden output file to:
- use scientific notation,
- split sp molecular orbitals to s and p, and
- replace multiple spaces with single."""
filelines = iter(self.filestring.split("\n"))
lines = []
for line in filelines:
line = line.replace('\n', '')
# Replace multiple spaces with single spaces.
line = ' '.join(line.split())
# Check for [Title] section.
if '[title]' in line.lower():
# skip the title
line = next(filelines)
line = next(filelines)
# Exclude SCFCONV section until issue #390 is resolved.
# https://github.com/cclib/cclib/issues/390
if '[scfconv]' in line.lower():
break
# Although Molden format specifies Sym in [MO] section,
# the Molden program does not print it.
if 'sym' in line.lower():
continue
# Convert D notation to scientific notation.
if 'D' in line:
vals = line.split()
vals = [self.scinotation(i) for i in vals]
lines.append(' '.join(vals))
# Convert sp to s and p orbitals.
elif 'sp' in line:
n_prim = int(line.split()[1])
new_s = ['s ' + str(n_prim) + ' 1.00']
new_p = ['p ' + str(n_prim) + ' 1.00']
while n_prim > 0:
n_prim -= 1
line = next(filelines).split()
new_s.append(self.scinotation(line[0]) + ' '
+ self.scinotation(line[1]))
new_p.append(self.scinotation(line[0]) + ' '
+ self.scinotation(line[2]))
lines.extend(new_s)
lines.extend(new_p)
else:
lines.append(line)
return '\n'.join(lines)
cclib-1.6.2/cclib/io/wfxwriter.py 0000664 0000000 0000000 00000045332 13535330462 0016705 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A writer for wfx format files."""
import os.path
import numpy
from cclib.io import filewriter
from cclib.parser import utils
# Number of orbitals of type key.
# There are 3 p type, 6 d type orbitals etc.
ORBITAL_COUNT = {'S':1, 'P':3, 'D':6, 'F':10, 'G':15, 'H':21}
# Index of first orbital of type key in a list of orbitals.
# The first s orbital has index 1, first p orbital has index 2, and first d
# has index 5.
ORBITAL_INDICES = {'S': 1}
ORBITAL_NAMES = 'SPDFGH'
for idx, name in enumerate(ORBITAL_NAMES[1:], start=1):
prev_orbital_name = ORBITAL_NAMES[idx - 1]
prev_orbital_count = ORBITAL_COUNT[prev_orbital_name]
prev_orbital_index = ORBITAL_INDICES[prev_orbital_name]
ORBITAL_INDICES[name] = prev_orbital_count + prev_orbital_index
PI_CUBE_INV = (2.0 / numpy.pi) ** 3
# Float formatting template.
WFX_FIELD_FMT = '%22.11E'
# Precomputed values for l+m+n to be used in MO normalization.
_L = dict(
[(prim_type, 0) for prim_type in range(1, 2)] + # s
[(prim_type, 1) for prim_type in range(2, 5)] + # p
[(prim_type, 2) for prim_type in range(5, 11)] + # d
[(prim_type, 3) for prim_type in range(11, 21)] + # f
[(prim_type, 4) for prim_type in range(21, 36)] # g
)
# Precomputed values for ((2l-1)!! * (2m-1)!! * (2n-1)!!).
_M = dict(
[(L, 1) for L in range(1, 5)] +
[(L, 9) for L in range(5, 8)] +
[(L, 1) for L in range(8, 11)] +
[(L, 225) for L in range(11, 14)] +
[(L, 9) for L in range(14, 20)] +
[(L, 1) for L in range(20, 21)] +
[(L, 11025) for L in range(21, 24)] +
[(L, 225) for L in range(24, 30)] +
[(L, 81) for L in range(30, 33)] +
[(L, 9) for L in range(33, 36)]
)
def _section(section_name, section_data):
"""Add opening/closing section_name tags to data."""
opening_tag = ['<' + section_name + '>']
closing_tag = ['' + section_name + '>']
section = None
if isinstance(section_data, list):
section = opening_tag + section_data + closing_tag
elif isinstance(section_data, str):
section = opening_tag + (' ' + section_data).split('\n') + closing_tag
elif isinstance(section_data, int) or isinstance(section_data, float):
section = opening_tag + [' ' + str(section_data)] + closing_tag
return section
def _list_format(data, per_line, style=WFX_FIELD_FMT):
"""Format lists for pretty print."""
template = style * per_line
leftover = len(data) % per_line
# Template for last line.
last_template = style * leftover
pretty_list = [template%tuple(data[i:i+per_line])
for i in range(0, len(data) - leftover, per_line)]
if leftover:
return pretty_list + [last_template%tuple(data[-1*leftover:])]
return pretty_list
class WFXWriter(filewriter.Writer):
"""A writer for wfx files."""
required_attrs = ('natom', 'atomcoords', 'atomnos', 'gbasis', 'charge',
'homos', 'mult', 'mocoeffs')
def _title(self):
"""Section: Title
Return filename without extension to be used as title."""
title = "Written by cclib."
if self.jobfilename is not None:
return os.path.basename(os.path.splitext(self.jobfilename)[0]) +\
'. ' + title
return title
def _keywords(self):
"""Section: Keywords.
Return one of GTO, GIAO, CSGT keyword."""
# Currently only GTO is supported.
return 'GTO'
def _no_of_nuclei(self):
"""Section: Number of Nuclei."""
return self.ccdata.natom
def _no_of_prims(self):
"""Section: Number of Primitives."""
nprims = 0
for atom in self.ccdata.gbasis:
for prims in atom:
nprims += ORBITAL_COUNT[prims[0]] * len(prims[1])
return nprims
def _no_of_mos(self):
"""Section: Number of Occupied Molecular Orbitals."""
return int(max(self.ccdata.homos)) + 1
def _no_of_perturbations(self):
"""Section: Number of Perturbation.
This is usually zero. For GIAO it should be 3
(corresponding to Lx, Ly and Lz), and
for CSGT it should be 6
(corresponding to Lx, Ly, Lz, Px, Py and Pz).
"""
if 'GIAO' in self._keywords():
return 3
elif 'CSGT' in self._keywords():
return 6
return 0
def _nuclear_names(self):
"""Section: Nuclear Names.
Names of nuclei present in the molecule.
O1
H2
H3
"""
return [self.pt.element[Z]+str(i) for i, Z in
enumerate(self.ccdata.atomnos, start=1)]
def _atomic_nos(self):
"""Section: Atomic Numbers."""
return [str(Z) for Z in self.ccdata.atomnos]
def _nuclear_charges(self):
"""Section: Nuclear Charges."""
nuclear_charge = [WFX_FIELD_FMT % Z for Z in self.ccdata.atomnos]
if hasattr(self.ccdata, 'coreelectrons'):
nuclear_charge = [WFX_FIELD_FMT % Z
for Z in self.ccdata.atomnos -
self.ccdata.coreelectrons]
return nuclear_charge
def _nuclear_coords(self):
"""Section: Nuclear Cartesian Coordinates.
Nuclear coordinates in Bohr."""
coord_template = WFX_FIELD_FMT * 3
to_bohr = lambda x: utils.convertor(x, 'Angstrom', 'bohr')
nuc_coords = [coord_template % tuple(to_bohr(coord))
for coord in self.ccdata.atomcoords[-1]]
return nuc_coords
def _net_charge(self):
"""Section: Net Charge.
Net charge on molecule."""
return WFX_FIELD_FMT % self.ccdata.charge
def _no_electrons(self):
"""Section: Number of Electrons."""
return int(self.ccdata.nelectrons)
def _no_alpha_electrons(self):
"""Section: Number of Alpha Electrons."""
return int(numpy.ceil(self.ccdata.nelectrons / 2.0))
def _no_beta_electrons(self):
"""Section: Number of Beta Electrons."""
return int(self.ccdata.nelectrons - self._no_alpha_electrons())
def _spin_mult(self):
"""Section: Electronic Spin Multiplicity"""
return self.ccdata.mult
def _prim_centers(self):
"""Section: Primitive Centers.
List of nuclear numbers upon which the primitive basis functions
are centered."""
prim_centers = []
for nuc_num, atom in enumerate(self.ccdata.gbasis, start=1):
for prims in atom:
prim_centers += [nuc_num] * ORBITAL_COUNT[prims[0]]\
* len(prims[1])
return _list_format(prim_centers, 10, '%d ')
def _rearrange_modata(self, data):
"""Rearranges MO related data according the expected order of
Cartesian gaussian primitive types in wfx format.
cclib parses mocoeffs in the order they occur in output files.
"""
prim_types = self._get_prim_types()
if isinstance(data, numpy.ndarray):
data = data.tolist()
pos_yyx = [key for key, val in enumerate(prim_types)
if val == 17]
pos_yyz = [key for key, val in enumerate(prim_types)
if val == 16]
if pos_yyx:
for pos in pos_yyx:
data.insert(pos-3, data.pop(pos))
if pos_yyz:
for pos in pos_yyz:
data.insert(pos+3, data.pop(pos + 1))
return data
def _get_prim_types(self):
"""List of primitive types.
Definition of the Cartesian Gaussian primitive types is as follows:
1 S, 2 PX, 3 PY, 4 PZ, 5 DXX, 6 DYY, 7 DZZ, 8 DXY, 9 DXZ, 10 DYZ,
11 FXXX, 12 FYYY, 13 FZZZ, 14 FXXY, 15 FXXZ, 16 FYYZ, 17 FXYY,
18 FXZZ, 19 FYZZ, 20 FXYZ,
21 GXXXX, 22 GYYYY, 23 GZZZZ, 24 GXXXY, 25 GXXXZ, 26 GXYYY,
27 GYYYZ, 28 GXZZZ,
29 GYZZZ, 30 GXXYY, 31 GXXZZ, 32 GYYZZ, 33 GXXYZ, 34 GXYYZ,
35 GXYZZ,
36 HZZZZZ, 37 HYZZZZ, 38 HYYZZZ, 39 HYYYZZ,
40 HYYYYZ, 41 HYYYYY, 42 HXZZZZ, 43 HXYZZZ, 44 HXYYZZ,
45 HXYYYZ, 46 HXYYYY, 47 HXXZZZ, 48 HXXYZZ, 49 HXXYYZ,
50 HXXYYY, 51 HXXXZZ, 52 HXXXYZ, 53 HXXXYY, 54 HXXXXZ, 55 HXXXXY,
56 HXXXXX
Spherical basis are not currently supported by the writer.
"""
prim_types = []
for atom in self.ccdata.gbasis:
for prims in atom:
prim_orb = []
for i in range(ORBITAL_COUNT[prims[0]]):
prim_orb += [(ORBITAL_INDICES[prims[0]] + i)]\
* len(prims[1])
prim_types += prim_orb
return prim_types
def _prim_types(self):
"""Section: Primitive Types."""
prim_types = self._get_prim_types()
# GAMESS specific reordering.
if self.ccdata.metadata['package'] == 'GAMESS':
prim_types = self._rearrange_modata(prim_types)
return _list_format(prim_types, 10, '%d ')
def _prim_exps(self):
"""Section: Primitive Exponents.
Space-separated list of primitive exponents."""
prim_exps = []
for atom in self.ccdata.gbasis:
for prims in atom:
prim_exps += [prim[0] for prim in prims[1]]\
* ORBITAL_COUNT[prims[0]]
return _list_format(prim_exps, 5)
def _mo_occup_nos(self):
"""Section: Molecular Orbital Occupation Numbers."""
occup = []
electrons = self._no_electrons()
alpha = self._no_alpha_electrons()
beta = self._no_beta_electrons()
if len(self.ccdata.homos) == 1:
occup += [WFX_FIELD_FMT % (2)] * int(electrons / 2) +\
[WFX_FIELD_FMT % (1)] * (electrons % 2)
else:
occup += [WFX_FIELD_FMT % (1)] * + alpha +\
[WFX_FIELD_FMT % (1)] * beta
return occup
def _mo_energies(self):
"""Section: Molecular Orbital Energies."""
mo_energies = []
alpha_elctrons = self._no_alpha_electrons()
beta_electrons = self._no_beta_electrons()
for mo_energy in self.ccdata.moenergies[0][:alpha_elctrons]:
mo_energies.append(WFX_FIELD_FMT % (
utils.convertor(mo_energy, 'eV', 'hartree')))
if self.ccdata.mult > 1:
for mo_energy in self.ccdata.moenergies[1][:beta_electrons]:
mo_energies.append(WFX_FIELD_FMT % (
utils.convertor(mo_energy, 'eV', 'hartree')))
return mo_energies
def _mo_spin_types(self):
"""Section: Molecular Orbital Spin Types."""
spin_types = []
electrons = self._no_electrons()
alpha = self._no_alpha_electrons()
beta = self._no_beta_electrons()
if len(self.ccdata.homos) == 1:
spin_types += ['Alpha and Beta'] * int(electrons / 2) +\
['Alpha'] * (electrons % 2)
else:
spin_types += ['Alpha'] * alpha +\
['Beta'] * beta
return spin_types
def _normalize(self, prim_type, alpha=1.0):
"""Normalization factor for Cartesian Gaussian Functions.
N**4 = (2/pi)**3 * 2**(l+m+n) * alpha**(3 + 2(l+m+n)) /
((2l-1)!! * (2m-1)!! * (2n-1)!!)**2
= (2/pi)**3 * 2**(L) * alpha**(3 + 2L) /
M**2,
L = l+m+n,
M = ((2l-1)!! * (2m-1)!! * (2n-1)!!)
"""
L = _L[prim_type]
M = _M[prim_type]
norm_four = PI_CUBE_INV * 2**(4*L) * alpha**(3+2*L) / M
norm = numpy.power(norm_four, 1/4.0)
return norm
def _rearrange_mocoeffs(self, mocoeffs):
"""Rearrange cartesian F functions in mocoeffs.
Expected order:
xxx, yyy, zzz, xyy, xxy, xxz, xzz, yzz, yyz, xyz
cclib's order for GAMESS:
XXX, YYY, ZZZ, XXY, XXZ, YYX, YYZ, ZZX, ZZY, XYZ
"""
aonames = self.ccdata.aonames
mocoeffs = mocoeffs.tolist()
pos_yyx = [key for key, val in enumerate(aonames)
if '_YYX' in val]
pos_yyz = [key for key, val in enumerate(aonames)
if '_YYZ' in val]
if pos_yyx:
for pos in pos_yyx:
mocoeffs.insert(pos-2, mocoeffs.pop(pos))
if pos_yyz:
for pos in pos_yyz:
mocoeffs.insert(pos+2, mocoeffs.pop(pos))
return mocoeffs
def _norm_mat(self):
"""Calculate normalization matrix for normalizing MOcoeffs."""
alpha = []
prim_coeff = []
mo_count = []
prim_type = self._get_prim_types()
for atom in self.ccdata.gbasis:
for prims in atom:
prim_orb = []
mo_count += [len(prims[1])] * ORBITAL_COUNT[prims[0]]
for i in range(ORBITAL_COUNT[prims[0]]):
norb = ORBITAL_INDICES[prims[0]]
prim_orb += [norb + i]
alpha += [prim[0] for prim in prims[1]]
prim_coeff += [prim[1] for prim in prims[1]]
# GAMESS specific reordering.
if self.ccdata.metadata['package'] == 'GAMESS':
prim_type = self._rearrange_modata(self._get_prim_types())
alpha = self._rearrange_modata(alpha)
prim_coeff = self._rearrange_modata(prim_coeff)
norm_mat = [self._normalize(prim_type[i], alpha[i]) * prim_coeff[i]
for i in range(len(prim_coeff))]
return (norm_mat, mo_count, prim_coeff)
def _nmos(self):
"""Return number of molecular orbitals to be printed."""
return self.ccdata.nelectrons if self.ccdata.mult > 1\
else self._no_of_mos()
def _prim_mocoeff(self, mo_count):
"""Return primitve mocoeffs array."""
prim_mocoeff = []
for i in range(self.ccdata.mult):
for j in range(self._nmos()):
mocoeffs = self.ccdata.mocoeffs[i][j]
if self.ccdata.metadata['package'] == 'GAMESS':
mocoeffs = self._rearrange_mocoeffs(self.ccdata.mocoeffs[i][j])
for k, mocoeff in enumerate(mocoeffs):
prim_mocoeff += [mocoeff] * mo_count[k]
return prim_mocoeff
def _normalized_mocoeffs(self):
"""Raw-Primitive Expansion coefficients for each normalized MO."""
# Normalization Matrix.
norm_mat, mo_count, prim_coeff = self._norm_mat()
prim_mocoeff = self._prim_mocoeff(mo_count)
norm_mocoeffs = []
for mo_num in range(self._nmos()):
norm_mocoeffs.append([norm_mat[i] *
prim_mocoeff[i + mo_num * len(prim_coeff)]
for i in range(len(prim_coeff))])
return norm_mocoeffs
def _mo_prim_coeffs(self):
"""Section: Molecular Orbital Primitive Coefficients."""
# Normalized MO Coeffs.
norm_mocoeffs = self._normalized_mocoeffs()
mocoeffs_section = []
for mo_num, mocoeffs in enumerate(norm_mocoeffs):
mocoeffs_section.extend(_section('MO Number', mo_num + 1))
mocoeffs_section.extend(_list_format
(mocoeffs, 5))
return mocoeffs_section
def _energy(self):
"""Section: Energy = T + Vne + Vee + Vnn.
The total energy of the molecule.
HF and KSDFT: SCF energy (scfenergies),
MP2 : MP2 total energy (mpenergies),
CCSD : CCSD total energy (ccenergies).
"""
energy = 0
if hasattr(self.ccdata, 'ccenergies'):
energy = self.ccdata.ccenergies[-1]
elif hasattr(self.ccdata, 'mpenergies'):
energy = self.ccdata.mpenergies[-1][-1]
elif hasattr(self.ccdata, 'scfenergies'):
energy = self.ccdata.scfenergies[-1]
else:
raise filewriter.MissingAttributeError(
'scfenergies/mpenergies/ccenergies')
return WFX_FIELD_FMT % (utils.convertor(energy, 'eV', 'hartree'))
def _virial_ratio(self):
"""Ratio of kinetic energy to potential energy."""
# Hardcoding expected value for Required Field.
return WFX_FIELD_FMT % (2.0)
def generate_repr(self):
"""Generate the wfx representation of the logfile data."""
# sections:(Function returning data for section,
# Section heading,
# Required)
sections = [
(self._title, "Title", True),
(self._keywords, "Keywords", True),
(self._no_of_nuclei, "Number of Nuclei", True),
(self._no_of_prims, "Number of Primitives", True),
(self._no_of_mos, "Number of Occupied Molecular Orbitals", True),
(self._no_of_perturbations, "Number of Perturbations", True),
(self._nuclear_names, "Nuclear Names", True),
(self._atomic_nos, "Atomic Numbers", True),
(self._nuclear_charges, "Nuclear Charges", True),
(self._nuclear_coords, "Nuclear Cartesian Coordinates", True),
(self._net_charge, "Net Charge", True),
(self._no_electrons, "Number of Electrons", True),
(self._no_alpha_electrons, "Number of Alpha Electrons", True),
(self._no_beta_electrons, "Number of Beta Electrons", True),
(self._spin_mult, "Electronic Spin Multiplicity", False),
# (self._model, "Model", False),
(self._prim_centers, "Primitive Centers", True),
(self._prim_types, "Primitive Types", True),
(self._prim_exps, "Primitive Exponents", True),
(self._mo_occup_nos,
"Molecular Orbital Occupation Numbers", True),
(self._mo_energies, "Molecular Orbital Energies", True),
(self._mo_spin_types, "Molecular Orbital Spin Types", True),
(self._mo_prim_coeffs,
"Molecular Orbital Primitive Coefficients", True),
(self._energy, "Energy = T + Vne + Vee + Vnn", True),
# (self._nuc_energy_gradients,
# "Nuclear Cartesian Energy Gradients", False),
# (self._nuc_virial,
# "Nuclear Virial of Energy-Gradient-Based Forces on Nuclei, W",
# False),
(self._virial_ratio, "Virial Ratio (-V/T)", True),
]
wfx_lines = []
for section_module, section_name, section_required in sections:
try:
section_data = section_module()
wfx_lines.extend(_section(section_name, section_data))
except:
if section_required:
raise filewriter.MissingAttributeError(
'Unable to write required wfx section: '
+ section_name)
wfx_lines.append('')
return '\n'.join(wfx_lines)
cclib-1.6.2/cclib/io/xyzreader.py 0000664 0000000 0000000 00000004626 13535330462 0016662 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A reader for XYZ (Cartesian coordinate) files."""
from cclib.io import filereader
from cclib.parser.data import ccData
from cclib.parser.utils import PeriodicTable
class XYZ(filereader.Reader):
"""A reader for XYZ (Cartesian coordinate) files."""
def __init__(self, source, *args, **kwargs):
super(XYZ, self).__init__(source, *args, **kwargs)
self.pt = PeriodicTable()
def parse(self):
super(XYZ, self).parse()
self.generate_repr()
return self.data
def generate_repr(self):
"""Convert the raw contents of the source into the internal representation."""
assert hasattr(self, 'filecontents')
it = iter(self.filecontents.splitlines())
# Ordering of lines:
# 1. number of atoms
# 2. comment line
# 3. line of at least 4 columns: 1 is atomic symbol (str), 2-4 are atomic coordinates (float)
# repeat for numver of atoms
# (4. optional blank line)
# repeat for multiple sets of coordinates
all_atomcoords = []
comments = []
while True:
try:
line = next(it)
if line.strip() == '':
line = next(it)
tokens = line.split()
assert len(tokens) >= 1
natom = int(tokens[0])
comments.append(next(it))
lines = []
for _ in range(natom):
line = next(it)
tokens = line.split()
assert len(tokens) >= 4
lines.append(tokens)
assert len(lines) == natom
atomsyms = [line[0] for line in lines]
atomnos = [self.pt.number[atomsym] for atomsym in atomsyms]
atomcoords = [line[1:4] for line in lines]
# Everything beyond the fourth column is ignored.
all_atomcoords.append(atomcoords)
except StopIteration:
break
attributes = {
'natom': natom,
'atomnos': atomnos,
'atomcoords': all_atomcoords,
'metadata': {"comments": comments},
}
self.data = ccData(attributes)
cclib-1.6.2/cclib/io/xyzwriter.py 0000664 0000000 0000000 00000010102 13535330462 0016716 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A writer for XYZ (Cartesian coordinate) files."""
from cclib.io import filewriter
class XYZ(filewriter.Writer):
"""A writer for XYZ (Cartesian coordinate) files."""
def __init__(self, ccdata, splitfiles=False,
firstgeom=False, lastgeom=False, allgeom=False,
*args, **kwargs):
"""Initialize the XYZ writer object.
Inputs:
ccdata - An instance of ccData, parse from a logfile.
splitfiles - Boolean to write multiple files if multiple files are requested. [TODO]
firstgeom - Boolean to write the first available geometry from the logfile.
lastgeom - Boolean to write the last available geometry from the logfile.
allgeom - Boolean to write all available geometries from the logfile.
"""
self.required_attrs = ('natom', 'atomcoords', 'atomnos')
# Call the __init__ method of the superclass
super(XYZ, self).__init__(ccdata, *args, **kwargs)
self.do_firstgeom = firstgeom
self.do_lastgeom = lastgeom
self.do_allgeom = allgeom
self.natom = str(self.ccdata.natom)
self.element_list = [self.pt.element[Z] for Z in self.ccdata.atomnos]
def generate_repr(self):
"""Generate the XYZ representation of the logfile data."""
# Options for output (to a single file):
# 1. Write all geometries from an optimization, which programs like VMD
# can read in like a trajectory.
# 2. Write the final converged geometry, which for any job other than
# a geometry optimization would be the single/only geometry.
# 3. Write the very first geometry, which for any job other than a
# geometry optimization would be the single/only geometry.
# 4. Write the first and last geometries from a geometry optimization.
# 5. Write arbitrary structures via zero-based indexing.
# TODO: Options for output (to multiple files)
xyzblock = []
lencoords = len(self.ccdata.atomcoords)
# Collect the indices.
if lencoords == 1 or self.do_firstgeom:
self.indices.add(0)
if self.do_lastgeom:
self.indices.add(lencoords - 1)
if self.do_allgeom:
for i in range(lencoords):
self.indices.add(i)
# Generate the XYZ string for each index.
indices = sorted(self.indices)
if not indices:
indices = [-1]
for i in indices:
xyzblock.append(self._xyz_from_ccdata(i))
# Ensure an extra newline at the very end.
xyzblock.append('')
return '\n'.join(xyzblock)
def _xyz_from_ccdata(self, index):
"""Create an XYZ file of the geometry at the given index."""
atomcoords = self.ccdata.atomcoords[index]
existing_comment = "" if not self.ccdata.metadata["comments"] \
else self.ccdata.metadata["comments"][index]
# Create a comment derived from the filename and the index.
if index == -1:
geometry_num = len(self.ccdata.atomcoords)
else:
geometry_num = index + 1
if self.jobfilename is not None:
comment = "{}: Geometry {}".format(self.jobfilename, geometry_num)
else:
comment = "Geometry {}".format(geometry_num)
# Wrap the geometry number part of the comment in square brackets,
# prefixing it with one previously parsed if it existed.
if existing_comment:
comment = "{} [{}]".format(existing_comment, comment)
else:
comment = "[{}]".format(comment)
atom_template = '{:3s} {:15.10f} {:15.10f} {:15.10f}'
block = []
block.append(self.natom)
block.append(comment)
for element, (x, y, z) in zip(self.element_list, atomcoords):
block.append(atom_template.format(element, x, y, z))
return '\n'.join(block)
cclib-1.6.2/cclib/method/ 0000775 0000000 0000000 00000000000 13535330462 0015134 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/method/__init__.py 0000664 0000000 0000000 00000001406 13535330462 0017246 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Example analyses and calculations based on data parsed by cclib."""
from cclib.method.cda import CDA
from cclib.method.cspa import CSPA
from cclib.method.density import Density
from cclib.method.electrons import Electrons
from cclib.method.fragments import FragmentAnalysis
from cclib.method.lpa import LPA
from cclib.method.mbo import MBO
from cclib.method.moments import Moments
from cclib.method.mpa import MPA
from cclib.method.nuclear import Nuclear
from cclib.method.opa import OPA
from cclib.method.orbitals import Orbitals
# from cclib.method.volume import Volume
cclib-1.6.2/cclib/method/calculationmethod.py 0000664 0000000 0000000 00000004312 13535330462 0021205 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Abstract based class for cclib methods."""
import logging
import sys
class MissingAttributeError(Exception):
pass
class Method(object):
"""Abstract base class for all cclib method classes.
Subclasses defined by cclib:
CDA - charde decomposition analysis
CSPA - C-squared population analysis
Density - density matrix calculation
FragmentAnalysis - fragment analysis for ADF output
LPA - Löwdin population analysis
MBO - Mayer's bond orders
Moments - multipole moments calculations
MPA - Mulliken population analysis
Nuclear - properties of atomic nuclei
OPA - overlap population analysis
Population - base class for population analyses
Volume - volume/grid calculations
All the modules containing methods should be importable.
"""
required_attrs = ()
def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"):
"""Initialise the Logfile object.
This constructor is typically called by the constructor of a subclass.
"""
self.data = data
self.progress = progress
self.loglevel = loglevel
self.logname = logname
self._check_required_attributes()
self.logger = logging.getLogger('%s %s' % (self.logname, self.data))
self.logger.setLevel(self.loglevel)
self.logformat = "[%(name)s %(levelname)s] %(message)s"
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(logging.Formatter(self.logformat))
self.logger.addHandler(handler)
def _check_required_attributes(self):
"""Check if required attributes are present in data."""
missing = [x for x in self.required_attrs
if not hasattr(self.data, x)]
if missing:
missing = ' '.join(missing)
raise MissingAttributeError(
'Could not parse required attributes to use method: ' + missing)
cclib-1.6.2/cclib/method/cda.py 0000664 0000000 0000000 00000010555 13535330462 0016243 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Charge Decomposition Analysis (CDA)"""
from __future__ import print_function
import random
import numpy
from cclib.method.fragments import FragmentAnalysis
class CDA(FragmentAnalysis):
"""Charge Decomposition Analysis (CDA)"""
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(FragmentAnalysis, self).__init__(logname="CDA", *args)
def __str__(self):
"""Return a string representation of the object."""
return "CDA of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'CDA("%s")' % (self.data)
def calculate(self, fragments, cupdate=0.05):
"""Perform the charge decomposition analysis.
Inputs:
fragments - list of ccData data objects
"""
retval = super(CDA, self).calculate(fragments, cupdate)
if not retval:
return False
# At this point, there should be a mocoeffs and fooverlaps
# in analogy to a ccData object.
donations = []
bdonations = []
repulsions = []
residuals = []
if len(self.mocoeffs) == 2:
occs = 1
else:
occs = 2
# Intialize progress if available.
nstep = self.data.homos[0]
if len(self.data.homos) == 2:
nstep += self.data.homos[1]
if self.progress:
self.progress.initialize(nstep)
# Begin the actual method.
step = 0
for spin in range(len(self.mocoeffs)):
size = len(self.mocoeffs[spin])
homo = self.data.homos[spin]
if len(fragments[0].homos) == 2:
homoa = fragments[0].homos[spin]
else:
homoa = fragments[0].homos[0]
if len(fragments[1].homos) == 2:
homob = fragments[1].homos[spin]
else:
homob = fragments[1].homos[0]
self.logger.info("handling spin unrestricted")
if spin == 0:
fooverlaps = self.fooverlaps
elif spin == 1 and hasattr(self, "fooverlaps2"):
fooverlaps = self.fooverlaps2
offset = fragments[0].nbasis
self.logger.info("Creating donations, bdonations, and repulsions: array[]")
donations.append(numpy.zeros(size, "d"))
bdonations.append(numpy.zeros(size, "d"))
repulsions.append(numpy.zeros(size, "d"))
residuals.append(numpy.zeros(size, "d"))
for i in range(self.data.homos[spin] + 1):
# Calculate donation for each MO.
for k in range(0, homoa + 1):
for n in range(offset + homob + 1, self.data.nbasis):
donations[spin][i] += 2 * occs * self.mocoeffs[spin][i,k] \
* self.mocoeffs[spin][i,n] * fooverlaps[k][n]
for l in range(offset, offset + homob + 1):
for m in range(homoa + 1, offset):
bdonations[spin][i] += 2 * occs * self.mocoeffs[spin][i,l] \
* self.mocoeffs[spin][i,m] * fooverlaps[l][m]
for k in range(0, homoa + 1):
for m in range(offset, offset+homob + 1):
repulsions[spin][i] += 2 * occs * self.mocoeffs[spin][i,k] \
* self.mocoeffs[spin][i, m] * fooverlaps[k][m]
for m in range(homoa + 1, offset):
for n in range(offset + homob + 1, self.data.nbasis):
residuals[spin][i] += 2 * occs * self.mocoeffs[spin][i,m] \
* self.mocoeffs[spin][i, n] * fooverlaps[m][n]
step += 1
if self.progress and random.random() < cupdate:
self.progress.update(step, "Charge Decomposition Analysis...")
if self.progress:
self.progress.update(nstep, "Done.")
self.donations = donations
self.bdonations = bdonations
self.repulsions = repulsions
self.residuals = residuals
return True
cclib-1.6.2/cclib/method/cspa.py 0000664 0000000 0000000 00000007056 13535330462 0016444 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""C-squared population analysis."""
import random
import numpy
from cclib.method.population import Population
class CSPA(Population):
"""The C-squared population analysis."""
# Overlaps are not required for CSPA.
overlap_attributes = ()
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(CSPA, self).__init__(logname="CSPA", *args)
def __str__(self):
"""Return a string representation of the object."""
return "CSPA of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'CSPA("%s")' % (self.data)
def calculate(self, indices=None, fupdate=0.05):
"""Perform the C squared population analysis.
Inputs:
indices - list of lists containing atomic orbital indices of fragments
"""
self.logger.info("Creating attribute aoresults: array[3]")
# Determine number of steps, and whether process involves beta orbitals.
unrestricted = (len(self.data.mocoeffs)==2)
nbasis = self.data.nbasis
self.aoresults = []
alpha = len(self.data.mocoeffs[0])
self.aoresults.append(numpy.zeros([alpha, nbasis], "d"))
nstep = alpha
if unrestricted:
beta = len(self.data.mocoeffs[1])
self.aoresults.append(numpy.zeros([beta, nbasis], "d"))
nstep += beta
# Intialize progress if available.
if self.progress:
self.progress.initialize(nstep)
step = 0
for spin in range(len(self.data.mocoeffs)):
for i in range(len(self.data.mocoeffs[spin])):
if self.progress and random.random() < fupdate:
self.progress.update(step, "C^2 Population Analysis")
submocoeffs = self.data.mocoeffs[spin][i]
scale = numpy.inner(submocoeffs, submocoeffs)
tempcoeffs = numpy.multiply(submocoeffs, submocoeffs)
tempvec = tempcoeffs/scale
self.aoresults[spin][i] = numpy.divide(tempcoeffs, scale).astype("d")
step += 1
if self.progress:
self.progress.update(nstep, "Done")
retval = super(CSPA, self).partition(indices)
if not retval:
self.logger.error("Error in partitioning results")
return False
self.logger.info("Creating fragcharges: array[1]")
size = len(self.fragresults[0][0])
self.fragcharges = numpy.zeros([size], "d")
alpha = numpy.zeros([size], "d")
if unrestricted:
beta = numpy.zeros([size], "d")
for spin in range(len(self.fragresults)):
for i in range(self.data.homos[spin] + 1):
temp = numpy.reshape(self.fragresults[spin][i], (size,))
self.fragcharges = numpy.add(self.fragcharges, temp)
if spin == 0:
alpha = numpy.add(alpha, temp)
elif spin == 1:
beta = numpy.add(beta, temp)
if not unrestricted:
self.fragcharges = numpy.multiply(self.fragcharges, 2)
else:
self.logger.info("Creating fragspins: array[1]")
self.fragspins = numpy.subtract(alpha, beta)
return True
cclib-1.6.2/cclib/method/density.py 0000664 0000000 0000000 00000005626 13535330462 0017176 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Building the density matrix from data parsed by cclib."""
import logging
import random
import numpy
from cclib.method.calculationmethod import Method
class Density(Method):
"""Calculate the density matrix"""
def __init__(self, data, progress=None, loglevel=logging.INFO,
logname="Density"):
# Call the __init__ method of the superclass.
super(Density, self).__init__(data, progress, loglevel, logname)
def __str__(self):
"""Return a string representation of the object."""
return "Density matrix of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'Density matrix("%s")' % (self.data)
def calculate(self, fupdate=0.05):
"""Calculate the density matrix."""
# Do we have the needed info in the data object?
if not hasattr(self.data, "mocoeffs"):
self.logger.error("Missing mocoeffs")
return False
if not hasattr(self.data,"nbasis"):
self.logger.error("Missing nbasis")
return False
if not hasattr(self.data,"homos"):
self.logger.error("Missing homos")
return False
self.logger.info("Creating attribute density: array[3]")
size = self.data.nbasis
unrestricted = (len(self.data.mocoeffs) == 2)
#determine number of steps, and whether process involves beta orbitals
nstep = self.data.homos[0] + 1
if unrestricted:
self.density = numpy.zeros([2, size, size], "d")
nstep += self.data.homos[1] + 1
else:
self.density = numpy.zeros([1, size, size], "d")
#intialize progress if available
if self.progress:
self.progress.initialize(nstep)
step = 0
for spin in range(len(self.data.mocoeffs)):
for i in range(self.data.homos[spin] + 1):
if self.progress and random.random() < fupdate:
self.progress.update(step, "Density Matrix")
col = numpy.reshape(self.data.mocoeffs[spin][i], (size, 1))
colt = numpy.reshape(col, (1, size))
tempdensity = numpy.dot(col, colt)
self.density[spin] = numpy.add(self.density[spin],
tempdensity)
step += 1
if not unrestricted: #multiply by two to account for second electron
self.density[0] = numpy.add(self.density[0], self.density[0])
if self.progress:
self.progress.update(nstep, "Done")
return True #let caller know we finished density
cclib-1.6.2/cclib/method/electrons.py 0000664 0000000 0000000 00000002662 13535330462 0017512 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculate properties for electrons."""
import logging
import numpy
from cclib.method.calculationmethod import Method
class Electrons(Method):
"""A container for methods pertaining to electrons."""
def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"):
self.required_attrs = ('atomnos','charge','coreelectrons','homos')
super(Electrons, self).__init__(data, progress, loglevel, logname)
def __str__(self):
"""Returns a string representation of the object."""
return "Electrons"
def __repr__(self):
"""Returns a representation of the object."""
return "Electrons"
def alpha(self):
"""Number of alpha electrons"""
return self.data.homos[0] + 1
def beta(self):
"""Number of beta electrons"""
return self.data.homos[-1] + 1
def count(self, core=False):
"""Returns the electron count in system.
Normally returns electrons used in calculation, but will include
core electrons in pseudopotentials if core is True.
"""
nelectrons = sum(self.data.atomnos) - self.data.charge
if core:
nelectrons += sum(self.data.coreelectrons)
return nelectrons
cclib-1.6.2/cclib/method/fragments.py 0000664 0000000 0000000 00000012514 13535330462 0017477 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Fragment analysis based on parsed ADF data."""
import logging
import random
import numpy
numpy.inv = numpy.linalg.inv
from cclib.method.calculationmethod import Method
class FragmentAnalysis(Method):
"""Convert a molecule's basis functions from atomic-based to fragment MO-based"""
def __init__(self, data, progress=None, loglevel=logging.INFO,
logname="FragmentAnalysis of"):
# Call the __init__ method of the superclass.
super(FragmentAnalysis, self).__init__(data, progress, loglevel, logname)
self.parsed = False
def __str__(self):
"""Return a string representation of the object."""
return "Fragment molecule basis of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'Fragment molecular basis("%s")' % (self.data)
def calculate(self, fragments, cupdate=0.05):
nFragBasis = 0
nFragAlpha = 0
nFragBeta = 0
self.fonames = []
unrestricted = ( len(self.data.mocoeffs) == 2 )
self.logger.info("Creating attribute fonames[]")
# Collect basis info on the fragments.
for j in range(len(fragments)):
nFragBasis += fragments[j].nbasis
nFragAlpha += fragments[j].homos[0] + 1
if unrestricted and len(fragments[j].homos) == 1:
nFragBeta += fragments[j].homos[0] + 1 #assume restricted fragment
elif unrestricted and len(fragments[j].homos) == 2:
nFragBeta += fragments[j].homos[1] + 1 #assume unrestricted fragment
#assign fonames based on fragment name and MO number
for i in range(fragments[j].nbasis):
if hasattr(fragments[j],"name"):
self.fonames.append("%s_%i"%(fragments[j].name,i+1))
else:
self.fonames.append("noname%i_%i"%(j,i+1))
nBasis = self.data.nbasis
nAlpha = self.data.homos[0] + 1
if unrestricted:
nBeta = self.data.homos[1] + 1
# Check to make sure calcs have the right properties.
if nBasis != nFragBasis:
self.logger.error("Basis functions don't match")
return False
if nAlpha != nFragAlpha:
self.logger.error("Alpha electrons don't match")
return False
if unrestricted and nBeta != nFragBeta:
self.logger.error("Beta electrons don't match")
return False
if len(self.data.atomcoords) != 1:
self.logger.warning("Molecule calc appears to be an optimization")
for frag in fragments:
if len(frag.atomcoords) != 1:
msg = "One or more fragment appears to be an optimization"
self.logger.warning(msg)
break
last = 0
for frag in fragments:
size = frag.natom
if self.data.atomcoords[0][last:last+size].tolist() != \
frag.atomcoords[0].tolist():
self.logger.error("Atom coordinates aren't aligned")
return False
if self.data.atomnos[last:last+size].tolist() != \
frag.atomnos.tolist():
self.logger.error("Elements don't match")
return False
last += size
# And let's begin!
self.mocoeffs = []
self.logger.info("Creating mocoeffs in new fragment MO basis: mocoeffs[]")
for spin in range(len(self.data.mocoeffs)):
blockMatrix = numpy.zeros((nBasis,nBasis), "d")
pos = 0
# Build up block-diagonal matrix from fragment mocoeffs.
# Need to switch ordering from [mo,ao] to [ao,mo].
for i in range(len(fragments)):
size = fragments[i].nbasis
if len(fragments[i].mocoeffs) == 1:
temp = numpy.transpose(fragments[i].mocoeffs[0])
blockMatrix[pos:pos+size, pos:pos+size] = temp
else:
temp = numpy.transpose(fragments[i].mocoeffs[spin])
blockMatrix[pos:pos+size, pos:pos+size] = temp
pos += size
# Invert and mutliply to result in fragment MOs as basis.
iBlockMatrix = numpy.inv(blockMatrix)
temp = numpy.transpose(self.data.mocoeffs[spin])
results = numpy.transpose(numpy.dot(iBlockMatrix, temp))
self.mocoeffs.append(results)
if hasattr(self.data, "aooverlaps"):
tempMatrix = numpy.dot(self.data.aooverlaps, blockMatrix)
tBlockMatrix = numpy.transpose(blockMatrix)
if spin == 0:
self.fooverlaps = numpy.dot(tBlockMatrix, tempMatrix)
self.logger.info("Creating fooverlaps: array[x,y]")
elif spin == 1:
self.fooverlaps2 = numpy.dot(tBlockMatrix, tempMatrix)
self.logger.info("Creating fooverlaps (beta): array[x,y]")
else:
self.logger.warning("Overlap matrix missing")
self.parsed = True
self.nbasis = nBasis
self.homos = self.data.homos
return True
cclib-1.6.2/cclib/method/lpa.py 0000664 0000000 0000000 00000010450 13535330462 0016262 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Löwdin population analysis."""
import random
import numpy
from cclib.method.population import Population
class LPA(Population):
"""The Löwdin population analysis"""
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(LPA, self).__init__(logname="LPA", *args)
def __str__(self):
"""Return a string representation of the object."""
return "LPA of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'LPA("%s")' % (self.data)
def calculate(self, indices=None, x=0.5, fupdate=0.05):
"""Perform a calculation of Löwdin population analysis.
Inputs:
indices - list of lists containing atomic orbital indices of fragments
x - overlap matrix exponent in wavefunxtion projection (x=0.5 for Lowdin)
"""
unrestricted = (len(self.data.mocoeffs) == 2)
nbasis = self.data.nbasis
# Determine number of steps, and whether process involves beta orbitals.
self.logger.info("Creating attribute aoresults: [array[2]]")
alpha = len(self.data.mocoeffs[0])
self.aoresults = [ numpy.zeros([alpha, nbasis], "d") ]
nstep = alpha
if unrestricted:
beta = len(self.data.mocoeffs[1])
self.aoresults.append(numpy.zeros([beta, nbasis], "d"))
nstep += beta
# intialize progress if available
if self.progress:
self.progress.initialize(nstep)
if hasattr(self.data, "aooverlaps"):
S = self.data.aooverlaps
elif hasattr(self.data, "fooverlaps"):
S = self.data.fooverlaps
# Get eigenvalues and matrix of eigenvectors for transformation decomposition (U).
# Find roots of diagonal elements, and transform backwards using eigevectors.
# We need two matrices here, one for S^x, another for S^(1-x).
# We don't need to invert U, since S is symmetrical.
eigenvalues, U = numpy.linalg.eig(S)
UI = U.transpose()
Sdiagroot1 = numpy.identity(len(S))*numpy.power(eigenvalues, x)
Sdiagroot2 = numpy.identity(len(S))*numpy.power(eigenvalues, 1-x)
Sroot1 = numpy.dot(U, numpy.dot(Sdiagroot1, UI))
Sroot2 = numpy.dot(U, numpy.dot(Sdiagroot2, UI))
step = 0
for spin in range(len(self.data.mocoeffs)):
for i in range(len(self.data.mocoeffs[spin])):
if self.progress and random.random() < fupdate:
self.progress.update(step, "Lowdin Population Analysis")
ci = self.data.mocoeffs[spin][i]
temp1 = numpy.dot(ci, Sroot1)
temp2 = numpy.dot(ci, Sroot2)
self.aoresults[spin][i] = numpy.multiply(temp1, temp2).astype("d")
step += 1
if self.progress:
self.progress.update(nstep, "Done")
retval = super(LPA, self).partition(indices)
if not retval:
self.logger.error("Error in partitioning results")
return False
# Create array for charges.
self.logger.info("Creating fragcharges: array[1]")
size = len(self.fragresults[0][0])
self.fragcharges = numpy.zeros([size], "d")
alpha = numpy.zeros([size], "d")
if unrestricted:
beta = numpy.zeros([size], "d")
for spin in range(len(self.fragresults)):
for i in range(self.data.homos[spin] + 1):
temp = numpy.reshape(self.fragresults[spin][i], (size,))
self.fragcharges = numpy.add(self.fragcharges, temp)
if spin == 0:
alpha = numpy.add(alpha, temp)
elif spin == 1:
beta = numpy.add(beta, temp)
if not unrestricted:
self.fragcharges = numpy.multiply(self.fragcharges, 2)
else:
self.logger.info("Creating fragspins: array[1]")
self.fragspins = numpy.subtract(alpha, beta)
return True
cclib-1.6.2/cclib/method/mbo.py 0000664 0000000 0000000 00000010110 13535330462 0016254 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculation of Mayer's bond orders based on data parsed by cclib."""
import random
import numpy
from cclib.method.density import Density
class MBO(Density):
"""Mayer's bond orders"""
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(MBO, self).__init__(logname="MBO", *args)
def __str__(self):
"""Return a string representation of the object."""
return "Mayer's bond order of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'Mayer\'s bond order("%s")' % (self.data)
def calculate(self, indices=None, fupdate=0.05):
"""Calculate Mayer's bond orders."""
retval = super(MBO, self).calculate(fupdate)
if not retval: #making density didn't work
return False
# Do we have the needed info in the ccData object?
if not (hasattr(self.data, "aooverlaps")
or hasattr(self.data, "fooverlaps")):
self.logger.error("Missing overlap matrix")
return False #let the caller of function know we didn't finish
if not indices:
# Build list of groups of orbitals in each atom for atomresults.
if hasattr(self.data, "aonames"):
names = self.data.aonames
overlaps = self.data.aooverlaps
elif hasattr(self.data, "fonames"):
names = self.data.fonames
overlaps = self.data.fooverlaps
else:
self.logger.error("Missing aonames or fonames")
return False
atoms = []
indices = []
name = names[0].split('_')[0]
atoms.append(name)
indices.append([0])
for i in range(1, len(names)):
name = names[i].split('_')[0]
try:
index = atoms.index(name)
except ValueError: #not found in atom list
atoms.append(name)
indices.append([i])
else:
indices[index].append(i)
self.logger.info("Creating attribute fragresults: array[3]")
size = len(indices)
# Determine number of steps, and whether process involves beta orbitals.
PS = []
PS.append(numpy.dot(self.density[0], overlaps))
nstep = size**2 #approximately quadratic in size
unrestricted = (len(self.data.mocoeffs) == 2)
if unrestricted:
self.fragresults = numpy.zeros([2, size, size], "d")
PS.append(numpy.dot(self.density[1], overlaps))
else:
self.fragresults = numpy.zeros([1, size, size], "d")
# Intialize progress if available.
if self.progress:
self.progress.initialize(nstep)
step = 0
for i in range(len(indices)):
if self.progress and random.random() < fupdate:
self.progress.update(step, "Mayer's Bond Order")
for j in range(i+1, len(indices)):
tempsumA = 0
tempsumB = 0
for a in indices[i]:
for b in indices[j]:
if unrestricted:
tempsumA += 2 * PS[0][a][b] * PS[0][b][a]
tempsumB += 2 * PS[1][a][b] * PS[1][b][a]
else:
tempsumA += PS[0][a][b] * PS[0][b][a]
self.fragresults[0][i, j] = tempsumA
self.fragresults[0][j, i] = tempsumA
if unrestricted:
self.fragresults[1][i, j] = tempsumB
self.fragresults[1][j, i] = tempsumB
if self.progress:
self.progress.update(nstep, "Done")
return True
cclib-1.6.2/cclib/method/moments.py 0000664 0000000 0000000 00000013022 13535330462 0017166 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculation of electric multipole moments based on data parsed by cclib."""
import sys
if sys.version_info <= (3, 3):
from collections import Iterable
else:
from collections.abc import Iterable
import numpy
from cclib.parser.utils import convertor
from cclib.method.calculationmethod import Method
class Moments(Method):
"""A class used to calculate electric multipole moments.
The obtained results are stored in `results` attribute as a
dictionary whose keys denote the used charge population scheme.
"""
def __init__(self, data):
self.required_attrs = ('atomcoords', 'atomcharges')
self.results = {}
super(Moments, self).__init__(data)
def __str__(self):
"""Returns a string representation of the object."""
return "Multipole moments of %s" % (self.data)
def __repr__(self):
"""Returns a representation of the object."""
return 'Moments("%s")' % (self.data)
def _calculate_dipole(self, charges, coords, origin):
"""Calculate the dipole moment from the given atomic charges
and their coordinates with respect to the origin.
"""
transl_coords_au = convertor(coords - origin, 'Angstrom', 'bohr')
dipole = numpy.dot(charges, transl_coords_au)
return convertor(dipole, 'ebohr', 'Debye')
def _calculate_quadrupole(self, charges, coords, origin):
"""Calculate the traceless quadrupole moment from the given
atomic charges and their coordinates with respect to the origin.
"""
transl_coords_au = convertor(coords - origin, 'Angstrom', 'bohr')
delta = numpy.eye(3)
Q = numpy.zeros([3, 3])
for i in range(3):
for j in range(3):
for q, r in zip(charges, transl_coords_au):
Q[i,j] += 1/2 * q * (3 * r[i] * r[j] - \
numpy.linalg.norm(r)**2 * delta[i,j])
triu_idxs = numpy.triu_indices_from(Q)
raveled_idxs = numpy.ravel_multi_index(triu_idxs, Q.shape)
quadrupole = numpy.take(Q.flatten(), raveled_idxs)
return convertor(quadrupole, 'ebohr2', 'Buckingham')
def calculate(self, origin='nuccharge', population='mulliken',
masses=None):
"""Calculate electric dipole and quadrupole moments using parsed
partial atomic charges.
Inputs:
origin - a choice of the origin of coordinate system. Can be
either a three-element iterable or a string. If
iterable, then it explicitly defines the origin (in
Angstrom). If string, then the value can be any one of
the following and it describes what is used as the
origin:
* 'nuccharge' -- center of positive nuclear charge
* 'mass' -- center of mass
population - a type of population analysis used to extract
corresponding atomic charges from the output file.
masses - if None, then use default atomic masses. Otherwise,
the user-provided will be used.
Returns:
A list where the first element is the origin of coordinates,
while other elements are dipole and quadrupole moments
expressed in terms of Debye and Buckingham units
respectively.
Raises:
ValueError when an argument with incorrect value or of
inappropriate type is passed to a method.
Notes:
To calculate the quadrupole moment the Buckingham definition
[1]_ is chosen. Hirschfelder et al. [2]_ define it two times
as much.
References:
.. [1] Buckingham, A. D. (1959). Molecular quadrupole moments.
Quarterly Reviews, Chemical Society, 13(3), 183.
https://doi.org:10.1039/qr9591300183.
.. [2] Hirschfelder J. O., Curtiss C. F. and Bird R. B. (1954).
The Molecular Theory of Gases and Liquids. New York: Wiley.
"""
coords = self.data.atomcoords[-1]
try:
charges = self.data.atomcharges[population]
except KeyError as e:
msg = ("charges coming from requested population analysis"
"scheme are not parsed")
raise ValueError(msg, e)
if isinstance(origin, Iterable) and not isinstance(origin, str):
origin_pos = numpy.asarray(origin)
elif origin == 'nuccharge':
origin_pos = numpy.average(coords, weights=self.data.atomnos, axis=0)
elif origin == 'mass':
if masses:
atommasses = numpy.asarray(masses)
else:
try:
atommasses = self.data.atommasses
except AttributeError as e:
msg = ("atomic masses were not parsed, consider provide "
"'masses' argument instead")
raise ValueError(msg, e)
origin_pos = numpy.average(coords, weights=atommasses, axis=0)
else:
raise ValueError("{} is invalid value for 'origin'".format(origin))
dipole = self._calculate_dipole(charges, coords, origin_pos)
quadrupole = self._calculate_quadrupole(charges, coords, origin_pos)
rv = [origin_pos, dipole, quadrupole]
self.results.update({population: rv})
return rv
cclib-1.6.2/cclib/method/mpa.py 0000664 0000000 0000000 00000007674 13535330462 0016301 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculation of Mulliken population analysis (MPA) based on data parsed by cclib."""
import random
import numpy
from cclib.method.population import Population
class MPA(Population):
"""Mulliken population analysis."""
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(MPA, self).__init__(logname="MPA", *args)
def __str__(self):
"""Return a string representation of the object."""
return "MPA of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'MPA("%s")' % (self.data)
def calculate(self, indices=None, fupdate=0.05):
"""Perform a Mulliken population analysis."""
# Determine number of steps, and whether process involves beta orbitals.
self.logger.info("Creating attribute aoresults: [array[2]]")
nbasis = self.data.nbasis
alpha = len(self.data.mocoeffs[0])
self.aoresults = [ numpy.zeros([alpha, nbasis], "d") ]
nstep = alpha
unrestricted = (len(self.data.mocoeffs) == 2)
if unrestricted:
beta = len(self.data.mocoeffs[1])
self.aoresults.append(numpy.zeros([beta, nbasis], "d"))
nstep += beta
# Intialize progress if available.
if self.progress:
self.progress.initialize(nstep)
step = 0
for spin in range(len(self.data.mocoeffs)):
for i in range(len(self.data.mocoeffs[spin])):
if self.progress and random.random() < fupdate:
self.progress.update(step, "Mulliken Population Analysis")
# X_{ai} = \sum_b c_{ai} c_{bi} S_{ab}
# = c_{ai} \sum_b c_{bi} S_{ab}
# = c_{ai} C(i) \cdot S(a)
# X = C(i) * [C(i) \cdot S]
# C(i) is 1xn and S is nxn, result of matrix mult is 1xn
ci = self.data.mocoeffs[spin][i]
if hasattr(self.data, "aooverlaps"):
temp = numpy.dot(ci, self.data.aooverlaps)
# handle spin-unrestricted beta case
elif hasattr(self.data, "fooverlaps2") and spin == 1:
temp = numpy.dot(ci, self.data.fooverlaps2)
elif hasattr(self.data, "fooverlaps"):
temp = numpy.dot(ci, self.data.fooverlaps)
self.aoresults[spin][i] = numpy.multiply(ci, temp).astype("d")
step += 1
if self.progress:
self.progress.update(nstep, "Done")
retval = super(MPA, self).partition(indices)
if not retval:
self.logger.error("Error in partitioning results")
return False
# Create array for Mulliken charges.
self.logger.info("Creating fragcharges: array[1]")
size = len(self.fragresults[0][0])
self.fragcharges = numpy.zeros([size], "d")
alpha = numpy.zeros([size], "d")
if unrestricted:
beta = numpy.zeros([size], "d")
for spin in range(len(self.fragresults)):
for i in range(self.data.homos[spin] + 1):
temp = numpy.reshape(self.fragresults[spin][i], (size,))
self.fragcharges = numpy.add(self.fragcharges, temp)
if spin == 0:
alpha = numpy.add(alpha, temp)
elif spin == 1:
beta = numpy.add(beta, temp)
if not unrestricted:
self.fragcharges = numpy.multiply(self.fragcharges, 2)
else:
self.logger.info("Creating fragspins: array[1]")
self.fragspins = numpy.subtract(alpha, beta)
return True
cclib-1.6.2/cclib/method/nuclear.py 0000664 0000000 0000000 00000015334 13535330462 0017145 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculate properties of nuclei based on data parsed by cclib."""
import logging
import numpy as np
from cclib.method.calculationmethod import Method
from cclib.parser.utils import PeriodicTable
from cclib.parser.utils import find_package
_found_periodictable = find_package("periodictable")
if _found_periodictable:
import periodictable as pt
_found_scipy = find_package("scipy")
if _found_scipy:
import scipy.constants as spc
def _check_periodictable(found_periodictable):
if not _found_periodictable:
raise ImportError("You must install `periodictable` to use this function")
def _check_scipy(found_scipy):
if not _found_scipy:
raise ImportError("You must install `scipy` to use this function")
def get_most_abundant_isotope(element):
"""Given a `periodictable` element, return the most abundant
isotope.
"""
most_abundant_isotope = element.isotopes[0]
abundance = 0
for iso in element:
if iso.abundance > abundance:
most_abundant_isotope = iso
abundance = iso.abundance
return most_abundant_isotope
def get_isotopic_masses(charges):
"""Return the masses for the given nuclei, respresented by their
nuclear charges.
"""
_check_periodictable(_found_periodictable)
masses = []
for charge in charges:
el = pt.elements[charge]
isotope = get_most_abundant_isotope(el)
mass = isotope.mass
masses.append(mass)
return np.array(masses)
class Nuclear(Method):
"""A container for methods pertaining to atomic nuclei."""
def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"):
self.required_attrs = ('natom','atomcoords','atomnos','charge')
super(Nuclear, self).__init__(data, progress, loglevel, logname)
def __str__(self):
"""Return a string representation of the object."""
return "Nuclear"
def __repr__(self):
"""Return a representation of the object."""
return "Nuclear"
def stoichiometry(self):
"""Return the stoichemistry of the object according to the Hill system"""
cclib_pt = PeriodicTable()
elements = [cclib_pt.element[ano] for ano in self.data.atomnos]
counts = {el: elements.count(el) for el in set(elements)}
formula = ""
elcount = lambda el, c: "%s%i" % (el, c) if c > 1 else el
if 'C' in elements:
formula += elcount('C', counts['C'])
counts.pop('C')
if 'H' in elements:
formula += elcount('H', counts['H'])
counts.pop('H')
for el, c in sorted(counts.items()):
formula += elcount(el, c)
if getattr(self.data, 'charge', 0):
magnitude = abs(self.data.charge)
sign = "+" if self.data.charge > 0 else "-"
formula += "(%s%i)" % (sign, magnitude)
return formula
def repulsion_energy(self, atomcoords_index=-1):
"""Return the nuclear repulsion energy."""
nre = 0.0
for i in range(self.data.natom):
ri = self.data.atomcoords[atomcoords_index][i]
zi = self.data.atomnos[i]
for j in range(i+1, self.data.natom):
rj = self.data.atomcoords[0][j]
zj = self.data.atomnos[j]
d = np.linalg.norm(ri-rj)
nre += zi*zj/d
return nre
def center_of_mass(self, atomcoords_index=-1):
"""Return the center of mass."""
charges = self.data.atomnos
coords = self.data.atomcoords[atomcoords_index]
masses = get_isotopic_masses(charges)
mwc = coords * masses[:, np.newaxis]
numerator = np.sum(mwc, axis=0)
denominator = np.sum(masses)
return numerator / denominator
def moment_of_inertia_tensor(self, atomcoords_index=-1):
"""Return the moment of inertia tensor."""
charges = self.data.atomnos
coords = self.data.atomcoords[atomcoords_index]
masses = get_isotopic_masses(charges)
moi_tensor = np.empty((3, 3))
moi_tensor[0][0] = np.sum(masses * (coords[:, 1]**2 + coords[:, 2]**2))
moi_tensor[1][1] = np.sum(masses * (coords[:, 0]**2 + coords[:, 2]**2))
moi_tensor[2][2] = np.sum(masses * (coords[:, 0]**2 + coords[:, 1]**2))
moi_tensor[0][1] = np.sum(masses * coords[:, 0] * coords[:, 1])
moi_tensor[0][2] = np.sum(masses * coords[:, 0] * coords[:, 2])
moi_tensor[1][2] = np.sum(masses * coords[:, 1] * coords[:, 2])
moi_tensor[1][0] = moi_tensor[0][1]
moi_tensor[2][0] = moi_tensor[0][2]
moi_tensor[2][1] = moi_tensor[1][2]
return moi_tensor
def principal_moments_of_inertia(self, units='amu_bohr_2'):
"""Return the principal moments of inertia in 3 kinds of units:
1. [amu][bohr]^2
2. [amu][angstrom]^2
3. [g][cm]^2
and the principal axes.
"""
choices = ('amu_bohr_2', 'amu_angstrom_2', 'g_cm_2')
units = units.lower()
if units not in choices:
raise ValueError("Invalid units, pick one of {}".format(choices))
moi_tensor = self.moment_of_inertia_tensor()
principal_moments, principal_axes = np.linalg.eigh(moi_tensor)
if units == 'amu_bohr_2':
conv = 1
if units == 'amu_angstrom_2':
_check_scipy(_found_scipy)
bohr2ang = spc.value('atomic unit of length') / spc.angstrom
conv = bohr2ang ** 2
if units == 'g_cm_2':
_check_scipy(_found_scipy)
amu2g = spc.value('unified atomic mass unit') * spc.kilo
conv = amu2g * (spc.value('atomic unit of length') * spc.centi) ** 2
return conv * principal_moments, principal_axes
def rotational_constants(self, units='ghz'):
"""Compute the rotational constants in 1/cm or GHz."""
choices = ('invcm', 'ghz')
units = units.lower()
if units not in choices:
raise ValueError("Invalid units, pick one of {}".format(choices))
principal_moments = self.principal_moments_of_inertia()[0]
_check_scipy(_found_scipy)
bohr2ang = spc.value('atomic unit of length') / spc.angstrom
xfamu = 1 / spc.value('electron mass in u')
xthz = spc.value('hartree-hertz relationship')
rotghz = xthz * (bohr2ang ** 2) / (2 * xfamu * spc.giga)
if units == 'ghz':
conv = rotghz
if units == 'invcm':
ghz2invcm = spc.giga * spc.centi / spc.c
conv = rotghz * ghz2invcm
return conv / principal_moments
del find_package
cclib-1.6.2/cclib/method/opa.py 0000664 0000000 0000000 00000010221 13535330462 0016261 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculation of overlap population analysis based on cclib data."""
import random
import numpy
from cclib.method.calculationmethod import Method
from cclib.method.population import Population
def func(x):
if x==1:
return 1
else:
return x+func(x-1)
class OPA(Population):
"""Overlap population analysis."""
def __init__(self, *args):
# Call the __init__ method of the superclass.
super(OPA, self).__init__(logname="OPA", *args)
def __str__(self):
"""Return a string representation of the object."""
return "OPA of %s" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'OPA("%s")' % (self.data)
def calculate(self, indices=None, fupdate=0.05):
"""Perform an overlap population analysis given the results of a parser"""
if not indices:
# Build list of groups of orbitals in each atom for atomresults.
if hasattr(self.data, "aonames"):
names = self.data.aonames
elif hasattr(self.data, "foonames"):
names = self.data.fonames
atoms = []
indices = []
name = names[0].split('_')[0]
atoms.append(name)
indices.append([0])
for i in range(1, len(names)):
name = names[i].split('_')[0]
try:
index = atoms.index(name)
except ValueError: #not found in atom list
atoms.append(name)
indices.append([i])
else:
indices[index].append(i)
# Determine number of steps, and whether process involves beta orbitals.
nfrag = len(indices) #nfrag
nstep = func(nfrag - 1)
unrestricted = (len(self.data.mocoeffs) == 2)
alpha = len(self.data.mocoeffs[0])
nbasis = self.data.nbasis
self.logger.info("Creating attribute results: array[4]")
results= [ numpy.zeros([nfrag, nfrag, alpha], "d") ]
if unrestricted:
beta = len(self.data.mocoeffs[1])
results.append(numpy.zeros([nfrag, nfrag, beta], "d"))
nstep *= 2
if hasattr(self.data, "aooverlaps"):
overlap = self.data.aooverlaps
elif hasattr(self.data,"fooverlaps"):
overlap = self.data.fooverlaps
#intialize progress if available
if self.progress:
self.progress.initialize(nstep)
size = len(self.data.mocoeffs[0])
step = 0
preresults = []
for spin in range(len(self.data.mocoeffs)):
two = numpy.array([2.0]*len(self.data.mocoeffs[spin]),"d")
# OP_{AB,i} = \sum_{a in A} \sum_{b in B} 2 c_{ai} c_{bi} S_{ab}
for A in range(len(indices)-1):
for B in range(A+1, len(indices)):
if self.progress: #usually only a handful of updates, so remove random part
self.progress.update(step, "Overlap Population Analysis")
for a in indices[A]:
ca = self.data.mocoeffs[spin][:,a]
for b in indices[B]:
cb = self.data.mocoeffs[spin][:,b]
temp = ca * cb * two *overlap[a,b]
results[spin][A,B] = numpy.add(results[spin][A,B],temp)
results[spin][B,A] = numpy.add(results[spin][B,A],temp)
step += 1
temparray2 = numpy.swapaxes(results[0],1,2)
self.results = [ numpy.swapaxes(temparray2,0,1) ]
if unrestricted:
temparray2 = numpy.swapaxes(results[1],1,2)
self.results.append(numpy.swapaxes(temparray2, 0, 1))
if self.progress:
self.progress.update(nstep, "Done")
return True
cclib-1.6.2/cclib/method/orbitals.py 0000664 0000000 0000000 00000003574 13535330462 0017336 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# This file is part of cclib (http://cclib.github.io), a library for parsing
# and interpreting the results of computational chemistry packages.
#
# Copyright (C) 2017, the cclib development team
#
# The library is free software, distributed under the terms of
# the GNU Lesser General Public version 2.1 or later. You should have
# received a copy of the license along with cclib. You can also access
# the full license online at http://www.gnu.org/copyleft/lgpl.html.
"""Analyses related to orbitals."""
import logging
import numpy
from cclib.method.calculationmethod import Method
class Orbitals(Method):
"""A class for orbital related methods."""
def __init__(self, data, progress=None, \
loglevel=logging.INFO, logname="Log"):
self.required_attrs = ('mocoeffs','moenergies','homos')
# Call the __init__ method of the superclass.
super(Orbitals, self).__init__(data, progress, loglevel, logname)
self.fragresults = None
def __str__(self):
"""Return a string representation of the object."""
return "Orbitals"
def __repr__(self):
"""Return a representation of the object."""
return "Orbitals"
def closed_shell(self):
"""Return Boolean indicating if system is closed shell."""
# If there are beta orbitals, we can assume the system is closed
# shell if the orbital energies are identical within numerical accuracy.
if len(self.data.mocoeffs) == 2:
precision = 10e-6
return numpy.allclose(*self.data.moenergies, atol=precision)
# Restricted open shell will have one set of MOs but two HOMO indices,
# and the indices should be different (otherwise it's still closed shell).
if len(self.data.homos) == 2 and self.data.homos[0] != self.data.homos[1]:
return False
return True
cclib-1.6.2/cclib/method/population.py 0000664 0000000 0000000 00000006506 13535330462 0017707 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Population analyses based on cclib data."""
import logging
import numpy
from cclib.method.calculationmethod import Method, MissingAttributeError
class Population(Method):
"""An abstract base class for population-type methods."""
# All of these are typically required for population analyses.
required_attrs = ('homos', 'mocoeffs', 'nbasis')
# At least one of these are typically required.
overlap_attributes = ('aooverlaps', 'fooverlaps')
def __init__(self, data, progress=None, \
loglevel=logging.INFO, logname="Log"):
super(Population, self).__init__(data, progress, loglevel, logname)
self.fragresults = None
def __str__(self):
"""Return a string representation of the object."""
return "Population"
def __repr__(self):
"""Return a representation of the object."""
return "Population"
def _check_required_attributes(self):
super(Population, self)._check_required_attributes()
if self.overlap_attributes and not any(hasattr(self.data, a) for a in self.overlap_attributes):
raise MissingAttributeError(
'Need overlap matrix (aooverlaps or fooverlaps attribute) for Population methods')
def partition(self, indices=None):
if not hasattr(self, "aoresults"):
self.calculate()
if not indices:
# Build list of groups of orbitals in each atom for atomresults.
if hasattr(self.data, "aonames"):
names = self.data.aonames
elif hasattr(self.data, "fonames"):
names = self.data.fonames
atoms = []
indices = []
name = names[0].split('_')[0]
atoms.append(name)
indices.append([0])
for i in range(1, len(names)):
name = names[i].split('_')[0]
try:
index = atoms.index(name)
except ValueError: #not found in atom list
atoms.append(name)
indices.append([i])
else:
indices[index].append(i)
natoms = len(indices)
nmocoeffs = len(self.aoresults[0])
# Build results numpy array[3].
alpha = len(self.aoresults[0])
results = []
results.append(numpy.zeros([alpha, natoms], "d"))
if len(self.aoresults) == 2:
beta = len(self.aoresults[1])
results.append(numpy.zeros([beta, natoms], "d"))
# For each spin, splice numpy array at ao index,
# and add to correct result row.
for spin in range(len(results)):
for i in range(natoms): # Number of groups.
for j in range(len(indices[i])): # For each group.
temp = self.aoresults[spin][:, indices[i][j]]
results[spin][:, i] = numpy.add(results[spin][:, i], temp)
self.logger.info("Saving partitioned results in fragresults: [array[2]]")
self.fragresults = results
return True
cclib-1.6.2/cclib/method/volume.py 0000664 0000000 0000000 00000021125 13535330462 0017016 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculation methods related to volume based on cclib data."""
from __future__ import print_function
import copy
import numpy
from cclib.parser.utils import convertor
from cclib.parser.utils import find_package
_found_pyquante = find_package("PyQuante")
if _found_pyquante:
from PyQuante.CGBF import CGBF
from cclib.bridge import cclib2pyquante
_found_pyvtk = find_package("pyvtk")
if _found_pyvtk:
from pyvtk import *
from pyvtk.DataSetAttr import *
def _check_pyvtk(found_pyvtk):
if not found_pyvtk:
raise ImportError("You must install `pyvtk` to use this function")
class Volume(object):
"""Represent a volume in space.
Required parameters:
origin -- the bottom left hand corner of the volume
topcorner -- the top right hand corner
spacing -- the distance between the points in the cube
Attributes:
data -- a NumPy array of values for each point in the volume
(set to zero at initialisation)
numpts -- the numbers of points in the (x,y,z) directions
"""
def __init__(self, origin, topcorner, spacing):
self.origin = numpy.asarray(origin, dtype=float)
self.topcorner = numpy.asarray(topcorner, dtype=float)
self.spacing = numpy.asarray(spacing, dtype=float)
self.numpts = []
for i in range(3):
self.numpts.append(int((self.topcorner[i] - self.origin[i]) / self.spacing[i] + 1))
self.data = numpy.zeros(tuple(self.numpts), "d")
def __str__(self):
"""Return a string representation."""
return "Volume %s to %s (density: %s)" % (self.origin, self.topcorner,
self.spacing)
def write(self, filename, fformat="Cube"):
"""Write the volume to a file."""
fformat = fformat.lower()
writers = {
"vtk": self.writeasvtk,
"cube": self.writeascube,
}
if fformat not in writers:
raise RuntimeError("File format must be either VTK or Cube")
writers[fformat](filename)
def writeasvtk(self, filename):
_check_pyvtk(_found_pyvtk)
ranges = (numpy.arange(self.data.shape[2]),
numpy.arange(self.data.shape[1]),
numpy.arange(self.data.shape[0]))
v = VtkData(RectilinearGrid(*ranges), "Test",
PointData(Scalars(self.data.ravel(), "from cclib", "default")))
v.tofile(filename)
def integrate(self):
boxvol = (self.spacing[0] * self.spacing[1] * self.spacing[2] *
convertor(1, "Angstrom", "bohr") ** 3)
return sum(self.data.ravel()) * boxvol
def integrate_square(self):
boxvol = (self.spacing[0] * self.spacing[1] * self.spacing[2] *
convertor(1, "Angstrom", "bohr") ** 3)
return sum(self.data.ravel() ** 2) * boxvol
def writeascube(self, filename):
# Remember that the units are bohr, not Angstroms
def convert(x):
return convertor(x, "Angstrom", "bohr")
ans = []
ans.append("Cube file generated by cclib")
ans.append("")
format = "%4d%12.6f%12.6f%12.6f"
origin = [convert(x) for x in self.origin]
ans.append(format % (0, origin[0], origin[1], origin[2]))
ans.append(format % (self.data.shape[0], convert(self.spacing[0]), 0.0, 0.0))
ans.append(format % (self.data.shape[1], 0.0, convert(self.spacing[1]), 0.0))
ans.append(format % (self.data.shape[2], 0.0, 0.0, convert(self.spacing[2])))
line = []
for i in range(self.data.shape[0]):
for j in range(self.data.shape[1]):
for k in range(self.data.shape[2]):
line.append(scinotation(self.data[i, j, k]))
if len(line) == 6:
ans.append(" ".join(line))
line = []
if line:
ans.append(" ".join(line))
line = []
with open(filename, "w") as outputfile:
outputfile.write("\n".join(ans))
def scinotation(num):
"""Write in scientific notation."""
ans = "%10.5E" % num
broken = ans.split("E")
exponent = int(broken[1])
if exponent < -99:
return " 0.000E+00"
if exponent < 0:
sign = "-"
else:
sign = "+"
return ("%sE%s%s" % (broken[0], sign, broken[1][-2:])).rjust(12)
def getbfs(coords, gbasis):
"""Convenience function for both wavefunction and density based on PyQuante Ints.py."""
mymol = cclib2pyquante.makepyquante(coords, [0 for _ in coords])
sym2powerlist = {
'S' : [(0, 0, 0)],
'P' : [(1, 0, 0), (0, 1, 0), (0, 0, 1)],
'D' : [(2, 0, 0), (0, 2, 0), (0, 0, 2), (1, 1, 0), (0, 1, 1), (1, 0, 1)],
'F' : [(3, 0, 0), (2, 1, 0), (2, 0, 1), (1, 2, 0), (1, 1, 1), (1, 0, 2),
(0, 3, 0), (0, 2, 1), (0, 1, 2), (0, 0, 3)]
}
bfs = []
for i, atom in enumerate(mymol):
bs = gbasis[i]
for sym, prims in bs:
for power in sym2powerlist[sym]:
bf = CGBF(atom.pos(), power)
for expnt, coef in prims:
bf.add_primitive(expnt, coef)
bf.normalize()
bfs.append(bf)
return bfs
def wavefunction(coords, mocoeffs, gbasis, volume):
"""Calculate the magnitude of the wavefunction at every point in a volume.
Attributes:
coords -- the coordinates of the atoms
mocoeffs -- mocoeffs for one eigenvalue
gbasis -- gbasis from a parser object
volume -- a template Volume object (will not be altered)
"""
bfs = getbfs(coords, gbasis)
wavefn = copy.copy(volume)
wavefn.data = numpy.zeros(wavefn.data.shape, "d")
conversion = convertor(1, "bohr", "Angstrom")
x = numpy.arange(wavefn.origin[0], wavefn.topcorner[0] + wavefn.spacing[0], wavefn.spacing[0]) / conversion
y = numpy.arange(wavefn.origin[1], wavefn.topcorner[1] + wavefn.spacing[1], wavefn.spacing[1]) / conversion
z = numpy.arange(wavefn.origin[2], wavefn.topcorner[2] + wavefn.spacing[2], wavefn.spacing[2]) / conversion
for bs in range(len(bfs)):
data = numpy.zeros(wavefn.data.shape, "d")
for i, xval in enumerate(x):
for j, yval in enumerate(y):
for k, zval in enumerate(z):
data[i, j, k] = bfs[bs].amp(xval, yval, zval)
data *= mocoeffs[bs]
wavefn.data += data
return wavefn
def electrondensity(coords, mocoeffslist, gbasis, volume):
"""Calculate the magnitude of the electron density at every point in a volume.
Attributes:
coords -- the coordinates of the atoms
mocoeffs -- mocoeffs for all of the occupied eigenvalues
gbasis -- gbasis from a parser object
volume -- a template Volume object (will not be altered)
Note: mocoeffs is a list of NumPy arrays. The list will be of length 1
for restricted calculations, and length 2 for unrestricted.
"""
bfs = getbfs(coords, gbasis)
density = copy.copy(volume)
density.data = numpy.zeros(density.data.shape, "d")
conversion = convertor(1, "bohr", "Angstrom")
x = numpy.arange(density.origin[0], density.topcorner[0] + density.spacing[0], density.spacing[0]) / conversion
y = numpy.arange(density.origin[1], density.topcorner[1] + density.spacing[1], density.spacing[1]) / conversion
z = numpy.arange(density.origin[2], density.topcorner[2] + density.spacing[2], density.spacing[2]) / conversion
for mocoeffs in mocoeffslist:
for mocoeff in mocoeffs:
wavefn = numpy.zeros(density.data.shape, "d")
for bs in range(len(bfs)):
data = numpy.zeros(density.data.shape, "d")
for i, xval in enumerate(x):
for j, yval in enumerate(y):
tmp = []
for zval in z:
tmp.append(bfs[bs].amp(xval, yval, zval))
data[i, j, :] = tmp
data *= mocoeff[bs]
wavefn += data
density.data += wavefn ** 2
# TODO ROHF
if len(mocoeffslist) == 1:
density.data *= 2.0
return density
del find_package
cclib-1.6.2/cclib/parser/ 0000775 0000000 0000000 00000000000 13535330462 0015150 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/parser/__init__.py 0000664 0000000 0000000 00000002376 13535330462 0017271 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Contains parsers for all supported programs"""
# These import statements are added for the convenience of users...
# Rather than having to type:
# from cclib.parser.gaussianparser import Gaussian
# they can use:
# from cclib.parser import Gaussian
from cclib.parser.adfparser import ADF
from cclib.parser.daltonparser import DALTON
from cclib.parser.gamessparser import GAMESS
from cclib.parser.gamessukparser import GAMESSUK
from cclib.parser.gaussianparser import Gaussian
from cclib.parser.jaguarparser import Jaguar
from cclib.parser.molcasparser import Molcas
from cclib.parser.molproparser import Molpro
from cclib.parser.mopacparser import MOPAC
from cclib.parser.nwchemparser import NWChem
from cclib.parser.orcaparser import ORCA
from cclib.parser.psi3parser import Psi3
from cclib.parser.psi4parser import Psi4
from cclib.parser.qchemparser import QChem
from cclib.parser.turbomoleparser import Turbomole
from cclib.parser.data import ccData
# This allows users to type:
# from cclib.parser import ccopen
from cclib.io.ccio import ccopen
cclib-1.6.2/cclib/parser/adfparser.py 0000664 0000000 0000000 00000143557 13535330462 0017510 0 ustar 00root root 0000000 0000000 ## -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for ADF output files"""
from __future__ import print_function
import itertools
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class ADF(logfileparser.Logfile):
"""An ADF log file"""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(ADF, self).__init__(logname="ADF", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "ADF log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'ADF("%s")' % (self.filename)
def normalisesym(self, label):
"""Use standard symmetry labels instead of ADF labels.
To normalise:
(1) any periods are removed (except in the case of greek letters)
(2) XXX is replaced by X, and a " added.
(3) XX is replaced by X, and a ' added.
(4) The greek letters Sigma, Pi, Delta and Phi are replaced by
their lowercase equivalent.
"""
greeks = ['Sigma', 'Pi', 'Delta', 'Phi']
for greek in greeks:
if label.startswith(greek):
return label.lower()
ans = label.replace(".", "")
if ans[1:3] == "''":
temp = ans[0] + '"'
ans = temp
l = len(ans)
if l > 1 and ans[0] == ans[1]: # Python only tests the second condition if the first is true
if l > 2 and ans[1] == ans[2]:
ans = ans.replace(ans[0]*3, ans[0]) + '"'
else:
ans = ans.replace(ans[0]*2, ans[0]) + "'"
return ans
def normalisedegenerates(self, label, num, ndict=None):
"""Generate a string used for matching degenerate orbital labels
To normalise:
(1) if label is E or T, return label:num
(2) if label is P or D, look up in dict, and return answer
"""
if not ndict:
ndict = {
'P': {0: "P:x", 1: "P:y", 2: "P:z"},
'D': {0: "D:z2", 1: "D:x2-y2", 2: "D:xy", 3: "D:xz", 4: "D:yz"}
}
if label in ndict:
if num in ndict[label]:
return ndict[label][num]
else:
return "%s:%i" % (label, num+1)
else:
return "%s:%i" % (label, num+1)
def before_parsing(self):
# Used to avoid extracting the final geometry twice in a GeoOpt
self.NOTFOUND, self.GETLAST, self.NOMORE = list(range(3))
self.finalgeometry = self.NOTFOUND
# Used for calculating the scftarget (variables names taken from the ADF manual)
self.accint = self.SCFconv = self.sconv2 = None
# keep track of nosym and unrestricted case to parse Energies since it doens't have an all Irreps section
self.nosymflag = False
self.unrestrictedflag = False
SCFCNV, SCFCNV2 = list(range(2)) # used to index self.scftargets[]
maxelem, norm = list(range(2)) # used to index scf.values
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# If a file contains multiple calculations, currently we want to print a warning
# and skip to the end of the file, since cclib parses only the main system, which
# is usually the largest. Here we test this by checking if scftargets has already
# been parsed when another INPUT FILE segment is found, although this might
# not always be the best indicator.
if line.strip() == "(INPUT FILE)" and hasattr(self, "scftargets"):
self.logger.warning("Skipping remaining calculations")
inputfile.seek(0, 2)
return
# We also want to check to make sure we aren't parsing "Create" jobs,
# which normally come before the calculation we actually want to parse.
if line.strip() == "(INPUT FILE)":
while True:
self.updateprogress(inputfile, "Unsupported Information", self.fupdate)
line = next(inputfile) if line.strip() == "(INPUT FILE)" else None
if line and not line[:6] in ("Create", "create"):
break
line = next(inputfile)
version_searchstr = "Amsterdam Density Functional (ADF)"
if version_searchstr in line:
startidx = line.index(version_searchstr) + len(version_searchstr)
trimmed_line = line[startidx:].strip()[:-1]
# The package version is normally a year with revision
# number (such as 2013.01), but it may also be a random
# string (such as a version control branch name).
match = re.search(r"([\d\.]{4,7})", trimmed_line)
if match:
package_version = match.groups()[0]
self.metadata["package_version"] = package_version
else:
# This isn't as well-defined, but the field shouldn't
# be left empty.
self.metadata["package_version"] = trimmed_line.strip()
# In ADF 2014.01, there are (INPUT FILE) messages, so we need to use just
# the lines that start with 'Create' and run until the title or something
# else we are sure is is the calculation proper. It would be good to combine
# this with the previous block, if possible.
if line[:6] == "Create":
while line[:5] != "title" and "NO TITLE" not in line:
line = inputfile.next()
if line[1:10] == "Symmetry:":
info = line.split()
if info[1] == "NOSYM":
self.nosymflag = True
# Use this to read the subspecies of irreducible representations.
# It will be a list, with each element representing one irrep.
if line.strip() == "Irreducible Representations, including subspecies":
self.skip_line(inputfile, 'dashes')
self.irreps = []
line = next(inputfile)
while line.strip() != "":
self.irreps.append(line.split())
line = next(inputfile)
if line[4:13] == 'Molecule:':
info = line.split()
if info[1] == 'UNrestricted':
self.unrestrictedflag = True
if line[1:6] == "ATOMS":
# Find the number of atoms and their atomic numbers
# Also extract the starting coordinates (for a GeoOpt anyway)
# and the atommasses (previously called vibmasses)
self.updateprogress(inputfile, "Attributes", self.cupdate)
self.atomcoords = []
self.skip_lines(inputfile, ['header1', 'header2', 'header3'])
atomnos = []
atommasses = []
atomcoords = []
coreelectrons = []
line = next(inputfile)
while len(line) > 2: # ensure that we are reading no blank lines
info = line.split()
element = info[1].split('.')[0]
atomnos.append(self.table.number[element])
atomcoords.append(list(map(float, info[2:5])))
coreelectrons.append(int(float(info[5]) - float(info[6])))
atommasses.append(float(info[7]))
line = next(inputfile)
self.atomcoords.append(atomcoords)
self.set_attribute('natom', len(atomnos))
self.set_attribute('atomnos', atomnos)
self.set_attribute('atommasses', atommasses)
self.set_attribute('coreelectrons', coreelectrons)
if line[1:10] == "FRAGMENTS":
header = next(inputfile)
self.frags = []
self.fragnames = []
line = next(inputfile)
while len(line) > 2: # ensure that we are reading no blank lines
info = line.split()
if len(info) == 7: # fragment name is listed here
self.fragnames.append("%s_%s" % (info[1], info[0]))
self.frags.append([])
self.frags[-1].append(int(info[2]) - 1)
elif len(info) == 5: # add atoms into last fragment
self.frags[-1].append(int(info[0]) - 1)
line = next(inputfile)
# Extract charge
if line[1:11] == "Net Charge":
charge = int(line.split()[2])
self.set_attribute('charge', charge)
line = next(inputfile)
if len(line.strip()):
# Spin polar: 1 (Spin_A minus Spin_B electrons)
# (Not sure about this for higher multiplicities)
mult = int(line.split()[2]) + 1
else:
mult = 1
self.set_attribute('mult', mult)
if line[1:22] == "S C F U P D A T E S":
# find targets for SCF convergence
if not hasattr(self, "scftargets"):
self.scftargets = []
self.skip_lines(inputfile, ['e', 'b', 'numbers'])
line = next(inputfile)
self.SCFconv = float(line.split()[-1])
line = next(inputfile)
self.sconv2 = float(line.split()[-1])
# In ADF 2013, the default numerical integration method is fuzzy cells,
# although it used to be Voronoi polyhedra. Both methods apparently set
# the accint parameter, although the latter does so indirectly, based on
# a 'grid quality' setting. This is translated into accint using a
# dictionary with values taken from the documentation.
if "Numerical Integration : Voronoi Polyhedra (Te Velde)" in line:
self.integration_method = "voronoi_polyhedra"
if line[1:27] == 'General Accuracy Parameter':
# Need to know the accuracy of the integration grid to
# calculate the scftarget...note that it changes with time
self.accint = float(line.split()[-1])
if "Numerical Integration : Fuzzy Cells (Becke)" in line:
self.integration_method = 'fuzzy_cells'
if line[1:19] == "Becke grid quality":
self.grid_quality = line.split()[-1]
quality2accint = {
'BASIC': 2.0,
'NORMAL': 4.0,
'GOOD': 6.0,
'VERYGOOD': 8.0,
'EXCELLENT': 10.0,
}
self.accint = quality2accint[self.grid_quality]
# Half of the atomic orbital overlap matrix is printed since it is symmetric,
# but this requires "PRINT Smat" to be in the input. There are extra blank lines
# at the end of the block, which are used to terminate the parsing.
#
# ====== smat
#
# column 1 2 3 4
# row
# 1 1.00000000000000E+00
# 2 2.43370854175315E-01 1.00000000000000E+00
# 3 0.00000000000000E+00 0.00000000000000E+00 1.00000000000000E+00
# ...
#
if "====== smat" in line:
# Initialize the matrix with Nones so we can easily check all has been parsed.
overlaps = [[None] * self.nbasis for i in range(self.nbasis)]
self.skip_line(inputfile, 'blank')
line = inputfile.next()
while line.strip():
colline = line
assert colline.split()[0] == "column"
columns = [int(i) for i in colline.split()[1:]]
rowline = inputfile.next()
assert rowline.strip() == "row"
line = inputfile.next()
while line.strip():
i = int(line.split()[0])
vals = [float(col) for col in line.split()[1:]]
for j, o in enumerate(vals):
k = columns[j]
overlaps[k-1][i-1] = o
overlaps[i-1][k-1] = o
line = inputfile.next()
line = inputfile.next()
# Now all values should be parsed, and so no Nones remaining.
assert all([all([x is not None for x in ao]) for ao in overlaps])
self.set_attribute('aooverlaps', overlaps)
if line[1:11] == "CYCLE 1":
self.updateprogress(inputfile, "QM convergence", self.fupdate)
newlist = []
line = next(inputfile)
if not hasattr(self, "geovalues"):
# This is the first SCF cycle
self.scftargets.append([self.sconv2*10, self.sconv2])
elif self.finalgeometry in [self.GETLAST, self.NOMORE]:
# This is the final SCF cycle
self.scftargets.append([self.SCFconv*10, self.SCFconv])
else:
# This is an intermediate SCF cycle in a geometry optimization,
# in which case the SCF convergence target needs to be derived
# from the accint parameter. For Voronoi polyhedra integration,
# accint is printed and parsed. For fuzzy cells, it can be inferred
# from the grid quality setting, as is done somewhere above.
if self.accint:
oldscftst = self.scftargets[-1][1]
grdmax = self.geovalues[-1][1]
scftst = max(self.SCFconv, min(oldscftst, grdmax/30, 10**(-self.accint)))
self.scftargets.append([scftst*10, scftst])
while line.find("SCF CONVERGED") == -1 and line.find("SCF not fully converged, result acceptable") == -1 and line.find("SCF NOT CONVERGED") == -1:
if line[4:12] == "SCF test":
if not hasattr(self, "scfvalues"):
self.scfvalues = []
info = line.split()
newlist.append([float(info[4]), abs(float(info[6]))])
try:
line = next(inputfile)
except StopIteration: # EOF reached?
self.logger.warning("SCF did not converge, so attributes may be missing")
break
if line.find("SCF not fully converged, result acceptable") > 0:
self.logger.warning("SCF not fully converged, results acceptable")
if line.find("SCF NOT CONVERGED") > 0:
self.logger.warning("SCF did not converge! moenergies and mocoeffs are unreliable")
if hasattr(self, "scfvalues"):
self.scfvalues.append(newlist)
# Parse SCF energy for SP calcs from bonding energy decomposition section.
# It seems ADF does not print it earlier for SP calculations.
# Geometry optimization runs also print this, and we want to parse it
# for them, too, even if it repeats the last "Geometry Convergence Tests"
# section (but it's usually a bit different).
if line[:21] == "Total Bonding Energy:":
if not hasattr(self, "scfenergies"):
self.scfenergies = []
energy = utils.convertor(float(line.split()[3]), "hartree", "eV")
self.scfenergies.append(energy)
if line[51:65] == "Final Geometry":
self.finalgeometry = self.GETLAST
# Get the coordinates from each step of the GeoOpt.
if line[1:24] == "Coordinates (Cartesian)" and self.finalgeometry in [self.NOTFOUND, self.GETLAST]:
self.skip_lines(inputfile, ['e', 'b', 'title', 'title', 'd'])
atomcoords = []
line = next(inputfile)
while list(set(line.strip())) != ['-']:
atomcoords.append(list(map(float, line.split()[5:8])))
line = next(inputfile)
if not hasattr(self, "atomcoords"):
self.atomcoords = []
self.atomcoords.append(atomcoords)
# Don't get any more coordinates in this case.
# KML: I think we could combine this with optdone (see below).
if self.finalgeometry == self.GETLAST:
self.finalgeometry = self.NOMORE
# There have been some changes in the format of the geometry convergence information,
# and this is how it is printed in older versions (2007.01 unit tests).
#
# ==========================
# Geometry Convergence Tests
# ==========================
#
# Energy old : -5.14170647
# new : -5.15951374
#
# Convergence tests:
# (Energies in hartree, Gradients in hartree/angstr or radian, Lengths in angstrom, Angles in degrees)
#
# Item Value Criterion Conv. Ratio
# -------------------------------------------------------------------------
# change in energy -0.01780727 0.00100000 NO 0.00346330
# gradient max 0.03219530 0.01000000 NO 0.30402650
# gradient rms 0.00858685 0.00666667 NO 0.27221261
# cart. step max 0.07674971 0.01000000 NO 0.75559435
# cart. step rms 0.02132310 0.00666667 NO 0.55335378
#
if line[1:27] == 'Geometry Convergence Tests':
if not hasattr(self, "geotargets"):
self.geovalues = []
self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0, 0.0], "d")
if not hasattr(self, "scfenergies"):
self.scfenergies = []
self.skip_lines(inputfile, ['e', 'b'])
energies_old = next(inputfile)
energies_new = next(inputfile)
self.scfenergies.append(utils.convertor(float(energies_new.split()[-1]), "hartree", "eV"))
self.skip_lines(inputfile, ['b', 'convergence', 'units', 'b', 'header', 'd'])
values = []
for i in range(5):
temp = next(inputfile).split()
self.geotargets[i] = float(temp[-3])
values.append(float(temp[-4]))
self.geovalues.append(values)
# This is to make geometry optimization always have the optdone attribute,
# even if it is to be empty for unconverged runs.
if not hasattr(self, 'optdone'):
self.optdone = []
# After the test, there is a message if the search is converged:
#
# ***************************************************************************************************
# Geometry CONVERGED
# ***************************************************************************************************
#
if line.strip() == "Geometry CONVERGED":
self.skip_line(inputfile, 'stars')
self.optdone.append(len(self.geovalues) - 1)
# Here is the corresponding geometry convergence info from the 2013.01 unit test.
# Note that the step number is given, which it will be prudent to use in an assertion.
#
#----------------------------------------------------------------------
#Geometry Convergence after Step 3 (Hartree/Angstrom,Angstrom)
#----------------------------------------------------------------------
#current energy -5.16274478 Hartree
#energy change -0.00237544 0.00100000 F
#constrained gradient max 0.00884999 0.00100000 F
#constrained gradient rms 0.00249569 0.00066667 F
#gradient max 0.00884999
#gradient rms 0.00249569
#cart. step max 0.03331296 0.01000000 F
#cart. step rms 0.00844037 0.00666667 F
if line[:31] == "Geometry Convergence after Step":
stepno = int(line.split()[4])
# This is to make geometry optimization always have the optdone attribute,
# even if it is to be empty for unconverged runs.
if not hasattr(self, 'optdone'):
self.optdone = []
# The convergence message is inline in this block, not later as it was before.
if "** CONVERGED **" in line:
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues) - 1)
self.skip_line(inputfile, 'dashes')
current_energy = next(inputfile)
energy_change = next(inputfile)
constrained_gradient_max = next(inputfile)
constrained_gradient_rms = next(inputfile)
gradient_max = next(inputfile)
gradient_rms = next(inputfile)
cart_step_max = next(inputfile)
cart_step_rms = next(inputfile)
if not hasattr(self, "scfenergies"):
self.scfenergies = []
energy = utils.convertor(float(current_energy.split()[-2]), "hartree", "eV")
self.scfenergies.append(energy)
if not hasattr(self, "geotargets"):
self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0, 0.0], "d")
self.geotargets[0] = float(energy_change.split()[-2])
self.geotargets[1] = float(constrained_gradient_max.split()[-2])
self.geotargets[2] = float(constrained_gradient_rms.split()[-2])
self.geotargets[3] = float(cart_step_max.split()[-2])
self.geotargets[4] = float(cart_step_rms.split()[-2])
if not hasattr(self, "geovalues"):
self.geovalues = []
self.geovalues.append([])
self.geovalues[-1].append(float(energy_change.split()[-3]))
self.geovalues[-1].append(float(constrained_gradient_max.split()[-3]))
self.geovalues[-1].append(float(constrained_gradient_rms.split()[-3]))
self.geovalues[-1].append(float(cart_step_max.split()[-3]))
self.geovalues[-1].append(float(cart_step_rms.split()[-3]))
if line.find('Orbital Energies, per Irrep and Spin') > 0 and not hasattr(self, "mosyms") and self.nosymflag and not self.unrestrictedflag:
#Extracting orbital symmetries and energies, homos for nosym case
#Should only be for restricted case because there is a better text block for unrestricted and nosym
self.mosyms = [[]]
self.moenergies = [[]]
self.skip_lines(inputfile, ['e', 'header', 'd', 'label'])
line = next(inputfile)
info = line.split()
if not info[0] == '1':
self.logger.warning("MO info up to #%s is missing" % info[0])
#handle case where MO information up to a certain orbital are missing
while int(info[0]) - 1 != len(self.moenergies[0]):
self.moenergies[0].append(99999)
self.mosyms[0].append('A')
homoA = None
while len(line) > 10:
info = line.split()
self.mosyms[0].append('A')
self.moenergies[0].append(utils.convertor(float(info[2]), 'hartree', 'eV'))
if info[1] == '0.000' and not hasattr(self, 'homos'):
self.set_attribute('homos', [len(self.moenergies[0]) - 2])
line = next(inputfile)
self.moenergies = [numpy.array(self.moenergies[0], "d")]
if line[1:29] == 'Orbital Energies, both Spins' and not hasattr(self, "mosyms") and self.nosymflag and self.unrestrictedflag:
#Extracting orbital symmetries and energies, homos for nosym case
#should only be here if unrestricted and nosym
self.mosyms = [[], []]
moenergies = [[], []]
self.skip_lines(inputfile, ['d', 'b', 'header', 'd'])
homoa = 0
homob = None
line = next(inputfile)
while len(line) > 5:
info = line.split()
if info[2] == 'A':
self.mosyms[0].append('A')
moenergies[0].append(utils.convertor(float(info[4]), 'hartree', 'eV'))
if info[3] != '0.00':
homoa = len(moenergies[0]) - 1
elif info[2] == 'B':
self.mosyms[1].append('A')
moenergies[1].append(utils.convertor(float(info[4]), 'hartree', 'eV'))
if info[3] != '0.00':
homob = len(moenergies[1]) - 1
else:
print(("Error reading line: %s" % line))
line = next(inputfile)
self.moenergies = [numpy.array(x, "d") for x in moenergies]
self.set_attribute('homos', [homoa, homob])
# Extracting orbital symmetries and energies, homos.
if line[1:29] == 'Orbital Energies, all Irreps' and not hasattr(self, "mosyms"):
self.symlist = {}
self.mosyms = [[]]
self.moenergies = [[]]
self.skip_lines(inputfile, ['e', 'b', 'header', 'd'])
homoa = None
homob = None
#multiple = {'E':2, 'T':3, 'P':3, 'D':5}
# The above is set if there are no special irreps
names = [irrep[0].split(':')[0] for irrep in self.irreps]
counts = [len(irrep) for irrep in self.irreps]
multiple = dict(list(zip(names, counts)))
irrepspecies = {}
for n in range(len(names)):
indices = list(range(counts[n]))
subspecies = self.irreps[n]
irrepspecies[names[n]] = dict(list(zip(indices, subspecies)))
line = next(inputfile)
while line.strip():
info = line.split()
if len(info) == 5: # this is restricted
#count = multiple.get(info[0][0],1)
count = multiple.get(info[0], 1)
for repeat in range(count): # i.e. add E's twice, T's thrice
self.mosyms[0].append(self.normalisesym(info[0]))
self.moenergies[0].append(utils.convertor(float(info[3]), 'hartree', 'eV'))
sym = info[0]
if count > 1: # add additional sym label
sym = self.normalisedegenerates(info[0], repeat, ndict=irrepspecies)
try:
self.symlist[sym][0].append(len(self.moenergies[0])-1)
except KeyError:
self.symlist[sym] = [[]]
self.symlist[sym][0].append(len(self.moenergies[0])-1)
if info[2] == '0.00' and not hasattr(self, 'homos'):
self.homos = [len(self.moenergies[0]) - (count + 1)] # count, because need to handle degenerate cases
line = next(inputfile)
elif len(info) == 6: # this is unrestricted
if len(self.moenergies) < 2: # if we don't have space, create it
self.moenergies.append([])
self.mosyms.append([])
# count = multiple.get(info[0][0], 1)
count = multiple.get(info[0], 1)
if info[2] == 'A':
for repeat in range(count): # i.e. add E's twice, T's thrice
self.mosyms[0].append(self.normalisesym(info[0]))
self.moenergies[0].append(utils.convertor(float(info[4]), 'hartree', 'eV'))
sym = info[0]
if count > 1: # add additional sym label
sym = self.normalisedegenerates(info[0], repeat)
try:
self.symlist[sym][0].append(len(self.moenergies[0])-1)
except KeyError:
self.symlist[sym] = [[], []]
self.symlist[sym][0].append(len(self.moenergies[0])-1)
if info[3] == '0.00' and homoa is None:
homoa = len(self.moenergies[0]) - (count + 1) # count because degenerate cases need to be handled
if info[2] == 'B':
for repeat in range(count): # i.e. add E's twice, T's thrice
self.mosyms[1].append(self.normalisesym(info[0]))
self.moenergies[1].append(utils.convertor(float(info[4]), 'hartree', 'eV'))
sym = info[0]
if count > 1: # add additional sym label
sym = self.normalisedegenerates(info[0], repeat)
try:
self.symlist[sym][1].append(len(self.moenergies[1])-1)
except KeyError:
self.symlist[sym] = [[], []]
self.symlist[sym][1].append(len(self.moenergies[1])-1)
if info[3] == '0.00' and homob is None:
homob = len(self.moenergies[1]) - (count + 1)
line = next(inputfile)
else: # different number of lines
print(("Error", info))
if len(info) == 6: # still unrestricted, despite being out of loop
self.set_attribute('homos', [homoa, homob])
self.moenergies = [numpy.array(x, "d") for x in self.moenergies]
# Section on extracting vibdisps
# Also contains vibfreqs, but these are extracted in the
# following section (see below)
if line[1:28] == "Vibrations and Normal Modes":
self.vibdisps = []
self.skip_lines(inputfile, ['e', 'b', 'header', 'header', 'b', 'b'])
freqs = next(inputfile)
while freqs.strip() != "":
minus = next(inputfile)
p = [[], [], []]
for i in range(len(self.atomnos)):
broken = list(map(float, next(inputfile).split()[1:]))
for j in range(0, len(broken), 3):
p[j//3].append(broken[j:j+3])
self.vibdisps.extend(p[:(len(broken)//3)])
self.skip_lines(inputfile, ['b', 'b'])
freqs = next(inputfile)
self.vibdisps = numpy.array(self.vibdisps, "d")
if line[1:24] == "List of All Frequencies":
# Start of the IR/Raman frequency section
self.updateprogress(inputfile, "Frequency information", self.fupdate)
# self.vibsyms = [] # Need to look into this a bit more
self.vibirs = []
self.vibfreqs = []
for i in range(8):
line = next(inputfile)
line = next(inputfile).strip()
while line:
temp = line.split()
self.vibfreqs.append(float(temp[0]))
self.vibirs.append(float(temp[2])) # or is it temp[1]?
line = next(inputfile).strip()
self.vibfreqs = numpy.array(self.vibfreqs, "d")
self.vibirs = numpy.array(self.vibirs, "d")
if hasattr(self, "vibramans"):
self.vibramans = numpy.array(self.vibramans, "d")
#******************************************************************************************************************8
#delete this after new implementation using smat, eigvec print,eprint?
# Extract the number of basis sets
if line[1:49] == "Total nr. of (C)SFOs (summation over all irreps)":
nbasis = int(line.split(":")[1].split()[0])
self.set_attribute('nbasis', nbasis)
# now that we're here, let's extract aonames
self.fonames = []
self.start_indeces = {}
self.atombasis = [[] for frag in self.frags] # parse atombasis in the case of trivial SFOs
self.skip_line(inputfile, 'blank')
note = next(inputfile)
symoffset = 0
self.skip_line(inputfile, 'blank')
line = next(inputfile)
if len(line) > 2: # fix for ADF2006.01 as it has another note
self.skip_line(inputfile, 'blank')
line = next(inputfile)
self.skip_line(inputfile, 'blank')
self.nosymreps = []
while len(self.fonames) < self.nbasis:
symline = next(inputfile)
sym = symline.split()[1]
line = next(inputfile)
num = int(line.split(':')[1].split()[0])
self.nosymreps.append(num)
#read until line "--------..." is found
while line.find('-----') < 0:
line = next(inputfile)
line = next(inputfile) # the start of the first SFO
while len(self.fonames) < symoffset + num:
info = line.split()
#index0 index1 occ2 energy3/4 fragname5 coeff6 orbnum7 orbname8 fragname9
if not sym in list(self.start_indeces.keys()):
#have we already set the start index for this symmetry?
self.start_indeces[sym] = int(info[1])
orbname = info[8]
orbital = info[7] + orbname.replace(":", "")
fragname = info[5]
frag = fragname + info[9]
coeff = float(info[6])
# parse atombasis only in the case that all coefficients are 1
# and delete it otherwise
if hasattr(self, 'atombasis'):
if coeff == 1.:
ibas = int(info[0]) - 1
ifrag = int(info[9]) - 1
iat = self.frags[ifrag][0]
self.atombasis[iat].append(ibas)
else:
del self.atombasis
line = next(inputfile)
while line.strip() and not line[:7].strip(): # while it's the same SFO
# i.e. while not completely blank, but blank at the start
info = line[43:].split()
if len(info) > 0: # len(info)==0 for the second line of dvb_ir.adfout
frag += "+" + fragname + info[-1]
coeff = float(info[-4])
if coeff < 0:
orbital += '-' + info[-3] + info[-2].replace(":", "")
else:
orbital += '+' + info[-3] + info[-2].replace(":", "")
line = next(inputfile)
# At this point, we are either at the start of the next SFO or at
# a blank line...the end
self.fonames.append("%s_%s" % (frag, orbital))
symoffset += num
# blankline blankline
next(inputfile)
next(inputfile)
if line[1:32] == "S F O P O P U L A T I O N S ,":
#Extract overlap matrix
# self.fooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
symoffset = 0
for nosymrep in self.nosymreps:
line = next(inputfile)
while line.find('===') < 10: # look for the symmetry labels
line = next(inputfile)
self.skip_lines(inputfile, ['b', 'b'])
text = next(inputfile)
if text[13:20] != "Overlap": # verify this has overlap info
break
self.skip_lines(inputfile, ['b', 'col', 'row'])
if not hasattr(self, "fooverlaps"): # make sure there is a matrix to store this
self.fooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
base = 0
while base < nosymrep: # have we read all the columns?
for i in range(nosymrep - base):
self.updateprogress(inputfile, "Overlap", self.fupdate)
line = next(inputfile)
parts = line.split()[1:]
for j in range(len(parts)):
k = float(parts[j])
self.fooverlaps[base + symoffset + j, base + symoffset + i] = k
self.fooverlaps[base + symoffset + i, base + symoffset + j] = k
#blank, blank, column
for i in range(3):
next(inputfile)
base += 4
symoffset += nosymrep
base = 0
# The commented code below makes the atombasis attribute based on the BAS function in ADF,
# but this is probably not so useful, since SFOs are used to build MOs in ADF.
# if line[1:54] == "BAS: List of all Elementary Cartesian Basis Functions":
#
# self.atombasis = []
#
# # There will be some text, followed by a line:
# # (power of) X Y Z R Alpha on Atom
# while not line[1:11] == "(power of)":
# line = inputfile.next()
# dashes = inputfile.next()
# blank = inputfile.next()
# line = inputfile.next()
# # There will be two blank lines when there are no more atom types.
# while line.strip() != "":
# atoms = [int(i)-1 for i in line.split()[1:]]
# for n in range(len(atoms)):
# self.atombasis.append([])
# dashes = inputfile.next()
# line = inputfile.next()
# while line.strip() != "":
# indices = [int(i)-1 for i in line.split()[5:]]
# for i in range(len(indices)):
# self.atombasis[atoms[i]].append(indices[i])
# line = inputfile.next()
# line = inputfile.next()
if line[48:67] == "SFO MO coefficients":
self.mocoeffs = [numpy.zeros((self.nbasis, self.nbasis), "d")]
spin = 0
symoffset = 0
lastrow = 0
# Section ends with "1" at beggining of a line.
while line[0] != "1":
line = next(inputfile)
# If spin is specified, then there will be two coefficient matrices.
if line.strip() == "***** SPIN 1 *****":
self.mocoeffs = [numpy.zeros((self.nbasis, self.nbasis), "d"),
numpy.zeros((self.nbasis, self.nbasis), "d")]
# Bump up the spin.
if line.strip() == "***** SPIN 2 *****":
spin = 1
symoffset = 0
lastrow = 0
# Next symmetry.
if line.strip()[:4] == "=== ":
sym = line.split()[1]
if self.nosymflag:
aolist = list(range(self.nbasis))
else:
aolist = self.symlist[sym][spin]
# Add to the symmetry offset of AO ordering.
symoffset += lastrow
# Blocks with coefficient always start with "MOs :".
if line[1:6] == "MOs :":
# Next line has the MO index contributed to.
monumbers = [int(n) for n in line[6:].split()]
self.skip_lines(inputfile, ['occup', 'label'])
# The table can end with a blank line or "1".
row = 0
line = next(inputfile)
while not line.strip() in ["", "1"]:
info = line.split()
if int(info[0]) < self.start_indeces[sym]:
#check to make sure we aren't parsing CFs
line = next(inputfile)
continue
self.updateprogress(inputfile, "Coefficients", self.fupdate)
row += 1
coeffs = [float(x) for x in info[1:]]
moindices = [aolist[n-1] for n in monumbers]
# The AO index is 1 less than the row.
aoindex = symoffset + row - 1
for i in range(len(monumbers)):
self.mocoeffs[spin][moindices[i], aoindex] = coeffs[i]
line = next(inputfile)
lastrow = row
# **************************************************************************
# * *
# * Final excitation energies from Davidson algorithm *
# * *
# **************************************************************************
#
# Number of loops in Davidson routine = 20
# Number of matrix-vector multiplications = 24
# Type of excitations = SINGLET-SINGLET
#
# Symmetry B.u
#
# ... several blocks ...
#
# Normal termination of EXCITATION program part
if line[4:53] == "Final excitation energies from Davidson algorithm":
while line[1:9] != "Symmetry" and "Normal termination" not in line:
line = next(inputfile)
symm = self.normalisesym(line.split()[1])
# Excitation energies E in a.u. and eV, dE wrt prev. cycle,
# oscillator strengths f in a.u.
#
# no. E/a.u. E/eV f dE/a.u.
# -----------------------------------------------------
# 1 0.17084 4.6488 0.16526E-01 0.28E-08
# ...
while line.split() != ['no.', 'E/a.u.', 'E/eV', 'f', 'dE/a.u.'] and "Normal termination" not in line:
line = next(inputfile)
self.skip_line(inputfile, 'dashes')
etenergies = []
etoscs = []
etsyms = []
line = next(inputfile)
while len(line) > 2:
info = line.split()
etenergies.append(utils.convertor(float(info[2]), "eV", "wavenumber"))
etoscs.append(float(info[3]))
etsyms.append(symm)
line = next(inputfile)
# There is another section before this, with transition dipole moments,
# but this should just skip past it.
while line[1:53] != "Major MO -> MO transitions for the above excitations":
line = next(inputfile)
# Note that here, and later, the number of blank lines can vary between
# version of ADF (extra lines are seen in 2013.01 unit tests, for example).
self.skip_line(inputfile, 'blank')
excitation_occupied = next(inputfile)
header = next(inputfile)
while not header.strip():
header = next(inputfile)
header2 = next(inputfile)
x_y_z = next(inputfile)
line = next(inputfile)
while not line.strip():
line = next(inputfile)
# Before we start handeling transitions, we need to create mosyms
# with indices; only restricted calcs are possible in ADF.
counts = {}
syms = []
for mosym in self.mosyms[0]:
if list(counts.keys()).count(mosym) == 0:
counts[mosym] = 1
else:
counts[mosym] += 1
syms.append(str(counts[mosym]) + mosym)
etsecs = []
printed_warning = False
for i in range(len(etenergies)):
etsec = []
info = line.split()
while len(info) > 0:
match = re.search('[^0-9]', info[1])
index1 = int(info[1][:match.start(0)])
text = info[1][match.start(0):]
symtext = text[0].upper() + text[1:]
sym1 = str(index1) + self.normalisesym(symtext)
match = re.search('[^0-9]', info[3])
index2 = int(info[3][:match.start(0)])
text = info[3][match.start(0):]
symtext = text[0].upper() + text[1:]
sym2 = str(index2) + self.normalisesym(symtext)
try:
index1 = syms.index(sym1)
except ValueError:
if not printed_warning:
self.logger.warning("Etsecs are not accurate!")
printed_warning = True
try:
index2 = syms.index(sym2)
except ValueError:
if not printed_warning:
self.logger.warning("Etsecs are not accurate!")
printed_warning = True
etsec.append([(index1, 0), (index2, 0), float(info[4])])
line = next(inputfile)
info = line.split()
etsecs.append(etsec)
# Again, the number of blank lines between transition can vary.
line = next(inputfile)
while not line.strip():
line = next(inputfile)
if not hasattr(self, "etenergies"):
self.etenergies = etenergies
else:
self.etenergies += etenergies
if not hasattr(self, "etoscs"):
self.etoscs = etoscs
else:
self.etoscs += etoscs
if not hasattr(self, "etsyms"):
self.etsyms = etsyms
else:
self.etsyms += etsyms
if not hasattr(self, "etsecs"):
self.etsecs = etsecs
else:
self.etsecs += etsecs
if "M U L L I K E N P O P U L A T I O N S" in line:
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
while line[1:5] != "Atom":
line = next(inputfile)
self.skip_line(inputfile, 'dashes')
mulliken = []
line = next(inputfile)
while line.strip():
mulliken.append(float(line.split()[2]))
line = next(inputfile)
self.atomcharges["mulliken"] = mulliken
# Dipole moment is always printed after a point calculation,
# and the reference point for this is always the origin (0,0,0)
# and not necessarily the center of mass, as explained on the
# ADF user mailing list (see cclib/cclib#113 for details).
#
# =============
# Dipole Moment *** (Debye) ***
# =============
#
# Vector : 0.00000000 0.00000000 0.00000000
# Magnitude: 0.00000000
#
if line.strip()[:13] == "Dipole Moment":
self.skip_line(inputfile, 'equals')
# There is not always a blank line here, for example when the dipole and quadrupole
# moments are printed after the multipole derived atomic charges. Still, to the best
# of my knowledge (KML) the values are still in Debye.
line = next(inputfile)
if not line.strip():
line = next(inputfile)
assert line.split()[0] == "Vector"
dipole = [float(d) for d in line.split()[-3:]]
reference = [0.0, 0.0, 0.0]
if not hasattr(self, 'moments'):
self.moments = [reference, dipole]
else:
try:
assert self.moments[1] == dipole
except AssertionError:
self.logger.warning('Overwriting previous multipole moments with new values')
self.moments = [reference, dipole]
# Molecular response properties.
if line.strip()[1:-1].strip() == "RESPONSE program part":
while line.strip() != "Normal termination of RESPONSE program part":
if "THE DIPOLE-DIPOLE POLARIZABILITY TENSOR:" in line:
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.empty(shape=(3, 3))
self.skip_lines(inputfile, ['b', 'FREQUENCY', 'coordinates'])
# Ordering of rows/columns is Y, Z, X.
ordering = [1, 2, 0]
indices = list(itertools.product(ordering, ordering))
for i in range(3):
tokens = next(inputfile).split()
for j in range(3):
polarizability[indices[(i*3)+j]] = tokens[j]
self.polarizabilities.append(polarizability)
line = next(inputfile)
if line[:24] == ' Buffered I/O statistics':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/daltonparser.py 0000664 0000000 0000000 00000157063 13535330462 0020234 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for DALTON output files"""
from __future__ import print_function
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class DALTON(logfileparser.Logfile):
"""A DALTON log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(DALTON, self).__init__(logname="DALTON", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "DALTON log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'DALTON("%s")' % (self.filename)
def normalisesym(self, label):
"""DALTON does not require normalizing symmetry labels."""
return label
def before_parsing(self):
# Used to decide whether to wipe the atomcoords clean.
self.firststdorient = True
# Use to track which section/program output we are parsing,
# since some programs print out the same headers, which we
# would like to use as triggers.
self.section = None
# If there is no symmetry, assume this.
self.symlabels = ['Ag']
# Is the basis set from a single library file? This is true
# when the first line is BASIS, false for INTGRL/ATOMBASIS.
self.basislibrary = True
def parse_geometry(self, lines):
"""Parse DALTON geometry lines into an atomcoords array."""
coords = []
for lin in lines:
# Without symmetry there are simply four columns, and with symmetry
# an extra label is printed after the atom type.
cols = lin.split()
if cols[1][0] == "_":
xyz = cols[2:]
else:
xyz = cols[1:]
# The assumption is that DALTON always print in atomic units.
xyz = [utils.convertor(float(x), 'bohr', 'Angstrom') for x in xyz]
coords.append(xyz)
return coords
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number and optionally the Git revision
# number.
#
# Example strings that at least the major version is parsed from:
#
# This is output from DALTON 2013.2
# 2013.4
# 2014.0
# 2014.alpha
# 2015.0
# 2016.alpha
# (Release 1.1, September 2000)
# Release 2011 (DEVELOPMENT VERSION)
# Release 2011 (Rev. 0, Dec. 2010)
# Release 2011 (Rev. 0, Mar. 2011)
# (Release 2.0 rev. 0, Mar. 2005)
# (Release Dalton2013 patch 0)
# release Dalton2017.alpha (2016)
# release Dalton2017.alpha (2017)
# release Dalton2018.0 (2018)
# release Dalton2018.0-rc (2018)
# release Dalton2018.alpha (2018)
# release Dalton2019.alpha (2018)
if line[4:30] == "This is output from DALTON":
rs = (r"from DALTON \(?(?:Release|release)?\s?(?:Dalton)?"
r"(\d+\.?[\w\d\-]*)"
r"(?:[\s,]\(?)?")
match = re.search(rs, line)
if match:
self.metadata["package_version"] = match.groups()[0]
# Don't add revision information to the main package version for now.
if "Last Git revision" in line:
revision = line.split()[4]
# Is the basis set from a single library file, or is it
# manually specified? See before_parsing().
if line[:6] == 'INTGRL'or line[:9] == 'ATOMBASIS':
self.basislibrary = False
# This section at the start of geometry optimization jobs gives us information
# about optimization targets (geotargets) and possibly other things as well.
# Notice how the number of criteria required to converge is set to 2 here,
# but this parameter can (probably) be tweaked in the input.
#
# Chosen parameters for *OPTIMI :
# -------------------------------
#
# Default 1st order method will be used: BFGS update.
# Optimization will be performed in redundant internal coordinates (by default).
# Model Hessian will be used as initial Hessian.
# The model Hessian parameters of Roland Lindh will be used.
#
#
# Trust region method will be used to control step (default).
#
# Convergence threshold for gradient set to : 1.00D-04
# Convergence threshold for energy set to : 1.00D-06
# Convergence threshold for step set to : 1.00D-04
# Number of convergence criteria set to : 2
#
if line.strip()[:25] == "Convergence threshold for":
if not hasattr(self, 'geotargets'):
self.geotargets = []
self.geotargets_names = []
target = self.float(line.split()[-1])
name = line.strip()[25:].split()[0]
self.geotargets.append(target)
self.geotargets_names.append(name)
# This is probably the first place where atomic symmetry labels are printed,
# somewhere afer the SYMGRP point group information section. We need to know
# which atom is in which symmetry, since this influences how some things are
# print later on. We can also get some generic attributes along the way.
#
# Isotopic Masses
# ---------------
#
# C _1 12.000000
# C _2 12.000000
# C _1 12.000000
# C _2 12.000000
# ...
#
# Note that when there is no symmetry there are only two columns here.
#
# It is also a good idea to keep in mind that DALTON, with symmetry on, operates
# in a specific point group, so symmetry atoms have no internal representation.
# Therefore only atoms marked as "_1" or "#1" in other places are actually
# represented in the model. The symmetry atoms (higher symmetry indices) are
# generated on the fly when writing the output. We will save the symmetry indices
# here for later use.
#
# Additional note: the symmetry labels are printed only for atoms that have
# symmetry images... so assume "_1" if a label is missing. For example, there will
# be no label for atoms on an axes, such as the oxygen in water in C2v:
#
# O 15.994915
# H _1 1.007825
# H _2 1.007825
#
if line.strip() == "Isotopic Masses":
self.skip_lines(inputfile, ['d', 'b'])
# Since some symmetry labels may be missing, read in all lines first.
lines = []
line = next(inputfile)
while line.strip():
lines.append(line)
line = next(inputfile)
# Split lines into columsn and dd any missing symmetry labels, if needed.
lines = [l.split() for l in lines]
if any([len(l) == 3 for l in lines]):
for il, l in enumerate(lines):
if len(l) == 2:
lines[il] = [l[0], "_1", l[1]]
atomnos = []
symmetry_atoms = []
atommasses = []
for cols in lines:
cols0 = ''.join([i for i in cols[0] if not i.isdigit()]) #remove numbers
atomnos.append(self.table.number[cols0])
if len(cols) == 3:
symmetry_atoms.append(int(cols[1][1]))
atommasses.append(float(cols[2]))
else:
atommasses.append(float(cols[1]))
self.set_attribute('atomnos', atomnos)
self.set_attribute('atommasses', atommasses)
self.set_attribute('natom', len(atomnos))
self.set_attribute('natom', len(atommasses))
# Save this for later if there were any labels.
self.symmetry_atoms = symmetry_atoms or None
# This section is close to the beginning of the file, and can be used
# to parse natom, nbasis and atomnos. We also construct atombasis here,
# although that is symmetry-dependent (see inline comments). Note that
# DALTON operates on the idea of atom type, which are not necessarily
# unique element-wise.
#
# Atoms and basis sets
# --------------------
#
# Number of atom types : 6
# Total number of atoms: 20
#
# Basis set used is "STO-3G" from the basis set library.
#
# label atoms charge prim cont basis
# ----------------------------------------------------------------------
# C 6 6.0000 15 5 [6s3p|2s1p]
# H 4 1.0000 3 1 [3s|1s]
# C 2 6.0000 15 5 [6s3p|2s1p]
# H 2 1.0000 3 1 [3s|1s]
# C 2 6.0000 15 5 [6s3p|2s1p]
# H 4 1.0000 3 1 [3s|1s]
# ----------------------------------------------------------------------
# total: 20 70.0000 180 60
# ----------------------------------------------------------------------
#
# Threshold for neglecting AO integrals: 1.00D-12
#
if line.strip() == "Atoms and basis sets":
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
assert "Number of atom types" in line
self.ntypes = int(line.split()[-1])
line = next(inputfile)
assert "Total number of atoms:" in line
self.set_attribute("natom", int(line.split()[-1]))
# When using the INTGRL keyword and not pulling from the
# basis set library, the "Basis set used" line doesn't
# appear.
if not self.basislibrary:
self.skip_line(inputfile, 'b')
else:
#self.skip_lines(inputfile, ['b', 'basisname', 'b'])
line = next(inputfile)
line = next(inputfile)
self.metadata["basis_set"] = line.split()[4].strip('\"')
line = next(inputfile)
line = next(inputfile)
cols = line.split()
# Detecting which columns things are in will be somewhat more robust
# to formatting changes in the future.
iatoms = cols.index('atoms')
icharge = cols.index('charge')
icont = cols.index('cont')
self.skip_line(inputfile, 'dashes')
atomnos = []
atombasis = []
nbasis = 0
for itype in range(self.ntypes):
line = next(inputfile)
cols = line.split()
atoms = int(cols[iatoms])
charge = float(cols[icharge])
assert int(charge) == charge
charge = int(charge)
cont = int(cols[icont])
for at in range(atoms):
atomnos.append(charge)
# If symmetry atoms are present, these will have basis functions
# printed immediately after the one unique atom, so for all
# practical purposes cclib can assume the ordering in atombasis
# follows this out-of order scheme to match the output.
if self.symmetry_atoms:
# So we extend atombasis only for the unique atoms (with a
# symmetry index of 1), interleaving the basis functions
# for this atoms with basis functions for all symmetry atoms.
if self.symmetry_atoms[at] == 1:
nsyms = 1
while (at + nsyms < self.natom) and self.symmetry_atoms[at + nsyms] == nsyms + 1:
nsyms += 1
for isym in range(nsyms):
istart = nbasis + isym
iend = nbasis + cont*nsyms + isym
atombasis.append(list(range(istart, iend, nsyms)))
nbasis += cont*nsyms
else:
atombasis.append(list(range(nbasis, nbasis + cont)))
nbasis += cont
self.set_attribute('atomnos', atomnos)
self.set_attribute('atombasis', atombasis)
self.set_attribute('nbasis', nbasis)
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
self.set_attribute('natom', int(line.split()[iatoms]))
self.set_attribute('nbasis', int(line.split()[icont]))
self.skip_line(inputfile, 'dashes')
# The Gaussian exponents and contraction coefficients are printed for each primitive
# and then the contraction information is printed separately (see below) Both segmented
# and general contractions are used, but we can parse them the same way since zeros are
# inserted for primitives that are not used. However, no atom index is printed here
# so we don't really know when a new atom is started without using information
# from other section (we should already have atombasis parsed at this point).
#
# Orbital exponents and contraction coefficients
# ----------------------------------------------
#
#
# C #1 1s 1 71.616837 0.1543 0.0000
# seg. cont. 2 13.045096 0.5353 0.0000
# 3 3.530512 0.4446 0.0000
# 4 2.941249 0.0000 -0.1000
# ...
#
# Here is a corresponding fragment for general contractions:
#
# C 1s 1 33980.000000 0.0001 -0.0000 0.0000 0.0000 0.0000
# 0.0000 0.0000 0.0000 0.0000
# gen. cont. 2 5089.000000 0.0007 -0.0002 0.0000 0.0000 0.0000
# 0.0000 0.0000 0.0000 0.0000
# 3 1157.000000 0.0037 -0.0008 0.0000 0.0000 0.0000
# 0.0000 0.0000 0.0000 0.0000
# 4 326.600000 0.0154 -0.0033 0.0000 0.0000 0.0000
# ...
#
if line.strip() == "Orbital exponents and contraction coefficients":
self.skip_lines(inputfile, ['d', 'b', 'b'])
# Here we simply want to save the numbers defining each primitive for later use,
# where the first number is the exponent, and the rest are coefficients which
# should be zero if the primitive is not used in a contraction. This list is
# symmetry agnostic, although primitives/contractions are not generally.
self.primitives = []
prims = []
line = next(inputfile)
while line.strip():
# Each contraction/section is separated by a blank line, and at the very
# end there is an extra blank line.
while line.strip():
# For generalized contraction it is typical to see the coefficients wrapped
# to new lines, so we must collect them until we are sure a primitive starts.
if line[:30].strip():
if prims:
self.primitives.append(prims)
prims = []
prims += [float(x) for x in line[20:].split()]
line = next(inputfile)
line = next(inputfile)
# At the end we have the final primitive to save.
self.primitives.append(prims)
# This is the corresponding section to the primitive definitions parsed above, so we
# assume those numbers are available in the variable 'primitives'. Here we read in the
# indicies of primitives, which we use to construct gbasis.
#
# Contracted Orbitals
# -------------------
#
# 1 C 1s 1 2 3 4 5 6 7 8 9 10 11 12
# 2 C 1s 1 2 3 4 5 6 7 8 9 10 11 12
# 3 C 1s 10
# 4 C 1s 11
# ...
#
# Here is an fragment with symmetry labels:
#
# ...
# 1 C #1 1s 1 2 3
# 2 C #2 1s 7 8 9
# 3 C #1 1s 4 5 6
# ...
#
if line.strip() == "Contracted Orbitals":
self.skip_lines(inputfile, ['d', 'b'])
# This is the reverse of atombasis, so that we can easily map from a basis functions
# to the corresponding atom for use in the loop below.
basisatoms = [None for i in range(self.nbasis)]
for iatom in range(self.natom):
for ibasis in self.atombasis[iatom]:
basisatoms[ibasis] = iatom
# Since contractions are not generally given in order (when there is symmetry),
# start with an empty list for gbasis.
gbasis = [[] for i in range(self.natom)]
# This will hold the number of contractions already printed for each orbital,
# counting symmetry orbitals separately.
orbitalcount = {}
for ibasis in range(self.nbasis):
line = next(inputfile)
cols = line.split()
# The first columns is always the basis function index, which we can assert.
assert int(cols[0]) == ibasis + 1
# The number of columns is differnet when symmetry is used. If there are further
# complications, it may be necessary to use exact slicing, since the formatting
# of this section seems to be fixed (although columns can be missing). Notice how
# We subtract one from the primitive indices here already to match cclib's
# way of counting from zero in atombasis.
if '#' in line:
sym = cols[2]
orbital = cols[3]
prims = [int(i) - 1 for i in cols[4:]]
else:
sym = None
orbital = cols[2]
prims = [int(i) - 1 for i in cols[3:]]
shell = orbital[0]
subshell = orbital[1].upper()
iatom = basisatoms[ibasis]
# We want to count the number of contractiong already parsed for each orbital,
# but need to make sure to differentiate between atoms and symmetry atoms.
orblabel = str(iatom) + '.' + orbital + (sym or "")
orbitalcount[orblabel] = orbitalcount.get(orblabel, 0) + 1
# Here construct the actual primitives for gbasis, which should be a list
# of 2-tuples containing an exponent an coefficient. Note how we are indexing
# self.primitives from zero although the printed numbering starts from one.
primitives = []
for ip in prims:
p = self.primitives[ip]
exponent = p[0]
coefficient = p[orbitalcount[orblabel]]
primitives.append((exponent, coefficient))
contraction = (subshell, primitives)
if contraction not in gbasis[iatom]:
gbasis[iatom].append(contraction)
self.skip_line(inputfile, 'blank')
self.set_attribute('gbasis', gbasis)
# Since DALTON sometimes uses symmetry labels (Ag, Au, etc.) and sometimes
# just the symmetry group index, we need to parse and keep a mapping between
# these two for later use.
#
# Symmetry Orbitals
# -----------------
#
# Number of orbitals in each symmetry: 25 5 25 5
#
#
# Symmetry Ag ( 1)
#
# 1 C 1s 1 + 2
# 2 C 1s 3 + 4
# ...
#
if line.strip() == "Symmetry Orbitals":
self.skip_lines(inputfile, ['d', 'b'])
line = inputfile.next()
self.symcounts = [int(c) for c in line.split(':')[1].split()]
self.symlabels = []
for sc in self.symcounts:
self.skip_lines(inputfile, ['b', 'b'])
# If the number of orbitals for a symmetry is zero, the printout
# is different (see MP2 unittest logfile for an example).
line = inputfile.next()
if sc == 0:
assert "No orbitals in symmetry" in line
else:
assert line.split()[0] == "Symmetry"
self.symlabels.append(line.split()[1])
self.skip_line(inputfile, 'blank')
for i in range(sc):
orbital = inputfile.next()
if "Starting in Wave Function Section (SIRIUS)" in line:
self.section = "SIRIUS"
# Orbital specifications
# ======================
# Abelian symmetry species All | 1 2 3 4
# | Ag Au Bu Bg
# --- | --- --- --- ---
# Total number of orbitals 60 | 25 5 25 5
# Number of basis functions 60 | 25 5 25 5
#
# ** Automatic occupation of RKS orbitals **
#
# -- Initial occupation of symmetries is determined from extended Huckel guess.
# -- Initial occupation of symmetries is :
# @ Occupied SCF orbitals 35 | 15 2 15 3
#
# Maximum number of Fock iterations 0
# Maximum number of DIIS iterations 60
# Maximum number of QC-SCF iterations 60
# Threshold for SCF convergence 1.00D-05
# This is a DFT calculation of type: B3LYP
# ...
#
if "Total number of orbitals" in line:
# DALTON 2015 adds a @ in front of number of orbitals
chomp = line.split()
index = 4
if "@" in chomp:
index = 5
self.set_attribute("nbasis", int(chomp[index]))
self.nmo_per_symmetry = list(map(int, chomp[index+2:]))
assert self.nbasis == sum(self.nmo_per_symmetry)
if "Threshold for SCF convergence" in line:
if not hasattr(self, "scftargets"):
self.scftargets = []
scftarget = self.float(line.split()[-1])
self.scftargets.append([scftarget])
# Wave function specification
# ============================
# @ Wave function type >>> KS-DFT <<<
# @ Number of closed shell electrons 70
# @ Number of electrons in active shells 0
# @ Total charge of the molecule 0
#
# @ Spin multiplicity and 2 M_S 1 0
# @ Total number of symmetries 4 (point group: C2h)
# @ Reference state symmetry 1 (irrep name : Ag )
#
# This is a DFT calculation of type: B3LYP
# ...
#
if line.strip() == "Wave function specification":
self.skip_line(inputfile, 'e')
line = next(inputfile)
# Must be a coupled cluster calculation.
if line.strip() == '':
self.skip_lines(inputfile, ['b', 'Coupled Cluster', 'b'])
else:
assert "wave function" in line.lower()
line = next(inputfile)
assert "Number of closed shell electrons" in line
self.paired_electrons = int(line.split()[-1])
line = next(inputfile)
assert "Number of electrons in active shells" in line
self.unpaired_electrons = int(line.split()[-1])
line = next(inputfile)
assert "Total charge of the molecule" in line
self.set_attribute("charge", int(line.split()[-1]))
self.skip_line(inputfile, 'b')
line = next(inputfile)
assert "Spin multiplicity and 2 M_S" in line
self.set_attribute("mult", int(line.split()[-2]))
# Dalton only has ROHF, no UHF
if self.mult != 1:
self.metadata["unrestricted"] = True
if not hasattr(self, 'homos'):
self.set_attribute('homos', [(self.paired_electrons // 2) - 1])
if self.unpaired_electrons > 0:
self.homos.append(self.homos[0])
self.homos[0] += self.unpaired_electrons
# *********************************************
# ***** DIIS optimization of Hartree-Fock *****
# *********************************************
#
# C1-DIIS algorithm; max error vectors = 8
#
# Automatic occupation of symmetries with 70 electrons.
#
# Iter Total energy Error norm Delta(E) SCF occupation
# -----------------------------------------------------------------------------
# K-S energy, electrons, error : -46.547567739269 69.9999799123 -2.01D-05
# @ 1 -381.645762476 4.00D+00 -3.82D+02 15 2 15 3
# Virial theorem: -V/T = 2.008993
# @ MULPOP C _1 0.15; C _2 0.15; C _1 0.12; C _2 0.12; C _1 0.11; C _2 0.11; H _1 -0.15; H _2 -0.15; H _1 -0.14; H _2 -0.14;
# @ C _1 0.23; C _2 0.23; H _1 -0.15; H _2 -0.15; C _1 0.08; C _2 0.08; H _1 -0.12; H _2 -0.12; H _1 -0.13; H _2 -0.13;
# -----------------------------------------------------------------------------
# K-S energy, electrons, error : -46.647668038900 69.9999810430 -1.90D-05
# @ 2 -381.949410128 1.05D+00 -3.04D-01 15 2 15 3
# Virial theorem: -V/T = 2.013393
# ...
#
# With and without symmetry, the "Total energy" line is shifted a little.
if self.section == "SIRIUS" and "Iter" in line and "Total energy" in line:
iteration = 0
converged = False
values = []
if not hasattr(self, "scfvalues"):
self.scfvalues = []
while not converged:
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF!')
break
# each iteration is bracketed by "-------------"
if "-------------------" in line:
iteration += 1
continue
# the first hit of @ n where n is the current iteration
strcompare = "@{0:>3d}".format(iteration)
if strcompare in line:
temp = line.split()
error_norm = self.float(temp[3])
values.append([error_norm])
if line[0] == "@" and "converged in" in line:
converged = True
# It seems DALTON does change the SCF convergence criteria during a
# geometry optimization, but also does not print them. So, assume they
# are unchanged and copy the initial values after the first step. However,
# it would be good to check up on this - perhaps it is possible to print.
self.scfvalues.append(values)
if len(self.scfvalues) > 1:
self.scftargets.append(self.scftargets[-1])
# DALTON organizes the energies by symmetry, so we need to parse first,
# and then sort the energies (and labels) before we store them.
#
# The formatting varies depending on RHF/DFT and/or version. Here is
# an example from a DFT job:
#
# *** SCF orbital energy analysis ***
#
# Only the five lowest virtual orbital energies printed in each symmetry.
#
# Number of electrons : 70
# Orbital occupations : 15 2 15 3
#
# Sym Kohn-Sham orbital energies
#
# 1 Ag -10.01616533 -10.00394288 -10.00288640 -10.00209612 -9.98818062
# -0.80583154 -0.71422407 -0.58487249 -0.55551093 -0.50630125
# ...
#
# Here is an example from an RHF job that only has symmetry group indices:
#
# *** SCF orbital energy analysis ***
#
# Only the five lowest virtual orbital energies printed in each symmetry.
#
# Number of electrons : 70
# Orbital occupations : 15 2 15 3
#
# Sym Hartree-Fock orbital energies
#
# 1 -11.04052518 -11.03158921 -11.02882211 -11.02858563 -11.01747921
# -1.09029777 -0.97492511 -0.79988247 -0.76282547 -0.69677619
# ...
#
if self.section == "SIRIUS" and "*** SCF orbital energy analysis ***" in line:
# to get ALL orbital energies, the .PRINTLEVELS keyword needs
# to be at least 0,10 (up from 0,5). I know, obvious, right?
# this, however, will conflict with the scfvalues output that
# changes into some weird form of DIIS debug output.
mosyms = []
moenergies = []
self.skip_line(inputfile, 'blank')
line = next(inputfile)
# There is some extra text between the section header and
# the number of electrons for open-shell calculations.
while "Number of electrons" not in line:
line = next(inputfile)
nelectrons = int(line.split()[-1])
line = next(inputfile)
occupations = [int(o) for o in line.split()[3:]]
nsym = len(occupations)
self.skip_lines(inputfile, ['b', 'header', 'b'])
# now parse nsym symmetries
for isym in range(nsym):
# For unoccupied symmetries, nothing is printed here.
if occupations[isym] == 0:
continue
# When there are exactly five energies printed (on just one line), it seems
# an extra blank line is printed after a block.
line = next(inputfile)
if not line.strip():
line = next(inputfile)
cols = line.split()
# The first line has the orbital symmetry information, but sometimes
# it's the label and sometimes it's the index. There are always five
# energies per line, though, so we can deduce if we have the labels or
# not just the index. In the latter case, we depend on the labels
# being read earlier into the list `symlabels`. Finally, if no symlabels
# were read that implies there is only one symmetry, namely Ag.
if 'A' in cols[1] or 'B' in cols[1]:
sym = self.normalisesym(cols[1])
energies = [float(t) for t in cols[2:]]
else:
if hasattr(self, 'symlabels'):
sym = self.normalisesym(self.symlabels[int(cols[0]) - 1])
else:
assert cols[0] == '1'
sym = "Ag"
energies = [float(t) for t in cols[1:]]
while len(energies) > 0:
moenergies.extend(energies)
mosyms.extend(len(energies)*[sym])
line = next(inputfile)
energies = [float(col) for col in line.split()]
# now sort the data about energies and symmetries. see the following post for the magic
# http://stackoverflow.com/questions/19339/a-transpose-unzip-function-in-python-inverse-of-zip
sdata = sorted(zip(moenergies, mosyms), key=lambda x: x[0])
moenergies, mosyms = zip(*sdata)
self.moenergies = [[]]
self.moenergies[0] = [utils.convertor(moenergy, 'hartree', 'eV') for moenergy in moenergies]
self.mosyms = [[]]
self.mosyms[0] = mosyms
if not hasattr(self, "nmo"):
self.nmo = self.nbasis
if len(self.moenergies[0]) != self.nmo:
self.set_attribute('nmo', len(self.moenergies[0]))
# .-----------------------------------.
# | >>> Final results from SIRIUS <<< |
# `-----------------------------------'
#
#
# @ Spin multiplicity: 1
# @ Spatial symmetry: 1 ( irrep Ag in C2h )
# @ Total charge of molecule: 0
#
# @ Final DFT energy: -382.050716652387
# @ Nuclear repulsion: 445.936979976608
# @ Electronic energy: -827.987696628995
#
# @ Final gradient norm: 0.000003746706
# ...
#
if "Final HF energy" in line and not (hasattr(self, "mpenergies") or hasattr(self, "ccenergies")):
self.metadata["methods"].append("HF")
if "Final DFT energy" in line:
self.metadata["methods"].append("DFT")
if "This is a DFT calculation of type" in line:
self.metadata["functional"] = line.split()[-1]
if "Final DFT energy" in line or "Final HF energy" in line:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
temp = line.split()
self.scfenergies.append(utils.convertor(float(temp[-1]), "hartree", "eV"))
if "@ = MP2 second order energy" in line:
self.metadata["methods"].append("MP2")
energ = utils.convertor(float(line.split()[-1]), 'hartree', 'eV')
if not hasattr(self, "mpenergies"):
self.mpenergies = []
self.mpenergies.append([])
self.mpenergies[-1].append(energ)
if "Total CCSD energy:" in line:
self.metadata["methods"].append("CCSD")
energ = utils.convertor(float(line.split()[-1]), 'hartree', 'eV')
if not hasattr(self, "ccenergies"):
self.ccenergies = []
self.ccenergies.append(energ)
if "Total energy CCSD(T)" in line:
self.metadata["methods"].append("CCSD(T)")
energ = utils.convertor(float(line.split()[-1]), 'hartree', 'eV')
if not hasattr(self, "ccenergies"):
self.ccenergies = []
self.ccenergies.append(energ)
# The molecular geometry requires the use of .RUN PROPERTIES in the input.
# Note that the second column is not the nuclear charge, but the atom type
# index used internally by DALTON.
#
# Molecular geometry (au)
# -----------------------
#
# C _1 1.3498778652 2.3494125195 0.0000000000
# C _2 -1.3498778652 -2.3494125195 0.0000000000
# C _1 2.6543517307 0.0000000000 0.0000000000
# ...
#
if "Molecular geometry (au)" in line:
if not hasattr(self, "atomcoords"):
self.atomcoords = []
if self.firststdorient:
self.firststdorient = False
self.skip_lines(inputfile, ['d', 'b'])
lines = [next(inputfile) for i in range(self.natom)]
atomcoords = self.parse_geometry(lines)
self.atomcoords.append(atomcoords)
if "Optimization Control Center" in line:
self.section = "OPT"
assert set(next(inputfile).strip()) == set(":")
# During geometry optimizations the geometry is printed in the section
# that is titles "Optimization Control Center". Note that after an optimizations
# finishes, DALTON normally runs another "static property section (ABACUS)",
# so the final geometry will be repeated in atomcoords.
#
# Next geometry (au)
# ------------------
#
# C _1 1.3203201560 2.3174808341 0.0000000000
# C _2 -1.3203201560 -2.3174808341 0.0000000000
# ...
if self.section == "OPT" and line.strip() == "Next geometry (au)":
self.skip_lines(inputfile, ['d', 'b'])
lines = [next(inputfile) for i in range(self.natom)]
coords = self.parse_geometry(lines)
self.atomcoords.append(coords)
# This section contains data for optdone and geovalues, although we could use
# it to double check some atttributes that were parsed before.
#
# Optimization information
# ------------------------
#
# Iteration number : 4
# End of optimization : T
# Energy at this geometry is : -379.777956
# Energy change from last geom. : -0.000000
# Predicted change : -0.000000
# Ratio, actual/predicted change : 0.952994
# Norm of gradient : 0.000058
# Norm of step : 0.000643
# Updated trust radius : 0.714097
# Total Hessian index : 0
#
if self.section == "OPT" and line.strip() == "Optimization information":
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
assert 'Iteration number' in line
iteration = int(line.split()[-1])
line = next(inputfile)
assert 'End of optimization' in line
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(line.split()[-1] == 'T')
# We need a way to map between lines here and the targets stated at the
# beginning of the file in 'Chosen parameters for *OPTIMI (see above),
# and this dictionary facilitates that. The keys are target names parsed
# in that initial section after input processing, and the values are
# substrings that should appear in the lines in this section. Make an
# exception for the energy at iteration zero where there is no gradient,
# and take the total energy for geovalues.
targets_labels = {
'gradient': 'Norm of gradient',
'energy': 'Energy change from last',
'step': 'Norm of step',
}
values = [numpy.nan] * len(self.geotargets)
while line.strip():
if iteration == 0 and "Energy at this geometry" in line:
index = self.geotargets_names.index('energy')
values[index] = self.float(line.split()[-1])
for tgt, lbl in targets_labels.items():
if lbl in line and tgt in self.geotargets_names:
index = self.geotargets_names.index(tgt)
values[index] = self.float(line.split()[-1])
line = next(inputfile)
# If we're missing something above, throw away the partial geovalues since
# we don't want artificial NaNs getting into cclib. Instead, fix the dictionary
# to make things work.
if not numpy.nan in values:
if not hasattr(self, 'geovalues'):
self.geovalues = []
self.geovalues.append(values)
# -------------------------------------------------
# extract the center of mass line
if "Center-of-mass coordinates (a.u.):" in line:
temp = line.split()
reference = [utils.convertor(float(temp[i]), "bohr", "Angstrom") for i in [3, 4, 5]]
if not hasattr(self, 'moments'):
self.moments = [reference]
# -------------------------------------------------
# Extract the dipole moment
if "Dipole moment components" in line:
dipole = numpy.zeros(3)
line = next(inputfile)
line = next(inputfile)
line = next(inputfile)
if not "zero by symmetry" in line:
line = next(inputfile)
line = next(inputfile)
temp = line.split()
for i in range(3):
dipole[i] = float(temp[2]) # store the Debye value
if hasattr(self, 'moments'):
self.moments.append(dipole)
## 'vibfreqs', 'vibirs', and 'vibsyms' appear in ABACUS.
# Vibrational Frequencies and IR Intensities
# ------------------------------------------
#
# mode irrep frequency IR intensity
# ============================================================
# cm-1 hartrees km/mol (D/A)**2/amu
# ------------------------------------------------------------
# 1 A 3546.72 0.016160 0.000 0.0000
# 2 A 3546.67 0.016160 0.024 0.0006
# ...
if "Vibrational Frequencies and IR Intensities" in line:
self.skip_lines(inputfile, ['dashes', 'blank'])
line = next(inputfile)
assert line.strip() == "mode irrep frequency IR intensity"
self.skip_line(inputfile, 'equals')
line = next(inputfile)
assert line.strip() == "cm-1 hartrees km/mol (D/A)**2/amu"
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
# The normal modes are in order of decreasing IR
# frequency, so they can't be added directly to
# attributes; they must be grouped together first, sorted
# in order of increasing frequency, then added to their
# respective attributes.
vibdata = []
while line.strip():
sline = line.split()
vibsym = sline[1]
vibfreq = float(sline[2])
vibir = float(sline[4])
vibdata.append((vibfreq, vibir, vibsym))
line = next(inputfile)
vibdata.sort(key=lambda normalmode: normalmode[0])
self.vibfreqs = [normalmode[0] for normalmode in vibdata]
self.vibirs = [normalmode[1] for normalmode in vibdata]
self.vibsyms = [normalmode[2] for normalmode in vibdata]
# Now extract the normal mode displacements.
self.skip_lines(inputfile, ['b', 'b'])
line = next(inputfile)
assert line.strip() == "Normal Coordinates (bohrs*amu**(1/2)):"
# Normal Coordinates (bohrs*amu**(1/2)):
# --------------------------------------
#
#
# 1 3547 2 3547 3 3474 4 3471 5 3451
# ----------------------------------------------------------------------
#
# C x -0.000319 -0.000314 0.002038 0.000003 -0.001599
# C y -0.000158 -0.000150 -0.001446 0.003719 -0.002576
# C z 0.000000 -0.000000 -0.000000 0.000000 -0.000000
#
# C x 0.000319 -0.000315 -0.002038 0.000003 0.001600
# C y 0.000157 -0.000150 0.001448 0.003717 0.002577
# ...
self.skip_line(inputfile, 'd')
line = next(inputfile)
vibdisps = numpy.empty(shape=(len(self.vibirs), self.natom, 3))
ndisps = 0
while ndisps < len(self.vibirs):
# Skip two blank lines.
line = next(inputfile)
line = next(inputfile)
# Use the header with the normal mode indices and
# frequencies to update where we are.
ndisps_block = (len(line.split()) // 2)
mode_min, mode_max = ndisps, ndisps + ndisps_block
# Skip a line of dashes and a blank line.
line = next(inputfile)
line = next(inputfile)
for w in range(self.natom):
for coord in range(3):
line = next(inputfile)
vibdisps[mode_min:mode_max, w, coord] = [float(i) for i in line.split()[2:]]
# Skip a blank line.
line = next(inputfile)
ndisps += ndisps_block
# The vibrational displacements are in the wrong order;
# reverse them.
self.vibdisps = vibdisps[::-1, :, :]
## 'vibramans'
# Raman related properties for freq. 0.000000 au = Infinity nm
# ---------------------------------------------------------------
#
# Mode Freq. Alpha**2 Beta(a)**2 Pol.Int. Depol.Int. Dep. Ratio
#
# 1 3546.72 0.379364 16.900089 84.671721 50.700268 0.598786
# 2 3546.67 0.000000 0.000000 0.000000 0.000000 0.599550
if "Raman related properties for freq." in line:
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
assert line[1:76] == "Mode Freq. Alpha**2 Beta(a)**2 Pol.Int. Depol.Int. Dep. Ratio"
self.skip_line(inputfile, 'b')
line = next(inputfile)
vibramans = []
# The Raman intensities appear under the "Pol.Int."
# (polarization intensity) column.
for m in range(len(self.vibfreqs)):
vibramans.append(float(line.split()[4]))
line = next(inputfile)
# All vibrational properties in DALTON appear in reverse
# order.
self.vibramans = vibramans[::-1]
# Static polarizability from **PROPERTIES/.POLARI.
if line.strip() == "Static polarizabilities (au)":
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = []
self.skip_lines(inputfile, ['d', 'b', 'directions', 'b'])
for _ in range(3):
line = next(inputfile)
# Separate possibly unspaced huge negative polarizability tensor
# element and the left adjacent column from each other.
line = line.replace('-', ' -')
polarizability.append(line.split()[1:])
self.polarizabilities.append(numpy.array(polarizability))
# Static and dynamic polarizability from **PROPERTIES/.ALPHA/*ABALNR.
if "Polarizability tensor for frequency" in line:
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = []
self.skip_lines(inputfile, ['d', 'directions', 'b'])
for _ in range(3):
line = next(inputfile)
polarizability.append(line.split()[1:])
self.polarizabilities.append(numpy.array(polarizability))
if "Starting in Dynamic Property Section (RESPONS)" in line:
self.section = "RESPONSE"
# Static and dynamic polarizability from **RESPONSE/*LINEAR.
# This section is *very* general and will need to be expanded later.
# For now, only form the matrix from dipole (length gauge) values.
if "@ FREQUENCY INDEPENDENT SECOND ORDER PROPERTIES" in line:
coord_to_idx = {'X': 0, 'Y': 1, 'Z': 2}
self.skip_line(inputfile, 'b')
line = next(inputfile)
polarizability_diplen = numpy.empty(shape=(3, 3)) * numpy.nan
while "Time used in linear response calculation is" not in line:
tokens = line.split()
if line.count("DIPLEN") == 2:
assert len(tokens) == 8
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
i, j = coord_to_idx[tokens[2][0]], coord_to_idx[tokens[4][0]]
polarizability_diplen[i, j] = self.float(tokens[7])
line = next(inputfile)
polarizability_diplen = utils.symmetrize(polarizability_diplen, use_triangle='upper')
if hasattr(self, 'polarizabilities'):
self.polarizabilities.append(polarizability_diplen)
## Electronic excitations: single residues of the linear response
## equations.
#
#
# @ Excited state no: 1 in symmetry 3 ( Bu )
# ----------------------------------------------
#
# @ Excitation energy : 0.19609400 au
# @ 5.3359892 eV; 43037.658 cm-1; 514.84472 kJ / mol
#
# @ Total energy : -381.85462 au
#
# @ Operator type: XDIPLEN
# @ Oscillator strength (LENGTH) : 8.93558787E-03 (Transition moment : 0.26144181 )
#
# @ Operator type: YDIPLEN
# @ Oscillator strength (LENGTH) : 0.15204812 (Transition moment : 1.0784599 )
#
# Eigenvector for state no. 1
#
# Response orbital operator symmetry = 3
# (only scaled elements abs greater than 10.00 % of max abs value)
#
# Index(r,s) r s (r s) operator (s r) operator (r s) scaled (s r) scaled
# ---------- ----- ----- -------------- -------------- -------------- --------------
# 308 57(4) 28(2) 0.4829593728 -0.0024872024 0.6830076950 -0.0035174354
# ...
if "Linear Response single residue calculation" in line:
etsyms = []
etenergies = []
etsecs = []
etoscs = dict()
etoscs_keys = set()
symmap = {"T": "Triplet", "F": "Singlet"}
while "End of Dynamic Property Section (RESPONS)" not in line:
line = next(inputfile)
if "Operator symmetry" in line:
do_triplet = line[-2]
# @ Excited state no: 4 in symmetry 2 ( Au )
if line.startswith(" @ Excited state no:"):
tokens = line.split()
excited_state_num_in_sym = int(tokens[4])
sym_num = int(tokens[7])
etosc_key = (sym_num, excited_state_num_in_sym)
etoscs_keys.add(etosc_key)
etsym = tokens[9]
etsyms.append(symmap[do_triplet] + "-" + etsym)
self.skip_lines(inputfile, ["d", "b", "Excitation energy in a.u."])
line = next(inputfile)
etenergies.append(self.float(line.split()[3]))
self.skip_lines(inputfile, ["b", "@ Total energy", "b"])
if line.startswith("@ Operator type:"):
line = next(inputfile)
assert line.startswith("@ Oscillator strength")
if etosc_key not in etoscs:
etoscs[etosc_key] = 0.0
etoscs[etosc_key] += self.float(line.split()[5])
self.skip_line(inputfile, "b")
# To understand why the "PBHT MO Overlap Diagnostic" section
# cannot be used, see
# `test/regression.py/testDALTON_DALTON_2013_dvb_td_normalprint_out`.
if "Eigenvector for state no." in line:
assert int(line.split()[4]) == excited_state_num_in_sym
self.skip_lines(inputfile, [
"b",
"Response orbital operator symmetry",
"only scaled elements",
"b",
"Index(r,s)",
"d"
])
line = next(inputfile)
etsec = []
while line.strip():
tokens = line.split()
startidx = int(tokens[1].split("(")[0]) - 1
endidx = int(tokens[2].split("(")[0]) - 1
# `(r s) scaled`; to handle anything other than
# CIS/TDA properly, the deexcitation coefficient `(s
# r) scaled` should also be considered, but this
# requires a rework of the attribute structure.
contrib = float(tokens[5])
# Since DALTON is restricted open-shell only, there is
# no distinction between alpha and beta spin.
etsec.append([(startidx, 0), (endidx, 0), contrib])
line = next(inputfile)
etsecs.append(etsec)
self.set_attribute("etsyms", etsyms)
self.set_attribute("etenergies", etenergies)
if etsecs:
self.set_attribute("etsecs", etsecs)
if etoscs:
for k in etoscs_keys:
# If the oscillator strength of a transition is known to
# be zero for symmetry reasons, it isn't printed, however
# we need it for consistency; if it wasn't found, add it.
if k not in etoscs:
etoscs[k] = 0.0
# `.keys()` is not strictly necessary, but make it obvious
# that this is being sorted in order of excitation and
# symmetry, not oscillator strength.
self.set_attribute("etoscs", [etoscs[k] for k in sorted(etoscs.keys())])
if line[:37] == ' >>>> Total wall time used in DALTON:':
self.metadata['success'] = True
# TODO:
# aonames
# aooverlaps
# atomcharges
# atomspins
# coreelectrons
# enthalpy
# entropy
# etrotats
# freeenergy
# grads
# hessian
# mocoeffs
# nocoeffs
# nooccnos
# scancoords
# scanenergies
# scannames
# scanparm
# temperature
# vibanharms
# N/A:
# fonames
# fooverlaps
# fragnames
# frags
cclib-1.6.2/cclib/parser/data.py 0000664 0000000 0000000 00000053610 13535330462 0016440 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Classes and tools for storing and handling parsed data"""
import logging
from collections import namedtuple
import numpy
from cclib.method import Electrons
from cclib.method import orbitals
Attribute = namedtuple('Attribute', ['type', 'json_key', 'attribute_path'])
class ccData(object):
"""Stores data extracted by cclib parsers
Description of cclib attributes:
aonames -- atomic orbital names (list of strings)
aooverlaps -- atomic orbital overlap matrix (array[2])
atombasis -- indices of atomic orbitals on each atom (list of lists)
atomcharges -- atomic partial charges (dict of arrays[1])
atomcoords -- atom coordinates (array[3], angstroms)
atommasses -- atom masses (array[1], daltons)
atomnos -- atomic numbers (array[1])
atomspins -- atomic spin densities (dict of arrays[1])
ccenergies -- molecular energies with Coupled-Cluster corrections (array[2], eV)
charge -- net charge of the system (integer)
coreelectrons -- number of core electrons in atom pseudopotentials (array[1])
enthalpy -- sum of electronic and thermal enthalpies (float, hartree/particle)
entropy -- entropy (float, hartree/particle)
etenergies -- energies of electronic transitions (array[1], 1/cm)
etoscs -- oscillator strengths of electronic transitions (array[1])
etrotats -- rotatory strengths of electronic transitions (array[1], ??)
etsecs -- singly-excited configurations for electronic transitions (list of lists)
etsyms -- symmetries of electronic transitions (list of string)
freeenergy -- sum of electronic and thermal free energies (float, hartree/particle)
fonames -- fragment orbital names (list of strings)
fooverlaps -- fragment orbital overlap matrix (array[2])
fragnames -- names of fragments (list of strings)
frags -- indices of atoms in a fragment (list of lists)
gbasis -- coefficients and exponents of Gaussian basis functions (PyQuante format)
geotargets -- targets for convergence of geometry optimization (array[1])
geovalues -- current values for convergence of geometry optmization (array[1])
grads -- current values of forces (gradients) in geometry optimization (array[3])
hessian -- elements of the force constant matrix (array[1])
homos -- molecular orbital indices of HOMO(s) (array[1])
metadata -- various metadata about the package and computation (dict)
mocoeffs -- molecular orbital coefficients (list of arrays[2])
moenergies -- molecular orbital energies (list of arrays[1], eV)
moments -- molecular multipole moments (list of arrays[], a.u.)
mosyms -- orbital symmetries (list of lists)
mpenergies -- molecular electronic energies with Møller-Plesset corrections (array[2], eV)
mult -- multiplicity of the system (integer)
natom -- number of atoms (integer)
nbasis -- number of basis functions (integer)
nmo -- number of molecular orbitals (integer)
nocoeffs -- natural orbital coefficients (array[2])
nooccnos -- natural orbital occupation numbers (array[1])
nsocoeffs -- natural spin orbital coefficients (list of array[2])
nsooccnos -- natural spin orbital occupation numbers (list of array[1])
optdone -- flags whether an optimization has converged (Boolean)
optstatus -- optimization status for each set of atomic coordinates (array[1])
polarizabilities -- (dipole) polarizabilities, static or dynamic (list of arrays[2])
pressure -- temperature used for Thermochemistry (float, atm)
scancoords -- geometries of each scan step (array[3], angstroms)
scanenergies -- energies of potential energy surface (list)
scannames -- names of varaibles scanned (list of strings)
scanparm -- values of parameters in potential energy surface (list of tuples)
scfenergies -- molecular electronic energies after SCF (Hartree-Fock, DFT) (array[1], eV)
scftargets -- targets for convergence of the SCF (array[2])
scfvalues -- current values for convergence of the SCF (list of arrays[2])
temperature -- temperature used for Thermochemistry (float, kelvin)
time -- time in molecular dynamics and other trajectories (array[1], fs)
transprop -- all absorption and emission spectra (dictionary {name:(etenergies, etoscs)})
WARNING: this attribute is not standardized and is liable to change in cclib 2.0
vibanharms -- vibrational anharmonicity constants (array[2], 1/cm)
vibdisps -- cartesian displacement vectors (array[3], delta angstrom)
vibfreqs -- vibrational frequencies (array[1], 1/cm)
vibirs -- IR intensities (array[1], km/mol)
vibramans -- Raman intensities (array[1], A^4/Da)
vibsyms -- symmetries of vibrations (list of strings)
(1) The term 'array' refers to a numpy array
(2) The number of dimensions of an array is given in square brackets
(3) Python indexes arrays/lists starting at zero, so if homos==[10], then
the 11th molecular orbital is the HOMO
"""
# The expected types for all supported attributes.
# The json_key is the key name used for attributes in the CJSON/JSON format
# 'TBD' - To Be Decided are the key names of attributes which haven't been included in the cjson format
_attributes = {
"aonames": Attribute(list, 'names', 'atoms:orbitals'),
"aooverlaps": Attribute(numpy.ndarray, 'overlaps', 'properties:orbitals'),
"atombasis": Attribute(list, 'indices', 'atoms:orbitals'),
"atomcharges": Attribute(dict, 'partial charges', 'properties'),
"atomcoords": Attribute(numpy.ndarray, 'coords', 'atoms:coords:3d'),
"atommasses": Attribute(numpy.ndarray, 'mass', 'atoms'),
"atomnos": Attribute(numpy.ndarray, 'number', 'atoms:elements'),
"atomspins": Attribute(dict, 'spins', 'atoms'),
"ccenergies": Attribute(numpy.ndarray, 'coupled cluster', 'properties:energy'),
"charge": Attribute(int, 'charge', 'properties'),
"coreelectrons": Attribute(numpy.ndarray, 'core electrons', 'atoms'),
"enthalpy": Attribute(float, 'enthalpy', 'properties'),
"entropy": Attribute(float, 'entropy', 'properties'),
"etenergies": Attribute(numpy.ndarray, 'electronic transitions', 'transitions'),
"etoscs": Attribute(numpy.ndarray, 'oscillator strength', 'transitions'),
"etrotats": Attribute(numpy.ndarray, 'rotatory strength', 'transitions'),
"etsecs": Attribute(list, 'one excited config', 'transitions'),
"etsyms": Attribute(list, 'symmetry', 'transitions'),
"freeenergy": Attribute(float, 'free energy', 'properties:energy'),
"fonames": Attribute(list, 'orbital names', 'fragments'),
"fooverlaps": Attribute(numpy.ndarray, 'orbital overlap', 'fragments'),
"fragnames": Attribute(list, 'fragment names', 'fragments'),
"frags": Attribute(list, 'atom indices', 'fragments'),
"gbasis": Attribute(list, 'basis functions', 'atoms:orbitals'),
"geotargets": Attribute(numpy.ndarray, 'geometric targets', 'optimization'),
"geovalues": Attribute(numpy.ndarray, 'geometric values', 'optimization'),
"grads": Attribute(numpy.ndarray, 'TBD', 'N/A'),
"hessian": Attribute(numpy.ndarray, 'hessian matrix', 'vibrations'),
"homos": Attribute(numpy.ndarray, 'homos', 'properties:orbitals'),
"metadata": Attribute(dict, 'TBD', 'N/A'),
"mocoeffs": Attribute(list, 'coeffs', 'properties:orbitals'),
"moenergies": Attribute(list, 'energies', 'properties:orbitals'),
"moments": Attribute(list, 'total dipole moment', 'properties'),
"mosyms": Attribute(list, 'molecular orbital symmetry', 'properties:orbitals'),
"mpenergies": Attribute(numpy.ndarray, 'moller plesset', 'properties:energy'),
"mult": Attribute(int, 'multiplicity', 'properties'),
"natom": Attribute(int, 'number of atoms', 'properties'),
"nbasis": Attribute(int, 'basis number', 'properties:orbitals'),
"nmo": Attribute(int, 'MO number', 'properties:orbitals'),
"nocoeffs": Attribute(numpy.ndarray, 'TBD', 'N/A'),
"nooccnos": Attribute(numpy.ndarray, 'TBD', 'N/A'),
"nsocoeffs": Attribute(list, 'TBD', 'N/A'),
"nsooccnos": Attribute(list, 'TBD', 'N/A'),
"optdone": Attribute(list, 'done', 'optimization'),
"optstatus": Attribute(numpy.ndarray, 'status', 'optimization'),
"polarizabilities": Attribute(list, 'polarizabilities', 'N/A'),
"pressure": Attribute(float, 'pressure', 'properties'),
"scancoords": Attribute(numpy.ndarray, 'step geometry', 'optimization:scan'),
"scanenergies": Attribute(list, 'PES energies', 'optimization:scan'),
"scannames": Attribute(list, 'variable names', 'optimization:scan'),
"scanparm": Attribute(list, 'PES parameter values', 'optimization:scan'),
"scfenergies": Attribute(numpy.ndarray, 'scf energies', 'optimization:scf'),
"scftargets": Attribute(numpy.ndarray, 'targets', 'optimization:scf'),
"scfvalues": Attribute(list, 'values', 'optimization:scf'),
"temperature": Attribute(float, 'temperature', 'properties'),
"time": Attribute(numpy.ndarray, 'time', 'N/A'),
"transprop": Attribute(dict, 'electronic transitions', 'transitions'),
"vibanharms": Attribute(numpy.ndarray, 'anharmonicity constants', 'vibrations'),
"vibdisps": Attribute(numpy.ndarray, 'displacement', 'vibrations'),
"vibfreqs": Attribute(numpy.ndarray, 'frequencies', 'vibrations'),
"vibirs": Attribute(numpy.ndarray, 'IR', 'vibrations:intensities'),
"vibramans": Attribute(numpy.ndarray, 'raman', 'vibrations:intensities'),
"vibsyms": Attribute(list, 'vibration symmetry', 'vibrations')
}
# The name of all attributes can be generated from the dictionary above.
_attrlist = sorted(_attributes.keys())
# Arrays are double precision by default, but these will be integer arrays.
_intarrays = ['atomnos', 'coreelectrons', 'homos', 'optstatus']
# Attributes that should be lists of arrays (double precision).
_listsofarrays = ['mocoeffs', 'moenergies', 'moments', 'polarizabilities', 'scfvalues']
# Attributes that should be dictionaries of arrays (double precision).
_dictsofarrays = ["atomcharges", "atomspins"]
# Possible statuses for optimization steps.
# OPT_UNKNOWN is the default and means optimization is in progress.
# OPT_NEW is set for every new optimization (e.g. PES, IRCs, etc.)
# OPT_DONE is set for the last step of an optimisation that converged.
# OPT_UNCONVERGED is set for every unconverged step (e.g. should be mutually exclusive with OPT_DONE)
# bit value notation allows coding for multiple states: OPT_NEW and OPT_UNCONVERGED or OPT_NEW and OPT_DONE.
OPT_UNKNOWN = 0b000
OPT_NEW = 0b001
OPT_UNCONVERGED = 0b010
OPT_DONE = 0b100
def __init__(self, attributes={}):
"""Initialize the cclibData object.
Normally called in the parse() method of a Logfile subclass.
Inputs:
attributes - optional dictionary of attributes to load as data
"""
if attributes:
self.setattributes(attributes)
def listify(self):
"""Converts all attributes that are arrays or lists/dicts of arrays to lists."""
attrlist = [k for k in self._attrlist if hasattr(self, k)]
for k in attrlist:
v = self._attributes[k].type
if v == numpy.ndarray:
setattr(self, k, getattr(self, k).tolist())
elif v == list and k in self._listsofarrays:
setattr(self, k, [x.tolist() for x in getattr(self, k)])
elif v == dict and k in self._dictsofarrays:
items = getattr(self, k).items()
pairs = [(key, val.tolist()) for key, val in items]
setattr(self, k, dict(pairs))
def arrayify(self):
"""Converts appropriate attributes to arrays or lists/dicts of arrays."""
attrlist = [k for k in self._attrlist if hasattr(self, k)]
for k in attrlist:
v = self._attributes[k].type
precision = 'd'
if k in self._intarrays:
precision = 'i'
if v == numpy.ndarray:
setattr(self, k, numpy.array(getattr(self, k), precision))
elif v == list and k in self._listsofarrays:
setattr(self, k, [numpy.array(x, precision) for x in getattr(self, k)])
elif v == dict and k in self._dictsofarrays:
items = getattr(self, k).items()
pairs = [(key, numpy.array(val, precision)) for key, val in items]
setattr(self, k, dict(pairs))
def getattributes(self, tolists=False):
"""Returns a dictionary of existing data attributes.
Inputs:
tolists - flag to convert attributes to lists where applicable
"""
if tolists:
self.listify()
attributes = {}
for attr in self._attrlist:
if hasattr(self, attr):
attributes[attr] = getattr(self, attr)
if tolists:
self.arrayify()
return attributes
def setattributes(self, attributes):
"""Sets data attributes given in a dictionary.
Inputs:
attributes - dictionary of attributes to set
Outputs:
invalid - list of attributes names that were not set, which
means they are not specified in self._attrlist
"""
if type(attributes) is not dict:
raise TypeError("attributes must be in a dictionary")
valid = [a for a in attributes if a in self._attrlist]
invalid = [a for a in attributes if a not in self._attrlist]
for attr in valid:
setattr(self, attr, attributes[attr])
self.arrayify()
self.typecheck()
return invalid
def typecheck(self):
"""Check the types of all attributes.
If an attribute does not match the expected type, then attempt to
convert; if that fails, only then raise a TypeError.
"""
self.arrayify()
for attr in [a for a in self._attrlist if hasattr(self, a)]:
val = getattr(self, attr)
if type(val) == self._attributes[attr].type:
continue
try:
val = self._attributes[attr].type(val)
except ValueError:
args = (attr, type(val), self._attributes[attr].type)
raise TypeError("attribute %s is %s instead of %s and could not be converted" % args)
def check_values(self, logger=logging):
"""Perform custom checks on the values of attributes."""
if hasattr(self, "etenergies") and any(e < 0 for e in self.etenergies):
negative_values = [e for e in self.etenergies if e < 0]
msg = ("At least one excitation energy is negative. "
"\nNegative values: %s\nFull etenergies: %s"
% (negative_values, self.etenergies))
logger.error(msg)
def write(self, filename=None, indices=None, *args, **kwargs):
"""Write parsed attributes to a file.
Possible extensions:
.cjson or .json - output a chemical JSON file
.cml - output a chemical markup language (CML) file
.xyz - output a Cartesian XYZ file of the last coordinates available
"""
from cclib.io import ccwrite
outputstr = ccwrite(self, outputdest=filename, indices=indices,
*args, **kwargs)
return outputstr
def writejson(self, filename=None, indices=None):
"""Write parsed attributes to a JSON file."""
return self.write(filename=filename, indices=indices,
outputtype='cjson')
def writecml(self, filename=None, indices=None):
"""Write parsed attributes to a CML file."""
return self.write(filename=filename, indices=indices,
outputtype='cml')
def writexyz(self, filename=None, indices=None):
"""Write parsed attributes to an XML file."""
return self.write(filename=filename, indices=indices,
outputtype='xyz')
@property
def converged_geometries(self):
"""
Return all converged geometries.
An array containing only the converged geometries, e.g.:
- For PES or IRCs, return all geometries for which optstatus matches OPT_DONE
- The converged geometry for simple optimisations
- The input geometry for single points
"""
if hasattr(self, 'optstatus'):
converged_indexes = [x for x, y in enumerate(self.optstatus) if y & self.OPT_DONE > 0]
return self.atomcoords[converged_indexes]
else:
return self.atomcoords
@property
def new_geometries(self):
"""
Return all starting geometries.
An array containing only the starting geometries, e.g.:
- For PES or IRCs, return all geometries for which optstatus matches OPT_NEW
- The input geometry for simple optimisations or single points
"""
if hasattr(self, 'optstatus'):
new_indexes = [x for x, y in enumerate(self.optstatus) if y & self.OPT_NEW > 0]
return self.atomcoords[new_indexes]
else:
return self.atomcoords
@property
def unknown_geometries(self):
"""
Return all OPT_UNKNOWN geometries.
An array containing only the starting geometries, e.g.:
- For PES or IRCs, return all geometries for which optstatus matches OPT_UNKNOWN
- The input geometry for simple optimisations or single points
"""
if hasattr(self, 'optstatus'):
unknown_indexes = [x for x, y in enumerate(self.optstatus) if y == self.OPT_UNKNOWN]
return self.atomcoords[unknown_indexes]
else:
return self.atomcoords
@property
def unconverged_geometries(self):
"""
Return all unconverged geometries.
An array containing only the starting geometries, e.g.:
- For PES or IRCs, return all geometries for which optstatus matches OPT_UNCONVERGED
- The input geometry for simple optimisations or single points
"""
if hasattr(self, 'optstatus'):
unconverged_indexes = [x for x, y in enumerate(self.optstatus) if y & self.OPT_UNCONVERGED > 0]
return self.atomcoords[unconverged_indexes]
else:
return self.atomcoords
@property
def nelectrons(self):
return Electrons(self).count()
@property
def closed_shell(self):
return orbitals.Orbitals(self).closed_shell()
class ccData_optdone_bool(ccData):
"""This is the version of ccData where optdone is a Boolean."""
def __init__(self, *args, **kwargs):
super(ccData_optdone_bool, self).__init__(*args, **kwargs)
self._attributes["optdone"] = Attribute(bool, 'done', 'optimization')
def setattributes(self, *args, **kwargs):
invalid = super(ccData_optdone_bool, self).setattributes(*args, **kwargs)
# Reduce optdone to a Boolean, because it will be parsed as a list. If this list has any element,
# it means that there was an optimized structure and optdone should be True.
if hasattr(self, 'optdone'):
self.optdone = len(self.optdone) > 0
cclib-1.6.2/cclib/parser/gamessparser.py 0000664 0000000 0000000 00000202512 13535330462 0020220 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for GAMESS(US) output files"""
from __future__ import print_function
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class GAMESS(logfileparser.Logfile):
"""A GAMESS/Firefly log file."""
# Used to index self.scftargets[].
SCFRMS, SCFMAX, SCFENERGY = list(range(3))
# Used to extact Dunning basis set names.
dunningbas = {'CCD': 'cc-pVDZ', \
'CCT': 'cc-pVTZ', \
'CCQ': 'cc-pVQZ', \
'CC5': 'cc-pV5Z', \
'CC6': 'cc-pV6Z', \
'ACCD': 'aug-cc-pVDZ', \
'ACCT': 'aug-cc-pVTZ', \
'ACCQ': 'aug-cc-pVQZ', \
'ACC5': 'aug-cc-pV5Z', \
'ACC6': 'aug-cc-pV6Z', \
'CCDC': 'cc-pCVDZ', \
'CCTC': 'cc-pCVTZ', \
'CCQC': 'cc-pCVQZ', \
'CC5C': 'cc-pCV5Z', \
'CC6C': 'cc-pCV6Z', \
'ACCDC': 'aug-cc-pCVDZ', \
'ACCTC': 'aug-cc-pCVTZ', \
'ACCQC': 'aug-cc-pCVQZ', \
'ACC5C': 'aug-cc-pCV5Z', \
'ACC6C': 'aug-cc-pCV6Z'}
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(GAMESS, self).__init__(logname="GAMESS", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "GAMESS log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'GAMESS("%s")' % (self.filename)
def normalisesym(self, label):
"""Normalise the symmetries used by GAMESS.
To normalise, two rules need to be applied:
(1) Occurences of U/G in the 2/3 position of the label
must be lower-cased
(2) Two single quotation marks must be replaced by a double
"""
if label[1:] == "''":
end = '"'
else:
end = label[1:].replace("U", "u").replace("G", "g")
return label[0] + end
def before_parsing(self):
self.firststdorient = True # Used to decide whether to wipe the atomcoords clean
self.cihamtyp = "none" # Type of CI Hamiltonian: saps or dets.
self.scftype = "none" # Type of SCF calculation: BLYP, RHF, ROHF, etc.
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number. If the calculation is from
# Firefly, its version number comes before a line that looks
# like the normal GAMESS version number...
if "Firefly version" in line:
match = re.search(r"Firefly version\s([\d.]*)\D*(\d*)\s*\*", line)
if match:
version, build = match.groups()
package_version = "{}.b{}".format(version, build)
self.metadata["package_version"] = package_version
if "GAMESS VERSION" in line:
# ...so avoid overwriting it if Firefly already set this field.
if "package_version" not in self.metadata:
tokens = line.split()
self.metadata["package_version"] = ' '.join(tokens[4:-1])
if line[1:12] == "INPUT CARD>":
return
# extract the methods
if line[1:7] == "SCFTYP":
method = line.split()[0][7:]
if len(self.metadata["methods"]) == 0:
self.metadata["methods"].append(method)
# extract the basis set name
if line[5:11] == "GBASIS":
basnm1 = line.split()[0][7:]
if basnm1 in self.dunningbas:
self.metadata["basis_set"] = self.dunningbas[basnm1]
else:
if basnm1 == "PM3" or basnm1 == "AM1":
self.metadata["methods"].append(basnm1)
if basnm1 == "STO" :
if line.split()[2] == "2":
self.metadata["basis_set"] = "STO-2G"
elif line.split()[2] == "3":
self.metadata["basis_set"] = "STO-3G"
elif line.split()[2] == "4":
self.metadata["basis_set"] = "STO-4G"
elif line.split()[2] == "5":
self.metadata["basis_set"] = "STO-5G"
if basnm1 == "N21" :
if line.split()[2] == "3" and line.split()[3] == "POLAR=COMMON":
self.metadata["basis_set"] = "3-21G*"
if line.split()[2] == "3" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "3-21G"
if line.split()[2] == "4" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "4-21G"
if line.split()[2] == "6" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "6-21G"
if basnm1 == "N31" :
if line.split()[2] == "6" and (line.split()[3] == "POLAR=POPN31" \
or line.split()[3] == "POLAR=POPLE"):
self.metadata["basis_set"] = "6-31G*"
line = next(inputfile)
if line.split()[-1] == "T":
self.metadata["basis_set"] = "6-31+G*"
line = next(inputfile)
if line.split()[1] == "0" and line.split()[3] == "T":
self.metadata["basis_set"] = "6-31++G*"
if line.split()[1] == "1" and line.split()[3] == "T":
self.metadata["basis_set"] = "6-31++G**"
else:
line = next(inputfile)
if line.split()[1] == "1": #NPFUNC = 1
self.metadata["basis_set"] = "6-31G**"
if line.split()[2] == "6" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "6-31G"
if line.split()[2] == "4" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "4-31G"
if line.split()[2] == "4" and line.split()[3] == "POLAR=POPN31":
self.metadata["basis_set"] = "4-31G*"
if basnm1 == "N311" :
if line.split()[2] == "6" and line.split()[3] == "POLAR=POPN311":
self.metadata["basis_set"] = "6-311G*"
line = next(inputfile)
if line.split()[-1] == "T":
self.metadata["basis_set"] = "6-311+G*"
line = next(inputfile)
if line.split()[1] == "0" and line.split()[3] == "T":
self.metadata["basis_set"] = "6-311++G*"
if line.split()[1] == "1" and line.split()[3] == "T":
self.metadata["basis_set"] = "6-311++G**"
else:
line = next(inputfile)
if line.split()[1] == "1": #NPFUNC = 1
self.metadata["basis_set"] = "6-311G**"
if line.split()[2] == "6" and line.split()[3] == "POLAR=NONE":
self.metadata["basis_set"] = "6-311G"
# We are looking for this line:
# PARAMETERS CONTROLLING GEOMETRY SEARCH ARE
# ...
# OPTTOL = 1.000E-04 RMIN = 1.500E-03
if line[10:18] == "OPTTOL =":
if not hasattr(self, "geotargets"):
opttol = float(line.split()[2])
self.geotargets = numpy.array([opttol, 3. / opttol], "d")
# Has to deal with such lines as:
# FINAL R-B3LYP ENERGY IS -382.0507446475 AFTER 10 ITERATIONS
# FINAL ENERGY IS -379.7594673378 AFTER 9 ITERATIONS
# ...so take the number after the "IS"
if line.find("FINAL") == 1:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
temp = line.split()
self.scfenergies.append(utils.convertor(float(temp[temp.index("IS") + 1]), "hartree", "eV"))
# For total energies after Moller-Plesset corrections, the output looks something like this:
#
# RESULTS OF MOLLER-PLESSET 2ND ORDER CORRECTION ARE
# E(0)= -285.7568061536
# E(1)= 0.0
# E(2)= -0.9679419329
# E(MP2)= -286.7247480864
# where E(MP2) = E(0) + E(2)
#
# With GAMESS-US 12 Jan 2009 (R3), the preceding text is different:
# DIRECT 4-INDEX TRANSFORMATION
# SCHWARZ INEQUALITY TEST SKIPPED 0 INTEGRAL BLOCKS
# E(SCF)= -76.0088477471
# E(2)= -0.1403745370
# E(MP2)= -76.1492222841
#
# With GAMESS-US 20 APR 2017 (R1), the following block may be present:
# SCHWARZ INEQUALITY TEST SKIPPED 0 INTEGRAL BLOCKS
# ... END OF INTEGRAL TRANSFORMATION ...
if line.find("RESULTS OF MOLLER-PLESSET") >= 0 or line[6:37] == "SCHWARZ INEQUALITY TEST SKIPPED":
if not hasattr(self, "mpenergies"):
self.mpenergies = []
line = next(inputfile)
# Each iteration has a new print-out
if "END OF INTEGRAL TRANSFORMATION" not in line:
self.mpenergies.append([])
# GAMESS-US presently supports only second order corrections (MP2)
# PC GAMESS also has higher levels (3rd and 4th), with different output
# Only the highest level MP4 energy is gathered (SDQ or SDTQ)
# Loop breaks when substring "DONE WITH MPn ENERGY" is encountered,
# where n=2, 3 or 4.
while "DONE WITH MP" not in line:
if len(line.split()) > 0:
# Only up to MP2 correction
if line.split()[0] == "E(MP2)=":
self.metadata["methods"].append("MP2")
mp2energy = float(line.split()[1])
self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV"))
# MP2 before higher order calculations
if line.split()[0] == "E(MP2)":
mp2energy = float(line.split()[2])
self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV"))
if line.split()[0] == "E(MP3)":
self.metadata["methods"].append("MP3")
mp3energy = float(line.split()[2])
self.mpenergies[-1].append(utils.convertor(mp3energy, "hartree", "eV"))
if line.split()[0] in ["E(MP4-SDQ)", "E(MP4-SDTQ)"]:
self.metadata["methods"].append("MP4")
mp4energy = float(line.split()[2])
self.mpenergies[-1].append(utils.convertor(mp4energy, "hartree", "eV"))
line = next(inputfile)
# Total energies after Coupled Cluster calculations
# Only the highest Coupled Cluster level result is gathered
if line[12:23] == "CCD ENERGY:":
self.metadata["methods"].append("CCD")
if not hasattr(self, "ccenergies"):
self.ccenergies = []
ccenergy = float(line.split()[2])
self.ccenergies.append(utils.convertor(ccenergy, "hartree", "eV"))
if line.find("CCSD") >= 0 and line.split()[0:2] == ["CCSD", "ENERGY:"]:
self.metadata["methods"].append("CCSD")
if not hasattr(self, "ccenergies"):
self.ccenergies = []
ccenergy = float(line.split()[2])
line = next(inputfile)
if line[8:23] == "CCSD[T] ENERGY:":
self.metadata["methods"].append("CCSD[T]")
ccenergy = float(line.split()[2])
line = next(inputfile)
if line[8:23] == "CCSD(T) ENERGY:":
self.metadata["methods"].append("CCSD(T)")
ccenergy = float(line.split()[2])
self.ccenergies.append(utils.convertor(ccenergy, "hartree", "eV"))
# Also collect MP2 energies, which are always calculated before CC
if line[8:23] == "MBPT(2) ENERGY:":
if not hasattr(self, "mpenergies"):
self.mpenergies = []
self.mpenergies.append([])
mp2energy = float(line.split()[2])
self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV"))
# Extract charge and multiplicity
if line[1:19] == "CHARGE OF MOLECULE":
charge = int(round(float(line.split()[-1])))
self.set_attribute('charge', charge)
line = next(inputfile)
mult = int(line.split()[-1])
self.set_attribute('mult', mult)
# Electronic transitions (etenergies) for CIS runs and TD-DFT, which
# have very similar outputs. The outputs EOM look very differentm, though.
#
# ---------------------------------------------------------------------
# CI-SINGLES EXCITATION ENERGIES
# STATE HARTREE EV KCAL/MOL CM-1 NM
# ---------------------------------------------------------------------
# 1A'' 0.1677341781 4.5643 105.2548 36813.40 271.64
# ...
if re.match("(CI-SINGLES|TDDFT) EXCITATION ENERGIES", line.strip()):
if not hasattr(self, "etenergies"):
self.etenergies = []
get_etosc = False
header = next(inputfile).rstrip()
if header.endswith("OSC. STR."):
# water_cis_dets.out does not have the oscillator strength
# in this table...it is extracted from a different section below
get_etosc = True
self.etoscs = []
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
broken = line.split()
while len(broken) > 0:
# Take hartree value with more numbers, and convert.
# Note that the values listed after this are also less exact!
etenergy = float(broken[1])
self.etenergies.append(utils.convertor(etenergy, "hartree", "wavenumber"))
if get_etosc:
etosc = float(broken[-1])
self.etoscs.append(etosc)
broken = next(inputfile).split()
# Detect the CI hamiltonian type, if applicable.
# Should always be detected if CIS is done.
if line[8:64] == "RESULTS FROM SPIN-ADAPTED ANTISYMMETRIZED PRODUCT (SAPS)":
self.cihamtyp = "saps"
if line[8:64] == "RESULTS FROM DETERMINANT BASED ATOMIC ORBITAL CI-SINGLES":
self.cihamtyp = "dets"
# etsecs (used only for CIS runs for now)
if line[1:14] == "EXCITED STATE":
if not hasattr(self, 'etsecs'):
self.etsecs = []
if not hasattr(self, 'etsyms'):
self.etsyms = []
statenumber = int(line.split()[2])
spin = int(float(line.split()[7]))
if spin == 0:
sym = "Singlet"
if spin == 1:
sym = "Triplet"
sym += '-' + line.split()[-1]
self.etsyms.append(sym)
# skip 5 lines
for i in range(5):
line = next(inputfile)
line = next(inputfile)
CIScontribs = []
while line.strip()[0] != "-":
MOtype = 0
# alpha/beta are specified for hamtyp=dets
if self.cihamtyp == "dets":
if line.split()[0] == "BETA":
MOtype = 1
fromMO = int(line.split()[-3])-1
toMO = int(line.split()[-2])-1
coeff = float(line.split()[-1])
# With the SAPS hamiltonian, the coefficients are multiplied
# by sqrt(2) so that they normalize to 1.
# With DETS, both alpha and beta excitations are printed.
# if self.cihamtyp == "saps":
# coeff /= numpy.sqrt(2.0)
CIScontribs.append([(fromMO, MOtype), (toMO, MOtype), coeff])
line = next(inputfile)
self.etsecs.append(CIScontribs)
# etoscs (used only for CIS runs now)
if line[1:50] == "TRANSITION FROM THE GROUND STATE TO EXCITED STATE":
if not hasattr(self, "etoscs"):
self.etoscs = []
# This was the suggested as a fix in issue #61, and it does allow
# the parser to finish without crashing. However, it seems that
# etoscs is shorter in this case than the other transition attributes,
# so that should be somehow corrected and tested for.
if "OPTICALLY" in line:
pass
else:
statenumber = int(line.split()[-1])
# skip 7 lines
for i in range(8):
line = next(inputfile)
strength = float(line.split()[3])
self.etoscs.append(strength)
# TD-DFT for GAMESS-US.
# The format for excitations has changed a bit between 2007 and 2012.
# Original format parser was written for:
#
# -------------------
# TRIPLET EXCITATIONS
# -------------------
#
# STATE # 1 ENERGY = 3.027228 EV
# OSCILLATOR STRENGTH = 0.000000
# DRF COEF OCC VIR
# --- ---- --- ---
# 35 -1.105383 35 -> 36
# 69 -0.389181 34 -> 37
# 103 -0.405078 33 -> 38
# 137 0.252485 32 -> 39
# 168 -0.158406 28 -> 40
#
# STATE # 2 ENERGY = 4.227763 EV
# ...
#
# Here is the corresponding 2012 version:
#
# -------------------
# TRIPLET EXCITATIONS
# -------------------
#
# STATE # 1 ENERGY = 3.027297 EV
# OSCILLATOR STRENGTH = 0.000000
# LAMBDA DIAGNOSTIC = 0.925 (RYDBERG/CHARGE TRANSFER CHARACTER)
# SYMMETRY OF STATE = A
# EXCITATION DE-EXCITATION
# OCC VIR AMPLITUDE AMPLITUDE
# I A X(I->A) Y(A->I)
# --- --- -------- --------
# 35 36 -0.929190 -0.176167
# 34 37 -0.279823 -0.109414
# ...
#
# We discern these two by the presence of the arrow in the old version.
#
# The "LET EXCITATIONS" pattern used below catches both
# singlet and triplet excitations output.
if line[14:29] == "LET EXCITATIONS":
self.etenergies = []
self.etoscs = []
self.etsecs = []
etsyms = []
self.skip_lines(inputfile, ['d', 'b'])
# Loop while states are still being printed.
line = next(inputfile)
while line[1:6] == "STATE":
self.updateprogress(inputfile, "Excited States")
etenergy = utils.convertor(float(line.split()[-2]), "eV", "wavenumber")
etoscs = float(next(inputfile).split()[-1])
self.etenergies.append(etenergy)
self.etoscs.append(etoscs)
# Symmetry is not always present, especially in old versions.
# Newer versions, on the other hand, can also provide a line
# with lambda diagnostic and some extra headers.
line = next(inputfile)
if "LAMBDA DIAGNOSTIC" in line:
line = next(inputfile)
if "SYMMETRY" in line:
etsyms.append(line.split()[-1])
line = next(inputfile)
if "EXCITATION" in line and "DE-EXCITATION" in line:
line = next(inputfile)
if line.count("AMPLITUDE") == 2:
line = next(inputfile)
self.skip_line(inputfile, 'dashes')
CIScontribs = []
line = next(inputfile)
while line.strip():
cols = line.split()
if "->" in line:
i_occ_vir = [2, 4]
i_coeff = 1
else:
i_occ_vir = [0, 1]
i_coeff = 2
fromMO, toMO = [int(cols[i]) - 1 for i in i_occ_vir]
coeff = float(cols[i_coeff])
CIScontribs.append([(fromMO, 0), (toMO, 0), coeff])
line = next(inputfile)
self.etsecs.append(CIScontribs)
line = next(inputfile)
# The symmetries are not always present.
if etsyms:
self.etsyms = etsyms
# Maximum and RMS gradients.
if "MAXIMUM GRADIENT" in line or "RMS GRADIENT" in line:
parts = line.split()
# Avoid parsing the following...
## YOU SHOULD RESTART "OPTIMIZE" RUNS WITH THE COORDINATES
## WHOSE ENERGY IS LOWEST. RESTART "SADPOINT" RUNS WITH THE
## COORDINATES WHOSE RMS GRADIENT IS SMALLEST. THESE ARE NOT
## ALWAYS THE LAST POINT COMPUTED!
if parts[0] not in ["MAXIMUM", "RMS", "(1)"]:
return
if not hasattr(self, "geovalues"):
self.geovalues = []
# Newer versions (around 2006) have both maximum and RMS on one line:
# MAXIMUM GRADIENT = 0.0531540 RMS GRADIENT = 0.0189223
if len(parts) == 8:
maximum = float(parts[3])
rms = float(parts[7])
# In older versions of GAMESS, this spanned two lines, like this:
# MAXIMUM GRADIENT = 0.057578167
# RMS GRADIENT = 0.027589766
if len(parts) == 4:
maximum = float(parts[3])
line = next(inputfile)
parts = line.split()
rms = float(parts[3])
# FMO also prints two final one- and two-body gradients (see exam37):
# (1) MAXIMUM GRADIENT = 0.0531540 RMS GRADIENT = 0.0189223
if len(parts) == 9:
maximum = float(parts[4])
rms = float(parts[8])
self.geovalues.append([maximum, rms])
# This is the input orientation, which is the only data available for
# SP calcs, but which should be overwritten by the standard orientation
# values, which is the only information available for all geoopt cycles.
if line[11:50] == "ATOMIC COORDINATES":
if not hasattr(self, "atomcoords"):
self.atomcoords = []
line = next(inputfile)
atomcoords = []
atomnos = []
line = next(inputfile)
while line.strip():
temp = line.strip().split()
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in temp[2:5]])
atomnos.append(int(round(float(temp[1])))) # Don't use the atom name as this is arbitary
line = next(inputfile)
self.set_attribute('atomnos', atomnos)
self.atomcoords.append(atomcoords)
if line[12:40] == "EQUILIBRIUM GEOMETRY LOCATED":
# Prevent extraction of the final geometry twice
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues) - 1)
# Make sure we always have optdone for geomtry optimization, even if not converged.
if "GEOMETRY SEARCH IS NOT CONVERGED" in line:
if not hasattr(self, 'optdone'):
self.optdone = []
# This is the standard orientation, which is the only coordinate
# information available for all geometry optimisation cycles.
# The input orientation will be overwritten if this is a geometry optimisation
# We assume that a previous Input Orientation has been found and
# used to extract the atomnos
if line[1:29] == "COORDINATES OF ALL ATOMS ARE" and (not hasattr(self, "optdone") or self.optdone == []):
self.updateprogress(inputfile, "Coordinates")
if self.firststdorient:
self.firststdorient = False
# Wipes out the single input coordinate at the start of the file
self.atomcoords = []
self.skip_lines(inputfile, ['line', '-'])
atomcoords = []
line = next(inputfile)
for i in range(self.natom):
temp = line.strip().split()
atomcoords.append(list(map(float, temp[2:5])))
line = next(inputfile)
self.atomcoords.append(atomcoords)
# Section with SCF information.
#
# The space at the start of the search string is to differentiate from MCSCF.
# Everything before the search string is stored as the type of SCF.
# SCF types may include: BLYP, RHF, ROHF, UHF, etc.
#
# For example, in exam17 the section looks like this (note that this is GVB):
# ------------------------
# ROHF-GVB SCF CALCULATION
# ------------------------
# GVB STEP WILL USE 119875 WORDS OF MEMORY.
#
# MAXIT= 30 NPUNCH= 2 SQCDF TOL=1.0000E-05
# NUCLEAR ENERGY= 6.1597411978
# EXTRAP=T DAMP=F SHIFT=F RSTRCT=F DIIS=F SOSCF=F
#
# ITER EX TOTAL ENERGY E CHANGE SQCDF DIIS ERROR
# 0 0 -38.298939963 -38.298939963 0.131784454 0.000000000
# 1 1 -38.332044339 -0.033104376 0.026019716 0.000000000
# ... and will be terminated by a blank line.
if line.rstrip()[-16:] == " SCF CALCULATION":
# Remember the type of SCF.
self.scftype = line.strip()[:-16]
self.skip_line(inputfile, 'dashes')
while line[:5] != " ITER":
self.updateprogress(inputfile, "Attributes")
# GVB uses SQCDF for checking convergence (for example in exam17).
if "GVB" in self.scftype and "SQCDF TOL=" in line:
scftarget = float(line.split("=")[-1])
# Normally, however, the density is used as the convergence criterium.
# Deal with various versions:
# (GAMESS VERSION = 12 DEC 2003)
# DENSITY MATRIX CONV= 2.00E-05 DFT GRID SWITCH THRESHOLD= 3.00E-04
# (GAMESS VERSION = 22 FEB 2006)
# DENSITY MATRIX CONV= 1.00E-05
# (PC GAMESS version 6.2, Not DFT?)
# DENSITY CONV= 1.00E-05
elif "DENSITY CONV" in line or "DENSITY MATRIX CONV" in line:
scftarget = float(line.split()[-1])
line = next(inputfile)
if not hasattr(self, "scftargets"):
self.scftargets = []
self.scftargets.append([scftarget])
if not hasattr(self, "scfvalues"):
self.scfvalues = []
# Normally the iterations print in 6 columns.
# For ROHF, however, it is 5 columns, thus this extra parameter.
if "ROHF" in self.scftype:
self.scf_valcol = 4
else:
self.scf_valcol = 5
line = next(inputfile)
# SCF iterations are terminated by a blank line.
# The first four characters usually contains the step number.
# However, lines can also contain messages, including:
# * * * INITIATING DIIS PROCEDURE * * *
# CONVERGED TO SWOFF, SO DFT CALCULATION IS NOW SWITCHED ON
# DFT CODE IS SWITCHING BACK TO THE FINER GRID
values = []
while line.strip():
try:
temp = int(line[0:4])
except ValueError:
pass
else:
values.append([float(line.split()[self.scf_valcol])])
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF!')
break
self.scfvalues.append(values)
# Sometimes, only the first SCF cycle has the banner parsed for above,
# so we must identify them from the header before the SCF iterations.
# The example we have for this is the GeoOpt unittest for Firefly8.
if line[1:8] == "ITER EX":
# In this case, the convergence targets are not printed, so we assume
# they do not change.
self.scftargets.append(self.scftargets[-1])
values = []
line = next(inputfile)
while line.strip():
try:
temp = int(line[0:4])
except ValueError:
pass
else:
values.append([float(line.split()[self.scf_valcol])])
line = next(inputfile)
self.scfvalues.append(values)
# Extract normal coordinate analysis, including vibrational frequencies (vibfreq),
# IT intensities (vibirs) and displacements (vibdisps).
#
# This section typically looks like the following in GAMESS-US:
#
# MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS.
#
# FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2,
# REDUCED MASSES IN AMU.
#
# 1 2 3 4 5
# FREQUENCY: 52.49 41.45 17.61 9.23 10.61
# REDUCED MASS: 3.92418 3.77048 5.43419 6.44636 5.50693
# IR INTENSITY: 0.00013 0.00001 0.00004 0.00000 0.00003
#
# ...or in the case of a numerical Hessian job...
#
# MODES 1 TO 5 ARE TAKEN AS ROTATIONS AND TRANSLATIONS.
#
# FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2,
# REDUCED MASSES IN AMU.
#
# 1 2 3 4 5
# FREQUENCY: 0.05 0.03 0.03 30.89 30.94
# REDUCED MASS: 8.50125 8.50137 8.50136 1.06709 1.06709
#
# ...whereas PC-GAMESS has...
#
# MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS.
#
# FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2
#
# 1 2 3 4 5
# FREQUENCY: 5.89 1.46 0.01 0.01 0.01
# IR INTENSITY: 0.00000 0.00000 0.00000 0.00000 0.00000
#
# If Raman is present we have (for PC-GAMESS)...
#
# MODES 1 TO 6 ARE TAKEN AS ROTATIONS AND TRANSLATIONS.
#
# FREQUENCIES IN CM**-1, IR INTENSITIES IN DEBYE**2/AMU-ANGSTROM**2
# RAMAN INTENSITIES IN ANGSTROM**4/AMU, DEPOLARIZATIONS ARE DIMENSIONLESS
#
# 1 2 3 4 5
# FREQUENCY: 5.89 1.46 0.04 0.03 0.01
# IR INTENSITY: 0.00000 0.00000 0.00000 0.00000 0.00000
# RAMAN INTENSITY: 12.675 1.828 0.000 0.000 0.000
# DEPOLARIZATION: 0.750 0.750 0.124 0.009 0.750
#
# If GAMESS-US or PC-GAMESS has not reached the stationary point we have
# and additional warning, repeated twice, like so (see n_water.log for an example):
#
# *******************************************************
# * THIS IS NOT A STATIONARY POINT ON THE MOLECULAR PES *
# * THE VIBRATIONAL ANALYSIS IS NOT VALID !!! *
# *******************************************************
#
# There can also be additional warnings about the selection of modes, for example:
#
# * * * WARNING, MODE 6 HAS BEEN CHOSEN AS A VIBRATION
# WHILE MODE12 IS ASSUMED TO BE A TRANSLATION/ROTATION.
# PLEASE VERIFY THE PROGRAM'S DECISION MANUALLY!
#
if "NORMAL COORDINATE ANALYSIS IN THE HARMONIC APPROXIMATION" in line:
self.vibfreqs = []
self.vibirs = []
self.vibdisps = []
# Need to get to the modes line, which is often preceeded by
# a list of atomic weights and some possible warnings.
# Pass the warnings to the logger if they are there.
while not "MODES" in line:
self.updateprogress(inputfile, "Frequency Information")
line = next(inputfile)
# Typical Atomic Masses section printed in GAMESS
# ATOMIC WEIGHTS (AMU)
#
# 1 O 15.99491
# 2 H 1.00782
# 3 H 1.00782
if "ATOMIC WEIGHTS" in line:
atommasses = []
self.skip_line(inputfile,['b'])
# There is a blank line after ATOMIC WEIGHTS
line = next(inputfile)
while line.strip():
temp = line.strip().split()
atommasses.append(float(temp[2]))
line = next(inputfile)
self.set_attribute('atommasses', atommasses)
if "THIS IS NOT A STATIONARY POINT" in line:
msg = "\n This is not a stationary point on the molecular PES"
msg += "\n The vibrational analysis is not valid!!!"
self.logger.warning(msg)
if "* * * WARNING, MODE" in line:
line1 = line.strip()
line2 = next(inputfile).strip()
line3 = next(inputfile).strip()
self.logger.warning("\n " + "\n ".join((line1, line2, line3)))
# In at least one case (regression zolm_dft3a.log) for older version of GAMESS-US,
# the header concerning the range of nodes is formatted wrong and can look like so:
# MODES 9 TO14 ARE TAKEN AS ROTATIONS AND TRANSLATIONS.
# ... although it's unclear whether this happens for all two-digit values.
startrot = int(line.split()[1])
if len(line.split()[2]) == 2:
endrot = int(line.split()[3])
else:
endrot = int(line.split()[2][2:])
self.skip_line(inputfile, 'blank')
# Continue down to the first frequencies
line = next(inputfile)
# With GAMESS-US 20 APR 2017 (R1), there are 28 blank spaces,
# in earlier versions there used to be 26.
while not line.strip() or not re.search(' {26,}1', line) is not None:
line = next(inputfile)
while not "SAYVETZ" in line:
self.updateprogress(inputfile, "Frequency Information")
# Note: there may be imaginary frequencies like this (which we make negative):
# FREQUENCY: 825.18 I 111.53 12.62 10.70 0.89
#
# A note for debuggers: some of these frequencies will be removed later,
# assumed to be translations or rotations (see startrot/endrot above).
for col in next(inputfile).split()[1:]:
if col == "I":
self.vibfreqs[-1] *= -1
else:
self.vibfreqs.append(float(col))
line = next(inputfile)
# Skip the symmetry (appears in newer versions), fixes bug #3476063.
if line.find("SYMMETRY") >= 0:
line = next(inputfile)
# Skip the reduced mass (not always present).
if line.find("REDUCED") >= 0:
line = next(inputfile)
# Not present in numerical Hessian calculations.
if line.find("IR INTENSITY") >= 0:
irIntensity = map(float, line.strip().split()[2:])
self.vibirs.extend([utils.convertor(x, "Debye^2/amu-Angstrom^2", "km/mol") for x in irIntensity])
line = next(inputfile)
# Read in Raman vibrational intensities if present.
if line.find("RAMAN") >= 0:
if not hasattr(self, "vibramans"):
self.vibramans = []
ramanIntensity = line.strip().split()
self.vibramans.extend(list(map(float, ramanIntensity[2:])))
depolar = next(inputfile)
line = next(inputfile)
# This line seems always to be blank.
assert line.strip() == ''
# Extract the Cartesian displacement vectors.
p = [[], [], [], [], []]
for j in range(self.natom):
q = [[], [], [], [], []]
for coord in "xyz":
line = next(inputfile)[21:]
cols = list(map(float, line.split()))
for i, val in enumerate(cols):
q[i].append(val)
for k in range(len(cols)):
p[k].append(q[k])
self.vibdisps.extend(p[:len(cols)])
# Skip the Sayvetz stuff at the end.
for j in range(10):
line = next(inputfile)
self.skip_line(inputfile, 'blank')
line = next(inputfile)
# Exclude rotations and translations.
self.vibfreqs = numpy.array(self.vibfreqs[:startrot-1]+self.vibfreqs[endrot:], "d")
self.vibirs = numpy.array(self.vibirs[:startrot-1]+self.vibirs[endrot:], "d")
self.vibdisps = numpy.array(self.vibdisps[:startrot-1]+self.vibdisps[endrot:], "d")
if hasattr(self, "vibramans"):
self.vibramans = numpy.array(self.vibramans[:startrot-1]+self.vibramans[endrot:], "d")
if line[5:21] == "ATOMIC BASIS SET":
self.gbasis = []
line = next(inputfile)
while line.find("SHELL") < 0:
line = next(inputfile)
self.skip_lines(inputfile, ['blank', 'atomname'])
# shellcounter stores the shell no of the last shell
# in the previous set of primitives
shellcounter = 1
while line.find("TOTAL NUMBER") < 0:
self.skip_line(inputfile, 'blank')
line = next(inputfile)
shellno = int(line.split()[0])
shellgap = shellno - shellcounter
gbasis = [] # Stores basis sets on one atom
shellsize = 0
while len(line.split()) != 1 and line.find("TOTAL NUMBER") < 0:
shellsize += 1
coeff = {}
# coefficients and symmetries for a block of rows
while line.strip():
temp = line.strip().split()
sym = temp[1]
assert sym in ['S', 'P', 'D', 'F', 'G', 'L']
if sym == "L": # L refers to SP
if len(temp) == 6: # GAMESS US
coeff.setdefault("S", []).append((float(temp[3]), float(temp[4])))
coeff.setdefault("P", []).append((float(temp[3]), float(temp[5])))
else: # PC GAMESS
assert temp[6][-1] == temp[9][-1] == ')'
coeff.setdefault("S", []).append((float(temp[3]), float(temp[6][:-1])))
coeff.setdefault("P", []).append((float(temp[3]), float(temp[9][:-1])))
else:
if len(temp) == 5: # GAMESS US
coeff.setdefault(sym, []).append((float(temp[3]), float(temp[4])))
else: # PC GAMESS
assert temp[6][-1] == ')'
coeff.setdefault(sym, []).append((float(temp[3]), float(temp[6][:-1])))
line = next(inputfile)
# either a blank or a continuation of the block
if sym == "L":
gbasis.append(('S', coeff['S']))
gbasis.append(('P', coeff['P']))
else:
gbasis.append((sym, coeff[sym]))
line = next(inputfile)
# either the start of the next block or the start of a new atom or
# the end of the basis function section
numtoadd = 1 + (shellgap // shellsize)
shellcounter = shellno + shellsize
for x in range(numtoadd):
self.gbasis.append(gbasis)
# The eigenvectors, which also include MO energies and symmetries, follow
# the *final* report of evalues and the last list of symmetries in the log file:
#
# ------------
# EIGENVECTORS
# ------------
#
# 1 2 3 4 5
# -10.0162 -10.0161 -10.0039 -10.0039 -10.0029
# BU AG BU AG AG
# 1 C 1 S 0.699293 0.699290 -0.027566 0.027799 0.002412
# 2 C 1 S 0.031569 0.031361 0.004097 -0.004054 -0.000605
# 3 C 1 X 0.000908 0.000632 -0.004163 0.004132 0.000619
# 4 C 1 Y -0.000019 0.000033 0.000668 -0.000651 0.005256
# 5 C 1 Z 0.000000 0.000000 0.000000 0.000000 0.000000
# 6 C 2 S -0.699293 0.699290 0.027566 0.027799 0.002412
# 7 C 2 S -0.031569 0.031361 -0.004097 -0.004054 -0.000605
# 8 C 2 X 0.000908 -0.000632 -0.004163 -0.004132 -0.000619
# 9 C 2 Y -0.000019 -0.000033 0.000668 0.000651 -0.005256
# 10 C 2 Z 0.000000 0.000000 0.000000 0.000000 0.000000
# 11 C 3 S -0.018967 -0.019439 0.011799 -0.014884 -0.452328
# 12 C 3 S -0.007748 -0.006932 0.000680 -0.000695 -0.024917
# 13 C 3 X 0.002628 0.002997 0.000018 0.000061 -0.003608
# ...
#
# There are blanks lines between each block.
#
# Warning! There are subtle differences between GAMESS-US and PC-GAMES
# in the formatting of the first four columns. In particular, for F orbitals,
# PC GAMESS:
# 19 C 1 YZ 0.000000 0.000000 0.000000 0.000000 0.000000
# 20 C XXX 0.000000 0.000000 0.000000 0.000000 0.002249
# 21 C YYY 0.000000 0.000000 -0.025555 0.000000 0.000000
# 22 C ZZZ 0.000000 0.000000 0.000000 0.002249 0.000000
# 23 C XXY 0.000000 0.000000 0.001343 0.000000 0.000000
# GAMESS US
# 55 C 1 XYZ 0.000000 0.000000 0.000000 0.000000 0.000000
# 56 C 1XXXX -0.000014 -0.000067 0.000000 0.000000 0.000000
#
if line.find("EIGENVECTORS") == 10 or line.find("MOLECULAR ORBITALS") == 10:
# This is the stuff that we can read from these blocks.
self.moenergies = [[]]
self.mosyms = [[]]
if not hasattr(self, "nmo"):
self.nmo = self.nbasis
self.mocoeffs = [numpy.zeros((self.nmo, self.nbasis), "d")]
readatombasis = False
if not hasattr(self, "atombasis"):
self.atombasis = []
self.aonames = []
for i in range(self.natom):
self.atombasis.append([])
self.aonames = []
readatombasis = True
self.skip_line(inputfile, 'dashes')
for base in range(0, self.nmo, 5):
self.updateprogress(inputfile, "Coefficients")
line = next(inputfile)
# This makes sure that this section does not end prematurely,
# which happens in regression 2CO.ccsd.aug-cc-pVDZ.out.
if line.strip() != "":
break
numbers = next(inputfile) # Eigenvector numbers.
# This is for regression CdtetraM1B3LYP.
if "ALPHA SET" in numbers:
blank = next(inputfile)
numbers = next(inputfile)
# If not all coefficients are printed, the logfile will go right to
# the beta section if there is one, so break out in that case.
if "BETA SET" in numbers:
line = numbers
break
# Sometimes there are some blank lines here.
while not line.strip():
line = next(inputfile)
# Geometry optimizations don't have END OF RHF/DFT
# CALCULATION, they head right into the next section.
if "--------" in line:
break
# Eigenvalues for these orbitals (in hartrees).
try:
self.moenergies[0].extend([utils.convertor(float(x), "hartree", "eV") for x in line.split()])
except:
self.logger.warning('MO section found but could not be parsed!')
break
# Orbital symmetries.
line = next(inputfile)
if line.strip():
self.mosyms[0].extend(list(map(self.normalisesym, line.split())))
# Now we have nbasis lines. We will use the same method as in normalise_aonames() before.
p = re.compile("(\d+)\s*([A-Z][A-Z]?)\s*(\d+)\s*([A-Z]+)")
oldatom = '0'
i_atom = 0 # counter to keep track of n_atoms > 99
flag_w = True # flag necessary to keep from adding 100's at wrong time
for i in range(self.nbasis):
line = next(inputfile)
# If line is empty, break (ex. for FMO in exam37 which is a regression).
if not line.strip():
break
# Fill atombasis and aonames only first time around
if readatombasis and base == 0:
aonames = []
start = line[:17].strip()
m = p.search(start)
if m:
g = m.groups()
g2 = int(g[2]) # atom index in GAMESS file; changes to 0 after 99
# Check if we have moved to a hundred
# if so, increment the counter and add it to the parsed value
# There will be subsequent 0's as that atoms AO's are parsed
# so wait until the next atom is parsed before resetting flag
if g2 == 0 and flag_w:
i_atom = i_atom + 100
flag_w = False # handle subsequent AO's
if g2 != 0:
flag_w = True # reset flag
g2 = g2 + i_atom
aoname = "%s%i_%s" % (g[1].capitalize(), g2, g[3])
oldatom = str(g2)
atomno = g2-1
orbno = int(g[0])-1
else: # For F orbitals, as shown above
g = [x.strip() for x in line.split()]
aoname = "%s%s_%s" % (g[1].capitalize(), oldatom, g[2])
atomno = int(oldatom)-1
orbno = int(g[0])-1
self.atombasis[atomno].append(orbno)
self.aonames.append(aoname)
coeffs = line[15:] # Strip off the crud at the start.
j = 0
while j*11+4 < len(coeffs):
self.mocoeffs[0][base+j, i] = float(coeffs[j * 11:(j + 1) * 11])
j += 1
# If it's a restricted calc and no more properties, we have:
#
# ...... END OF RHF/DFT CALCULATION ......
#
# If there are more properties (such as the density matrix):
# --------------
#
# If it's an unrestricted calculation, however, we now get the beta orbitals:
#
# ----- BETA SET -----
#
# ------------
# EIGENVECTORS
# ------------
#
# 1 2 3 4 5
# ...
#
if "BETA SET" not in line:
line = next(inputfile)
line = next(inputfile)
# This can come in between the alpha and beta orbitals (see #130).
if line.strip() == "LZ VALUE ANALYSIS FOR THE MOS":
while line.strip():
line = next(inputfile)
line = next(inputfile)
# Covers label with both dashes and stars (like regression CdtetraM1B3LYP).
if "BETA SET" in line:
self.mocoeffs.append(numpy.zeros((self.nmo, self.nbasis), "d"))
self.moenergies.append([])
self.mosyms.append([])
blank = next(inputfile)
line = next(inputfile)
# Sometimes EIGENVECTORS is missing, so rely on dashes to signal it.
if set(line.strip()) == {'-'}:
self.skip_lines(inputfile, ['EIGENVECTORS', 'd', 'b'])
line = next(inputfile)
for base in range(0, self.nmo, 5):
self.updateprogress(inputfile, "Coefficients")
if base != 0:
line = next(inputfile)
line = next(inputfile)
line = next(inputfile)
if "properties" in line.lower():
break
self.moenergies[1].extend([utils.convertor(float(x), "hartree", "eV") for x in line.split()])
line = next(inputfile)
self.mosyms[1].extend(list(map(self.normalisesym, line.split())))
for i in range(self.nbasis):
line = next(inputfile)
temp = line[15:] # Strip off the crud at the start
j = 0
while j * 11 + 4 < len(temp):
self.mocoeffs[1][base+j, i] = float(temp[j * 11:(j + 1) * 11])
j += 1
line = next(inputfile)
self.moenergies = [numpy.array(x, "d") for x in self.moenergies]
# Natural orbital coefficients and occupation numbers, presently supported only
# for CIS calculations. Looks the same as eigenvectors, without symmetry labels.
#
# --------------------
# CIS NATURAL ORBITALS
# --------------------
#
# 1 2 3 4 5
#
# 2.0158 2.0036 2.0000 2.0000 1.0000
#
# 1 O 1 S 0.000000 -0.157316 0.999428 0.164938 0.000000
# 2 O 1 S 0.000000 0.754402 0.004472 -0.581970 0.000000
# ...
#
if line[10:30] == "CIS NATURAL ORBITALS":
self.nocoeffs = numpy.zeros((self.nmo, self.nbasis), "d")
self.nooccnos = []
self.skip_line(inputfile, 'dashes')
for base in range(0, self.nmo, 5):
self.skip_lines(inputfile, ['blank', 'numbers'])
# The eigenvalues that go along with these natural orbitals are
# their occupation numbers. Sometimes there are blank lines before them.
line = next(inputfile)
while not line.strip():
line = next(inputfile)
eigenvalues = map(float, line.split())
self.nooccnos.extend(eigenvalues)
# Orbital symemtry labels are normally here for MO coefficients.
line = next(inputfile)
# Now we have nbasis lines with the coefficients.
for i in range(self.nbasis):
line = next(inputfile)
coeffs = line[15:]
j = 0
while j*11+4 < len(coeffs):
self.nocoeffs[base+j, i] = float(coeffs[j * 11:(j + 1) * 11])
j += 1
# We cannot trust this self.homos until we come to the phrase:
# SYMMETRIES FOR INITAL GUESS ORBITALS FOLLOW
# which either is followed by "ALPHA" or "BOTH" at which point we can say
# for certain that it is an un/restricted calculations.
# Note that MCSCF calcs also print this search string, so make sure
# that self.homos does not exist yet.
if line[1:28] == "NUMBER OF OCCUPIED ORBITALS" and not hasattr(self, 'homos'):
homos = [int(line.split()[-1])-1]
line = next(inputfile)
homos.append(int(line.split()[-1])-1)
self.set_attribute('homos', homos)
if line.find("SYMMETRIES FOR INITIAL GUESS ORBITALS FOLLOW") >= 0:
# Not unrestricted, so lop off the second index.
# In case the search string above was not used (ex. FMO in exam38),
# we can try to use the next line which should also contain the
# number of occupied orbitals.
if line.find("BOTH SET(S)") >= 0:
nextline = next(inputfile)
if "ORBITALS ARE OCCUPIED" in nextline:
homos = int(nextline.split()[0])-1
if hasattr(self, "homos"):
try:
assert self.homos[0] == homos
except AssertionError:
self.logger.warning("Number of occupied orbitals not consistent. This is normal for ECP and FMO jobs.")
else:
self.homos = [homos]
self.homos = numpy.resize(self.homos, [1])
# Set the total number of atoms, only once.
# Normally GAMESS print TOTAL NUMBER OF ATOMS, however in some cases
# this is slightly different (ex. lower case for FMO in exam37).
if not hasattr(self, "natom") and "NUMBER OF ATOMS" in line.upper():
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
# The first is from Julien's Example and the second is from Alexander's
# I think it happens if you use a polar basis function instead of a cartesian one
if line.find("NUMBER OF CARTESIAN GAUSSIAN BASIS") == 1 or line.find("TOTAL NUMBER OF BASIS FUNCTIONS") == 1:
nbasis = int(line.strip().split()[-1])
self.set_attribute('nbasis', nbasis)
elif line.find("TOTAL NUMBER OF CONTAMINANTS DROPPED") >= 0:
nmos_dropped = int(line.split()[-1])
if hasattr(self, "nmo"):
self.set_attribute('nmo', self.nmo - nmos_dropped)
else:
self.set_attribute('nmo', self.nbasis - nmos_dropped)
# Note that this line is present if ISPHER=1, e.g. for C_bigbasis
elif line.find("SPHERICAL HARMONICS KEPT IN THE VARIATION SPACE") >= 0:
nmo = int(line.strip().split()[-1])
self.set_attribute('nmo', nmo)
# Note that this line is not always present, so by default
# NBsUse is set equal to NBasis (see below).
elif line.find("TOTAL NUMBER OF MOS IN VARIATION SPACE") == 1:
nmo = int(line.split()[-1])
self.set_attribute('nmo', nmo)
elif line.find("OVERLAP MATRIX") == 0 or line.find("OVERLAP MATRIX") == 1:
# The first is for PC-GAMESS, the second for GAMESS
# Read 1-electron overlap matrix
if not hasattr(self, "aooverlaps"):
self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
else:
self.logger.info("Reading additional aooverlaps...")
base = 0
while base < self.nbasis:
self.updateprogress(inputfile, "Overlap")
self.skip_lines(inputfile, ['b', 'basis_fn_number', 'b'])
for i in range(self.nbasis - base): # Fewer lines each time
line = next(inputfile)
temp = line.split()
for j in range(4, len(temp)):
self.aooverlaps[base+j-4, i+base] = float(temp[j])
self.aooverlaps[i+base, base+j-4] = float(temp[j])
base += 5
# ECP Pseudopotential information
if "ECP POTENTIALS" in line:
if not hasattr(self, "coreelectrons"):
self.coreelectrons = [0]*self.natom
self.skip_lines(inputfile, ['d', 'b'])
header = next(inputfile)
while header.split()[0] == "PARAMETERS":
name = header[17:25]
atomnum = int(header[34:40])
# The pseudopotnetial is given explicitely
if header[40:50] == "WITH ZCORE":
zcore = int(header[50:55])
lmax = int(header[63:67])
self.coreelectrons[atomnum-1] = zcore
# The pseudopotnetial is copied from another atom
if header[40:55] == "ARE THE SAME AS":
atomcopy = int(header[60:])
self.coreelectrons[atomnum-1] = self.coreelectrons[atomcopy-1]
line = next(inputfile)
while line.split() != []:
line = next(inputfile)
header = next(inputfile)
# This was used before refactoring the parser, geotargets was set here after parsing.
#if not hasattr(self, "geotargets"):
# opttol = 1e-4
# self.geotargets = numpy.array([opttol, 3. / opttol], "d")
#if hasattr(self,"geovalues"): self.geovalues = numpy.array(self.geovalues, "d")
# This is quite simple to parse, but some files seem to print certain lines twice,
# repeating the populations without charges, but not in proper order.
# The unrestricted calculations are a bit tricky, since GAMESS-US prints populations
# for both alpha and beta orbitals in the same format and with the same title,
# but it still prints the charges only at the very end.
if "TOTAL MULLIKEN AND LOWDIN ATOMIC POPULATIONS" in line:
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
header = next(inputfile)
line = next(inputfile)
# It seems that when population are printed twice (without charges),
# there is a blank line along the way (after the first header),
# so let's get a flag out of that circumstance.
doubles_printed = line.strip() == ""
if doubles_printed:
title = next(inputfile)
header = next(inputfile)
line = next(inputfile)
# Only go further if the header had five columns, which should
# be the case when both populations and charges are printed.
# This is pertinent for both double printing and unrestricted output.
if not len(header.split()) == 5:
return
mulliken, lowdin = [], []
while line.strip():
if line.strip() and doubles_printed:
line = next(inputfile)
mulliken.append(float(line.split()[3]))
lowdin.append(float(line.split()[5]))
line = next(inputfile)
self.atomcharges["mulliken"] = mulliken
self.atomcharges["lowdin"] = lowdin
# ---------------------
# ELECTROSTATIC MOMENTS
# ---------------------
#
# POINT 1 X Y Z (BOHR) CHARGE
# -0.000000 0.000000 0.000000 -0.00 (A.U.)
# DX DY DZ /D/ (DEBYE)
# 0.000000 -0.000000 0.000000 0.000000
#
if line.strip() == "ELECTROSTATIC MOMENTS":
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
# The old PC-GAMESS prints memory assignment information here.
if "MEMORY ASSIGNMENT" in line:
memory_assignment = next(inputfile)
line = next(inputfile)
# If something else ever comes up, we should get a signal from this assert.
assert line.split()[0] == "POINT"
# We can get the reference point from here, as well as
# check here that the net charge of the molecule is correct.
coords_and_charge = next(inputfile)
assert coords_and_charge.split()[-1] == '(A.U.)'
reference = numpy.array([float(x) for x in coords_and_charge.split()[:3]])
reference = utils.convertor(reference, 'bohr', 'Angstrom')
charge = int(round(float(coords_and_charge.split()[-2])))
self.set_attribute('charge', charge)
dipoleheader = next(inputfile)
assert dipoleheader.split()[:3] == ['DX', 'DY', 'DZ']
assert dipoleheader.split()[-1] == "(DEBYE)"
dipoleline = next(inputfile)
dipole = [float(d) for d in dipoleline.split()[:3]]
# The dipole is always the first multipole moment to be printed,
# so if it already exists, we will overwrite all moments since we want
# to leave just the last printed value (could change in the future).
if not hasattr(self, 'moments'):
self.moments = [reference, dipole]
else:
try:
assert self.moments[1] == dipole
except AssertionError:
self.logger.warning('Overwriting previous multipole moments with new values')
self.logger.warning('This could be from post-HF properties or geometry optimization')
self.moments = [reference, dipole]
# Static polarizability from a harmonic frequency calculation
# with $CPHF/POLAR=.TRUE.
if line.strip() == 'ALPHA POLARIZABILITY TENSOR (ANGSTROMS**3)':
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.zeros(shape=(3, 3))
self.skip_lines(inputfile, ['d', 'b', 'directions'])
for i in range(3):
line = next(inputfile)
polarizability[i, :i+1] = [float(x) for x in line.split()[1:]]
polarizability = utils.symmetrize(polarizability, use_triangle='lower')
# Convert from Angstrom**3 to bohr**3 (a.u.**3).
volume_convert = numpy.vectorize(lambda x: x * utils.convertor(1, 'Angstrom', 'bohr') ** 3)
polarizability = volume_convert(polarizability)
self.polarizabilities.append(polarizability)
# Static and dynamic polarizability from RUNTYP=TDHF.
if line.strip() == 'TIME-DEPENDENT HARTREE-FOCK NLO PROPERTIES':
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.empty(shape=(3, 3))
coord_to_idx = {'X': 0, 'Y': 1, 'Z': 2}
self.skip_lines(inputfile, ['d', 'b', 'dots'])
line = next(inputfile)
assert 'ALPHA AT' in line
self.skip_lines(inputfile, ['dots', 'b'])
for a in range(3):
for b in range(3):
line = next(inputfile)
tokens = line.split()
i, j = coord_to_idx[tokens[1][0]], coord_to_idx[tokens[1][1]]
polarizability[i, j] = tokens[3]
self.polarizabilities.append(polarizability)
if line[:30] == ' ddikick.x: exited gracefully.'\
or line[:41] == ' EXECUTION OF FIREFLY TERMINATED NORMALLY'\
or line[:40] == ' EXECUTION OF GAMESS TERMINATED NORMALLY':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/gamessukparser.py 0000664 0000000 0000000 00000070732 13535330462 0020567 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for GAMESS-UK output files"""
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class GAMESSUK(logfileparser.Logfile):
"""A GAMESS UK log file"""
SCFRMS, SCFMAX, SCFENERGY = list(range(3)) # Used to index self.scftargets[]
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(GAMESSUK, self).__init__(logname="GAMESSUK", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "GAMESS UK log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'GAMESSUK("%s")' % (self.filename)
def normalisesym(self, label):
"""Use standard symmetry labels instead of GAMESS UK labels."""
label = label.replace("''", '"').replace("+", "").replace("-", "")
ans = label[0].upper() + label[1:]
return ans
def before_parsing(self):
# used for determining whether to add a second mosyms, etc.
self.betamosyms = self.betamoenergies = self.betamocoeffs = False
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number and optionally the revision number.
if "version" in line:
search = re.search(r"\sversion\s*(\d\.\d)", line)
if search:
self.metadata["package_version"] = search.groups()[0]
if "Revision" in line:
revision = line.split()[1]
# Don't add revision information to the main package version for now.
# if "package_version" in self.metadata:
# package_version = "{}.r{}".format(self.metadata["package_version"],
# revision)
# self.metadata["package_version"] = package_version
if line[1:22] == "total number of atoms":
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
if line[3:44] == "convergence threshold in optimization run":
# Assuming that this is only found in the case of OPTXYZ
# (i.e. an optimization in Cartesian coordinates)
self.geotargets = [float(line.split()[-2])]
if line[32:61] == "largest component of gradient":
# This is the geotarget in the case of OPTXYZ
if not hasattr(self, "geovalues"):
self.geovalues = []
self.geovalues.append([float(line.split()[4])])
if line[37:49] == "convergence?":
# Get the geovalues and geotargets for OPTIMIZE
if not hasattr(self, "geovalues"):
self.geovalues = []
self.geotargets = []
geotargets = []
geovalues = []
for i in range(4):
temp = line.split()
geovalues.append(float(temp[2]))
if not self.geotargets:
geotargets.append(float(temp[-2]))
line = next(inputfile)
self.geovalues.append(geovalues)
if not self.geotargets:
self.geotargets = geotargets
# This is the only place coordinates are printed in single point calculations. Note that
# in the following fragment, the basis set selection is not always printed:
#
# ******************
# molecular geometry
# ******************
#
# ****************************************
# * basis selected is sto sto3g *
# ****************************************
#
# *******************************************************************************
# * *
# * atom atomic coordinates number of *
# * charge x y z shells *
# * *
# *******************************************************************************
# * *
# * *
# * c 6.0 0.0000000 -2.6361501 0.0000000 2 *
# * 1s 2sp *
# * *
# * *
# * c 6.0 0.0000000 2.6361501 0.0000000 2 *
# * 1s 2sp *
# * *
# ...
#
if line.strip() == "molecular geometry":
self.updateprogress(inputfile, "Coordinates")
self.skip_lines(inputfile, ['s', 'b', 's'])
line = next(inputfile)
if "basis selected is" in line:
self.skip_lines(inputfile, ['s', 'b', 's', 's'])
self.skip_lines(inputfile, ['header1', 'header2', 's', 's'])
atomnos = []
atomcoords = []
line = next(inputfile)
while line.strip():
line = next(inputfile)
if line.strip()[1:10].strip() and list(set(line.strip())) != ['*']:
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in line.split()[3:6]])
atomnos.append(int(round(float(line.split()[2]))))
if not hasattr(self, "atomcoords"):
self.atomcoords = []
self.atomcoords.append(atomcoords)
self.set_attribute('atomnos', atomnos)
# Each step of a geometry optimization will also print the coordinates:
#
# search 0
# *******************
# point 0 nuclear coordinates
# *******************
#
# x y z chg tag
# ============================================================
# 0.0000000 -2.6361501 0.0000000 6.00 c
# 0.0000000 2.6361501 0.0000000 6.00 c
# ..
#
if line[40:59] == "nuclear coordinates":
self.updateprogress(inputfile, "Coordinates")
# We need not remember the first geometry in geometry optimizations, as this will
# be already parsed from the "molecular geometry" section (see above).
if not hasattr(self, 'firstnuccoords') or self.firstnuccoords:
self.firstnuccoords = False
return
self.skip_lines(inputfile, ['s', 'b', 'colname', 'e'])
atomcoords = []
atomnos = []
line = next(inputfile)
while list(set(line.strip())) != ['=']:
cols = line.split()
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in cols[0:3]])
atomnos.append(int(float(cols[3])))
line = next(inputfile)
if not hasattr(self, "atomcoords"):
self.atomcoords = []
self.atomcoords.append(atomcoords)
self.set_attribute('atomnos', atomnos)
# This is printed when a geometry optimization succeeds, after the last gradient of the energy.
if line[40:62] == "optimization converged":
self.skip_line(inputfile, 's')
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues)-1)
# This is apparently printed when a geometry optimization is not converged but the job ends.
if "minimisation not converging" in line:
self.skip_line(inputfile, 's')
self.optdone = []
if line[1:32] == "total number of basis functions":
nbasis = int(line.split()[-1])
self.set_attribute('nbasis', nbasis)
while line.find("charge of molecule") < 0:
line = next(inputfile)
charge = int(line.split()[-1])
self.set_attribute('charge', charge)
mult = int(next(inputfile).split()[-1])
self.set_attribute('mult', mult)
alpha = int(next(inputfile).split()[-1])-1
beta = int(next(inputfile).split()[-1])-1
if self.mult == 1:
self.homos = numpy.array([alpha], "i")
else:
self.homos = numpy.array([alpha, beta], "i")
if line[37:69] == "s-matrix over gaussian basis set":
self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
self.skip_lines(inputfile, ['d', 'b'])
i = 0
while i < self.nbasis:
self.updateprogress(inputfile, "Overlap")
self.skip_lines(inputfile, ['b', 'b', 'header', 'b', 'b'])
for j in range(self.nbasis):
temp = list(map(float, next(inputfile).split()[1:]))
self.aooverlaps[j, (0+i):(len(temp)+i)] = temp
i += len(temp)
if line[18:43] == 'EFFECTIVE CORE POTENTIALS':
self.skip_line(inputfile, 'stars')
self.coreelectrons = numpy.zeros(self.natom, 'i')
line = next(inputfile)
while line[15:46] != "*"*31:
if line.find("for atoms ...") >= 0:
atomindex = []
line = next(inputfile)
while line.find("core charge") < 0:
broken = line.split()
atomindex.extend([int(x.split("-")[0]) for x in broken])
line = next(inputfile)
charge = float(line.split()[4])
for idx in atomindex:
self.coreelectrons[idx-1] = self.atomnos[idx-1] - charge
line = next(inputfile)
if line[3:27] == "Wavefunction convergence":
self.scftarget = float(line.split()[-2])
self.scftargets = []
if line[11:22] == "normal mode":
if not hasattr(self, "vibfreqs"):
self.vibfreqs = []
self.vibirs = []
units = next(inputfile)
xyz = next(inputfile)
equals = next(inputfile)
line = next(inputfile)
while line != equals:
temp = line.split()
self.vibfreqs.append(float(temp[1]))
self.vibirs.append(float(temp[-2]))
line = next(inputfile)
# Use the length of the vibdisps to figure out
# how many rotations and translations to remove
self.vibfreqs = self.vibfreqs[-len(self.vibdisps):]
self.vibirs = self.vibirs[-len(self.vibdisps):]
if line[44:73] == "normalised normal coordinates":
self.skip_lines(inputfile, ['e', 'b', 'b'])
self.vibdisps = []
freqnum = next(inputfile)
while freqnum.find("=") < 0:
self.skip_lines(inputfile, ['b', 'e', 'freqs', 'e', 'b', 'header', 'e'])
p = [[] for x in range(9)]
for i in range(len(self.atomnos)):
brokenx = list(map(float, next(inputfile)[25:].split()))
brokeny = list(map(float, next(inputfile)[25:].split()))
brokenz = list(map(float, next(inputfile)[25:].split()))
for j, x in enumerate(list(zip(brokenx, brokeny, brokenz))):
p[j].append(x)
self.vibdisps.extend(p)
self.skip_lines(inputfile, ['b', 'b'])
freqnum = next(inputfile)
if line[26:36] == "raman data":
self.vibramans = []
self.skip_lines(inputfile, ['s', 'b', 'header', 'b'])
line = next(inputfile)
while line[1] != "*":
self.vibramans.append(float(line.split()[3]))
self.skip_line(inputfile, 'blank')
line = next(inputfile)
# Use the length of the vibdisps to figure out
# how many rotations and translations to remove
self.vibramans = self.vibramans[-len(self.vibdisps):]
if line[3:11] == "SCF TYPE":
self.scftype = line.split()[-2]
assert self.scftype in ['rhf', 'uhf', 'gvb'], "%s not one of 'rhf', 'uhf' or 'gvb'" % self.scftype
if line[15:31] == "convergence data":
if not hasattr(self, "scfvalues"):
self.scfvalues = []
self.scftargets.append([self.scftarget]) # Assuming it does not change over time
while line[1:10] != "="*9:
line = next(inputfile)
line = next(inputfile)
tester = line.find("tester") # Can be in a different place depending
assert tester >= 0
while line[1:10] != "="*9: # May be two or three lines (unres)
line = next(inputfile)
scfvalues = []
line = next(inputfile)
while line.strip():
# e.g. **** recalulation of fock matrix on iteration 4 (examples/chap12/pyridine.out)
if line[2:6] != "****":
scfvalues.append([float(line[tester-5:tester+6])])
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last tester: {}'.format(line.split()[5]))
break
self.scfvalues.append(scfvalues)
if line[10:22] == "total energy" and len(line.split()) == 3:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
scfenergy = utils.convertor(float(line.split()[-1]), "hartree", "eV")
self.scfenergies.append(scfenergy)
# Total energies after Moller-Plesset corrections
# Second order correction is always first, so its first occurance
# triggers creation of mpenergies (list of lists of energies)
# Further corrections are appended as found
# Note: GAMESS-UK sometimes prints only the corrections,
# so they must be added to the last value of scfenergies
if line[10:32] == "mp2 correlation energy" or \
line[10:42] == "second order perturbation energy":
if not hasattr(self, "mpenergies"):
self.mpenergies = []
self.mpenergies.append([])
self.mp2correction = self.float(line.split()[-1])
self.mp2energy = self.scfenergies[-1] + self.mp2correction
self.mpenergies[-1].append(utils.convertor(self.mp2energy, "hartree", "eV"))
if line[10:41] == "third order perturbation energy":
self.mp3correction = self.float(line.split()[-1])
self.mp3energy = self.mp2energy + self.mp3correction
self.mpenergies[-1].append(utils.convertor(self.mp3energy, "hartree", "eV"))
if line[40:59] == "molecular basis set":
self.gbasis = []
line = next(inputfile)
while line.find("contraction coefficients") < 0:
line = next(inputfile)
equals = next(inputfile)
blank = next(inputfile)
atomname = next(inputfile)
basisregexp = re.compile("\d*(\D+)") # Get everything after any digits
shellcounter = 1
while line != equals:
gbasis = [] # Stores basis sets on one atom
blank = next(inputfile)
blank = next(inputfile)
line = next(inputfile)
shellno = int(line.split()[0])
shellgap = shellno - shellcounter
shellsize = 0
while len(line.split()) != 1 and line != equals:
if line.split():
shellsize += 1
coeff = {}
# coefficients and symmetries for a block of rows
while line.strip() and line != equals:
temp = line.strip().split()
# temp[1] may be either like (a) "1s" and "1sp", or (b) "s" and "sp"
# See GAMESS-UK 7.0 distribution/examples/chap12/pyridine2_21m10r.out
# for an example of the latter
sym = basisregexp.match(temp[1]).groups()[0]
assert sym in ['s', 'p', 'd', 'f', 'sp'], "'%s' not a recognized symmetry" % sym
if sym == "sp":
coeff.setdefault("S", []).append((float(temp[3]), float(temp[6])))
coeff.setdefault("P", []).append((float(temp[3]), float(temp[10])))
else:
coeff.setdefault(sym.upper(), []).append((float(temp[3]), float(temp[6])))
line = next(inputfile)
# either a blank or a continuation of the block
if coeff:
if sym == "sp":
gbasis.append(('S', coeff['S']))
gbasis.append(('P', coeff['P']))
else:
gbasis.append((sym.upper(), coeff[sym.upper()]))
if line == equals:
continue
line = next(inputfile)
# either the start of the next block or the start of a new atom or
# the end of the basis function section (signified by a line of equals)
numtoadd = 1 + (shellgap // shellsize)
shellcounter = shellno + shellsize
for x in range(numtoadd):
self.gbasis.append(gbasis)
if line[50:70] == "----- beta set -----":
self.betamosyms = True
self.betamoenergies = True
self.betamocoeffs = True
# betamosyms will be turned off in the next
# SYMMETRY ASSIGNMENT section
if line[31:50] == "SYMMETRY ASSIGNMENT":
if not hasattr(self, "mosyms"):
self.mosyms = []
multiple = {'a': 1, 'b': 1, 'e': 2, 't': 3, 'g': 4, 'h': 5}
equals = next(inputfile)
line = next(inputfile)
while line != equals: # There may be one or two lines of title (compare mg10.out and duhf_1.out)
line = next(inputfile)
mosyms = []
line = next(inputfile)
while line != equals:
temp = line[25:30].strip()
if temp[-1] == '?':
# e.g. e? or t? or g? (see example/chap12/na7mg_uhf.out)
# for two As, an A and an E, and two Es of the same energy respectively.
t = line[91:].strip().split()
for i in range(1, len(t), 2):
for j in range(multiple[t[i][0]]): # add twice for 'e', etc.
mosyms.append(self.normalisesym(t[i]))
else:
for j in range(multiple[temp[0]]):
mosyms.append(self.normalisesym(temp)) # add twice for 'e', etc.
line = next(inputfile)
assert len(mosyms) == self.nmo, "mosyms: %d but nmo: %d" % (len(mosyms), self.nmo)
if self.betamosyms:
# Only append if beta (otherwise with IPRINT SCF
# it will add mosyms for every step of a geo opt)
self.mosyms.append(mosyms)
self.betamosyms = False
elif self.scftype == 'gvb':
# gvb has alpha and beta orbitals but they are identical
self.mosysms = [mosyms, mosyms]
else:
self.mosyms = [mosyms]
if line[50:62] == "eigenvectors":
# Mocoeffs...can get evalues from here too
# (only if using FORMAT HIGH though will they all be present)
if not hasattr(self, "mocoeffs"):
self.aonames = []
aonames = []
minus = next(inputfile)
mocoeffs = numpy.zeros((self.nmo, self.nbasis), "d")
readatombasis = False
if not hasattr(self, "atombasis"):
self.atombasis = []
for i in range(self.natom):
self.atombasis.append([])
readatombasis = True
self.skip_lines(inputfile, ['b', 'b', 'evalues'])
p = re.compile(r"\d+\s+(\d+)\s*(\w+) (\w+)")
oldatomname = "DUMMY VALUE"
mo = 0
while mo < self.nmo:
self.updateprogress(inputfile, "Coefficients")
self.skip_lines(inputfile, ['b', 'b', 'nums', 'b', 'b'])
for basis in range(self.nbasis):
line = next(inputfile)
# Fill atombasis only first time around.
if readatombasis:
orbno = int(line[1:5])-1
atomno = int(line[6:9])-1
self.atombasis[atomno].append(orbno)
if not self.aonames:
pg = p.match(line[:18].strip()).groups()
atomname = "%s%s%s" % (pg[1][0].upper(), pg[1][1:], pg[0])
if atomname != oldatomname:
aonum = 1
oldatomname = atomname
name = "%s_%d%s" % (atomname, aonum, pg[2].upper())
if name in aonames:
aonum += 1
name = "%s_%d%s" % (atomname, aonum, pg[2].upper())
aonames.append(name)
temp = list(map(float, line[19:].split()))
mocoeffs[mo:(mo+len(temp)), basis] = temp
# Fill atombasis only first time around.
readatombasis = False
if not self.aonames:
self.aonames = aonames
line = next(inputfile) # blank line
while not line.strip():
line = next(inputfile)
evalues = line
if evalues[:17].strip(): # i.e. if these aren't evalues
break # Not all the MOs are present
mo += len(temp)
mocoeffs = mocoeffs[0:(mo+len(temp)), :] # In case some aren't present
if self.betamocoeffs:
self.mocoeffs.append(mocoeffs)
else:
self.mocoeffs = [mocoeffs]
if line[7:12] == "irrep":
########## eigenvalues ###########
# This section appears once at the start of a geo-opt and once at the end
# unless IPRINT SCF is used (when it appears at every step in addition)
if not hasattr(self, "moenergies"):
self.moenergies = []
equals = next(inputfile)
while equals[1:5] != "====": # May be one or two lines of title (compare duhf_1.out and mg10.out)
equals = next(inputfile)
moenergies = []
line = next(inputfile)
if not line.strip(): # May be a blank line here (compare duhf_1.out and mg10.out)
line = next(inputfile)
while line.strip() and line != equals: # May end with a blank or equals
temp = line.strip().split()
moenergies.append(utils.convertor(float(temp[2]), "hartree", "eV"))
line = next(inputfile)
self.nmo = len(moenergies)
if self.betamoenergies:
self.moenergies.append(moenergies)
self.betamoenergies = False
elif self.scftype == 'gvb':
self.moenergies = [moenergies, moenergies]
else:
self.moenergies = [moenergies]
# The dipole moment is printed by default at the beginning of the wavefunction analysis,
# but the value is in atomic units, so we need to convert to Debye. It seems pretty
# evident that the reference point is the origin (0,0,0) which is also the center
# of mass after reorientation at the beginning of the job, although this is not
# stated anywhere (would be good to check).
#
# *********************
# wavefunction analysis
# *********************
#
# commence analysis at 24.61 seconds
#
# dipole moments
#
#
# nuclear electronic total
#
# x 0.0000000 0.0000000 0.0000000
# y 0.0000000 0.0000000 0.0000000
# z 0.0000000 0.0000000 0.0000000
#
if line.strip() == "dipole moments":
# In older version there is only one blank line before the header,
# and newer version there are two.
self.skip_line(inputfile, 'blank')
line = next(inputfile)
if not line.strip():
line = next(inputfile)
self.skip_line(inputfile, 'blank')
dipole = []
for i in range(3):
line = next(inputfile)
dipole.append(float(line.split()[-1]))
reference = [0.0, 0.0, 0.0]
dipole = utils.convertor(numpy.array(dipole), "ebohr", "Debye")
if not hasattr(self, 'moments'):
self.moments = [reference, dipole]
else:
assert self.moments[1] == dipole
# Net atomic charges are not printed at all, it seems,
# but you can get at them from nuclear charges and
# electron populations, which are printed like so:
#
# ---------------------------------------
# mulliken and lowdin population analyses
# ---------------------------------------
#
# ----- total gross population in aos ------
#
# 1 1 c s 1.99066 1.98479
# 2 1 c s 1.14685 1.04816
# ...
#
# ----- total gross population on atoms ----
#
# 1 c 6.0 6.00446 5.99625
# 2 c 6.0 6.00446 5.99625
# 3 c 6.0 6.07671 6.04399
# ...
if line[10:49] == "mulliken and lowdin population analyses":
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
while not "total gross population on atoms" in line:
line = next(inputfile)
self.skip_line(inputfile, 'blank')
line = next(inputfile)
mulliken, lowdin = [], []
while line.strip():
nuclear = float(line.split()[2])
mulliken.append(nuclear - float(line.split()[3]))
lowdin.append(nuclear - float(line.split()[4]))
line = next(inputfile)
self.atomcharges["mulliken"] = mulliken
self.atomcharges["lowdin"] = lowdin
# ----- spinfree UHF natural orbital occupations -----
#
# 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000
#
# 2.0000000 2.0000000 2.0000000 2.0000000 2.0000000 1.9999997 1.9999997
# ...
if "natural orbital occupations" in line:
occupations = []
self.skip_line(inputfile, "blank")
line = inputfile.next()
while line.strip():
occupations += map(float, line.split())
self.skip_line(inputfile, "blank")
line = inputfile.next()
self.set_attribute('nooccnos', occupations)
if line[:33] == ' end of G A M E S S program at':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/gaussianparser.py 0000664 0000000 0000000 00000246723 13535330462 0020567 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Gaussian output files"""
from __future__ import print_function
import re
import numpy
from cclib.parser import data
from cclib.parser import logfileparser
from cclib.parser import utils
class Gaussian(logfileparser.Logfile):
"""A Gaussian 98/03 log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Gaussian, self).__init__(logname="Gaussian", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Gaussian log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Gaussian("%s")' % (self.filename)
def normalisesym(self, label):
"""Use standard symmetry labels instead of Gaussian labels.
To normalise:
(1) If label is one of [SG, PI, PHI, DLTA], replace by [sigma, pi, phi, delta]
(2) replace any G or U by their lowercase equivalent
"""
# note: DLT must come after DLTA
greek = [('SG', 'sigma'), ('PI', 'pi'), ('PHI', 'phi'),
('DLTA', 'delta'), ('DLT', 'delta')]
for k, v in greek:
if label.startswith(k):
tmp = label[len(k):]
label = v
if tmp:
label = v + "." + tmp
ans = label.replace("U", "u").replace("G", "g")
return ans
def before_parsing(self):
# Used to index self.scftargets[].
SCFRMS, SCFMAX, SCFENERGY = list(range(3))
# Extract only well-formed numbers in scientific notation.
self.re_scinot = re.compile('(\w*)=\s*(-?\d\.\d{2}D[+-]\d{2})')
# Extract only well-formed numbers in traditional
# floating-point format.
self.re_float = re.compile('(\w*-?\w*)=\s*(-?\d+\.\d{10,})')
# Flag for identifying Coupled Cluster runs.
self.coupledcluster = False
# Fragment number for counterpoise or fragment guess calculations
# (normally zero).
self.counterpoise = 0
# Flag for identifying ONIOM calculations.
self.oniom = False
# Flag for identifying BOMD calculations.
# These calculations have a back-integration algorithm so that not all
# geometries should be kept.
# We also add a "time" attribute to the parser.
self.BOMD = False
# Do we have high-precision polarizabilities printed from a
# dedicated `polar` job? If so, avoid duplicate parsing.
self.hp_polarizabilities = False
def after_parsing(self):
# Correct the percent values in the etsecs in the case of
# a restricted calculation. The following has the
# effect of including each transition twice.
if hasattr(self, "etsecs") and len(self.homos) == 1:
new_etsecs = [[(x[0], x[1], x[2] * numpy.sqrt(2)) for x in etsec]
for etsec in self.etsecs]
self.etsecs = new_etsecs
if hasattr(self, "scanenergies"):
self.scancoords = []
self.scancoords = self.atomcoords
if (hasattr(self, 'enthalpy') and hasattr(self, 'temperature')
and hasattr(self, 'freeenergy')):
self.set_attribute('entropy', (self.enthalpy - self.freeenergy) / self.temperature)
# This bit is needed in order to trim coordinates that are printed a second time
# at the end of geometry optimizations. Note that we need to do this for both atomcoords
# and inputcoords. The reason is that normally a standard orientation is printed and that
# is what we parse into atomcoords, but inputcoords stores the input (unmodified) coordinates
# and that is copied over to atomcoords if no standard oritentation was printed, which happens
# for example for jobs with no symmetry. This last step, however, is now generic for all parsers.
# Perhaps then this part should also be generic code...
# Regression that tests this: Gaussian03/cyclopropenyl.rhf.g03.cut.log
if hasattr(self, 'optstatus') and len(self.optstatus) > 0:
last_point = len(self.optstatus) - 1
if hasattr(self, 'atomcoords'):
self.atomcoords = self.atomcoords[:last_point + 1]
if hasattr(self, 'inputcoords'):
self.inputcoords = self.inputcoords[:last_point + 1]
# If we parsed high-precision vibrational displacements, overwrite
# lower-precision displacements in self.vibdisps
if hasattr(self, 'vibdispshp'):
self.vibdisps = self.vibdispshp
del self.vibdispshp
if hasattr(self, 'time'):
self.time = [self.time[i] for i in sorted(self.time.keys())]
if hasattr(self, 'energies_BOMD'):
self.set_attribute('scfenergies',
[self.energies_BOMD[i] for i in sorted(self.energies_BOMD.keys())])
if hasattr(self, 'atomcoords_BOMD'):
self.atomcoords= \
[self.atomcoords_BOMD[i] for i in sorted(self.atomcoords_BOMD.keys())]
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number: "Gaussian 09, Revision D.01"
# becomes "09revisionD.01".
if line.strip() == "Cite this work as:":
line = inputfile.next()
tokens = line.split()
self.metadata["package_version"] = ''.join([
tokens[1][:-1],
'revision',
tokens[-1][:-1],
])
if line.strip().startswith("Link1: Proceeding to internal job step number"):
self.new_internal_job()
# This block contains some general information as well as coordinates,
# which could be parsed in the future:
#
# Symbolic Z-matrix:
# Charge = 0 Multiplicity = 1
# C 0.73465 0. 0.
# C 1.93465 0. 0.
# C
# ...
#
# It also lists fragments, if there are any, which is potentially valuable:
#
# Symbolic Z-matrix:
# Charge = 0 Multiplicity = 1 in supermolecule
# Charge = 0 Multiplicity = 1 in fragment 1.
# Charge = 0 Multiplicity = 1 in fragment 2.
# B(Fragment=1) 0.06457 -0.0279 0.01364
# H(Fragment=1) 0.03117 -0.02317 1.21604
# ...
#
# Note, however, that currently we only parse information for the whole system
# or supermolecule as Gaussian calls it.
if line.strip() == "Symbolic Z-matrix:":
self.updateprogress(inputfile, "Symbolic Z-matrix", self.fupdate)
line = inputfile.next()
while line.split()[0] == 'Charge':
# For the supermolecule, we can parse the charge and multicplicity.
regex = ".*=(.*)Mul.*=\s*-?(\d+).*"
match = re.match(regex, line)
assert match, "Something unusual about the line: '%s'" % line
self.set_attribute('charge', int(match.groups()[0]))
self.set_attribute('mult', int(match.groups()[1]))
if line.split()[-2] == "fragment":
self.nfragments = int(line.split()[-1].strip('.'))
if line.strip()[-13:] == "model system.":
self.nmodels = getattr(self, 'nmodels', 0) + 1
line = inputfile.next()
# The remaining part will allow us to get the atom count.
# When coordinates are given, there is a blank line at the end, but if
# there is a Z-matrix here, there will also be variables and we need to
# stop at those to get the right atom count.
# Also, in older versions there is bo blank line (G98 regressions),
# so we need to watch out for leaving the link.
natom = 0
while line.split() and not "Variables" in line and not "Leave Link" in line:
natom += 1
line = inputfile.next()
self.set_attribute('natom', natom)
# Continuing from above, there is not always a symbolic matrix, for example
# if the Z-matrix was in the input file. In such cases, try to match the
# line and get at the charge and multiplicity.
#
# Charge = 0 Multiplicity = 1 in supermolecule
# Charge = 0 Multiplicity = 1 in fragment 1.
# Charge = 0 Multiplicity = 1 in fragment 2.
if line[1:7] == 'Charge' and line.find("Multiplicity") >= 0:
self.updateprogress(inputfile, "Charge and Multiplicity", self.fupdate)
if line.split()[-1] == "supermolecule" or not "fragment" in line:
regex = ".*=(.*)Mul.*=\s*-?(\d+).*"
match = re.match(regex, line)
assert match, "Something unusual about the line: '%s'" % line
self.set_attribute('charge', int(match.groups()[0]))
self.set_attribute('mult', int(match.groups()[1]))
if line.split()[-2] == "fragment":
self.nfragments = int(line.split()[-1].strip('.'))
# Number of atoms is also explicitely printed after the above.
if line[1:8] == "NAtoms=":
self.updateprogress(inputfile, "Attributes", self.fupdate)
natom = int(re.search('NAtoms=\s*(\d+)', line).group(1))
self.set_attribute('natom', natom)
# Basis set name
if line[1:15] == "Standard basis":
self.metadata["basis_set"] = line.split()[2]
# Dipole moment
# e.g. from G09
# Dipole moment (field-independent basis, Debye):
# X= 0.0000 Y= 0.0000 Z= 0.0930
# e.g. from G03
# X= 0.0000 Y= 0.0000 Z= -1.6735 Tot= 1.6735
# need the "field independent" part - ONIOM and other calc use diff formats
if line[1:39] == "Dipole moment (field-independent basis":
self.updateprogress(inputfile, "Dipole and Higher Moments", self.fupdate)
self.reference = [0.0, 0.0, 0.0]
self.moments = [self.reference]
tokens = inputfile.next().split()
# split - dipole would need to be *huge* to fail a split
# and G03 and G09 use different spacing
if len(tokens) >= 6:
dipole = (float(tokens[1]), float(tokens[3]), float(tokens[5]))
if not hasattr(self, 'moments'):
self.moments = [self.reference, dipole]
else:
self.moments.append(dipole)
if line[1:43] == "Quadrupole moment (field-independent basis":
# e.g. (g09)
# Quadrupole moment (field-independent basis, Debye-Ang):
# XX= -6.1213 YY= -4.2950 ZZ= -5.4175
# XY= 0.0000 XZ= 0.0000 YZ= 0.0000
# or from g03
# XX= -6.1213 YY= -4.2950 ZZ= -5.4175
quadrupole = {}
for j in range(2): # two rows
line = inputfile.next()
if line[22] == '=': # g03 file
for i in (1, 18, 35):
quadrupole[line[i:i+4]] = float(line[i+5:i+16])
else:
for i in (1, 27, 53):
quadrupole[line[i:i+4]] = float(line[i+5:i+25])
lex = sorted(quadrupole.keys())
quadrupole = [quadrupole[key] for key in lex]
if not hasattr(self, 'moments') or len(self.moments) < 2:
self.logger.warning("Found quadrupole moments but no previous dipole")
self.reference = [0.0, 0.0, 0.0]
self.moments = [self.reference, None, quadrupole]
else:
if len(self.moments) == 2:
self.moments.append(quadrupole)
else:
assert self.moments[2] == quadrupole
if line[1:41] == "Octapole moment (field-independent basis":
# e.g.
# Octapole moment (field-independent basis, Debye-Ang**2):
# XXX= 0.0000 YYY= 0.0000 ZZZ= -0.1457 XYY= 0.0000
# XXY= 0.0000 XXZ= 0.0136 XZZ= 0.0000 YZZ= 0.0000
# YYZ= -0.5848 XYZ= 0.0000
octapole = {}
for j in range(2): # two rows
line = inputfile.next()
if line[22] == '=': # g03 file
for i in (1, 18, 35, 52):
octapole[line[i:i+4]] = float(line[i+5:i+16])
else:
for i in (1, 27, 53, 79):
octapole[line[i:i+4]] = float(line[i+5:i+25])
# last line only 2 moments
line = inputfile.next()
if line[22] == '=': # g03 file
for i in (1, 18):
octapole[line[i:i+4]] = float(line[i+5:i+16])
else:
for i in (1, 27):
octapole[line[i:i+4]] = float(line[i+5:i+25])
lex = sorted(octapole.keys())
octapole = [octapole[key] for key in lex]
if not hasattr(self, 'moments') or len(self.moments) < 3:
self.logger.warning("Found octapole moments but no previous dipole or quadrupole")
self.reference = [0.0, 0.0, 0.0]
self.moments = [self.reference, None, None, octapole]
else:
if len(self.moments) == 3:
self.moments.append(octapole)
else:
assert self.moments[3] == octapole
if line[1:20] == "Hexadecapole moment":
# e.g.
# Hexadecapole moment (field-independent basis, Debye-Ang**3):
# XXXX= -3.2614 YYYY= -6.8264 ZZZZ= -4.9965 XXXY= 0.0000
# XXXZ= 0.0000 YYYX= 0.0000 YYYZ= 0.0000 ZZZX= 0.0000
# ZZZY= 0.0000 XXYY= -1.8585 XXZZ= -1.4123 YYZZ= -1.7504
# XXYZ= 0.0000 YYXZ= 0.0000 ZZXY= 0.0000
hexadecapole = {}
# read three lines worth of 4 moments per line
for j in range(3):
line = inputfile.next()
if line[22] == '=': # g03 file
for i in (1, 18, 35, 52):
hexadecapole[line[i:i+4]] = float(line[i+5:i+16])
else:
for i in (1, 27, 53, 79):
hexadecapole[line[i:i+4]] = float(line[i+5:i+25])
# last line only 3 moments
line = inputfile.next()
if line[22] == '=': # g03 file
for i in (1, 18, 35):
hexadecapole[line[i:i+4]] = float(line[i+5:i+16])
else:
for i in (1, 27, 53):
hexadecapole[line[i:i+4]] = float(line[i+5:i+25])
lex = sorted(hexadecapole.keys())
hexadecapole = [hexadecapole[key] for key in lex]
if not hasattr(self, 'moments') or len(self.moments) < 4:
self.reference = [0.0, 0.0, 0.0]
self.moments = [self.reference, None, None, None, hexadecapole]
else:
if len(self.moments) == 4:
self.append_attribute("moments", hexadecapole)
else:
try:
numpy.testing.assert_equal(self.moments[4], hexadecapole)
except AssertionError:
self.logger.warning("Attribute hexadecapole changed value (%s -> %s)" % (self.moments[4], hexadecapole))
self.append_attribute("moments", hexadecapole)
# Catch message about completed optimization.
if line[1:23] == "Optimization completed":
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues) - 1)
assert hasattr(self, "optstatus") and len(self.optstatus) > 0
self.optstatus[-1] += data.ccData.OPT_DONE
# Catch message about stopped optimization (not converged).
if line[1:21] == "Optimization stopped":
if not hasattr(self, "optdone"):
self.optdone = []
assert hasattr(self, "optstatus") and len(self.optstatus) > 0
self.optstatus[-1] += data.ccData.OPT_UNCONVERGED
# Extract the atomic numbers and coordinates from the input orientation,
# in the event the standard orientation isn't available.
# Don't extract from Input or Z-matrix orientation in a BOMD run, as only
# the final geometry should be kept but extract inputatoms.
if line.find("Input orientation") > -1 or line.find("Z-Matrix orientation") > -1:
# If this is a counterpoise calculation, this output means that
# the supermolecule is now being considered, so we can set:
self.counterpoise = 0
self.updateprogress(inputfile, "Attributes", self.cupdate)
if not self.BOMD and not hasattr(self, "inputcoords"):
self.inputcoords = []
self.inputatoms = []
self.skip_lines(inputfile, ['d', 'cols', 'cols', 'd'])
atomcoords = []
line = next(inputfile)
while list(set(line.strip())) != ["-"]:
broken = line.split()
self.inputatoms.append(int(broken[1]))
atomcoords.append(list(map(float, broken[3:6])))
line = next(inputfile)
if not self.BOMD: self.inputcoords.append(atomcoords)
self.set_attribute('atomnos', self.inputatoms)
self.set_attribute('natom', len(self.inputatoms))
if self.BOMD and line.startswith(' Summary information for step'):
# We keep time and energies_BOMD and coordinates in a dictionary
# because steps can be recalculated, and we need to overwite the
# previous data
broken = line.split()
step = int(broken[-1])
line = next(inputfile)
broken = line.split()
if not hasattr(self, "time"):
self.set_attribute('time', {step:float(broken[-1])})
else:
self.time[step] = float(broken[-1])
line = next(inputfile)
broken = line.split(';')[1].split()
ene = utils.convertor(self.float(broken[-1]), "hartree", "eV")
if not hasattr(self, "energies_BOMD"):
self.set_attribute('energies_BOMD', {step:ene})
else:
self.energies_BOMD[step] = ene
self.updateprogress(inputfile, "Attributes", self.cupdate)
if not hasattr(self, "atomcoords_BOMD"):
self.atomcoords_BOMD = {}
#self.inputatoms = []
self.skip_lines(inputfile, ['EKin', 'Angular', 'JX', 'Total', 'Total', 'Cartesian'])
atomcoords = []
line = next(inputfile)
while not "MW cartesian" in line:
broken = line.split()
atomcoords.append(list(map(self.float, (broken[3], broken[5], broken[7]))))
# self.inputatoms.append(int(broken[1]))
line = next(inputfile)
self.atomcoords_BOMD[step] = atomcoords
#self.set_attribute('atomnos', self.inputatoms)
#self.set_attribute('natom', len(self.inputatoms))
# Extract the atomic masses.
# Typical section:
# Isotopes and Nuclear Properties:
#(Nuclear quadrupole moments (NQMom) in fm**2, nuclear magnetic moments (NMagM)
# in nuclear magnetons)
#
# Atom 1 2 3 4 5 6 7 8 9 10
# IAtWgt= 12 12 12 12 12 1 1 1 12 12
# AtmWgt= 12.0000000 12.0000000 12.0000000 12.0000000 12.0000000 1.0078250 1.0078250 1.0078250 12.0000000 12.0000000
# NucSpn= 0 0 0 0 0 1 1 1 0 0
# AtZEff= -3.6000000 -3.6000000 -3.6000000 -3.6000000 -3.6000000 -1.0000000 -1.0000000 -1.0000000 -3.6000000 -3.6000000
# NQMom= 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
# NMagM= 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 2.7928460 2.7928460 2.7928460 0.0000000 0.0000000
# ... with blank lines dividing blocks of ten, and Leave Link 101 at the end.
# This is generally parsed before coordinates, so atomnos is not defined.
# Note that in Gaussian03 the comments are not there yet and the labels are different.
if line.strip() == "Isotopes and Nuclear Properties:":
if not hasattr(self, "atommasses"):
self.atommasses = []
line = next(inputfile)
while line[1:16] != "Leave Link 101":
if line[1:8] == "AtmWgt=":
self.atommasses.extend(list(map(float, line.split()[1:])))
line = next(inputfile)
# Extract the atomic numbers and coordinates of the atoms.
if line.strip() == "Standard orientation:":
self.updateprogress(inputfile, "Attributes", self.cupdate)
# If this is a counterpoise calculation, this output means that
# the supermolecule is now being considered, so we can set:
self.counterpoise = 0
if not hasattr(self, "atomcoords"):
self.atomcoords = []
self.skip_lines(inputfile, ['d', 'cols', 'cols', 'd'])
atomnos = []
atomcoords = []
line = next(inputfile)
while list(set(line.strip())) != ["-"]:
broken = line.split()
atomnos.append(int(broken[1]))
atomcoords.append(list(map(float, broken[-3:])))
line = next(inputfile)
self.atomcoords.append(atomcoords)
self.set_attribute('natom', len(atomnos))
self.set_attribute('atomnos', atomnos)
# This is a bit of a hack for regression Gaussian09/BH3_fragment_guess.pop_minimal.log
# to skip output for all fragments, assuming the supermolecule is always printed first.
# Eventually we want to make this more general, or even better parse the output for
# all fragment, but that will happen in a newer version of cclib.
if line[1:16] == "Fragment guess:" and getattr(self, 'nfragments', 0) > 1:
if not "full" in line:
inputfile.seek(0, 2)
# Another hack for regression Gaussian03/ortho_prod_prod_freq.log, which is an ONIOM job.
# Basically for now we stop parsing after the output for the real system, because
# currently we don't support changes in system size or fragments in cclib. When we do,
# we will want to parse the model systems, too, and that is what nmodels could track.
if "ONIOM: generating point" in line and line.strip()[-13:] == 'model system.' and getattr(self, 'nmodels', 0) > 0:
inputfile.seek(0, 2)
# With the gfinput keyword, the atomic basis set functios are:
#
# AO basis set in the form of general basis input (Overlap normalization):
# 1 0
# S 3 1.00 0.000000000000
# 0.7161683735D+02 0.1543289673D+00
# 0.1304509632D+02 0.5353281423D+00
# 0.3530512160D+01 0.4446345422D+00
# SP 3 1.00 0.000000000000
# 0.2941249355D+01 -0.9996722919D-01 0.1559162750D+00
# 0.6834830964D+00 0.3995128261D+00 0.6076837186D+00
# 0.2222899159D+00 0.7001154689D+00 0.3919573931D+00
# ****
# 2 0
# S 3 1.00 0.000000000000
# 0.7161683735D+02 0.1543289673D+00
# ...
#
# The same is also printed when the gfprint keyword is used, but the
# interstitial lines differ and there are no stars between atoms:
#
# AO basis set (Overlap normalization):
# Atom C1 Shell 1 S 3 bf 1 - 1 0.509245180608 -2.664678875191 0.000000000000
# 0.7161683735D+02 0.1543289673D+00
# 0.1304509632D+02 0.5353281423D+00
# 0.3530512160D+01 0.4446345422D+00
# Atom C1 Shell 2 SP 3 bf 2 - 5 0.509245180608 -2.664678875191 0.000000000000
# 0.2941249355D+01 -0.9996722919D-01 0.1559162750D+00
# ...
#ONIOM calculations result basis sets reported for atoms that are not in order of atom number which breaks this code (line 390 relies on atoms coming in order)
if line[1:13] == "AO basis set" and not self.oniom:
self.gbasis = []
# For counterpoise fragment calcualtions, skip these lines.
if self.counterpoise != 0:
return
atom_line = inputfile.next()
self.gfprint = atom_line.split()[0] == "Atom"
self.gfinput = not self.gfprint
# Note how the shell information is on a separate line for gfinput,
# whereas for gfprint it is on the same line as atom information.
if self.gfinput:
shell_line = inputfile.next()
shell = []
while len(self.gbasis) < self.natom:
if self.gfprint:
cols = atom_line.split()
subshells = cols[4]
ngauss = int(cols[5])
else:
cols = shell_line.split()
subshells = cols[0]
ngauss = int(cols[1])
parameters = []
for ig in range(ngauss):
line = inputfile.next()
parameters.append(list(map(self.float, line.split())))
for iss, ss in enumerate(subshells):
contractions = []
for param in parameters:
exponent = param[0]
coefficient = param[iss+1]
contractions.append((exponent, coefficient))
subshell = (ss, contractions)
shell.append(subshell)
if self.gfprint:
line = inputfile.next()
if line.split()[0] == "Atom":
atomnum = int(re.sub(r"\D", "", line.split()[1]))
if atomnum == len(self.gbasis) + 2:
self.gbasis.append(shell)
shell = []
atom_line = line
else:
self.gbasis.append(shell)
else:
line = inputfile.next()
if line.strip() == "****":
self.gbasis.append(shell)
shell = []
atom_line = inputfile.next()
shell_line = inputfile.next()
else:
shell_line = line
# Find the targets for SCF convergence (QM calcs).
# Not for BOMD as targets are not available in the summary
if not self.BOMD and line[1:44] == 'Requested convergence on RMS density matrix':
if not hasattr(self, "scftargets"):
self.scftargets = []
# The following can happen with ONIOM which are mixed SCF
# and semi-empirical
if type(self.scftargets) == type(numpy.array([])):
self.scftargets = []
scftargets = []
# The RMS density matrix.
scftargets.append(self.float(line.split('=')[1].split()[0]))
line = next(inputfile)
# The MAX density matrix.
scftargets.append(self.float(line.strip().split('=')[1][:-1]))
line = next(inputfile)
# For G03, there's also the energy (not for G98).
if line[1:10] == "Requested":
scftargets.append(self.float(line.strip().split('=')[1][:-1]))
self.scftargets.append(scftargets)
# Extract SCF convergence information (QM calcs).
if line[1:10] == 'Cycle 1':
if not hasattr(self, "scfvalues"):
self.scfvalues = []
scfvalues = []
line = next(inputfile)
while line.find("SCF Done") == -1:
self.updateprogress(inputfile, "QM convergence", self.fupdate)
if line.find(' E=') == 0:
self.logger.debug(line)
# RMSDP=3.74D-06 MaxDP=7.27D-05 DE=-1.73D-07 OVMax= 3.67D-05
# or
# RMSDP=1.13D-05 MaxDP=1.08D-04 OVMax= 1.66D-04
if line.find(" RMSDP") == 0:
# Fields of interest:
# RMSDP
# MaxDP
# (DE) -> Only add the energy if it's a target criteria
matches = self.re_scinot.findall(line)
matches = {
match[0]: self.float(match[1])
for match in matches
}
scfvalues_step = [
matches.get('RMSDP', numpy.nan),
matches.get('MaxDP', numpy.nan)
]
if len(self.scftargets[0]) == 3:
scfvalues_step.append(matches.get('DE', numpy.nan))
scfvalues.append(scfvalues_step)
try:
line = next(inputfile)
# May be interupted by EOF.
except StopIteration:
self.logger.warning('File terminated before end of last SCF!')
break
self.scfvalues.append(scfvalues)
# Extract SCF convergence information (AM1, INDO and other semi-empirical calcs).
# The output (for AM1) looks like this:
# Ext34=T Pulay=F Camp-King=F BShift= 0.00D+00
# It= 1 PL= 0.103D+01 DiagD=T ESCF= 31.564733 Diff= 0.272D+02 RMSDP= 0.152D+00.
# It= 2 PL= 0.114D+00 DiagD=T ESCF= 7.265370 Diff=-0.243D+02 RMSDP= 0.589D-02.
# ...
# It= 11 PL= 0.184D-04 DiagD=F ESCF= 4.687669 Diff= 0.260D-05 RMSDP= 0.134D-05.
# It= 12 PL= 0.105D-04 DiagD=F ESCF= 4.687669 Diff=-0.686D-07 RMSDP= 0.215D-05.
# 4-point extrapolation.
# It= 13 PL= 0.110D-05 DiagD=F ESCF= 4.687669 Diff=-0.111D-06 RMSDP= 0.653D-07.
# Energy= 0.172272018655 NIter= 14.
if line[1:4] == 'It=':
scftargets = numpy.array([1E-7], "d") # This is the target value for the rms
scfvalues = [[]]
while line.find(" Energy") == -1:
self.updateprogress(inputfile, "AM1 Convergence")
if line[1:4] == "It=":
parts = line.strip().split()
scfvalues[0].append(self.float(parts[-1][:-1]))
line = next(inputfile)
# If an AM1 or INDO guess is used (Guess=INDO in the input, for example),
# this will be printed after a single iteration, so that is the line
# that should trigger a break from this loop. At least that's what we see
# for regression Gaussian/Gaussian09/guessIndo_modified_ALT.out
if line[:14] == " Initial guess":
break
# Attach the attributes to the object Only after the energy is found .
if line.find(" Energy") == 0:
self.scftargets = scftargets
self.scfvalues = scfvalues
# Note: this needs to follow the section where 'SCF Done' is used
# to terminate a loop when extracting SCF convergence information.
if not self.BOMD and line[1:9] == 'SCF Done':
t1 = line.split()[2]
if t1 == 'E(RHF)':
self.metadata["methods"].append("HF")
else:
self.metadata["methods"].append("DFT")
self.metadata["functional"] = t1[t1.index("(") + 2:t1.rindex(")")]
if not hasattr(self, "scfenergies"):
self.scfenergies = []
self.scfenergies.append(utils.convertor(self.float(line.split()[4]), "hartree", "eV"))
# gmagoon 5/27/09: added scfenergies reading for PM3 case
# Example line: " Energy= -0.077520562724 NIter= 14."
# See regression Gaussian03/QVGXLLKOCUKJST-UHFFFAOYAJmult3Fixed.out
if line[1:8] == 'Energy=':
if not hasattr(self, "scfenergies"):
self.scfenergies = []
self.scfenergies.append(utils.convertor(self.float(line.split()[1]), "hartree", "eV"))
# Total energies after Moller-Plesset corrections.
# Second order correction is always first, so its first occurance
# triggers creation of mpenergies (list of lists of energies).
# Further MP2 corrections are appended as found.
#
# Example MP2 output line:
# E2 = -0.9505918144D+00 EUMP2 = -0.28670924198852D+03
# Warning! this output line is subtly different for MP3/4/5 runs
if "EUMP2" in line[27:34]:
self.metadata["methods"].append("MP2")
if not hasattr(self, "mpenergies"):
self.mpenergies = []
self.mpenergies.append([])
mp2energy = self.float(line.split("=")[2])
self.mpenergies[-1].append(utils.convertor(mp2energy, "hartree", "eV"))
# Example MP3 output line:
# E3= -0.10518801D-01 EUMP3= -0.75012800924D+02
if line[34:39] == "EUMP3":
self.metadata["methods"].append("MP3")
mp3energy = self.float(line.split("=")[2])
self.mpenergies[-1].append(utils.convertor(mp3energy, "hartree", "eV"))
# Example MP4 output lines:
# E4(DQ)= -0.31002157D-02 UMP4(DQ)= -0.75015901139D+02
# E4(SDQ)= -0.32127241D-02 UMP4(SDQ)= -0.75016013648D+02
# E4(SDTQ)= -0.32671209D-02 UMP4(SDTQ)= -0.75016068045D+02
# Energy for most substitutions is used only (SDTQ by default)
if line[34:42] == "UMP4(DQ)":
self.metadata["methods"].append("MP4")
mp4energy = self.float(line.split("=")[2])
line = next(inputfile)
if line[34:43] == "UMP4(SDQ)":
mp4energy = self.float(line.split("=")[2])
line = next(inputfile)
if line[34:44] == "UMP4(SDTQ)":
mp4energy = self.float(line.split("=")[2])
self.mpenergies[-1].append(utils.convertor(mp4energy, "hartree", "eV"))
# Example MP5 output line:
# DEMP5 = -0.11048812312D-02 MP5 = -0.75017172926D+02
if line[29:32] == "MP5":
self.metadata["methods"].append("MP5")
mp5energy = self.float(line.split("=")[2])
self.mpenergies[-1].append(utils.convertor(mp5energy, "hartree", "eV"))
# Total energies after Coupled Cluster corrections.
# Second order MBPT energies (MP2) are also calculated for these runs,
# but the output is the same as when parsing for mpenergies.
# Read the consecutive correlated energies
# but append only the last one to ccenergies.
# Only the highest level energy is appended - ex. CCSD(T), not CCSD.
if line[1:10] == "DE(Corr)=" and line[27:35] == "E(CORR)=":
self.metadata["methods"].append("CCSD")
self.ccenergy = self.float(line.split()[3])
if line[1:10] == "T5(CCSD)=":
line = next(inputfile)
if line[1:9] == "CCSD(T)=":
self.metadata["methods"].append("CCSD-T")
self.ccenergy = self.float(line.split()[1])
if line[12:53] == "Population analysis using the SCF density":
if hasattr(self, "ccenergy"):
if not hasattr(self, "ccenergies"):
self.ccenergies = []
self.ccenergies.append(utils.convertor(self.ccenergy, "hartree", "eV"))
del self.ccenergy
# Find step number for current optimization/IRC
# Matches "Step number 123", "Pt XX Step number 123" and "PtXXX Step number 123"
if " Step number" in line:
step = int(line.split()[line.split().index('Step') + 2])
if not hasattr(self, "optstatus"):
self.optstatus = []
self.optstatus.append(data.ccData.OPT_UNKNOWN)
if step == 1:
self.optstatus[-1] += data.ccData.OPT_NEW
# Geometry convergence information.
if line[49:59] == 'Converged?':
if not hasattr(self, "geotargets"):
self.geovalues = []
self.geotargets = numpy.array([0.0, 0.0, 0.0, 0.0], "d")
newlist = [0]*4
for i in range(4):
line = next(inputfile)
self.logger.debug(line)
parts = line.split()
try:
value = self.float(parts[2])
except ValueError:
self.logger.error("Problem parsing the value for geometry optimisation: %s is not a number." % parts[2])
else:
newlist[i] = value
self.geotargets[i] = self.float(parts[3])
self.geovalues.append(newlist)
# Gradients.
# Read in the cartesian energy gradients (forces) from a block like this:
# -------------------------------------------------------------------
# Center Atomic Forces (Hartrees/Bohr)
# Number Number X Y Z
# -------------------------------------------------------------------
# 1 1 -0.012534744 -0.021754635 -0.008346094
# 2 6 0.018984731 0.032948887 -0.038003451
# 3 1 -0.002133484 -0.006226040 0.023174772
# 4 1 -0.004316502 -0.004968213 0.023174772
# -2 -0.001830728 -0.000743108 -0.000196625
# ------------------------------------------------------------------
#
# The "-2" line is for a dummy atom
#
# Then optimization is done in internal coordinates, Gaussian also
# print the forces in internal coordinates, which can be produced from
# the above. This block looks like this:
# Variable Old X -DE/DX Delta X Delta X Delta X New X
# (Linear) (Quad) (Total)
# ch 2.05980 0.01260 0.00000 0.01134 0.01134 2.07114
# hch 1.75406 0.09547 0.00000 0.24861 0.24861 2.00267
# hchh 2.09614 0.01261 0.00000 0.16875 0.16875 2.26489
# Item Value Threshold Converged?
# We could get the gradients in BOMD, but it is more complex because
# they are not in the summary, and they are not as relevant as for
# an optimization
if not self.BOMD and line[37:43] == "Forces":
if not hasattr(self, "grads"):
self.grads = []
self.skip_lines(inputfile, ['header', 'd'])
forces = []
line = next(inputfile)
while list(set(line.strip())) != ['-']:
tmpforces = []
for N in range(3): # Fx, Fy, Fz
force = line[23+N*15:38+N*15]
if force.startswith("*"):
force = "NaN"
tmpforces.append(float(force))
forces.append(tmpforces)
line = next(inputfile)
self.grads.append(forces)
#Extract PES scan data
#Summary of the potential surface scan:
# N A SCF
#---- --------- -----------
# 1 109.0000 -76.43373
# 2 119.0000 -76.43011
# 3 129.0000 -76.42311
# 4 139.0000 -76.41398
# 5 149.0000 -76.40420
# 6 159.0000 -76.39541
# 7 169.0000 -76.38916
# 8 179.0000 -76.38664
# 9 189.0000 -76.38833
# 10 199.0000 -76.39391
# 11 209.0000 -76.40231
#---- --------- -----------
if "Summary of the potential surface scan:" in line:
scanenergies = []
scanparm = []
colmnames = next(inputfile)
hyphens = next(inputfile)
line = next(inputfile)
while line != hyphens:
broken = line.split()
scanenergies.append(float(broken[-1]))
scanparm.append(map(float, broken[1:-1]))
line = next(inputfile)
if not hasattr(self, "scanenergies"):
self.scanenergies = []
self.scanenergies = scanenergies
if not hasattr(self, "scanparm"):
self.scanparm = []
self.scanparm = scanparm
if not hasattr(self, "scannames"):
self.scannames = colmnames.split()[1:-1]
# Orbital symmetries.
if line[1:20] == 'Orbital symmetries:' and not hasattr(self, "mosyms"):
# For counterpoise fragments, skip these lines.
if self.counterpoise != 0:
return
self.updateprogress(inputfile, "MO Symmetries", self.fupdate)
self.mosyms = [[]]
line = next(inputfile)
unres = False
if line.find("Alpha Orbitals") == 1:
unres = True
line = next(inputfile)
i = 0
while len(line) > 18 and line[17] == '(':
if line.find('Virtual') >= 0:
self.homos = [i - 1]
parts = line[17:].split()
for x in parts:
self.mosyms[0].append(self.normalisesym(x.strip('()')))
i += 1
line = next(inputfile)
if unres:
line = next(inputfile)
# Repeat with beta orbital information
i = 0
self.mosyms.append([])
while len(line) > 18 and line[17] == '(':
if line.find('Virtual') >= 0:
# Here we consider beta
# If there was also an alpha virtual orbital,
# we will store two indices in the array
# Otherwise there is no alpha virtual orbital,
# only beta virtual orbitals, and we initialize
# the array with one element. See the regression
# QVGXLLKOCUKJST-UHFFFAOYAJmult3Fixed.out
# donated by Gregory Magoon (gmagoon).
if (hasattr(self, "homos")):
# Extend the array to two elements
# 'HOMO' indexes the HOMO in the arrays
self.homos.append(i-1)
else:
# 'HOMO' indexes the HOMO in the arrays
self.homos = [i - 1]
parts = line[17:].split()
for x in parts:
self.mosyms[1].append(self.normalisesym(x.strip('()')))
i += 1
line = next(inputfile)
# Some calculations won't explicitely print the number of basis sets used,
# and will occasionally drop some without warning. We can infer the number,
# however, from the MO symmetries printed here. Specifically, this fixes
# regression Gaussian/Gaussian09/dvb_sp_terse.log (#23 on github).
self.set_attribute('nmo', len(self.mosyms[-1]))
# Alpha/Beta electron eigenvalues.
if line[1:6] == "Alpha" and line.find("eigenvalues") >= 0:
# For counterpoise fragments, skip these lines.
if self.counterpoise != 0:
return
# For ONIOM calcs, ignore this section in order to bypass assertion failure.
if self.oniom:
return
self.updateprogress(inputfile, "Eigenvalues", self.fupdate)
self.moenergies = [[]]
HOMO = -2
while line.find('Alpha') == 1:
if line.split()[1] == "virt." and HOMO == -2:
# If there aren't any symmetries, this is a good way to find the HOMO.
HOMO = len(self.moenergies[0])-1
self.homos = [HOMO]
# Convert to floats and append to moenergies, but sometimes Gaussian
# doesn't print correctly so test for ValueError (bug 1756789).
part = line[28:]
i = 0
while i*10+4 < len(part):
s = part[i*10:(i+1)*10]
try:
x = self.float(s)
except ValueError:
x = numpy.nan
self.moenergies[0].append(utils.convertor(x, "hartree", "eV"))
i += 1
line = next(inputfile)
# If, at this point, self.homos is unset, then there were not
# any alpha virtual orbitals
if not hasattr(self, "homos"):
HOMO = len(self.moenergies[0])-1
self.homos = [HOMO]
if line.find('Beta') == 2:
self.moenergies.append([])
HOMO = -2
while line.find('Beta') == 2:
if line.split()[1] == "virt." and HOMO == -2:
# If there aren't any symmetries, this is a good way to find the HOMO.
# Also, check for consistency if homos was already parsed.
HOMO = len(self.moenergies[1])-1
self.homos.append(HOMO)
part = line[28:]
i = 0
while i*10+4 < len(part):
x = part[i*10:(i+1)*10]
self.moenergies[1].append(utils.convertor(self.float(x), "hartree", "eV"))
i += 1
line = next(inputfile)
self.moenergies = [numpy.array(x, "d") for x in self.moenergies]
# Start of the IR/Raman frequency section.
# Caution is advised here, as additional frequency blocks
# can be printed by Gaussian (with slightly different formats),
# often doubling the information printed.
# See, for a non-standard exmaple, regression Gaussian98/test_H2.log
# If either the Gaussian freq=hpmodes keyword or IOP(7/33=1) is used,
# an extra frequency block with higher-precision vibdisps is
# printed before the normal frequency block.
# Note that the code parses only the vibsyms and vibdisps
# from the high-precision block, but parses vibsyms, vibfreqs,
# vibramans and vibirs from the normal block. vibsyms parsed
# from the high-precision block are discarded and replaced by those
# from the normal block while the high-precision vibdisps, if present,
# are used to overwrite default-precision vibdisps at the end of the parse.
if line[1:14] == "Harmonic freq": # This matches in both freq block types
self.updateprogress(inputfile, "Frequency Information", self.fupdate)
# The whole block should not have any blank lines.
while line.strip() != "":
# The line with indices
if line[1:15].strip() == "" and line[15:60].split()[0].isdigit():
freqbase = int(line[15:60].split()[0])
if freqbase == 1 and hasattr(self, 'vibsyms'):
# we are coming accross duplicated information.
# We might be be parsing a default-precision block having
# already parsed (only) vibsyms and displacements from
# the high-precision block, or might be encountering
# a second low-precision block (see e.g. 25DMF_HRANH.log
# regression).
self.vibsyms = []
if hasattr(self, "vibirs"):
self.vibirs = []
if hasattr(self, 'vibfreqs'):
self.vibfreqs = []
if hasattr(self, 'vibramans'):
self.vibramans = []
if hasattr(self, 'vibdisps'):
self.vibdisps = []
# Lines with symmetries and symm. indices begin with whitespace.
if line[1:15].strip() == "" and not line[15:60].split()[0].isdigit():
if not hasattr(self, 'vibsyms'):
self.vibsyms = []
syms = line.split()
self.vibsyms.extend(syms)
if line[1:15] == "Frequencies --": # note: matches low-precision block, and
if not hasattr(self, 'vibfreqs'):
self.vibfreqs = []
freqs = [self.float(f) for f in line[15:].split()]
self.vibfreqs.extend(freqs)
if line[1:15] == "IR Inten --": # note: matches only low-precision block
if not hasattr(self, 'vibirs'):
self.vibirs = []
irs = []
for ir in line[15:].split():
try:
irs.append(self.float(ir))
except ValueError:
irs.append(self.float('nan'))
self.vibirs.extend(irs)
if line[1:15] == "Raman Activ --": # note: matches only low-precision block
if not hasattr(self, 'vibramans'):
self.vibramans = []
ramans = []
for raman in line[15:].split():
try:
ramans.append(self.float(raman))
except ValueError:
ramans.append(self.float('nan'))
self.vibramans.extend(ramans)
# Block with (default-precision) displacements should start with this.
# 1 2 3
# A A A
# Frequencies -- 370.7936 370.7987 618.0103
# Red. masses -- 2.3022 2.3023 1.9355
# Frc consts -- 0.1865 0.1865 0.4355
# IR Inten -- 0.0000 0.0000 0.0000
# Atom AN X Y Z X Y Z X Y Z
# 1 6 0.00 0.00 -0.04 0.00 0.00 0.19 0.00 0.00 0.12
# 2 6 0.00 0.00 0.19 0.00 0.00 -0.06 0.00 0.00 -0.12
if line.strip().split()[0:3] == ["Atom", "AN", "X"]:
if not hasattr(self, 'vibdisps'):
self.vibdisps = []
disps = []
for n in range(self.natom):
line = next(inputfile)
numbers = [float(s) for s in line[10:].split()]
N = len(numbers) // 3
if not disps:
for n in range(N):
disps.append([])
for n in range(N):
disps[n].append(numbers[3*n:3*n+3])
self.vibdisps.extend(disps)
# Block with high-precision (freq=hpmodes) displacements should start with this.
# 1 2 3 4 5
# A A A A A
# Frequencies --- 370.7936 370.7987 618.0103 647.7864 647.7895
# Reduced masses --- 2.3022 2.3023 1.9355 6.4600 6.4600
# Force constants --- 0.1865 0.1865 0.4355 1.5971 1.5972
# IR Intensities --- 0.0000 0.0000 0.0000 0.0000 0.0000
# Coord Atom Element:
# 1 1 6 0.00000 0.00000 0.00000 -0.18677 0.05592
# 2 1 6 0.00000 0.00000 0.00000 0.28440 0.21550
# 3 1 6 -0.04497 0.19296 0.11859 0.00000 0.00000
# 1 2 6 0.00000 0.00000 0.00000 0.03243 0.37351
# 2 2 6 0.00000 0.00000 0.00000 0.14503 -0.06117
# 3 2 6 0.18959 -0.05753 -0.11859 0.00000 0.00000
if line.strip().split()[0:3] == ["Coord", "Atom", "Element:"]:
# Wait until very end of parsing to assign vibdispshp to self.vibdisps
# as otherwise the higher precision displacements will be overwritten
# by low precision displacements which are printed further down file
if not hasattr(self, 'vibdispshp'):
self.vibdispshp = []
disps = []
for n in range(3*self.natom):
line = next(inputfile)
numbers = [float(s) for s in line[16:].split()]
atomindex = int(line[4:10])-1 # atom index, starting at zero
numbermodes = len(numbers)
if not disps:
for mode in range(numbermodes):
# For each mode, make list of list [atom][coord_index]
disps.append([[] for x in range(0, self.natom)])
for mode in range(numbermodes):
disps[mode][atomindex].append(numbers[mode])
self.vibdispshp.extend(disps)
line = next(inputfile)
# Electronic transitions.
if line[1:14] == "Excited State":
if not hasattr(self, "etenergies"):
self.etenergies = []
self.etoscs = []
self.etsyms = []
self.etsecs = []
# Need to deal with lines like:
# (restricted calc)
# Excited State 1: Singlet-BU 5.3351 eV 232.39 nm f=0.1695
# (unrestricted calc) (first excited state is 2!)
# Excited State 2: ?Spin -A 0.1222 eV 10148.75 nm f=0.0000
# (Gaussian 09 ZINDO)
# Excited State 1: Singlet-?Sym 2.5938 eV 478.01 nm f=0.0000 =0.000
p = re.compile(":(?P.*?)(?P-?\d*\.\d*) eV")
groups = p.search(line).groups()
self.etenergies.append(utils.convertor(self.float(groups[1]), "eV", "wavenumber"))
self.etoscs.append(self.float(line.split("f=")[-1].split()[0]))
self.etsyms.append(groups[0].strip())
line = next(inputfile)
p = re.compile("(\d+)")
CIScontrib = []
while line.find(" ->") >= 0: # This is a contribution to the transition
parts = line.split("->")
self.logger.debug(parts)
# Has to deal with lines like:
# 32 -> 38 0.04990
# 35A -> 45A 0.01921
frommoindex = 0 # For restricted or alpha unrestricted
fromMO = parts[0].strip()
if fromMO[-1] == "B":
frommoindex = 1 # For beta unrestricted
fromMO = int(p.match(fromMO).group())-1 # subtract 1 so that it is an index into moenergies
t = parts[1].split()
tomoindex = 0
toMO = t[0]
if toMO[-1] == "B":
tomoindex = 1
toMO = int(p.match(toMO).group())-1 # subtract 1 so that it is an index into moenergies
percent = self.float(t[1])
# For restricted calculations, the percentage will be corrected
# after parsing (see after_parsing() above).
CIScontrib.append([(fromMO, frommoindex), (toMO, tomoindex), percent])
line = next(inputfile)
self.etsecs.append(CIScontrib)
# Circular dichroism data (different for G03 vs G09)
#
# G03
#
# ## <0|r|b> * (Au), Rotatory Strengths (R) in
# ## cgs (10**-40 erg-esu-cm/Gauss)
# ## state X Y Z R(length)
# ## 1 0.0006 0.0096 -0.0082 -0.4568
# ## 2 0.0251 -0.0025 0.0002 -5.3846
# ## 3 0.0168 0.4204 -0.3707 -15.6580
# ## 4 0.0721 0.9196 -0.9775 -3.3553
#
# G09
#
# ## 1/2[<0|r|b>* + (<0|rxdel|b>*)*]
# ## Rotatory Strengths (R) in cgs (10**-40 erg-esu-cm/Gauss)
# ## state XX YY ZZ R(length) R(au)
# ## 1 -0.3893 -6.7546 5.7736 -0.4568 -0.0010
# ## 2 -17.7437 1.7335 -0.1435 -5.3845 -0.0114
# ## 3 -11.8655 -297.2604 262.1519 -15.6580 -0.0332
if line[1:52] == "<0|r|b> * (Au), Rotatory Strengths (R)" or \
line[1:50] == "1/2[<0|r|b>* + (<0|rxdel|b>*)*]":
self.etrotats = []
self.skip_lines(inputfile, ['units'])
headers = next(inputfile)
Ncolms = len(headers.split())
line = next(inputfile)
parts = line.strip().split()
while len(parts) == Ncolms:
try:
R = self.float(parts[4])
except ValueError:
# nan or -nan if there is no first excited state
# (for unrestricted calculations)
pass
else:
self.etrotats.append(R)
line = next(inputfile)
temp = line.strip().split()
parts = line.strip().split()
self.etrotats = numpy.array(self.etrotats, "d")
# Number of basis sets functions.
# Has to deal with lines like:
# NBasis = 434 NAE= 97 NBE= 97 NFC= 34 NFV= 0
# and...
# NBasis = 148 MinDer = 0 MaxDer = 0
# Although the former is in every file, it doesn't occur before
# the overlap matrix is printed.
if line[1:7] == "NBasis" or line[4:10] == "NBasis":
# For counterpoise fragment, skip these lines.
if self.counterpoise != 0:
return
# For ONIOM calcs, ignore this section in order to bypass assertion failure.
if self.oniom:
return
# If nbasis was already parsed, check if it changed. If it did, issue a warning.
# In the future, we will probably want to have nbasis, as well as nmo below,
# as a list so that we don't need to pick one value when it changes.
nbasis = int(line.split('=')[1].split()[0])
if hasattr(self, "nbasis"):
try:
assert nbasis == self.nbasis
except AssertionError:
self.logger.warning("Number of basis functions (nbasis) has changed from %i to %i" % (self.nbasis, nbasis))
self.nbasis = nbasis
# Number of linearly-independent basis sets.
if line[1:7] == "NBsUse":
# For counterpoise fragment, skip these lines.
if self.counterpoise != 0:
return
# For ONIOM calcs, ignore this section in order to bypass assertion failure.
if self.oniom:
return
nmo = int(line.split('=')[1].split()[0])
self.set_attribute('nmo', nmo)
# For AM1 calculations, set nbasis by a second method,
# as nmo may not always be explicitly stated.
if line[7:22] == "basis functions, ":
nbasis = int(line.split()[0])
self.set_attribute('nbasis', nbasis)
# Molecular orbital overlap matrix.
# Has to deal with lines such as:
# *** Overlap ***
# ****** Overlap ******
# Note that Gaussian sometimes drops basis functions,
# causing the overlap matrix as parsed below to not be
# symmetric (which is a problem for population analyses, etc.)
if line[1:4] == "***" and (line[5:12] == "Overlap" or line[8:15] == "Overlap"):
# Ensure that this is the main calc and not a fragment
if self.counterpoise != 0:
return
self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
# Overlap integrals for basis fn#1 are in aooverlaps[0]
base = 0
colmNames = next(inputfile)
while base < self.nbasis:
self.updateprogress(inputfile, "Overlap", self.fupdate)
for i in range(self.nbasis-base): # Fewer lines this time
line = next(inputfile)
parts = line.split()
for j in range(len(parts)-1): # Some lines are longer than others
k = float(parts[j+1].replace("D", "E"))
self.aooverlaps[base+j, i+base] = k
self.aooverlaps[i+base, base+j] = k
base += 5
colmNames = next(inputfile)
self.aooverlaps = numpy.array(self.aooverlaps, "d")
# Molecular orbital coefficients (mocoeffs).
# Essentially only produced for SCF calculations.
# This is also the place where aonames and atombasis are parsed.
if line[5:35] == "Molecular Orbital Coefficients" or line[5:41] == "Alpha Molecular Orbital Coefficients" or line[5:40] == "Beta Molecular Orbital Coefficients":
# If counterpoise fragment, return without parsing orbital info
if self.counterpoise != 0:
return
# Skip this for ONIOM calcs
if self.oniom:
return
if line[5:40] == "Beta Molecular Orbital Coefficients":
beta = True
if self.popregular:
return
# This was continue before refactoring the parsers.
#continue # Not going to extract mocoeffs
# Need to add an extra array to self.mocoeffs
self.mocoeffs.append(numpy.zeros((self.nmo, self.nbasis), "d"))
else:
beta = False
self.aonames = []
self.atombasis = []
mocoeffs = [numpy.zeros((self.nmo, self.nbasis), "d")]
base = 0
self.popregular = False
for base in range(0, self.nmo, 5):
self.updateprogress(inputfile, "Coefficients", self.fupdate)
colmNames = next(inputfile)
if not colmNames.split():
self.logger.warning("Molecular coefficients header found but no coefficients.")
break
if base == 0 and int(colmNames.split()[0]) != 1:
# Implies that this is a POP=REGULAR calculation
# and so, only aonames (not mocoeffs) will be extracted
self.popregular = True
symmetries = next(inputfile)
eigenvalues = next(inputfile)
for i in range(self.nbasis):
line = next(inputfile)
if i == 0:
# Find location of the start of the basis function name
start_of_basis_fn_name = line.find(line.split()[3]) - 1
if base == 0 and not beta: # Just do this the first time 'round
parts = line[:start_of_basis_fn_name].split()
if len(parts) > 1: # New atom
if i > 0:
self.atombasis.append(atombasis)
atombasis = []
atomname = "%s%s" % (parts[2], parts[1])
orbital = line[start_of_basis_fn_name:20].strip()
self.aonames.append("%s_%s" % (atomname, orbital))
atombasis.append(i)
part = line[21:].replace("D", "E").rstrip()
temp = []
for j in range(0, len(part), 10):
temp.append(float(part[j:j+10]))
if beta:
self.mocoeffs[1][base:base + len(part) // 10, i] = temp
else:
mocoeffs[0][base:base + len(part) // 10, i] = temp
if base == 0 and not beta: # Do the last update of atombasis
self.atombasis.append(atombasis)
if self.popregular:
# We now have aonames, so no need to continue
break
if not self.popregular and not beta:
self.mocoeffs = mocoeffs
# Natural orbital coefficients (nocoeffs) and occupation numbers (nooccnos),
# which are respectively define the eigenvectors and eigenvalues of the
# diagonalized one-electron density matrix. These orbitals are formed after
# configuration interaction (CI) calculations, but not only. Similarly to mocoeffs,
# we can parse and check aonames and atombasis here.
#
# Natural Orbital Coefficients:
# 1 2 3 4 5
# Eigenvalues -- 2.01580 2.00363 2.00000 2.00000 1.00000
# 1 1 O 1S 0.00000 -0.15731 -0.28062 0.97330 0.00000
# 2 2S 0.00000 0.75440 0.57746 0.07245 0.00000
# ...
#
def natural_orbital_single_spin_parsing(inputfile, updateprogress_title):
coeffs = numpy.zeros((self.nmo, self.nbasis), "d")
occnos = []
aonames = []
atombasis = []
for base in range(0, self.nmo, 5):
self.updateprogress(inputfile, updateprogress_title, self.fupdate)
colmNames = next(inputfile)
eigenvalues = next(inputfile)
occnos.extend(map(float, eigenvalues.split()[2:]))
for i in range(self.nbasis):
line = next(inputfile)
# Just do this the first time 'round.
if base == 0:
# Changed below from :12 to :11 to deal with Elmar Neumann's example.
parts = line[:11].split()
# New atom.
if len(parts) > 1:
if i > 0:
atombasis.append(basisonatom)
basisonatom = []
atomname = "%s%s" % (parts[2], parts[1])
orbital = line[11:20].strip()
aonames.append("%s_%s" % (atomname, orbital))
basisonatom.append(i)
part = line[21:].replace("D", "E").rstrip()
temp = []
for j in range(0, len(part), 10):
temp.append(float(part[j:j+10]))
coeffs[base:base + len(part) // 10, i] = temp
# Do the last update of atombasis.
if base == 0:
atombasis.append(basisonatom)
return occnos, coeffs, aonames, atombasis
if line[5:33] == "Natural Orbital Coefficients":
updateprogress_title = "Natural orbitals"
nooccnos, nocoeffs, aonames, atombasis = natural_orbital_single_spin_parsing(inputfile, updateprogress_title)
self.set_attribute("nocoeffs", nocoeffs)
self.set_attribute("nooccnos", nooccnos)
self.set_attribute("atombasis", atombasis)
self.set_attribute("aonames", aonames)
# Natural spin orbital coefficients (nsocoeffs) and occupation numbers (nsooccnos)
# Parsed attributes are similar to the natural orbitals above except
# the natural spin orbitals and occupation numbers are the eigenvalues
# and eigenvectors of the one particles spin density matrices
# Alpha Natural Orbital Coefficients:
# 1 2 3 4 5
# Eigenvalues -- 1.00000 1.00000 0.99615 0.99320 0.99107
# 1 1 O 1S 0.70425 0.70600 -0.16844 -0.14996 -0.00000
# 2 2S 0.01499 0.01209 0.36089 0.34940 -0.00000
# ...
# Beta Natural Orbital Coefficients:
# 1 2 3 4 5
# Eigenvalues -- 1.00000 1.00000 0.99429 0.98790 0.98506
# 1 1 O 1S 0.70822 0.70798 -0.15316 -0.13458 0.00465
# 2 2S 0.00521 0.00532 0.33837 0.33189 -0.01301
# 3 3S -0.02542 -0.00841 0.28649 0.53224 0.18902
# ...
if line[5:39] == "Alpha Natural Orbital Coefficients":
updateprogress_title = "Natural Spin orbitals (alpha)"
nsooccnos, nsocoeffs, aonames, atombasis = natural_orbital_single_spin_parsing(inputfile, updateprogress_title)
if self.unified_no_nso:
self.append_attribute("nocoeffs", nsocoeffs)
self.append_attribute("nooccnos", nsooccnos)
else:
self.append_attribute("nsocoeffs", nsocoeffs)
self.append_attribute("nsooccnos", nsooccnos)
self.set_attribute("atombasis", atombasis)
self.set_attribute("aonames", aonames)
if line[5:38] == "Beta Natural Orbital Coefficients":
updateprogress_title = "Natural Spin orbitals (beta)"
nsooccnos, nsocoeffs, aonames, atombasis = natural_orbital_single_spin_parsing(inputfile, updateprogress_title)
if self.unified_no_nso:
self.append_attribute("nocoeffs", nsocoeffs)
self.append_attribute("nooccnos", nsooccnos)
else:
self.append_attribute("nsocoeffs", nsocoeffs)
self.append_attribute("nsooccnos", nsooccnos)
self.set_attribute("atombasis", atombasis)
self.set_attribute("aonames", aonames)
# For FREQ=Anharm, extract anharmonicity constants
if line[1:40] == "X matrix of Anharmonic Constants (cm-1)":
Nvibs = len(self.vibfreqs)
self.vibanharms = numpy.zeros((Nvibs, Nvibs), "d")
base = 0
colmNames = next(inputfile)
while base < Nvibs:
for i in range(Nvibs-base): # Fewer lines this time
line = next(inputfile)
parts = line.split()
for j in range(len(parts)-1): # Some lines are longer than others
k = float(parts[j+1].replace("D", "E"))
self.vibanharms[base+j, i+base] = k
self.vibanharms[i+base, base+j] = k
base += 5
colmNames = next(inputfile)
# Pseudopotential charges.
if line.find("Pseudopotential Parameters") > -1:
self.skip_lines(inputfile, ['e', 'label1', 'label2', 'e'])
line = next(inputfile)
if line.find("Centers:") < 0:
return
# This was continue before parser refactoring.
# continue
# Needs to handle code like the following:
#
# Center Atomic Valence Angular Power Coordinates
# Number Number Electrons Momentum of R Exponent Coefficient X Y Z
# ===================================================================================================================================
# Centers: 1
# Centers: 16
# Centers: 21 24
# Centers: 99100101102
# 1 44 16 -4.012684 -0.696698 0.006750
# F and up
# 0 554.3796303 -0.05152700
centers = []
while line.find("Centers:") >= 0:
temp = line[10:]
for i in range(0, len(temp)-3, 3):
centers.append(int(temp[i:i+3]))
line = next(inputfile)
centers.sort() # Not always in increasing order
self.coreelectrons = numpy.zeros(self.natom, "i")
for center in centers:
front = line[:10].strip()
while not (front and int(front) == center):
line = next(inputfile)
front = line[:10].strip()
info = line.split()
self.coreelectrons[center-1] = int(info[1]) - int(info[2])
line = next(inputfile)
# This will be printed for counterpoise calcualtions only.
# To prevent crashing, we need to know which fragment is being considered.
# Other information is also printed in lines that start like this.
if line[1:14] == 'Counterpoise:':
if line[42:50] == "fragment":
self.counterpoise = int(line[51:54])
# This will be printed only during ONIOM calcs; use it to set a flag
# that will allow assertion failures to be bypassed in the code.
if line[1:7] == "ONIOM:":
self.oniom = True
# This will be printed only during BOMD calcs;
if line.startswith(" INPUT DATA FOR L118"):
self.BOMD = True
# Atomic charges are straightforward to parse, although the header
# has changed over time somewhat.
#
# Mulliken charges:
# 1
# 1 C -0.004513
# 2 C -0.077156
# ...
# Sum of Mulliken charges = 0.00000
# Mulliken charges with hydrogens summed into heavy atoms:
# 1
# 1 C -0.004513
# 2 C 0.002063
# ...
#
if line[1:25] == "Mulliken atomic charges:" or line[1:18] == "Mulliken charges:" or \
line[1:23] == "Lowdin Atomic Charges:" or line[1:16] == "Lowdin charges:":
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
ones = next(inputfile)
charges = []
nline = next(inputfile)
while not "Sum of" in nline:
charges.append(float(nline.split()[2]))
nline = next(inputfile)
if "Mulliken" in line:
self.atomcharges["mulliken"] = charges
else:
self.atomcharges["lowdin"] = charges
if line.strip() == "Natural Population":
if not hasattr(self, 'atomcharges'):
self.atomcharges = {}
line1 = next(inputfile)
line2 = next(inputfile)
if line1.split()[0] == 'Natural' and line2.split()[2] == 'Charge':
dashes = next(inputfile)
charges = []
for i in range(self.natom):
nline = next(inputfile)
charges.append(float(nline.split()[2]))
self.atomcharges["natural"] = charges
#Extract Thermochemistry
#Temperature 298.150 Kelvin. Pressure 1.00000 Atm.
#Zero-point correction= 0.342233 (Hartree/
#Thermal correction to Energy= 0.
#Thermal correction to Enthalpy= 0.
#Thermal correction to Gibbs Free Energy= 0.302940
#Sum of electronic and zero-point Energies= -563.649744
#Sum of electronic and thermal Energies= -563.636699
#Sum of electronic and thermal Enthalpies= -563.635755
#Sum of electronic and thermal Free Energies= -563.689037
if "Sum of electronic and thermal Enthalpies" in line:
self.set_attribute('enthalpy', float(line.split()[6]))
if "Sum of electronic and thermal Free Energies=" in line:
self.set_attribute('freeenergy', float(line.split()[7]))
if line[1:13] == "Temperature ":
self.set_attribute('temperature', float(line.split()[1]))
self.set_attribute('pressure', float(line.split()[4]))
# Static polarizability (from `polar`), lower triangular
# matrix.
if line[1:26] == "SCF Polarizability for W=":
self.hp_polarizabilities = True
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.zeros(shape=(3, 3))
self.skip_line(inputfile, 'directions')
for i in range(3):
line = next(inputfile)
polarizability[i, :i+1] = [self.float(x) for x in line.split()[1:]]
polarizability = utils.symmetrize(polarizability, use_triangle='lower')
self.polarizabilities.append(polarizability)
# Static polarizability (from `freq`), lower triangular matrix.
if line[1:16] == "Polarizability=":
self.hp_polarizabilities = True
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.zeros(shape=(3, 3))
polarizability_list = []
polarizability_list.extend([line[16:31], line[31:46], line[46:61]])
line = next(inputfile)
polarizability_list.extend([line[16:31], line[31:46], line[46:61]])
indices = numpy.tril_indices(3)
polarizability[indices] = [self.float(x) for x in polarizability_list]
polarizability = utils.symmetrize(polarizability, use_triangle='lower')
self.polarizabilities.append(polarizability)
# Static polarizability, compressed into a single line from
# terse printing.
# Order is XX, YX, YY, ZX, ZY, ZZ (lower triangle).
if line[2:23] == "Exact polarizability:":
if not self.hp_polarizabilities:
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = numpy.zeros(shape=(3, 3))
indices = numpy.tril_indices(3)
polarizability[indices] = [self.float(x) for x in
[line[23:31], line[31:39], line[39:47], line[47:55], line[55:63], line[63:71]]]
polarizability = utils.symmetrize(polarizability, use_triangle='lower')
self.polarizabilities.append(polarizability)
# IRC Computation convergence checks.
#
# -------- Sample extract for IRC step --------
#
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
# ------------------------------------------------------------------------
# INPUT DATA FOR L123
# ------------------------------------------------------------------------
# GENERAL PARAMETERS:
# Follow reaction path in both directions.
# Maximum points per path = 200
# Step size = 0.100 bohr
# Integration scheme = HPC
# Redo corrector integration= Yes
# DWI Weight Power = 2
# DWI will use Hessian update vectors when possible.
# Max correction cycles = 50
# Initial Hessian = CalcFC
# Hessian evaluation = Analytic every 5 predictor steps
# = Analytic every 5 corrector steps
# Hessian updating method = Bofill
# ------------------------------------------------------------------------
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
#
# -------- Sample extract for converged step --------
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
# Pt 1 Step number 1 out of a maximum of 50
# Modified Bulirsch-Stoer Extrapolation Cycles:
# EPS = 0.000010000000000
# Maximum DWI energy std dev = 0.000000595 at pt 1
# Maximum DWI gradient std dev = 0.135684493 at pt 2
# CORRECTOR INTEGRATION CONVERGENCE:
# Recorrection delta-x convergence threshold: 0.010000
# Delta-x Convergence Met
# Point Number: 1 Path Number: 1
# CHANGE IN THE REACTION COORDINATE = 0.16730
# NET REACTION COORDINATE UP TO THIS POINT = 0.16730
# # OF POINTS ALONG THE PATH = 1
# # OF STEPS = 1
#
# Calculating another point on the path.
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
#
# -------- Sample extract for unconverged intermediate step --------
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
# Error in corrector energy = -0.0000457166
# Magnitude of corrector gradient = 0.0140183779
# Magnitude of analytic gradient = 0.0144021969
# Magnitude of difference = 0.0078709968
# Angle between gradients (degrees)= 32.1199
# Pt 40 Step number 2 out of a maximum of 20
# Modified Bulirsch-Stoer Extrapolation Cycles:
# EPS = 0.000010000000000
# Maximum DWI energy std dev = 0.000007300 at pt 31
# Maximum DWI gradient std dev = 0.085197906 at pt 59
# CORRECTOR INTEGRATION CONVERGENCE:
# Recorrection delta-x convergence threshold: 0.010000
# Delta-x Convergence NOT Met
# IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC-IRC
if line[1:20] == "INPUT DATA FOR L123": # First IRC step
if not hasattr(self, "optstatus"):
self.optstatus = []
self.optstatus.append(data.ccData.OPT_NEW)
if line[3:22] == "Delta-x Convergence":
if line[23:30] == "NOT Met":
self.optstatus[-1] += data.ccData.OPT_UNCONVERGED
elif line[23:26] == "Met":
self.optstatus[-1] += data.ccData.OPT_DONE
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.optstatus) - 1)
if line[:31] == ' Normal termination of Gaussian':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/jaguarparser.py 0000664 0000000 0000000 00000073207 13535330462 0020221 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Jaguar output files"""
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class Jaguar(logfileparser.Logfile):
"""A Jaguar output file"""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Jaguar, self).__init__(logname="Jaguar", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Jaguar output file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Jaguar("%s")' % (self.filename)
def normalisesym(self, label):
"""Normalise the symmetries used by Jaguar.
To normalise, three rules need to be applied:
(1) To handle orbitals of E symmetry, retain everything before the /
(2) Replace two p's by "
(2) Replace any remaining single p's by '
"""
ans = label.split("/")[0].replace("pp", '"').replace("p", "'")
return ans
def before_parsing(self):
# We need to track whether we are inside geometry optimization in order
# to parse SCF targets/values correctly.
self.geoopt = False
def after_parsing(self):
# This is to make sure we always have optdone after geometry optimizations,
# even if it is to be empty for unconverged runs. We have yet to test this
# with a regression for Jaguar, though.
if self.geoopt and not hasattr(self, 'optdone'):
self.optdone = []
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the package version number.
if "Jaguar version" in line:
tokens = line.split()
# Don't add revision information to the main package
# version for now.
# package_version = "{}.r{}".format(tokens[3][:-1], tokens[5])
package_version = tokens[3][:-1]
self.metadata["package_version"] = package_version
# Extract the basis set name
if line[2:12] == "basis set:":
self.metadata["basis_set"] = line.split()[2]
# Extract charge and multiplicity
if line[2:22] == "net molecular charge":
self.set_attribute('charge', int(line.split()[-1]))
self.set_attribute('mult', int(next(inputfile).split()[-1]))
# The Gaussian basis set information is printed before the geometry, and we need
# to do some indexing to get this into cclib format, because fn increments
# for each engular momentum, but cclib does not (we have just P instead of
# all three X/Y/Z with the same parameters. On the other hand, fn enumerates
# the atomic orbitals correctly, so use it to build atombasis.
#
# Gaussian basis set information
#
# renorm mfac*renorm
# atom fn prim L z coef coef coef
# -------- ----- ---- --- ------------- ----------- ----------- -----------
# C1 1 1 S 7.161684E+01 1.5433E-01 2.7078E+00 2.7078E+00
# C1 1 2 S 1.304510E+01 5.3533E-01 2.6189E+00 2.6189E+00
# ...
# C1 3 6 X 2.941249E+00 2.2135E-01 1.2153E+00 1.2153E+00
# 4 Y 1.2153E+00
# 5 Z 1.2153E+00
# C1 2 8 S 2.222899E-01 1.0000E+00 2.3073E-01 2.3073E-01
# C1 3 7 X 6.834831E-01 8.6271E-01 7.6421E-01 7.6421E-01
# ...
# C2 6 1 S 7.161684E+01 1.5433E-01 2.7078E+00 2.7078E+00
# ...
#
if line.strip() == "Gaussian basis set information":
self.skip_lines(inputfile, ['b', 'renorm', 'header', 'd'])
# This is probably the only place we can get this information from Jaguar.
self.gbasis = []
atombasis = []
line = next(inputfile)
fn_per_atom = []
while line.strip():
if len(line.split()) > 3:
aname = line.split()[0]
fn = int(line.split()[1])
prim = int(line.split()[2])
L = line.split()[3]
z = float(line.split()[4])
coef = float(line.split()[5])
# The primitive count is reset for each atom, so use that for adding
# new elements to atombasis and gbasis. We could also probably do this
# using the atom name, although that perhaps might not always be unique.
if prim == 1:
atombasis.append([])
fn_per_atom = []
self.gbasis.append([])
# Remember that fn is repeated when functions are contracted.
if not fn-1 in atombasis[-1]:
atombasis[-1].append(fn-1)
# Here we use fn only to know when a new contraction is encountered,
# so we don't need to decrement it, and we don't even use all values.
# What's more, since we only wish to save the parameters for each subshell
# once, we don't even need to consider lines for orbitals other than
# those for X*, making things a bit easier.
if not fn in fn_per_atom:
fn_per_atom.append(fn)
label = {'S': 'S', 'X': 'P', 'XX': 'D', 'XXX': 'F'}[L]
self.gbasis[-1].append((label, []))
igbasis = fn_per_atom.index(fn)
self.gbasis[-1][igbasis][1].append([z, coef])
else:
fn = int(line.split()[0])
L = line.split()[1]
# Some AO indices are only printed in these lines, for L > 0.
if not fn-1 in atombasis[-1]:
atombasis[-1].append(fn-1)
line = next(inputfile)
# The indices for atombasis can also be read later from the molecular orbital output.
self.set_attribute('atombasis', atombasis)
# This length of atombasis should always be the number of atoms.
self.set_attribute('natom', len(self.atombasis))
# Effective Core Potential
#
# Atom Electrons represented by ECP
# Mo 36
# Maximum angular term 3
# F Potential 1/r^n Exponent Coefficient
# ----- -------- -----------
# 0 140.4577691 -0.0469492
# 1 89.4739342 -24.9754989
# ...
# S-F Potential 1/r^n Exponent Coefficient
# ----- -------- -----------
# 0 33.7771969 2.9278406
# 1 10.0120020 34.3483716
# ...
# O 0
# Cl 10
# Maximum angular term 2
# D Potential 1/r^n Exponent Coefficient
# ----- -------- -----------
# 1 94.8130000 -10.0000000
# ...
if line.strip() == "Effective Core Potential":
self.skip_line(inputfile, 'blank')
line = next(inputfile)
assert line.split()[0] == "Atom"
assert " ".join(line.split()[1:]) == "Electrons represented by ECP"
self.coreelectrons = []
line = next(inputfile)
while line.strip():
if len(line.split()) == 2:
self.coreelectrons.append(int(line.split()[1]))
line = next(inputfile)
if line[2:14] == "new geometry" or line[1:21] == "Symmetrized geometry" or line.find("Input geometry") > 0:
# Get the atom coordinates
if not hasattr(self, "atomcoords") or line[1:21] == "Symmetrized geometry":
# Wipe the "Input geometry" if "Symmetrized geometry" present
self.atomcoords = []
p = re.compile("(\D+)\d+") # One/more letters followed by a number
atomcoords = []
atomnos = []
angstrom = next(inputfile)
title = next(inputfile)
line = next(inputfile)
while line.strip():
temp = line.split()
element = p.findall(temp[0])[0]
atomnos.append(self.table.number[element])
atomcoords.append(list(map(float, temp[1:])))
line = next(inputfile)
self.atomcoords.append(atomcoords)
self.atomnos = numpy.array(atomnos, "i")
self.set_attribute('natom', len(atomcoords))
# Hartree-Fock energy after SCF
if line[1:18] == "SCFE: SCF energy:":
self.metadata["methods"].append("HF")
if not hasattr(self, "scfenergies"):
self.scfenergies = []
temp = line.strip().split()
scfenergy = float(temp[temp.index("hartrees") - 1])
scfenergy = utils.convertor(scfenergy, "hartree", "eV")
self.scfenergies.append(scfenergy)
# Energy after LMP2 correction
if line[1:18] == "Total LMP2 Energy":
self.metadata["methods"].append("LMP2")
if not hasattr(self, "mpenergies"):
self.mpenergies = [[]]
lmp2energy = float(line.split()[-1])
lmp2energy = utils.convertor(lmp2energy, "hartree", "eV")
self.mpenergies[-1].append(lmp2energy)
if line[15:45] == "Geometry optimization complete":
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues) - 1)
if line.find("number of occupied orbitals") > 0:
# Get number of MOs
occs = int(line.split()[-1])
line = next(inputfile)
virts = int(line.split()[-1])
self.nmo = occs + virts
self.homos = numpy.array([occs-1], "i")
self.unrestrictedflag = False
if line[1:28] == "number of occupied orbitals":
self.homos = numpy.array([float(line.strip().split()[-1])-1], "i")
if line[2:27] == "number of basis functions":
nbasis = int(line.strip().split()[-1])
self.set_attribute('nbasis', nbasis)
if line.find("number of alpha occupied orb") > 0:
# Get number of MOs for an unrestricted calc
aoccs = int(line.split()[-1])
line = next(inputfile)
avirts = int(line.split()[-1])
line = next(inputfile)
boccs = int(line.split()[-1])
line = next(inputfile)
bvirt = int(line.split()[-1])
self.nmo = aoccs + avirts
self.homos = numpy.array([aoccs-1, boccs-1], "i")
self.unrestrictedflag = True
if line[0:4] == "etot":
# Get SCF convergence information
if not hasattr(self, "scfvalues"):
self.scfvalues = []
self.scftargets = [[5E-5, 5E-6]]
values = []
while line[0:4] == "etot":
# Jaguar 4.2
# etot 1 N N 0 N -382.08751886450 2.3E-03 1.4E-01
# etot 2 Y Y 0 N -382.27486023153 1.9E-01 1.4E-03 5.7E-02
# Jaguar 6.5
# etot 1 N N 0 N -382.08751881733 2.3E-03 1.4E-01
# etot 2 Y Y 0 N -382.27486018708 1.9E-01 1.4E-03 5.7E-02
temp = line.split()[7:]
if len(temp) == 3:
denergy = float(temp[0])
else:
denergy = 0 # Should really be greater than target value
# or should we just ignore the values in this line
ddensity = float(temp[-2])
maxdiiserr = float(temp[-1])
if not self.geoopt:
values.append([denergy, ddensity])
else:
values.append([ddensity])
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last error: {}'.format(maxdiiserr))
break
self.scfvalues.append(values)
# MO energies and symmetries.
# Jaguar 7.0: provides energies and symmetries for both
# restricted and unrestricted calculations, like this:
# Alpha Orbital energies/symmetry label:
# -10.25358 Bu -10.25353 Ag -10.21931 Bu -10.21927 Ag
# -10.21792 Bu -10.21782 Ag -10.21773 Bu -10.21772 Ag
# ...
# Jaguar 6.5: prints both only for restricted calculations,
# so for unrestricted calculations the output it looks like this:
# Alpha Orbital energies:
# -10.25358 -10.25353 -10.21931 -10.21927 -10.21792 -10.21782
# -10.21773 -10.21772 -10.21537 -10.21537 -1.02078 -0.96193
# ...
# Presence of 'Orbital energies' is enough to catch all versions.
if "Orbital energies" in line:
# Parsing results is identical for restricted/unrestricted
# calculations, just assert later that alpha/beta order is OK.
spin = int(line[2:6] == "Beta")
# Check if symmetries are printed also.
issyms = "symmetry label" in line
if not hasattr(self, "moenergies"):
self.moenergies = []
if issyms and not hasattr(self, "mosyms"):
self.mosyms = []
# Grow moeneriges/mosyms and make sure they are empty when
# parsed multiple times - currently cclib returns only
# the final output (ex. in a geomtry optimization).
if len(self.moenergies) < spin+1:
self.moenergies.append([])
self.moenergies[spin] = []
if issyms:
if len(self.mosyms) < spin+1:
self.mosyms.append([])
self.mosyms[spin] = []
line = next(inputfile).split()
while len(line) > 0:
if issyms:
energies = [float(line[2*i]) for i in range(len(line)//2)]
syms = [line[2*i+1] for i in range(len(line)//2)]
else:
energies = [float(e) for e in line]
energies = [utils.convertor(e, "hartree", "eV") for e in energies]
self.moenergies[spin].extend(energies)
if issyms:
syms = [self.normalisesym(s) for s in syms]
self.mosyms[spin].extend(syms)
line = next(inputfile).split()
line = next(inputfile)
# The second trigger string is in the version 8.3 unit test and the first one was
# encountered in version 6.x and is followed by a bit different format. In particular,
# the line with occupations is missing in each block. Here is a fragment of this block
# from version 8.3:
#
# *****************************************
#
# occupied + virtual orbitals: final wave function
#
# *****************************************
#
#
# 1 2 3 4 5
# eigenvalues- -11.04064 -11.04058 -11.03196 -11.03196 -11.02881
# occupations- 2.00000 2.00000 2.00000 2.00000 2.00000
# 1 C1 S 0.70148 0.70154 -0.00958 -0.00991 0.00401
# 2 C1 S 0.02527 0.02518 0.00380 0.00374 0.00371
# ...
#
if line.find("Occupied + virtual Orbitals- final wvfn") > 0 or \
line.find("occupied + virtual orbitals: final wave function") > 0:
self.skip_lines(inputfile, ['b', 's', 'b', 'b'])
if not hasattr(self, "mocoeffs"):
self.mocoeffs = []
aonames = []
lastatom = "X"
readatombasis = False
if not hasattr(self, "atombasis"):
self.atombasis = []
for i in range(self.natom):
self.atombasis.append([])
readatombasis = True
offset = 0
spin = 1 + int(self.unrestrictedflag)
for s in range(spin):
mocoeffs = numpy.zeros((len(self.moenergies[s]), self.nbasis), "d")
if s == 1: # beta case
self.skip_lines(inputfile, ['s', 'b', 'title', 'b', 's', 'b', 'b'])
for k in range(0, len(self.moenergies[s]), 5):
self.updateprogress(inputfile, "Coefficients")
# All known version have a line with indices followed by the eigenvalues.
self.skip_lines(inputfile, ['numbers', 'eigens'])
# Newer version also have a line with occupation numbers here.
line = next(inputfile)
if "occupations-" in line:
line = next(inputfile)
for i in range(self.nbasis):
info = line.split()
# Fill atombasis only first time around.
if readatombasis and k == 0:
orbno = int(info[0])
atom = info[1]
if atom[1].isalpha():
atomno = int(atom[2:])
else:
atomno = int(atom[1:])
self.atombasis[atomno-1].append(orbno-1)
if not hasattr(self, "aonames"):
if lastatom != info[1]:
scount = 1
pcount = 3
dcount = 6 # six d orbitals in Jaguar
if info[2] == 'S':
aonames.append("%s_%i%s" % (info[1], scount, info[2]))
scount += 1
if info[2] == 'X' or info[2] == 'Y' or info[2] == 'Z':
aonames.append("%s_%iP%s" % (info[1], pcount / 3, info[2]))
pcount += 1
if info[2] == 'XX' or info[2] == 'YY' or info[2] == 'ZZ' or \
info[2] == 'XY' or info[2] == 'XZ' or info[2] == 'YZ':
aonames.append("%s_%iD%s" % (info[1], dcount / 6, info[2]))
dcount += 1
lastatom = info[1]
for j in range(len(info[3:])):
mocoeffs[j+k, i] = float(info[3+j])
line = next(inputfile)
if not hasattr(self, "aonames"):
self.aonames = aonames
offset += 5
self.mocoeffs.append(mocoeffs)
# Atomic charges from Mulliken population analysis:
#
# Atom C1 C2 C3 C4 C5
# Charge 0.00177 -0.06075 -0.05956 0.00177 -0.06075
#
# Atom H6 H7 H8 C9 C10
# ...
if line.strip() == "Atomic charges from Mulliken population analysis:":
if not hasattr(self, 'atomcharges'):
self.atomcharges = {}
charges = []
self.skip_line(inputfile, "blank")
line = next(inputfile)
while "sum of atomic charges" not in line:
assert line.split()[0] == "Atom"
line = next(inputfile)
assert line.split()[0] == "Charge"
charges.extend([float(c) for c in line.split()[1:]])
self.skip_line(inputfile, "blank")
line = next(inputfile)
self.atomcharges['mulliken'] = charges
if (line[2:6] == "olap") or (line.strip() == "overlap matrix:"):
if line[6] == "-":
return
# This was continue (in loop) before parser refactoring.
# continue # avoid "olap-dev"
self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
for i in range(0, self.nbasis, 5):
self.updateprogress(inputfile, "Overlap")
self.skip_lines(inputfile, ['b', 'header'])
for j in range(i, self.nbasis):
temp = list(map(float, next(inputfile).split()[1:]))
self.aooverlaps[j, i:(i+len(temp))] = temp
self.aooverlaps[i:(i+len(temp)), j] = temp
if line[2:24] == "start of program geopt":
if not self.geoopt:
# Need to keep only the RMS density change info
# if this is a geooptz
self.scftargets = [[self.scftargets[0][0]]]
if hasattr(self, "scfvalues"):
self.scfvalues[0] = [[x[0]] for x in self.scfvalues[0]]
self.geoopt = True
else:
self.scftargets.append([5E-5])
# Get Geometry Opt convergence information
#
# geometry optimization step 7
# energy: -382.30219111487 hartrees
# [ turning on trust-radius adjustment ]
# ** restarting optimization from step 6 **
#
#
# Level shifts adjusted to satisfy step-size constraints
# Step size: 0.0360704
# Cos(theta): 0.8789215
# Final level shift: -8.6176299E-02
#
# energy change: 2.5819E-04 . ( 5.0000E-05 )
# gradient maximum: 5.0947E-03 . ( 4.5000E-04 )
# gradient rms: 1.2996E-03 . ( 3.0000E-04 )
# displacement maximum: 1.3954E-02 . ( 1.8000E-03 )
# displacement rms: 4.6567E-03 . ( 1.2000E-03 )
#
if line[2:28] == "geometry optimization step":
if not hasattr(self, "geovalues"):
self.geovalues = []
self.geotargets = numpy.zeros(5, "d")
gopt_step = int(line.split()[-1])
energy = next(inputfile)
blank = next(inputfile)
# A quick hack for messages that show up right after the energy
# at this point, which include:
# ** restarting optimization from step 2 **
# [ turning on trust-radius adjustment ]
# as found in regression file ptnh3_2_H2O_2_2plus.out and other logfiles.
restarting_from_1 = False
while blank.strip():
if blank.strip() == "** restarting optimization from step 1 **":
restarting_from_1 = True
blank = next(inputfile)
# One or more blank lines, depending on content.
line = next(inputfile)
while not line.strip():
line = next(inputfile)
# Note that the level shift message is followed by a blank, too.
if "Level shifts adjusted" in line:
while line.strip():
line = next(inputfile)
line = next(inputfile)
# The first optimization step does not produce an energy change, and
# ther is also no energy change when the optimization is restarted
# from step 1 (since step 1 had no change).
values = []
target_index = 0
if (gopt_step == 1) or restarting_from_1:
values.append(0.0)
target_index = 1
while line.strip():
if len(line) > 40 and line[41] == "(":
# A new geo convergence value
values.append(float(line[26:37]))
self.geotargets[target_index] = float(line[43:54])
target_index += 1
line = next(inputfile)
self.geovalues.append(values)
# IR output looks like this:
# frequencies 72.45 113.25 176.88 183.76 267.60 312.06
# symmetries Au Bg Au Bu Ag Bg
# intensities 0.07 0.00 0.28 0.52 0.00 0.00
# reduc. mass 1.90 0.74 1.06 1.42 1.19 0.85
# force const 0.01 0.01 0.02 0.03 0.05 0.05
# C1 X 0.00000 0.00000 0.00000 -0.05707 -0.06716 0.00000
# C1 Y 0.00000 0.00000 0.00000 0.00909 -0.02529 0.00000
# C1 Z 0.04792 -0.06032 -0.01192 0.00000 0.00000 0.11613
# C2 X 0.00000 0.00000 0.00000 -0.06094 -0.04635 0.00000
# ... etc. ...
# This is a complete ouput, some files will not have intensities,
# and older Jaguar versions sometimes skip the symmetries.
if line[2:23] == "start of program freq":
self.skip_line(inputfile, 'blank')
# Version 8.3 has two blank lines here, earlier versions just one.
line = next(inputfile)
if not line.strip():
line = next(inputfile)
self.vibfreqs = []
self.vibdisps = []
forceconstants = False
intensities = False
while line.strip():
if "force const" in line:
forceconstants = True
if "intensities" in line:
intensities = True
line = next(inputfile)
# In older version, the last block had an extra blank line after it,
# which could be caught. This is not true in newer version (including 8.3),
# but in general it would be better to bound this loop more strictly.
freqs = next(inputfile)
while freqs.strip() and not "imaginary frequencies" in freqs:
# Number of modes (columns printed in this block).
nmodes = len(freqs.split())-1
# Append the frequencies.
self.vibfreqs.extend(list(map(float, freqs.split()[1:])))
line = next(inputfile).split()
# May skip symmetries (older Jaguar versions).
if line[0] == "symmetries":
if not hasattr(self, "vibsyms"):
self.vibsyms = []
self.vibsyms.extend(list(map(self.normalisesym, line[1:])))
line = next(inputfile).split()
if intensities:
if not hasattr(self, "vibirs"):
self.vibirs = []
self.vibirs.extend(list(map(float, line[1:])))
line = next(inputfile).split()
if forceconstants:
line = next(inputfile)
# Start parsing the displacements.
# Variable 'q' holds up to 7 lists of triplets.
q = [[] for i in range(7)]
for n in range(self.natom):
# Variable 'p' holds up to 7 triplets.
p = [[] for i in range(7)]
for i in range(3):
line = next(inputfile)
disps = [float(disp) for disp in line.split()[2:]]
for j in range(nmodes):
p[j].append(disps[j])
for i in range(nmodes):
q[i].append(p[i])
self.vibdisps.extend(q[:nmodes])
self.skip_line(inputfile, 'blank')
freqs = next(inputfile)
# Convert new data to arrays.
self.vibfreqs = numpy.array(self.vibfreqs, "d")
self.vibdisps = numpy.array(self.vibdisps, "d")
if hasattr(self, "vibirs"):
self.vibirs = numpy.array(self.vibirs, "d")
# Parse excited state output (for CIS calculations).
# Jaguar calculates only singlet states.
if line[2:15] == "Excited State":
if not hasattr(self, "etenergies"):
self.etenergies = []
if not hasattr(self, "etoscs"):
self.etoscs = []
if not hasattr(self, "etsecs"):
self.etsecs = []
self.etsyms = []
etenergy = float(line.split()[3])
etenergy = utils.convertor(etenergy, "eV", "wavenumber")
self.etenergies.append(etenergy)
self.skip_lines(inputfile, ['line', 'line', 'line', 'line'])
line = next(inputfile)
self.etsecs.append([])
# Jaguar calculates only singlet states.
self.etsyms.append('Singlet-A')
while line.strip() != "":
fromMO = int(line.split()[0])-1
toMO = int(line.split()[2])-1
coeff = float(line.split()[-1])
self.etsecs[-1].append([(fromMO, 0), (toMO, 0), coeff])
line = next(inputfile)
# Skip 3 lines
for i in range(4):
line = next(inputfile)
strength = float(line.split()[-1])
self.etoscs.append(strength)
if line[:20] == ' Total elapsed time:' \
or line[:18] == ' Total cpu seconds':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/logfileparser.py 0000664 0000000 0000000 00000052326 13535330462 0020370 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Generic output file parser and related tools"""
import bz2
import fileinput
import gzip
import inspect
import io
import logging
import os
import random
import sys
import zipfile
if sys.version_info.major == 2:
getargspec = inspect.getargspec
else:
getargspec = inspect.getfullargspec
import numpy
from cclib.parser import utils
from cclib.parser.data import ccData
from cclib.parser.data import ccData_optdone_bool
# This seems to avoid a problem with Avogadro.
logging.logMultiprocessing = 0
class myBZ2File(bz2.BZ2File):
"""Return string instead of bytes"""
def __next__(self):
line = super(bz2.BZ2File, self).__next__()
return line.decode("ascii", "replace")
def next(self):
line = self.__next__()
return line
class myGzipFile(gzip.GzipFile):
"""Return string instead of bytes"""
def __next__(self):
super_ob = super(gzip.GzipFile, self)
# seemingly different versions of gzip can have either next or __next__
if hasattr(super_ob, 'next'):
line = super_ob.next()
else:
line = super_ob.__next__()
return line.decode("ascii", "replace")
def next(self):
line = self.__next__()
return line
class myFileinputFile(fileinput.FileInput):
"""Implement next() method"""
def next(self):
line = next(self)
return line
class FileWrapper(object):
"""Wrap a file-like object or stream with some custom tweaks"""
def __init__(self, source, pos=0):
self.src = source
# Most file-like objects have seek and tell methods, but streams returned
# by urllib.urlopen in Python2 do not, which will raise an AttributeError
# in this code. On the other hand, in Python3 these methods do exist since
# urllib uses the stream class in the io library, but they raise a different
# error, namely io.UnsupportedOperation. That is why it is hard to be more
# specific with except block here.
try:
self.src.seek(0, 2)
self.size = self.src.tell()
self.src.seek(pos, 0)
except (AttributeError, IOError, io.UnsupportedOperation):
# Stream returned by urllib should have size information.
if hasattr(self.src, 'headers') and 'content-length' in self.src.headers:
self.size = int(self.src.headers['content-length'])
else:
self.size = pos
# Assume the position is what was passed to the constructor.
self.pos = pos
def next(self):
line = next(self.src)
self.pos += len(line)
return line
def __next__(self):
return self.next()
def __iter__(self):
return self
def close(self):
self.src.close()
def seek(self, pos, ref):
# If we are seeking to end, we can emulate it usually. As explained above,
# we cannot be too specific with the except clause due to differences
# between Python2 and 3. Yet another reason to drop Python 2 soon!
try:
self.src.seek(pos, ref)
except:
if ref == 2:
self.src.read()
else:
raise
if ref == 0:
self.pos = pos
if ref == 1:
self.pos += pos
if ref == 2 and hasattr(self, 'size'):
self.pos = self.size
def openlogfile(filename, object=None):
"""Return a file object given a filename or if object specified decompresses it
if needed and wrap it up.
Given the filename or file object of a log file or a gzipped, zipped, or bzipped
log file, this function returns a file-like object.
Given a list of filenames, this function returns a FileInput object,
which can be used for seamless iteration without concatenation.
"""
# If there is a single string argument given.
if type(filename) in [str, str]:
extension = os.path.splitext(filename)[1]
if extension == ".gz":
fileobject = myGzipFile(filename, "r", fileobj=object)
elif extension == ".zip":
zip = zipfile.ZipFile(object, "r") if object else zipfile.ZipFile(filename, "r")
assert len(zip.namelist()) == 1, "ERROR: Zip file contains more than 1 file"
fileobject = io.StringIO(zip.read(zip.namelist()[0]).decode("ascii", "ignore"))
elif extension in ['.bz', '.bz2']:
# Module 'bz2' is not always importable.
assert bz2 is not None, "ERROR: module bz2 cannot be imported"
fileobject = myBZ2File(object, "r") if object else myBZ2File(filename, "r")
else:
# Assuming that object is text file encoded in utf-8
fileobject = io.StringIO(object.decode('utf-8')) if object \
else FileWrapper(io.open(filename, "r", errors='ignore'))
return fileobject
elif hasattr(filename, "__iter__"):
# This is needed, because fileinput will assume stdin when filename is empty.
if len(filename) == 0:
return None
# Compression (gzip and bzip) is supported as of Python 2.5.
if sys.version_info[0] >= 2 and sys.version_info[1] >= 5:
fileobject = fileinput.input(filename, openhook=fileinput.hook_compressed)
else:
fileobject = myFileinputFile(filename)
return fileobject
class Logfile(object):
"""Abstract class for logfile objects.
Subclasses defined by cclib:
ADF, DALTON, GAMESS, GAMESSUK, Gaussian, Jaguar, Molpro, MOPAC,
NWChem, ORCA, Psi, Q-Chem
"""
def __init__(self, source, loglevel=logging.ERROR, logname="Log",
logstream=sys.stderr, datatype=ccData_optdone_bool, **kwds):
"""Initialise the Logfile object.
This should be called by a subclass in its own __init__ method.
Inputs:
source - a logfile, list of logfiles, or stream with at least a read method
loglevel - integer corresponding to a log level from the logging module
logname - name of the source logfile passed to this constructor
logstream - where to output the logging information
datatype - class to use for gathering data attributes
"""
# Set the filename to source if it is a string or a list of strings, which are
# assumed to be filenames. Otherwise, assume the source is a file-like object
# if it has a read method, and we will try to use it like a stream.
self.isfileinput = False
if isinstance(source, str):
self.filename = source
self.isstream = False
elif isinstance(source, list) and all([isinstance(s, str) for s in source]):
self.filename = source
self.isstream = False
elif isinstance(source, fileinput.FileInput):
self.filename = source
self.isstream = False
self.isfileinput = True
elif hasattr(source, "read"):
self.filename = "stream %s" % str(type(source))
self.isstream = True
self.stream = source
else:
raise ValueError("Unexpected source type.")
# Set up the logger.
# Note that calling logging.getLogger() with one name always returns the same instance.
# Presently in cclib, all parser instances of the same class use the same logger,
# which means that care needs to be taken not to duplicate handlers.
self.loglevel = loglevel
self.logname = logname
self.logger = logging.getLogger('%s %s' % (self.logname, self.filename))
self.logger.setLevel(self.loglevel)
if len(self.logger.handlers) == 0:
handler = logging.StreamHandler(logstream)
handler.setFormatter(logging.Formatter("[%(name)s %(levelname)s] %(message)s"))
self.logger.addHandler(handler)
# Set up the metadata.
if not hasattr(self, "metadata"):
self.metadata = {}
self.metadata["package"] = self.logname
self.metadata["methods"] = []
# Indicate if the computation has completed successfully
self.metadata['success'] = False
# Periodic table of elements.
self.table = utils.PeriodicTable()
# This is the class that will be used in the data object returned by parse(), and should
# normally be ccData or a subclass of it.
self.datatype = datatype
# Change the class used if we want optdone to be a list or if the 'future' option
# is used, which might have more consequences in the future.
optdone_as_list = kwds.get("optdone_as_list", False) or kwds.get("future", False)
optdone_as_list = optdone_as_list if isinstance(optdone_as_list, bool) else False
if optdone_as_list:
self.datatype = ccData
# Parsing of Natural Orbitals and Natural Spin Orbtials into one attribute
self.unified_no_nso = kwds.get("future",False)
def __setattr__(self, name, value):
# Send info to logger if the attribute is in the list of attributes.
if name in ccData._attrlist and hasattr(self, "logger"):
# Call logger.info() only if the attribute is new.
if not hasattr(self, name):
if type(value) in [numpy.ndarray, list]:
self.logger.info("Creating attribute %s[]" % name)
else:
self.logger.info("Creating attribute %s: %s" % (name, str(value)))
# Set the attribute.
object.__setattr__(self, name, value)
def parse(self, progress=None, fupdate=0.05, cupdate=0.002):
"""Parse the logfile, using the assumed extract method of the child."""
# Check that the sub-class has an extract attribute,
# that is callable with the proper number of arguemnts.
if not hasattr(self, "extract"):
raise AttributeError("Class %s has no extract() method." % self.__class__.__name__)
if not callable(self.extract):
raise AttributeError("Method %s._extract not callable." % self.__class__.__name__)
if len(getargspec(self.extract)[0]) != 3:
raise AttributeError("Method %s._extract takes wrong number of arguments." % self.__class__.__name__)
# Save the current list of attributes to keep after parsing.
# The dict of self should be the same after parsing.
_nodelete = list(set(self.__dict__.keys()))
# Initiate the FileInput object for the input files.
# Remember that self.filename can be a list of files.
if not self.isstream:
if not self.isfileinput:
inputfile = openlogfile(self.filename)
else:
inputfile = self.filename
else:
inputfile = FileWrapper(self.stream)
# Intialize self.progress
is_compressed = isinstance(inputfile, myGzipFile) or isinstance(inputfile, myBZ2File)
if progress and not (is_compressed):
self.progress = progress
self.progress.initialize(inputfile.size)
self.progress.step = 0
self.fupdate = fupdate
self.cupdate = cupdate
# Maybe the sub-class has something to do before parsing.
self.before_parsing()
# Loop over lines in the file object and call extract().
# This is where the actual parsing is done.
for line in inputfile:
self.updateprogress(inputfile, "Unsupported information", cupdate)
# This call should check if the line begins a section of extracted data.
# If it does, it parses some lines and sets the relevant attributes (to self).
# Any attributes can be freely set and used across calls, however only those
# in data._attrlist will be moved to final data object that is returned.
try:
self.extract(inputfile, line)
except StopIteration:
self.logger.error("Unexpectedly encountered end of logfile.")
break
# Close input file object.
if not self.isstream:
inputfile.close()
# Maybe the sub-class has something to do after parsing.
self.after_parsing()
# If atomcoords were not parsed, but some input coordinates were ("inputcoords").
# This is originally from the Gaussian parser, a regression fix.
if not hasattr(self, "atomcoords") and hasattr(self, "inputcoords"):
self.atomcoords = numpy.array(self.inputcoords, 'd')
# Set nmo if not set already - to nbasis.
if not hasattr(self, "nmo") and hasattr(self, "nbasis"):
self.nmo = self.nbasis
# Create a default coreelectrons array, unless it's impossible
# to determine.
if not hasattr(self, "coreelectrons") and hasattr(self, "natom"):
self.coreelectrons = numpy.zeros(self.natom, "i")
if hasattr(self, "incorrect_coreelectrons"):
self.__delattr__("coreelectrons")
# Create the data object we want to return. This is normally ccData, but can be changed
# by passing the datatype argument to the constructor. All supported cclib attributes
# are copied to this object, but beware that in order to be moved an attribute must be
# included in the data._attrlist of ccData (or whatever else).
# There is the possibility of passing assitional argument via self.data_args, but
# we use this sparingly in cases where we want to limit the API with options, etc.
data = self.datatype(attributes=self.__dict__)
# Now make sure that the cclib attributes in the data object are all the correct type,
# including arrays and lists of arrays.
data.arrayify()
# Delete all temporary attributes (including cclib attributes).
# All attributes should have been moved to a data object, which will be returned.
for attr in list(self.__dict__.keys()):
if not attr in _nodelete:
self.__delattr__(attr)
# Perform final checks on values of attributes.
data.check_values(logger=self.logger)
# Update self.progress as done.
if hasattr(self, "progress"):
self.progress.update(inputfile.size, "Done")
return data
def before_parsing(self):
"""Set parser-specific variables and do other initial things here."""
pass
def after_parsing(self):
"""Correct data or do parser-specific validation after parsing is finished."""
pass
def updateprogress(self, inputfile, msg, xupdate=0.05):
"""Update progress."""
if hasattr(self, "progress") and random.random() < xupdate:
newstep = inputfile.pos
if newstep != self.progress.step:
self.progress.update(newstep, msg)
self.progress.step = newstep
def normalisesym(self, symlabel):
"""Standardise the symmetry labels between parsers.
This method should be overwritten by individual parsers, and should
contain appropriate doctests. If is not overwritten, this is detected
as an error by unit tests.
"""
raise NotImplementedError("normalisesym(self, symlabel) must be overriden by the parser.")
def float(self, number):
"""Convert a string to a float.
This method should perform certain checks that are specific to cclib,
including avoiding the problem with Ds instead of Es in scientific notation.
Another point is converting string signifying numerical problems (*****)
to something we can manage (Numpy's NaN).
"""
if list(set(number)) == ['*']:
return numpy.nan
return float(number.replace("D", "E"))
def new_internal_job(self):
"""Delete attributes that can be problematic in multistep jobs.
TODO: instead of this hack, parse each job in a multistep comptation
as a different ccData object (this is for 2.x).
Some computations are actually sequences of several jobs, and some
attributes won't work well if parsed across jobs. There include:
mpenergies: if different jobs go to different orders then
these won't be consistent and can't be converted
to an array easily
"""
for name in ("mpenergies",):
if hasattr(self, name):
delattr(self, name)
def set_attribute(self, name, value, check_change=True):
"""Set an attribute and perform an optional check when it already exists.
Note that this can be used for scalars and lists alike, whenever we want
to set a value for an attribute.
Parameters
----------
name: str
The name of the attribute.
value: str
The value for the attribute.
check_change: bool
By default we want to check that the value does not change
if the attribute already exists.
"""
if check_change and hasattr(self, name):
try:
numpy.testing.assert_equal(getattr(self, name), value)
except AssertionError:
self.logger.warning("Attribute %s changed value (%s -> %s)" % (name, getattr(self, name), value))
setattr(self, name, value)
def append_attribute(self, name, value):
"""Appends a value to an attribute."""
if not hasattr(self, name):
self.set_attribute(name, [])
getattr(self, name).append(value)
def extend_attribute(self, name, values):
"""Appends an iterable of values to an attribute."""
if not hasattr(self, name):
self.set_attribute(name, [])
getattr(self, name).extend(values)
def _assign_coreelectrons_to_element(self, element, ncore,
ncore_is_total_count=False):
"""Assign core electrons to all instances of the element.
It's usually reasonable to do this for all atoms of a given element,
because mixed usage isn't normally allowed within elements.
Parameters
----------
element: str
the chemical element to set coreelectrons for
ncore: int
the number of core electrons
ncore_is_total_count: bool
whether the ncore argument is the total count, in which case it is
divided by the number of atoms of this element
"""
atomsymbols = [self.table.element[atomno] for atomno in self.atomnos]
indices = [i for i, el in enumerate(atomsymbols) if el == element]
if ncore_is_total_count:
ncore = ncore // len(indices)
if not hasattr(self, 'coreelectrons'):
self.coreelectrons = numpy.zeros(self.natom, 'i')
self.coreelectrons[indices] = ncore
def skip_lines(self, inputfile, sequence):
"""Read trivial line types and check they are what they are supposed to be.
This function will read len(sequence) lines and do certain checks on them,
when the elements of sequence have the appropriate values. Currently the
following elements trigger checks:
'blank' or 'b' - the line should be blank
'dashes' or 'd' - the line should contain only dashes (or spaces)
'equals' or 'e' - the line should contain only equal signs (or spaces)
'stars' or 's' - the line should contain only stars (or spaces)
"""
expected_characters = {
'-': ['dashes', 'd'],
'=': ['equals', 'e'],
'*': ['stars', 's'],
}
lines = []
for expected in sequence:
# Read the line we want to skip.
line = next(inputfile)
# Blank lines are perhaps the most common thing we want to check for.
if expected in ["blank", "b"]:
try:
assert line.strip() == ""
except AssertionError:
frame, fname, lno, funcname, funcline, index = inspect.getouterframes(inspect.currentframe())[1]
parser = fname.split('/')[-1]
msg = "In %s, line %i, line not blank as expected: %s" % (parser, lno, line.strip())
self.logger.warning(msg)
# All cases of heterogeneous lines can be dealt with by the same code.
for character, keys in expected_characters.items():
if expected in keys:
try:
assert all([c == character for c in line.strip() if c != ' '])
except AssertionError:
frame, fname, lno, funcname, funcline, index = inspect.getouterframes(inspect.currentframe())[1]
parser = fname.split('/')[-1]
msg = "In %s, line %i, line not all %s as expected: %s" % (parser, lno, keys[0], line.strip())
self.logger.warning(msg)
continue
# Save the skipped line, and we will return the whole list.
lines.append(line)
return lines
skip_line = lambda self, inputfile, expected: self.skip_lines(inputfile, [expected])
cclib-1.6.2/cclib/parser/molcasparser.py 0000664 0000000 0000000 00000126553 13535330462 0020231 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Molcas output files"""
from __future__ import print_function
import re
import string
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class Molcas(logfileparser.Logfile):
"""A Molcas log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Molcas, self).__init__(logname="Molcas", *args, **kwargs)
def __str__(self):
"""Return a string repeesentation of the object."""
return "Molcas log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Molcas("%s")' % (self.filename)
#These are yet to be implemented.
def normalisesym(self, label):
"""Does Molcas require symmetry label normalization?"""
def after_parsing(self):
for element, ncore in self.core_array:
self._assign_coreelectrons_to_element(element, ncore)
def before_parsing(self):
# Compile the regex for extracting the element symbol from the
# atom label in the "Molecular structure info" block.
self.re_atomelement = re.compile('([a-zA-Z]+)\d?')
# Compile the dashes-and-or-spaces-only regex.
self.re_dashes_and_spaces = re.compile('^[\s-]+$')
# Molcas can do multiple calculations in one job, and each one
# starts from the gateway module. Onle parse the first.
# TODO: It would be best to parse each calculation as a separate
# ccData object and return an iterator - something for 2.x
self.gateway_module_count = 0
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
if "Start Module: gateway" in line:
self.gateway_module_count += 1
if self.gateway_module_count > 1:
return
# Extract the version number and optionally the Git tag and hash.
if "version" in line:
match = re.search(r"\s{2,}version\s(\d*\.\d*)", line)
if match:
package_version = match.groups()[0]
self.metadata["package_version"] = package_version
# Don't add revision information to the main package version for now.
if "tag" in line:
tag = line.split()[-1]
if "build" in line:
match = re.search(r"\*\s*build\s(\S*)\s*\*", line)
if match:
revision = match.groups()[0]
## This section is present when executing &GATEWAY.
# ++ Molecular structure info:
# -------------------------
# ************************************************
# **** Cartesian Coordinates / Bohr, Angstrom ****
# ************************************************
# Center Label x y z x y z
# 1 C1 0.526628 -2.582937 0.000000 0.278679 -1.366832 0.000000
# 2 C2 2.500165 -0.834760 0.000000 1.323030 -0.441736 0.000000
if line[25:63] == 'Cartesian Coordinates / Bohr, Angstrom':
if not hasattr(self, 'atomnos'):
self.atomnos = []
self.skip_lines(inputfile, ['stars', 'blank', 'header'])
line = next(inputfile)
atomelements = []
atomcoords = []
while line.strip() not in ('', '--'):
sline = line.split()
atomelement = sline[1].rstrip(string.digits).title()
atomelements.append(atomelement)
atomcoords.append(list(map(float, sline[5:])))
line = next(inputfile)
self.append_attribute('atomcoords', atomcoords)
if self.atomnos == []:
self.atomnos = [self.table.number[ae.title()] for ae in atomelements]
if not hasattr(self, 'natom'):
self.set_attribute('natom', len(self.atomnos))
## This section is present when executing &SCF.
# ++ Orbital specifications:
# -----------------------
# Symmetry species 1
# Frozen orbitals 0
# Occupied orbitals 3
# Secondary orbitals 77
# Deleted orbitals 0
# Total number of orbitals 80
# Number of basis functions 80
# --
if line[:29] == '++ Orbital specifications:':
self.skip_lines(inputfile, ['dashes', 'blank'])
line = next(inputfile)
symmetry_count = 1
while not line.startswith('--'):
if line.strip().startswith('Symmetry species'):
symmetry_count = int(line.split()[-1])
if line.strip().startswith('Total number of orbitals'):
nmos = line.split()[-symmetry_count:]
self.set_attribute('nmo', sum(map(int, nmos)))
if line.strip().startswith('Number of basis functions'):
nbasis = line.split()[-symmetry_count:]
self.set_attribute('nbasis', sum(map(int, nbasis)))
line = next(inputfile)
if line.strip().startswith(('Molecular charge', 'Total molecular charge')):
self.set_attribute('charge', int(float(line.split()[-1])))
# ++ Molecular charges:
# ------------------
# Mulliken charges per centre and basis function type
# ---------------------------------------------------
# C1
# 1s 2.0005
# 2s 2.0207
# 2px 0.0253
# 2pz 0.1147
# 2py 1.8198
# *s -0.0215
# *px 0.0005
# *pz 0.0023
# *py 0.0368
# *d2+ 0.0002
# *d1+ 0.0000
# *d0 0.0000
# *d1- 0.0000
# *d2- 0.0000
# *f3+ 0.0000
# *f2+ 0.0001
# *f1+ 0.0000
# *f0 0.0001
# *f1- 0.0001
# *f2- 0.0000
# *f3- 0.0003
# *g4+ 0.0000
# *g3+ 0.0000
# *g2+ 0.0000
# *g1+ 0.0000
# *g0 0.0000
# *g1- 0.0000
# *g2- 0.0000
# *g3- 0.0000
# *g4- 0.0000
# Total 6.0000
# N-E 0.0000
# Total electronic charge= 6.000000
# Total charge= 0.000000
#--
if line[:24] == '++ Molecular charges:':
atomcharges = []
while line[6:29] != 'Total electronic charge':
line = next(inputfile)
if line[6:9] == 'N-E':
atomcharges.extend(map(float, line.split()[1:]))
# Molcas only performs Mulliken population analysis.
self.set_attribute('atomcharges', {'mulliken': atomcharges})
# Ensure the charge printed here is identical to the
# charge printed before entering the SCF.
self.skip_line(inputfile, 'blank')
line = next(inputfile)
assert line[6:30] == 'Total charge='
if hasattr(self, 'charge'):
assert int(float(line.split()[2])) == self.charge
# This section is present when executing &SCF
# This section parses the total SCF Energy.
# *****************************************************************************************************************************
# * *
# * SCF/KS-DFT Program, Final results *
# * *
# * *
# * *
# * Final Results *
# * *
# *****************************************************************************************************************************
# :: Total SCF energy -37.6045426484
if line[:22] == ':: Total SCF energy' or line[:25] == ':: Total KS-DFT energy':
if not hasattr(self, 'scfenergies'):
self.scfenergies = []
scfenergy = float(line.split()[-1])
self.scfenergies.append(utils.convertor(scfenergy, 'hartree', 'eV'))
## Parsing the scftargets in this section
# ++ Optimization specifications:
# ----------------------------
# SCF Algorithm: Conventional
# Minimized density differences are used
# Number of density matrices in core 9
# Maximum number of NDDO SCF iterations 400
# Maximum number of HF SCF iterations 400
# Threshold for SCF energy change 0.10E-08
# Threshold for density matrix 0.10E-03
# Threshold for Fock matrix 0.15E-03
# Threshold for linear dependence 0.10E-08
# Threshold at which DIIS is turned on 0.15E+00
# Threshold at which QNR/C2DIIS is turned on 0.75E-01
# Threshold for Norm(delta) (QNR/C2DIIS) 0.20E-04
if line[:34] == '++ Optimization specifications:':
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
if line.strip().startswith('SCF'):
scftargets = []
self.skip_lines(inputfile,
['Minimized', 'Number', 'Maximum', 'Maximum'])
lines = [next(inputfile) for i in range(7)]
targets = [
'Threshold for SCF energy change',
'Threshold for density matrix',
'Threshold for Fock matrix',
'Threshold for Norm(delta)',
]
for y in targets:
scftargets.extend([float(x.split()[-1]) for x in lines if y in x])
self.append_attribute('scftargets', scftargets)
# ++ Convergence information
# SCF iterations: Energy and convergence statistics
#
# Iter Tot. SCF One-electron Two-electron Energy Max Dij or Max Fij DNorm TNorm AccCon Time
# Energy Energy Energy Change Delta Norm in Sec.
# 1 -36.83817703 -50.43096166 13.59278464 0.00E+00 0.16E+00* 0.27E+01* 0.30E+01 0.33E+02 NoneDa 0.
# 2 -36.03405202 -45.74525152 9.71119950 0.80E+00* 0.14E+00* 0.93E-02* 0.26E+01 0.43E+01 Damp 0.
# 3 -37.08936118 -48.41536598 11.32600480 -0.11E+01* 0.12E+00* 0.91E-01* 0.97E+00 0.16E+01 Damp 0.
# 4 -37.31610460 -50.54103969 13.22493509 -0.23E+00* 0.11E+00* 0.96E-01* 0.72E+00 0.27E+01 Damp 0.
# 5 -37.33596239 -49.47021484 12.13425245 -0.20E-01* 0.59E-01* 0.59E-01* 0.37E+00 0.16E+01 Damp 0.
# ...
# Convergence after 26 Macro Iterations
# --
if line[46:91] == 'iterations: Energy and convergence statistics':
self.skip_line(inputfile, 'blank')
while line.split() != ['Energy', 'Energy', 'Energy', 'Change', 'Delta', 'Norm', 'in', 'Sec.']:
line = next(inputfile)
iteration_regex = ("^([0-9]+)" # Iter
"( [ \-0-9]*\.[0-9]{6,9})" # Tot. SCF Energy
"( [ \-0-9]*\.[0-9]{6,9})" # One-electron Energy
"( [ \-0-9]*\.[0-9]{6,9})" # Two-electron Energy
"( [ \-0-9]*\.[0-9]{2}E[\-\+][0-9]{2}\*?)" # Energy Change
"( [ \-0-9]*\.[0-9]{2}E[\-\+][0-9]{2}\*?)" # Max Dij or Delta Norm
"( [ \-0-9]*\.[0-9]{2}E[\-\+][0-9]{2}\*?)" # Max Fij
"( [ \-0-9]*\.[0-9]{2}E[\-\+][0-9]{2}\*?)" # DNorm
"( [ \-0-9]*\.[0-9]{2}E[\-\+][0-9]{2}\*?)" # TNorm
"( [ A-Za-z0-9]*)" # AccCon
"( [ \.0-9]*)$") # Time in Sec.
scfvalues = []
line = next(inputfile)
while not line.strip().startswith("Convergence"):
match = re.match(iteration_regex, line.strip())
if match:
groups = match.groups()
cols = [g.strip() for g in match.groups()]
cols = [c.replace('*', '') for c in cols]
energy = float(cols[4])
density = float(cols[5])
fock = float(cols[6])
dnorm = float(cols[7])
scfvalues.append([energy, density, fock, dnorm])
if line.strip() == "--":
self.logger.warning('File terminated before end of last SCF!')
break
line = next(inputfile)
self.append_attribute('scfvalues', scfvalues)
# Harmonic frequencies in cm-1
#
# IR Intensities in km/mol
#
# 1 2 3 4 5 6
#
# Frequency: i60.14 i57.39 128.18 210.06 298.24 309.65
#
# Intensity: 3.177E-03 2.129E-06 4.767E-01 2.056E-01 6.983E-07 1.753E-07
# Red. mass: 2.42030 2.34024 2.68044 3.66414 2.61721 3.34904
#
# C1 x -0.00000 0.00000 0.00000 -0.05921 0.00000 -0.06807
# C1 y 0.00001 -0.00001 -0.00001 0.00889 0.00001 -0.02479
# C1 z -0.03190 0.04096 -0.03872 0.00001 -0.12398 -0.00002
# C2 x -0.00000 0.00001 0.00000 -0.06504 0.00000 -0.03487
# C2 y 0.00000 -0.00000 -0.00000 0.01045 0.00001 -0.05659
# C2 z -0.03703 -0.03449 -0.07269 0.00000 -0.07416 -0.00001
# C3 x -0.00000 0.00001 0.00000 -0.06409 -0.00001 0.05110
# C3 y -0.00000 0.00001 0.00000 0.00152 0.00000 -0.03263
# C3 z -0.03808 -0.08037 -0.07267 -0.00001 0.07305 0.00000
# ...
# H20 y 0.00245 -0.00394 0.03215 0.03444 -0.10424 -0.10517
# H20 z 0.00002 -0.00001 0.00000 -0.00000 -0.00000 0.00000
#
#
#
# ++ Thermochemistry
if line[1:29] == 'Harmonic frequencies in cm-1':
self.skip_line(inputfile, 'blank')
line = next(inputfile)
while 'Thermochemistry' not in line:
if 'Frequency:' in line:
if not hasattr(self, 'vibfreqs'):
self.vibfreqs = []
vibfreqs = [float(i.replace('i', '-')) for i in line.split()[1:]]
self.vibfreqs.extend(vibfreqs)
if 'Intensity:' in line:
if not hasattr(self, 'vibirs'):
self.vibirs = []
vibirs = map(float, line.split()[1:])
self.vibirs.extend(vibirs)
if 'Red.' in line:
self.skip_line(inputfile, 'blank')
line = next(inputfile)
if not hasattr(self, 'vibdisps'):
self.vibdisps = []
disps = []
for n in range(3*self.natom):
numbers = [float(s) for s in line[17:].split()]
# The atomindex should start at 0 instead of 1.
atomindex = int(re.search(r'\d+$', line.split()[0]).group()) - 1
numbermodes = len(numbers)
if len(disps) == 0:
# Appends empty array of the following
# dimensions (numbermodes, natom, 0) to disps.
for mode in range(numbermodes):
disps.append([[] for x in range(0, self.natom)])
for mode in range(numbermodes):
disps[mode][atomindex].append(numbers[mode])
line = next(inputfile)
self.vibdisps.extend(disps)
line = next(inputfile)
## Parsing thermochemistry attributes here
# ++ Thermochemistry
#
# *********************
# * *
# * THERMOCHEMISTRY *
# * *
# *********************
#
# Mass-centered Coordinates (Angstrom):
# ***********************************************************
# ...
# *****************************************************
# Temperature = 0.00 Kelvin, Pressure = 1.00 atm
# -----------------------------------------------------
# Molecular Partition Function and Molar Entropy:
# q/V (M**-3) S(kcal/mol*K)
# Electronic 0.100000D+01 0.000
# Translational 0.100000D+01 0.000
# Rotational 0.100000D+01 2.981
# Vibrational 0.100000D+01 0.000
# TOTAL 0.100000D+01 2.981
#
# Thermal contributions to INTERNAL ENERGY:
# Electronic 0.000 kcal/mol 0.000000 au.
# Translational 0.000 kcal/mol 0.000000 au.
# Rotational 0.000 kcal/mol 0.000000 au.
# Vibrational 111.885 kcal/mol 0.178300 au.
# TOTAL 111.885 kcal/mol 0.178300 au.
#
# Thermal contributions to
# ENTHALPY 111.885 kcal/mol 0.178300 au.
# GIBBS FREE ENERGY 111.885 kcal/mol 0.178300 au.
#
# Sum of energy and thermal contributions
# INTERNAL ENERGY -382.121931 au.
# ENTHALPY -382.121931 au.
# GIBBS FREE ENERGY -382.121931 au.
# -----------------------------------------------------
# ...
# ENTHALPY -382.102619 au.
# GIBBS FREE ENERGY -382.179819 au.
# -----------------------------------------------------
# --
#
# ++ Isotopic shifts:
if line[4:19] == 'THERMOCHEMISTRY':
temperature_values = []
pressure_values = []
entropy_values = []
internal_energy_values = []
enthalpy_values = []
free_energy_values = []
while 'Isotopic' not in line:
if line[1:12] == 'Temperature':
temperature_values.append(float(line.split()[2]))
pressure_values.append(float(line.split()[6]))
if line[1:48] == 'Molecular Partition Function and Molar Entropy:':
while 'TOTAL' not in line:
line = next(inputfile)
entropy_values.append(utils.convertor(float(line.split()[2]), 'kcal/mol', 'hartree'))
if line[1:40] == 'Sum of energy and thermal contributions':
internal_energy_values.append(float(next(inputfile).split()[2]))
enthalpy_values.append(float(next(inputfile).split()[1]))
free_energy_values.append(float(next(inputfile).split()[3]))
line = next(inputfile)
# When calculations for more than one temperature value are
# performed, the values corresponding to room temperature (298.15 K)
# are returned and if no calculations are performed for 298.15 K, then
# the values corresponding last temperature value are returned.
index = -1
if 298.15 in temperature_values:
index = temperature_values.index(298.15)
self.set_attribute('temperature', temperature_values[index])
if len(temperature_values) > 1:
self.logger.warning('More than 1 values of temperature found')
self.set_attribute('pressure', pressure_values[index])
if len(pressure_values) > 1:
self.logger.warning('More than 1 values of pressure found')
self.set_attribute('entropy', entropy_values[index])
if len(entropy_values) > 1:
self.logger.warning('More than 1 values of entropy found')
self.set_attribute('enthalpy', enthalpy_values[index])
if len(enthalpy_values) > 1:
self.logger.warning('More than 1 values of enthalpy found')
self.set_attribute('freeenergy', free_energy_values[index])
if len(free_energy_values) > 1:
self.logger.warning('More than 1 values of freeenergy found')
## Parsing Geometrical Optimization attributes in this section.
# ++ Slapaf input parameters:
# ------------------------
#
# Max iterations: 2000
# Convergence test a la Schlegel.
# Convergence criterion on gradient/para.<=: 0.3E-03
# Convergence criterion on step/parameter<=: 0.3E-03
# Convergence criterion on energy change <=: 0.0E+00
# Max change of an internal coordinate: 0.30E+00
# ...
# ...
# **********************************************************************************************************************
# * Energy Statistics for Geometry Optimization *
# **********************************************************************************************************************
# Energy Grad Grad Step Estimated Geom Hessian
# Iter Energy Change Norm Max Element Max Element Final Energy Update Update Index
# 1 -382.30023222 0.00000000 0.107221 0.039531 nrc047 0.085726 nrc047 -382.30533799 RS-RFO None 0
# 2 -382.30702964 -0.00679742 0.043573 0.014908 nrc001 0.068195 nrc001 -382.30871333 RS-RFO BFGS 0
# 3 -382.30805348 -0.00102384 0.014883 0.005458 nrc010 -0.020973 nrc001 -382.30822089 RS-RFO BFGS 0
# ...
# ...
# 18 -382.30823419 -0.00000136 0.001032 0.000100 nrc053 0.012319 nrc053 -382.30823452 RS-RFO BFGS 0
# 19 -382.30823198 0.00000221 0.001051 -0.000092 nrc054 0.066565 nrc053 -382.30823822 RS-RFO BFGS 0
# 20 -382.30820252 0.00002946 0.001132 -0.000167 nrc021 -0.064003 nrc053 -382.30823244 RS-RFO BFGS 0
#
# +----------------------------------+----------------------------------+
# + Cartesian Displacements + Gradient in internals +
# + Value Threshold Converged? + Value Threshold Converged? +
# +-----+----------------------------------+----------------------------------+
# + RMS + 5.7330E-02 1.2000E-03 No + 1.6508E-04 3.0000E-04 Yes +
# +-----+----------------------------------+----------------------------------+
# + Max + 1.2039E-01 1.8000E-03 No + 1.6711E-04 4.5000E-04 Yes +
# +-----+----------------------------------+----------------------------------+
if 'Convergence criterion on energy change' in line:
self.energy_threshold = float(line.split()[6])
# If energy change threshold equals zero,
# then energy change is not a criteria for convergence.
if self.energy_threshold == 0:
self.energy_threshold = numpy.inf
if 'Energy Statistics for Geometry Optimization' in line:
if not hasattr(self, 'geovalues'):
self.geovalues = []
self.skip_lines(inputfile, ['stars', 'header'])
line = next(inputfile)
assert 'Iter Energy Change Norm' in line
# A variable keeping track of ongoing iteration.
iter_number = len(self.geovalues) + 1
# Iterate till blank line.
while line.split() != []:
for i in range(iter_number):
line = next(inputfile)
self.geovalues.append([float(line.split()[2])])
line = next(inputfile)
# Along with energy change, RMS and Max values of change in
# Cartesian Diaplacement and Gradients are used as optimization
# criteria.
self.skip_lines(inputfile, ['border', 'header', 'header', 'border'])
line = next(inputfile)
assert '+ RMS +' in line
line_rms = line.split()
line = next(inputfile)
line_max = next(inputfile).split()
if not hasattr(self, 'geotargets'):
# The attribute geotargets is an array consisting of the following
# values: [Energy threshold, Max Gradient threshold, RMS Gradient threshold, \
# Max Displacements threshold, RMS Displacements threshold].
max_gradient_threshold = float(line_max[8])
rms_gradient_threshold = float(line_rms[8])
max_displacement_threshold = float(line_max[4])
rms_displacement_threshold = float(line_rms[4])
self.geotargets = [self.energy_threshold, max_gradient_threshold, rms_gradient_threshold, max_displacement_threshold, rms_displacement_threshold]
max_gradient_change = float(line_max[7])
rms_gradient_change = float(line_rms[7])
max_displacement_change = float(line_max[3])
rms_displacement_change = float(line_rms[3])
self.geovalues[iter_number - 1].extend([max_gradient_change, rms_gradient_change, max_displacement_change, rms_displacement_change])
# *********************************************************
# * Nuclear coordinates for the next iteration / Angstrom *
# *********************************************************
# ATOM X Y Z
# C1 0.235560 -1.415847 0.012012
# C2 1.313797 -0.488199 0.015149
# C3 1.087050 0.895510 0.014200
# ...
# ...
# H19 -0.021327 -4.934915 -0.029355
# H20 -1.432030 -3.721047 -0.039835
#
# --
if 'Nuclear coordinates for the next iteration / Angstrom' in line:
self.skip_lines(inputfile, ['s', 'header'])
line = next(inputfile)
atomcoords = []
while line.split() != []:
atomcoords.append([float(c) for c in line.split()[1:]])
line = next(inputfile)
if len(atomcoords) == self.natom:
self.atomcoords.append(atomcoords)
else:
self.logger.warning(
"Parsed coordinates not consistent with previous, skipping. "
"This could be due to symmetry being turned on during the job. "
"Length was %i, now found %i. New coordinates: %s"
% (len(self.atomcoords[-1]), len(atomcoords), str(atomcoords)))
# **********************************************************************************************************************
# * Energy Statistics for Geometry Optimization *
# **********************************************************************************************************************
# Energy Grad Grad Step Estimated Geom Hessian
# Iter Energy Change Norm Max Element Max Element Final Energy Update Update Index
# 1 -382.30023222 0.00000000 0.107221 0.039531 nrc047 0.085726 nrc047 -382.30533799 RS-RFO None 0
# ...
# ...
# 23 -382.30823115 -0.00000089 0.001030 0.000088 nrc053 0.000955 nrc053 -382.30823118 RS-RFO BFGS 0
#
# +----------------------------------+----------------------------------+
# + Cartesian Displacements + Gradient in internals +
# + Value Threshold Converged? + Value Threshold Converged? +
# +-----+----------------------------------+----------------------------------+
# + RMS + 7.2395E-04 1.2000E-03 Yes + 2.7516E-04 3.0000E-04 Yes +
# +-----+----------------------------------+----------------------------------+
# + Max + 1.6918E-03 1.8000E-03 Yes + 8.7768E-05 4.5000E-04 Yes +
# +-----+----------------------------------+----------------------------------+
#
# Geometry is converged in 23 iterations to a Minimum Structure
if 'Geometry is converged' in line:
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.atomcoords))
# *********************************************************
# * Nuclear coordinates of the final structure / Angstrom *
# *********************************************************
# ATOM X Y Z
# C1 0.235547 -1.415838 0.012193
# C2 1.313784 -0.488201 0.015297
# C3 1.087036 0.895508 0.014333
# ...
# ...
# H19 -0.021315 -4.934913 -0.029666
# H20 -1.431994 -3.721026 -0.041078
if 'Nuclear coordinates of the final structure / Angstrom' in line:
self.skip_lines(inputfile, ['s', 'header'])
line = next(inputfile)
atomcoords = []
while line.split() != []:
atomcoords.append([float(c) for c in line.split()[1:]])
line = next(inputfile)
if len(atomcoords) == self.natom:
self.atomcoords.append(atomcoords)
else:
self.logger.error(
'Number of atoms (%d) in parsed atom coordinates '
'is smaller than previously (%d), possibly due to '
'symmetry. Ignoring these coordinates.'
% (len(atomcoords), self.natom))
# All orbitals with orbital energies smaller than E(LUMO)+0.5 are printed
#
# ++ Molecular orbitals:
# -------------------
#
# Title: RKS-DFT orbitals
#
# Molecular orbitals for symmetry species 1: a
#
# Orbital 1 2 3 4 5 6 7 8 9 10
# Energy -10.0179 -10.0179 -10.0075 -10.0075 -10.0066 -10.0066 -10.0056 -10.0055 -9.9919 -9.9919
# Occ. No. 2.0000 2.0000 2.0000 2.0000 2.0000 2.0000 2.0000 2.0000 2.0000 2.0000
#
# 1 C1 1s -0.6990 0.6989 0.0342 0.0346 0.0264 -0.0145 -0.0124 -0.0275 -0.0004 -0.0004
# 2 C1 2s -0.0319 0.0317 -0.0034 -0.0033 -0.0078 0.0034 0.0041 0.0073 -0.0002 -0.0002
# ...
# ...
# 58 H18 1s 0.2678
# 59 H19 1s -0.2473
# 60 H20 1s 0.1835
# --
if '++ Molecular orbitals:' in line:
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
# We don't currently support parsing natural orbitals or active space orbitals.
if 'Natural orbitals' not in line and "Pseudonatural" not in line:
self.skip_line(inputfile, 'b')
# Symmetry is not currently supported, so this line can have one form.
while 'Molecular orbitals for symmetry species 1: a' not in line.strip():
line = next(inputfile)
# Symmetry is not currently supported, so this line can have one form.
if line.strip() != 'Molecular orbitals for symmetry species 1: a':
return
line = next(inputfile)
moenergies = []
homos = 0
mocoeffs = []
while line[:2] != '--':
line = next(inputfile)
if line.strip().startswith('Orbital'):
orbital_index = line.split()[1:]
for i in orbital_index:
mocoeffs.append([])
if 'Energy' in line:
energies = [utils.convertor(float(x), 'hartree', 'eV') for x in line.split()[1:]]
moenergies.extend(energies)
if 'Occ. No.' in line:
for i in line.split()[2:]:
if float(i) != 0:
homos += 1
aonames = []
tokens = line.split()
if tokens and tokens[0] == '1':
while tokens and tokens[0] != '--':
aonames.append("{atom}_{orbital}".format(atom=tokens[1], orbital=tokens[2]))
info = tokens[3:]
j = 0
for i in orbital_index:
mocoeffs[int(i)-1].append(float(info[j]))
j += 1
line = next(inputfile)
tokens = line.split()
self.set_attribute('aonames', aonames)
if len(moenergies) != self.nmo:
moenergies.extend([numpy.nan for x in range(self.nmo - len(moenergies))])
self.append_attribute('moenergies', moenergies)
if not hasattr(self, 'homos'):
self.homos = []
self.homos.extend([homos-1])
while len(mocoeffs) < self.nmo:
nan_array = [numpy.nan for i in range(self.nbasis)]
mocoeffs.append(nan_array)
self.append_attribute('mocoeffs', mocoeffs)
## Parsing MP energy from the &MBPT2 module.
# Conventional algorithm used...
#
# SCF energy = -74.9644564043 a.u.
# Second-order correlation energy = -0.0364237923 a.u.
#
# Total energy = -75.0008801966 a.u.
# Reference weight ( Cref**2 ) = 0.98652
#
# :: Total MBPT2 energy -75.0008801966
#
#
# Zeroth-order energy (E0) = -36.8202538520 a.u.
#
# Shanks-type energy S1(E) = -75.0009150108 a.u.
if 'Total MBPT2 energy' in line:
mpenergies = []
mpenergies.append(utils.convertor(self.float(line.split()[4]), 'hartree', 'eV'))
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
self.mpenergies.append(mpenergies)
# Parsing data ccenergies from &CCSDT module.
# --- Start Module: ccsdt at Thu Jul 26 14:03:23 2018 ---
#
# ()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()()
#
# &CCSDT
# ...
# ...
# 14 -75.01515915 -0.05070274 -0.00000029
# 15 -75.01515929 -0.05070289 -0.00000014
# 16 -75.01515936 -0.05070296 -0.00000007
# Convergence after 17 Iterations
#
#
# Total energy (diff) : -75.01515936 -0.00000007
# Correlation energy : -0.0507029554992
if 'Start Module: ccsdt' in line:
self.skip_lines(inputfile, ['b', '()', 'b'])
line = next(inputfile)
if '&CCSDT' in line:
while not line.strip().startswith('Total energy (diff)'):
line = next(inputfile)
ccenergies = utils.convertor(self.float(line.split()[4]), 'hartree', 'eV')
if not hasattr(self, 'ccenergies'):
self.ccenergies= []
self.ccenergies.append(ccenergies)
# ++ Primitive basis info:
# ---------------------
#
#
# *****************************************************
# ******** Primitive Basis Functions (Valence) ********
# *****************************************************
#
#
# Basis set:C.AUG-CC-PVQZ.........
#
# Type
# s
# No. Exponent Contraction Coefficients
# 1 0.339800000D+05 0.000091 -0.000019 0.000000 0.000000 0.000000 0.000000
# 2 0.508900000D+04 0.000704 -0.000151 0.000000 0.000000 0.000000 0.000000
# ...
# ...
# 29 0.424000000D+00 0.000000 1.000000
#
# Number of primitives 93
# Number of basis functions 80
#
# --
if line.startswith('++ Primitive basis info:'):
self.skip_lines(inputfile, ['d', 'b', 'b', 's', 'header', 's', 'b'])
line = next(inputfile)
gbasis_array = []
while '--' not in line and '****' not in line:
if 'Basis set:' in line:
basis_element_patterns = re.findall('Basis set:([A-Za-z]{1,2})\.', line)
assert len(basis_element_patterns) == 1
basis_element = basis_element_patterns[0].title()
gbasis_array.append((basis_element, []))
if 'Type' in line:
line = next(inputfile)
shell_type = line.split()[0].upper()
self.skip_line(inputfile, 'headers')
line = next(inputfile)
exponents = []
coefficients = []
func_array = []
while line.split():
exponents.append(self.float(line.split()[1]))
coefficients.append([self.float(i) for i in line.split()[2:]])
line = next(inputfile)
for i in range(len(coefficients[0])):
func_tuple = (shell_type, [])
for iexp, exp in enumerate(exponents):
coeff = coefficients[iexp][i]
if coeff != 0:
func_tuple[1].append((exp, coeff))
gbasis_array[-1][1].append(func_tuple)
line = next(inputfile)
atomsymbols = [self.table.element[atomno] for atomno in self.atomnos]
self.gbasis = [[] for i in range(self.natom)]
for element, gbasis in gbasis_array:
mask = [element == possible_element for possible_element in atomsymbols]
indices = [i for (i, x) in enumerate(mask) if x]
for index in indices:
self.gbasis[index] = gbasis
# ++ Basis set information:
# ----------------------
# ...
# Basis set label: MO.ECP.HAY-WADT.5S6P4D.3S3P2D.14E-LANL2DZ.....
#
# Electronic valence basis set:
# ------------------
# Associated Effective Charge 14.000000 au
# Associated Actual Charge 42.000000 au
# Nuclear Model: Point charge
# ...
#
# Effective Core Potential specification:
# =======================================
#
# Label Cartesian Coordinates / Bohr
#
# MO 0.0006141610 -0.0006141610 0.0979067106
# --
if '++ Basis set information:' in line:
self.core_array = []
basis_element = None
ncore = 0
while line[:2] != '--':
if 'Basis set label' in line:
try:
basis_element = line.split()[3].split('.')[0]
basis_element = basis_element[0] + basis_element[1:].lower()
except:
self.logger.warning('Basis set label is missing!')
basis_element = ''
if 'valence basis set:' in line.lower():
self.skip_line(inputfile, 'd')
line = next(inputfile)
if 'Associated Effective Charge' in line:
effective_charge = float(line.split()[3])
actual_charge = float(next(inputfile).split()[3])
element = self.table.element[int(actual_charge)]
ncore = int(actual_charge - effective_charge)
if basis_element:
assert basis_element == element
else:
basis_element = element
if basis_element and ncore:
self.core_array.append((basis_element, ncore))
basis_element = ''
ncore = 0
line = next(inputfile)
cclib-1.6.2/cclib/parser/molproparser.py 0000664 0000000 0000000 00000122013 13535330462 0020246 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Molpro output files"""
import itertools
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
def create_atomic_orbital_names(orbitals):
"""Generate all atomic orbital names that could be used by Molpro.
The names are returned in a dictionary, organized by subshell (S, P, D and so on).
"""
# We can write out the first two manually, since there are not that many.
atomic_orbital_names = {
'S': ['s', '1s'],
'P': ['x', 'y', 'z', '2px', '2py', '2pz'],
}
# Although we could write out all names for the other subshells, it is better
# to generate them if we need to expand further, since the number of functions quickly
# grows and there are both Cartesian and spherical variants to consider.
# For D orbitals, the Cartesian functions are xx, yy, zz, xy, xz and yz, and the
# spherical ones are called 3d0, 3d1-, 3d1+, 3d2- and 3d2+. For F orbitals, the Cartesians
# are xxx, xxy, xxz, xyy, ... and the sphericals are 4f0, 4f1-, 4f+ and so on.
for i, orb in enumerate(orbitals):
# Cartesian can be generated directly by combinations.
cartesian = list(map(''.join, list(itertools.combinations_with_replacement(['x', 'y', 'z'], i+2))))
# For spherical functions, we need to construct the names.
pre = str(i+3) + orb.lower()
spherical = [pre + '0'] + [pre + str(j) + s for j in range(1, i+3) for s in ['-', '+']]
atomic_orbital_names[orb] = cartesian + spherical
return atomic_orbital_names
class Molpro(logfileparser.Logfile):
"""Molpro file parser"""
atomic_orbital_names = create_atomic_orbital_names(['D', 'F', 'G'])
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Molpro, self).__init__(logname="Molpro", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Molpro log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Molpro("%s")' % (self.filename)
def normalisesym(self, label):
"""Normalise the symmetries used by Molpro."""
ans = label.replace("`", "'").replace("``", "''")
return ans
def before_parsing(self):
self.electronorbitals = ""
self.insidescf = False
def after_parsing(self):
# If optimization thresholds are default, they are normally not printed and we need
# to set them to the default after parsing. Make sure to set them in the same order that
# they appear in the in the geometry optimization progress printed in the output,
# namely: energy difference, maximum gradient, maximum step.
if not hasattr(self, "geotargets"):
self.geotargets = []
# Default THRENERG (required accuracy of the optimized energy).
self.geotargets.append(1E-6)
# Default THRGRAD (required accuracy of the optimized gradient).
self.geotargets.append(3E-4)
# Default THRSTEP (convergence threshold for the geometry optimization step).
self.geotargets.append(3E-4)
def _parse_orbitals(self, inputfile, line):
# From this block aonames, atombasis, moenergies and mocoeffs can be parsed. The data is
# flipped compared to most programs (GAMESS, Gaussian), since the MOs are in rows. Also, Molpro
# does not cut the table into parts, rather each MO row has as many lines as it takes ro print
# all of the MO coefficients. Each row normally has 10 coefficients, although this can be less
# for the last row and when symmetry is used (each irrep has its own block).
#
# ELECTRON ORBITALS
# =================
#
#
# Orb Occ Energy Couls-En Coefficients
#
# 1 1s 1 1s 1 2px 1 2py 1 2pz 2 1s (...)
# 3 1s 3 1s 3 2px 3 2py 3 2pz 4 1s (...)
# (...)
#
# 1.1 2 -11.0351 -43.4915 0.701460 0.025696 -0.000365 -0.000006 0.000000 0.006922 (...)
# -0.006450 0.004742 -0.001028 -0.002955 0.000000 -0.701460 (...)
# (...)
#
# If an MCSCF calculation was performed, the natural orbitals
# (coefficients and occupation numbers) are printed in a
# format nearly identical to the ELECTRON ORBITALS section.
#
# NATURAL ORBITALS (state averaged)
# =================================
#
# Orb Occ Energy Coefficients
#
# 1 s 1 s 1 s 1 z 1 z 1 xx 1 yy 1 zz 2 s 2 s
# 2 s 2 z 2 z 2 xx 2 yy 2 zz 3 s 3 s 3 z 3 y
#
# 1.1 2.00000 -20.678730 0.000141 -0.000057 0.001631 -0.001377 0.001117 0.000029 0.000293 -0.000852 1.000748 0.001746
# -0.002552 -0.002005 0.001658 -0.001266 -0.001274 -0.001001 0.000215 -0.000131 -0.000242 -0.000126
#
# 2.1 2.00000 -11.322823 1.000682 0.004626 -0.000485 0.006634 -0.002096 -0.003072 -0.003282 -0.001724 -0.000181 0.006734
# -0.002398 -0.000527 0.001335 0.000091 0.000058 0.000396 -0.003219 0.000981 0.000250 -0.000191
# (...)
# The assigment of final cclib attributes is different for
# canonical/natural orbitals.
self.naturalorbitals = (line[1:17] == "NATURAL ORBITALS")
# Make sure we didn't get here by mistake.
assert line[1:18] == "ELECTRON ORBITALS" or self.electronorbitals or self.naturalorbitals
# For unrestricted calculations, ELECTRON ORBITALS is followed on the same line
# by FOR POSITIVE SPIN or FOR NEGATIVE SPIN as appropriate.
spin = (line[19:36] == "FOR NEGATIVE SPIN") or (self.electronorbitals[19:36] == "FOR NEGATIVE SPIN")
if self.naturalorbitals:
self.skip_lines(inputfile, ['equals', 'b', 'headers', 'b'])
else:
if not self.electronorbitals:
self.skip_line(inputfile, 'equals')
self.skip_lines(inputfile, ['b', 'b', 'headers', 'b'])
aonames = []
atombasis = [[] for i in range(self.natom)]
moenergies = []
# Use for both canonical and natural orbital coefficients.
mocoeffs = []
occnos = []
line = next(inputfile)
# Besides a double blank line, stop when the next orbitals are encountered for unrestricted jobs
# or if there are stars on the line which always signifies the end of the block.
while line.strip() and (not "ORBITALS" in line) and (not set(line.strip()) == {'*'}):
# The function names are normally printed just once, but if symmetry is used then each irrep
# has its own mocoeff block with a preceding list of names.
is_aonames = line[:25].strip() == ""
if is_aonames:
# We need to save this offset for parsing the coefficients later.
offset = len(aonames)
aonum = len(aonames)
while line.strip():
for s in line.split():
if s.isdigit():
atomno = int(s)
atombasis[atomno-1].append(aonum)
aonum += 1
else:
functype = s
element = self.table.element[self.atomnos[atomno-1]]
aoname = "%s%i_%s" % (element, atomno, functype)
aonames.append(aoname)
line = next(inputfile)
# Now there can be one or two blank lines.
while not line.strip():
line = next(inputfile)
# Newer versions of Molpro (for example, 2012 test files) will print some
# more things here, such as HOMO and LUMO, but these have less than 10 columns.
if "HOMO" in line or "LUMO" in line:
break
# End of the NATURAL ORBITALS section.
if "Natural orbital dump" in line:
break
# Now parse the MO coefficients, padding the list with an appropriate amount of zeros.
coeffs = [0.0 for i in range(offset)]
while line.strip() != "":
if line[:31].rstrip():
tokens = line.split()
moenergy = float(tokens[2])
moenergy = utils.convertor(moenergy, "hartree", "eV")
moenergies.append(moenergy)
if self.naturalorbitals:
occno = float(tokens[1])
occnos.append(occno)
# Coefficients are in 10.6f format and splitting does not work since there are not
# always spaces between them. If the numbers are very large, there will be stars.
str_coeffs = line[31:]
ncoeffs = len(str_coeffs) // 10
coeff = []
for ic in range(ncoeffs):
p = str_coeffs[ic*10:(ic+1)*10]
try:
c = float(p)
except ValueError as detail:
self.logger.warn("setting coeff element to zero: %s" % detail)
c = 0.0
coeff.append(c)
coeffs.extend(coeff)
line = next(inputfile)
mocoeffs.append(coeffs)
# The loop should keep going until there is a double blank line, and there is
# a single line between each coefficient block.
line = next(inputfile)
if not line.strip():
line = next(inputfile)
# If symmetry was used (offset was needed) then we will need to pad all MO vectors
# up to nbasis for all irreps before the last one.
if offset > 0:
for im, m in enumerate(mocoeffs):
if len(m) < self.nbasis:
mocoeffs[im] = m + [0.0 for i in range(self.nbasis - len(m))]
self.set_attribute('atombasis', atombasis)
self.set_attribute('aonames', aonames)
if self.naturalorbitals:
# Consistent with current cclib conventions, keep only the
# last possible set of natural orbital coefficients and
# occupation numbers.
self.nocoeffs = mocoeffs
self.nooccnos = occnos
else:
# Consistent with current cclib conventions, reset moenergies/mocoeffs if they have been
# previously parsed, since we want to produce only the final values.
if not hasattr(self, "moenergies") or spin == 0:
self.mocoeffs = []
self.moenergies = []
self.moenergies.append(moenergies)
self.mocoeffs.append(mocoeffs)
# Check if last line begins the next ELECTRON ORBITALS section, because we already used
# this line and need to know when this method is called next time.
if line[1:18] == "ELECTRON ORBITALS":
self.electronorbitals = line
else:
self.electronorbitals = ""
return
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the package version number.
if "Version" in line:
self.metadata["package_version"] = line.split()[1]
if line[1:19] == "ATOMIC COORDINATES":
if not hasattr(self, "atomcoords"):
self.atomcoords = []
atomcoords = []
atomnos = []
self.skip_lines(inputfile, ['line', 'line', 'line'])
line = next(inputfile)
while line.strip():
temp = line.strip().split()
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom") for x in temp[3:6]]) # bohrs to angs
atomnos.append(int(round(float(temp[2]))))
line = next(inputfile)
self.atomcoords.append(atomcoords)
self.set_attribute('atomnos', atomnos)
self.set_attribute('natom', len(self.atomnos))
# Use BASIS DATA to parse input for gbasis, aonames and atombasis. If symmetry is used,
# the function number starts from 1 for each irrep (the irrep index comes after the dot).
#
# BASIS DATA
#
# Nr Sym Nuc Type Exponents Contraction coefficients
#
# 1.1 A 1 1s 71.616837 0.154329
# 13.045096 0.535328
# 3.530512 0.444635
# 2.1 A 1 1s 2.941249 -0.099967
# 0.683483 0.399513
# ...
#
if line[1:11] == "BASIS DATA":
# We can do a sanity check with the header.
self.skip_line(inputfile, 'blank')
header = next(inputfile)
assert header.split() == ["Nr", "Sym", "Nuc", "Type", "Exponents", "Contraction", "coefficients"]
self.skip_line(inputfile, 'blank')
aonames = []
atombasis = [[] for i in range(self.natom)]
gbasis = [[] for i in range(self.natom)]
while line.strip():
# We need to read the line at the start of the loop here, because the last function
# will be added when a blank line signalling the end of the block is encountered.
line = next(inputfile)
# The formatting here can exhibit subtle differences, including the number of spaces
# or indentation size. However, we will rely on explicit slices since not all components
# are always available. In fact, components not being there has some meaning (see below).
line_nr = line[1:6].strip()
line_sym = line[7:9].strip()
line_nuc = line[11:15].strip()
line_type = line[16:22].strip()
line_exp = line[25:38].strip()
line_coeffs = line[38:].strip()
# If a new function type is printed or the BASIS DATA block ends with a blank line,
# then add the previous function to gbasis, except for the first function since
# there was no preceeding one. When translating the Molpro function name to gbasis,
# note that Molpro prints all components, but we want it only once, with the proper
# shell type (S,P,D,F,G). Molpro names also differ between Cartesian/spherical representations.
if (line_type and aonames) or line.strip() == "":
# All the possible AO names are created with the class. The function should always
# find a match in that dictionary, so we can check for that here and will need to
# update the dict if something unexpected comes up.
funcbasis = None
for fb, names in self.atomic_orbital_names.items():
if functype in names:
funcbasis = fb
assert funcbasis
# There is a separate basis function for each column of contraction coefficients. Since all
# atomic orbitals for a subshell will have the same parameters, we can simply check if
# the function tuple is already in gbasis[i] before adding it.
for i in range(len(coefficients[0])):
func = (funcbasis, [])
for j in range(len(exponents)):
func[1].append((exponents[j], coefficients[j][i]))
if func not in gbasis[funcatom-1]:
gbasis[funcatom-1].append(func)
# If it is a new type, set up the variables for the next shell(s). An exception is symmetry functions,
# which we want to copy from the previous function and don't have a new number on the line. For them,
# we just want to update the nuclear index.
if line_type:
if line_nr:
exponents = []
coefficients = []
functype = line_type
funcatom = int(line_nuc)
# Add any exponents and coefficients to lists.
if line_exp and line_coeffs:
funcexp = float(line_exp)
funccoeffs = [float(s) for s in line_coeffs.split()]
exponents.append(funcexp)
coefficients.append(funccoeffs)
# If the function number is present then add to atombasis and aonames, which is different from
# adding to gbasis since it enumerates AOs rather than basis functions. The number counts functions
# in each irrep from 1 and we could add up the functions for each irrep to get the global count,
# but it is simpler to just see how many aonames we have already parsed. Any symmetry functions
# are also printed, but they don't get numbers so they are nor parsed.
if line_nr:
element = self.table.element[self.atomnos[funcatom-1]]
aoname = "%s%i_%s" % (element, funcatom, functype)
aonames.append(aoname)
funcnr = len(aonames)
atombasis[funcatom-1].append(funcnr-1)
self.set_attribute('aonames', aonames)
self.set_attribute('atombasis', atombasis)
self.set_attribute('gbasis', gbasis)
if line[1:23] == "NUMBER OF CONTRACTIONS":
nbasis = int(line.split()[3])
self.set_attribute('nbasis', nbasis)
# Basis set name
if line[1:8] == "Library":
self.metadata["basis_set"] = line.split()[4]
# This is used to signalize whether we are inside an SCF calculation.
if line[1:8] == "PROGRAM" and line[14:18] == "-SCF":
self.insidescf = True
self.metadata["methods"].append("HF")
# Use this information instead of 'SETTING ...', in case the defaults are standard.
# Note that this is sometimes printed in each geometry optimization step.
if line[1:20] == "NUMBER OF ELECTRONS":
spinup = int(line.split()[3][:-1])
spindown = int(line.split()[4][:-1])
# Nuclear charges (atomnos) should be parsed by now.
nuclear = numpy.sum(self.atomnos)
charge = nuclear - spinup - spindown
self.set_attribute('charge', charge)
mult = spinup - spindown + 1
self.set_attribute('mult', mult)
# Convergenve thresholds for SCF cycle, should be contained in a line such as:
# CONVERGENCE THRESHOLDS: 1.00E-05 (Density) 1.40E-07 (Energy)
if self.insidescf and line[1:24] == "CONVERGENCE THRESHOLDS:":
if not hasattr(self, "scftargets"):
self.scftargets = []
scftargets = list(map(float, line.split()[2::2]))
self.scftargets.append(scftargets)
# Usually two criteria, but save the names this just in case.
self.scftargetnames = line.split()[3::2]
# Read in the print out of the SCF cycle - for scfvalues. For RHF looks like:
# ITERATION DDIFF GRAD ENERGY 2-EL.EN. DIPOLE MOMENTS DIIS
# 1 0.000D+00 0.000D+00 -379.71523700 1159.621171 0.000000 0.000000 0.000000 0
# 2 0.000D+00 0.898D-02 -379.74469736 1162.389787 0.000000 0.000000 0.000000 1
# 3 0.817D-02 0.144D-02 -379.74635529 1162.041033 0.000000 0.000000 0.000000 2
# 4 0.213D-02 0.571D-03 -379.74658063 1162.159929 0.000000 0.000000 0.000000 3
# 5 0.799D-03 0.166D-03 -379.74660889 1162.144256 0.000000 0.000000 0.000000 4
if self.insidescf and line[1:10] == "ITERATION":
if not hasattr(self, "scfvalues"):
self.scfvalues = []
line = next(inputfile)
energy = 0.0
scfvalues = []
while line.strip() != "":
chomp = line.split()
if chomp[0].isdigit():
ddiff = float(chomp[1].replace('D', 'E'))
grad = float(chomp[2].replace('D', 'E'))
newenergy = float(chomp[3])
ediff = newenergy - energy
energy = newenergy
# The convergence thresholds must have been read above.
# Presently, we recognize MAX DENSITY and MAX ENERGY thresholds.
numtargets = len(self.scftargetnames)
values = [numpy.nan]*numtargets
for n, name in zip(list(range(numtargets)), self.scftargetnames):
if "ENERGY" in name.upper():
values[n] = ediff
elif "DENSITY" in name.upper():
values[n] = ddiff
scfvalues.append(values)
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last gradient: {}'.format(grad))
break
self.scfvalues.append(numpy.array(scfvalues))
# SCF result - RHF/UHF and DFT (RKS) energies.
if (line[1:5] in ["!RHF", "!UHF", "!RKS"] and line[16:22].lower() == "energy"):
if not hasattr(self, "scfenergies"):
self.scfenergies = []
scfenergy = float(line.split()[4])
self.scfenergies.append(utils.convertor(scfenergy, "hartree", "eV"))
# We are now done with SCF cycle (after a few lines).
self.insidescf = False
# MP2 energies.
if line[1:5] == "!MP2":
self.metadata["methods"].append("MP2")
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
mp2energy = float(line.split()[-1])
mp2energy = utils.convertor(mp2energy, "hartree", "eV")
self.mpenergies.append([mp2energy])
# MP2 energies if MP3 or MP4 is also calculated.
if line[1:5] == "MP2:":
self.metadata["methods"].append("MP2")
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
mp2energy = float(line.split()[2])
mp2energy = utils.convertor(mp2energy, "hartree", "eV")
self.mpenergies.append([mp2energy])
# MP3 (D) and MP4 (DQ or SDQ) energies.
if line[1:8] == "MP3(D):":
self.metadata["methods"].append("MP3")
mp3energy = float(line.split()[2])
mp2energy = utils.convertor(mp3energy, "hartree", "eV")
line = next(inputfile)
self.mpenergies[-1].append(mp2energy)
if line[1:9] == "MP4(DQ):":
self.metadata["methods"].append("MP4")
mp4energy = float(line.split()[2])
line = next(inputfile)
if line[1:10] == "MP4(SDQ):":
self.metadata["methods"].append("MP4")
mp4energy = float(line.split()[2])
mp4energy = utils.convertor(mp4energy, "hartree", "eV")
self.mpenergies[-1].append(mp4energy)
# The CCSD program operates all closed-shel coupled cluster runs.
if line[1:15] == "PROGRAM * CCSD":
self.metadata["methods"].append("CCSD")
if not hasattr(self, "ccenergies"):
self.ccenergies = []
while line[1:20] != "Program statistics:":
# The last energy (most exact) will be read last and thus saved.
if line[1:5] == "!CCD" or line[1:6] == "!CCSD" or line[1:9] == "!CCSD(T)":
ccenergy = float(line.split()[-1])
ccenergy = utils.convertor(ccenergy, "hartree", "eV")
line = next(inputfile)
self.ccenergies.append(ccenergy)
# Read the occupancy (index of HOMO s).
# For restricted calculations, there is one line here. For unrestricted, two:
# Final alpha occupancy: ...
# Final beta occupancy: ...
if line[1:17] == "Final occupancy:":
self.homos = [int(line.split()[-1])-1]
if line[1:23] == "Final alpha occupancy:":
self.homos = [int(line.split()[-1])-1]
line = next(inputfile)
self.homos.append(int(line.split()[-1])-1)
# Dipole is always printed on one line after the final RHF energy, and by default
# it seems Molpro uses the origin as the reference point.
if line.strip()[:13] == "Dipole moment":
assert line.split()[2] == "/Debye"
reference = [0.0, 0.0, 0.0]
dipole = [float(d) for d in line.split()[-3:]]
if not hasattr(self, 'moments'):
self.moments = [reference, dipole]
else:
self.moments[1] == dipole
# Static dipole polarizability.
if line.strip() == "SCF dipole polarizabilities":
if not hasattr(self, "polarizabilities"):
self.polarizabilities = []
polarizability = []
self.skip_lines(inputfile, ['b', 'directions'])
for _ in range(3):
line = next(inputfile)
polarizability.append(line.split()[1:])
self.polarizabilities.append(numpy.array(polarizability))
# Check for ELECTRON ORBITALS (canonical molecular orbitals).
if line[1:18] == "ELECTRON ORBITALS" or self.electronorbitals:
self._parse_orbitals(inputfile, line)
# If the MATROP program was called appropriately,
# the atomic obital overlap matrix S is printed.
# The matrix is printed straight-out, ten elements in each row, both halves.
# Note that is the entire matrix is not printed, then aooverlaps
# will not have dimensions nbasis x nbasis.
if line[1:9] == "MATRIX S":
if not hasattr(self, "aooverlaps"):
self.aooverlaps = [[]]
self.skip_lines(inputfile, ['b', 'symblocklabel'])
line = next(inputfile)
while line.strip() != "":
elements = [float(s) for s in line.split()]
if len(self.aooverlaps[-1]) + len(elements) <= self.nbasis:
self.aooverlaps[-1] += elements
else:
n = len(self.aooverlaps[-1]) + len(elements) - self.nbasis
self.aooverlaps[-1] += elements[:-n]
self.aooverlaps.append([])
self.aooverlaps[-1] += elements[-n:]
line = next(inputfile)
# Check for MCSCF natural orbitals.
if line[1:17] == "NATURAL ORBITALS":
self._parse_orbitals(inputfile, line)
# Thresholds are printed only if the defaults are changed with GTHRESH.
# In that case, we can fill geotargets with non-default values.
# The block should look like this as of Molpro 2006.1:
# THRESHOLDS:
# ZERO = 1.00D-12 ONEINT = 1.00D-12 TWOINT = 1.00D-11 PREFAC = 1.00D-14 LOCALI = 1.00D-09 EORDER = 1.00D-04
# ENERGY = 0.00D+00 ETEST = 0.00D+00 EDENS = 0.00D+00 THRDEDEF= 1.00D-06 GRADIENT= 1.00D-02 STEP = 1.00D-03
# ORBITAL = 1.00D-05 CIVEC = 1.00D-05 COEFF = 1.00D-04 PRINTCI = 5.00D-02 PUNCHCI = 9.90D+01 OPTGRAD = 3.00D-04
# OPTENERG= 1.00D-06 OPTSTEP = 3.00D-04 THRGRAD = 2.00D-04 COMPRESS= 1.00D-11 VARMIN = 1.00D-07 VARMAX = 1.00D-03
# THRDOUB = 0.00D+00 THRDIV = 1.00D-05 THRRED = 1.00D-07 THRPSP = 1.00D+00 THRDC = 1.00D-10 THRCS = 1.00D-10
# THRNRM = 1.00D-08 THREQ = 0.00D+00 THRDE = 1.00D+00 THRREF = 1.00D-05 SPARFAC = 1.00D+00 THRDLP = 1.00D-07
# THRDIA = 1.00D-10 THRDLS = 1.00D-07 THRGPS = 0.00D+00 THRKEX = 0.00D+00 THRDIS = 2.00D-01 THRVAR = 1.00D-10
# THRLOC = 1.00D-06 THRGAP = 1.00D-06 THRLOCT = -1.00D+00 THRGAPT = -1.00D+00 THRORB = 1.00D-06 THRMLTP = 0.00D+00
# THRCPQCI= 1.00D-10 KEXTA = 0.00D+00 THRCOARS= 0.00D+00 SYMTOL = 1.00D-06 GRADTOL = 1.00D-06 THROVL = 1.00D-08
# THRORTH = 1.00D-08 GRID = 1.00D-06 GRIDMAX = 1.00D-03 DTMAX = 0.00D+00
if line[1:12] == "THRESHOLDS":
self.skip_line(input, 'blank')
line = next(inputfile)
while line.strip():
if "OPTENERG" in line:
start = line.find("OPTENERG")
optenerg = line[start+10:start+20]
if "OPTGRAD" in line:
start = line.find("OPTGRAD")
optgrad = line[start+10:start+20]
if "OPTSTEP" in line:
start = line.find("OPTSTEP")
optstep = line[start+10:start+20]
line = next(inputfile)
self.geotargets = [optenerg, optgrad, optstep]
# The optimization history is the source for geovlues:
#
# END OF GEOMETRY OPTIMIZATION. TOTAL CPU: 246.9 SEC
#
# ITER. ENERGY(OLD) ENERGY(NEW) DE GRADMAX GRADNORM GRADRMS STEPMAX STEPLEN STEPRMS
# 1 -382.02936898 -382.04914450 -0.01977552 0.11354875 0.20127947 0.01183997 0.12972761 0.20171740 0.01186573
# 2 -382.04914450 -382.05059234 -0.00144784 0.03299860 0.03963339 0.00233138 0.05577169 0.06687650 0.00393391
# 3 -382.05059234 -382.05069136 -0.00009902 0.00694359 0.01069889 0.00062935 0.01654549 0.02016307 0.00118606
# ...
#
# The above is an exerpt from Molpro 2006, but it is a little bit different
# for Molpro 2012, namely the 'END OF GEOMETRY OPTIMIZATION occurs after the
# actual history list. It seems there is a another consistent line before the
# history, but this might not be always true -- so this is a potential weak link.
if line[1:30] == "END OF GEOMETRY OPTIMIZATION." or line.strip() == "Quadratic Steepest Descent - Minimum Search":
# I think this is the trigger for convergence, and it shows up at the top in Molpro 2006.
geometry_converged = line[1:30] == "END OF GEOMETRY OPTIMIZATION."
self.skip_line(inputfile, 'blank')
# Newer version of Molpro (at least for 2012) print and additional column
# with the timing information for each step. Otherwise, the history looks the same.
headers = next(inputfile).split()
if not len(headers) in (10, 11):
return
# Although criteria can be changed, the printed format should not change.
# In case it does, retrieve the columns for each parameter.
index_ITER = headers.index('ITER.')
index_THRENERG = headers.index('DE')
index_THRGRAD = headers.index('GRADMAX')
index_THRSTEP = headers.index('STEPMAX')
line = next(inputfile)
self.geovalues = []
while line.strip():
line = line.split()
istep = int(line[index_ITER])
geovalues = []
geovalues.append(float(line[index_THRENERG]))
geovalues.append(float(line[index_THRGRAD]))
geovalues.append(float(line[index_THRSTEP]))
self.geovalues.append(geovalues)
line = next(inputfile)
if line.strip() == "Freezing grid":
line = next(inputfile)
# The convergence trigger shows up somewhere at the bottom in Molpro 2012,
# before the final stars. If convergence is not reached, there is an additional
# line that can be checked for. This is a little tricky, though, since it is
# not the last line... so bail out of the loop if convergence failure is detected.
while "*****" not in line:
line = next(inputfile)
if line.strip() == "END OF GEOMETRY OPTIMIZATION.":
geometry_converged = True
if "No convergence" in line:
geometry_converged = False
break
# Finally, deal with optdone, append the last step to it only if we had convergence.
if not hasattr(self, 'optdone'):
self.optdone = []
if geometry_converged:
self.optdone.append(istep-1)
# This block should look like this:
# Normal Modes
#
# 1 Au 2 Bu 3 Ag 4 Bg 5 Ag
# Wavenumbers [cm-1] 151.81 190.88 271.17 299.59 407.86
# Intensities [km/mol] 0.33 0.28 0.00 0.00 0.00
# Intensities [relative] 0.34 0.28 0.00 0.00 0.00
# CX1 0.00000 -0.01009 0.02577 0.00000 0.06008
# CY1 0.00000 -0.05723 -0.06696 0.00000 0.06349
# CZ1 -0.02021 0.00000 0.00000 0.11848 0.00000
# CX2 0.00000 -0.01344 0.05582 0.00000 -0.02513
# CY2 0.00000 -0.06288 -0.03618 0.00000 0.00349
# CZ2 -0.05565 0.00000 0.00000 0.07815 0.00000
# ...
# Molpro prints low frequency modes in a subsequent section with the same format,
# which also contains zero frequency modes, with the title:
# Normal Modes of low/zero frequencies
if line[1:13] == "Normal Modes":
islow = (line[1:37] == "Normal Modes of low/zero frequencies")
self.skip_line(inputfile, 'blank')
# Each portion of five modes is followed by a single blank line.
# The whole block is followed by an additional blank line.
line = next(inputfile)
while line.strip():
if line[1:25].isspace():
if not islow: # vibsyms not printed for low freq modes
numbers = list(map(int, line.split()[::2]))
vibsyms = line.split()[1::2]
else:
# give low freq modes an empty str as vibsym
# note there could be other possibilities..
numbers = list(map(int, line.split()))
vibsyms = ['']*len(numbers)
if line[1:12] == "Wavenumbers":
vibfreqs = list(map(float, line.strip().split()[2:]))
if line[1:21] == "Intensities [km/mol]":
vibirs = list(map(float, line.strip().split()[2:]))
# There should always by 3xnatom displacement rows.
if line[1:11].isspace() and line[13:25].strip().isdigit():
# There are a maximum of 5 modes per line.
nmodes = len(line.split())-1
vibdisps = []
for i in range(nmodes):
vibdisps.append([])
for n in range(self.natom):
vibdisps[i].append([])
for i in range(nmodes):
disp = float(line.split()[i+1])
vibdisps[i][0].append(disp)
for i in range(self.natom*3 - 1):
line = next(inputfile)
iatom = (i+1)//3
for i in range(nmodes):
disp = float(line.split()[i+1])
vibdisps[i][iatom].append(disp)
line = next(inputfile)
if not line.strip():
if not hasattr(self, "vibfreqs"):
self.vibfreqs = []
if not hasattr(self, "vibsyms"):
self.vibsyms = []
if not hasattr(self, "vibirs") and "vibirs" in dir():
self.vibirs = []
if not hasattr(self, "vibdisps") and "vibdisps" in dir():
self.vibdisps = []
if not islow:
self.vibfreqs.extend(vibfreqs)
self.vibsyms.extend(vibsyms)
if "vibirs" in dir():
self.vibirs.extend(vibirs)
if "vibdisps" in dir():
self.vibdisps.extend(vibdisps)
else:
nonzero = [f > 0 for f in vibfreqs]
vibfreqs = [f for f in vibfreqs if f > 0]
self.vibfreqs = vibfreqs + self.vibfreqs
vibsyms = [vibsyms[i] for i in range(len(vibsyms)) if nonzero[i]]
self.vibsyms = vibsyms + self.vibsyms
if "vibirs" in dir():
vibirs = [vibirs[i] for i in range(len(vibirs)) if nonzero[i]]
self.vibirs = vibirs + self.vibirs
if "vibdisps" in dir():
vibdisps = [vibdisps[i] for i in range(len(vibdisps)) if nonzero[i]]
self.vibdisps = vibdisps + self.vibdisps
line = next(inputfile)
if line[1:16] == "Force Constants":
hessian = []
line = next(inputfile)
hess = []
tmp = []
while line.strip():
try:
list(map(float, line.strip().split()[2:]))
except:
line = next(inputfile)
line.strip().split()[1:]
hess.extend([list(map(float, line.strip().split()[1:]))])
line = next(inputfile)
lig = 0
while (lig == 0) or (len(hess[0]) > 1):
tmp.append(hess.pop(0))
lig += 1
k = 5
while len(hess) != 0:
tmp[k] += hess.pop(0)
k += 1
if (len(tmp[k-1]) == lig):
break
if k >= lig:
k = len(tmp[-1])
for l in tmp:
hessian += l
self.set_attribute("hessian", hessian)
if line[1:14] == "Atomic Masses" and hasattr(self, "hessian"):
line = next(inputfile)
self.amass = list(map(float, line.strip().split()[2:]))
while line.strip():
line = next(inputfile)
self.amass += list(map(float, line.strip().split()[2:]))
#1PROGRAM * POP (Mulliken population analysis)
#
#
# Density matrix read from record 2100.2 Type=RHF/CHARGE (state 1.1)
#
# Population analysis by basis function type
#
# Unique atom s p d f g Total Charge
# 2 C 3.11797 2.88497 0.00000 0.00000 0.00000 6.00294 - 0.00294
# 3 C 3.14091 2.91892 0.00000 0.00000 0.00000 6.05984 - 0.05984
# ...
if line.strip() == "1PROGRAM * POP (Mulliken population analysis)":
self.skip_lines(inputfile, ['b', 'b', 'density_source', 'b', 'func_type', 'b'])
header = next(inputfile)
icharge = header.split().index('Charge')
charges = []
line = next(inputfile)
while line.strip():
cols = line.split()
charges.append(float(cols[icharge]+cols[icharge+1]))
line = next(inputfile)
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
self.atomcharges['mulliken'] = charges
if 'GRADIENT FOR STATE' in line:
for _ in range(3):
next(inputfile)
grad = []
lines_read = 0
while lines_read < self.natom:
line = next(inputfile)
# Because molpro inserts an empty line every 50th atom.
if line:
grad.append([float(x) for x in line.split()[1:]])
lines_read += 1
if not hasattr(self, 'grads'):
self.grads = []
self.grads.append(grad)
if line[:25] == ' Variable memory released':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/mopacparser.py 0000664 0000000 0000000 00000023401 13535330462 0020036 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for MOPAC output files"""
# Based on parser in RMG-Py by Greg Magoon
# https://github.com/ReactionMechanismGenerator/RMG-Py/blob/master/external/cclib/parser/mopacparser.py
# Also parts from Ben Albrecht
# https://github.com/ben-albrecht/cclib/blob/master/cclib/parser/mopacparser.py
# Merged and modernized by Geoff Hutchison
from __future__ import print_function
import re
import math
import numpy
from cclib.parser import data
from cclib.parser import logfileparser
from cclib.parser import utils
def symbol2int(symbol):
t = utils.PeriodicTable()
return t.number[symbol]
class MOPAC(logfileparser.Logfile):
"""A MOPAC20XX output file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(MOPAC, self).__init__(logname="MOPAC", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "MOPAC log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'MOPAC("%s")' % (self.filename)
def normalisesym(self, label):
"""MOPAC does not require normalizing symmetry labels."""
return label
def before_parsing(self):
#TODO
# Defaults
charge = 0
self.set_attribute('charge', charge)
mult = 1
self.set_attribute('mult', mult)
# Keep track of whether or not we're performing an
# (un)restricted calculation.
self.unrestricted = False
self.is_rohf = False
# Keep track of 1SCF vs. gopt since gopt is default
self.onescf = False
self.geomdone = False
# Compile the dashes-and-or-spaces-only regex.
self.re_dashes_and_spaces = re.compile('^[\s-]+$')
self.star = ' * '
self.stars = ' *******************************************************************************'
self.spinstate = {'SINGLET': 1,
'DOUBLET': 2,
'TRIPLET': 3,
'QUARTET': 4,
'QUINTET': 5,
'SEXTET': 6,
'HEPTET': 7,
'OCTET': 8,
'NONET': 9}
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
if "(Version:" in line:
# Part of the full version can be extracted from here, but is
# missing information about the bitness.
package_version = line[line.find("MOPAC") + 5:line.find("(")]
package_version = package_version[:4]
if "BETA" in line:
package_version = package_version + " BETA"
self.metadata["package_version"] = package_version
# Don't use the full package version until we know its field
# yet.
if "For non-commercial use only" in line:
tokens = line.split()
tokens = tokens[8:]
assert len(tokens) == 2
package_version_full = tokens[0]
if tokens[1] != "**":
package_version_full = '-'.join(tokens)[:-2]
# Extract the atomic numbers and coordinates from the optimized geometry
# note that cartesian coordinates section occurs multiple times in the file, and we want to end up using the last instance
# also, note that the section labeled cartesian coordinates doesn't have as many decimal places as the one used here
# Example 1 (not used):
# CARTESIAN COORDINATES
#
# NO. ATOM X Y Z
#
# 1 O 4.7928 -0.8461 0.3641
# 2 O 5.8977 -0.3171 0.0092
# ...
# Example 2 (used):
# ATOM CHEMICAL X Y Z
# NUMBER SYMBOL (ANGSTROMS) (ANGSTROMS) (ANGSTROMS)
#
# 1 O 4.79280259 * -0.84610232 * 0.36409474 *
# 2 O 5.89768035 * -0.31706418 * 0.00917035 *
# ... etc.
if line.split() == ["NUMBER", "SYMBOL", "(ANGSTROMS)", "(ANGSTROMS)", "(ANGSTROMS)"]:
self.updateprogress(inputfile, "Attributes", self.cupdate)
self.inputcoords = []
self.inputatoms = []
blankline = inputfile.next()
atomcoords = []
line = inputfile.next()
while len(line.split()) > 6:
# MOPAC Version 14.019L 64BITS suddenly appends this block with
# "CARTESIAN COORDINATES" block with no blank line.
tokens = line.split()
self.inputatoms.append(symbol2int(tokens[1]))
xc = float(tokens[2])
yc = float(tokens[4])
zc = float(tokens[6])
atomcoords.append([xc, yc, zc])
line = inputfile.next()
self.inputcoords.append(atomcoords)
if not hasattr(self, "natom"):
self.atomnos = numpy.array(self.inputatoms, 'i')
self.natom = len(self.atomnos)
if 'CHARGE ON SYSTEM =' in line:
charge = int(line.split()[5])
self.set_attribute('charge', charge)
if 'SPIN STATE DEFINED' in line:
# find the multiplicity from the line token (SINGLET, DOUBLET, TRIPLET, etc)
mult = self.spinstate[line.split()[1]]
self.set_attribute('mult', mult)
# Read energy (in kcal/mol, converted to eV)
#
# FINAL HEAT OF FORMATION = -333.88606 KCAL = -1396.97927 KJ
if 'FINAL HEAT OF FORMATION =' in line:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
self.scfenergies.append(utils.convertor(self.float(line.split()[5]), "kcal/mol", "eV"))
# Molecular mass parsing (units will be amu)
#
# MOLECULAR WEIGHT == 130.1890
if line[0:35] == ' MOLECULAR WEIGHT =':
self.molmass = self.float(line.split()[3])
#rotational constants
#Example:
# ROTATIONAL CONSTANTS IN CM(-1)
#
# A = 0.01757641 B = 0.00739763 C = 0.00712013
# could also read in moment of inertia, but this should just differ by a constant: rot cons= h/(8*Pi^2*I)
# note that the last occurence of this in the thermochemistry section has reduced precision,
# so we will want to use the 2nd to last instance
if line[0:40] == ' ROTATIONAL CONSTANTS IN CM(-1)':
blankline = inputfile.next()
rotinfo = inputfile.next()
if not hasattr(self, "rotcons"):
self.rotcons = []
broken = rotinfo.split()
# leave the rotational constants in Hz
a = float(broken[2])
b = float(broken[5])
c = float(broken[8])
self.rotcons.append([a, b, c])
# Start of the IR/Raman frequency section.
# Example:
# VIBRATION 1 1A ATOM PAIR ENERGY CONTRIBUTION RADIAL
# FREQ. 15.08 C 12 -- C 16 +7.9% (999.0%) 0.0%
# T-DIPOLE 0.2028 C 16 -- H 34 +5.8% (999.0%) 28.0%
# TRAVEL 0.0240 C 16 -- H 32 +5.6% (999.0%) 35.0%
# RED. MASS 1.7712 O 1 -- O 4 +5.2% (999.0%) 0.4%
# EFF. MASS7752.8338
#
# VIBRATION 2 2A ATOM PAIR ENERGY CONTRIBUTION RADIAL
# FREQ. 42.22 C 11 -- C 15 +9.0% (985.8%) 0.0%
# T-DIPOLE 0.1675 C 15 -- H 31 +6.6% (843.6%) 3.3%
# TRAVEL 0.0359 C 15 -- H 29 +6.0% (802.8%) 24.5%
# RED. MASS 1.7417 C 13 -- C 17 +5.8% (792.7%) 0.0%
# EFF. MASS1242.2114
if line[1:10] == 'VIBRATION':
self.updateprogress(inputfile, "Frequency Information", self.fupdate)
# get the vib symmetry
if len(line.split()) >= 3:
sym = line.split()[2]
if not hasattr(self, 'vibsyms'):
self.vibsyms = []
self.vibsyms.append(sym)
line = inputfile.next()
if 'FREQ' in line:
if not hasattr(self, 'vibfreqs'):
self.vibfreqs = []
freq = float(line.split()[1])
self.vibfreqs.append(freq)
line = inputfile.next()
if 'T-DIPOLE' in line:
if not hasattr(self, 'vibirs'):
self.vibirs = []
tdipole = float(line.split()[1])
# transform to km/mol
self.vibirs.append(math.sqrt(tdipole))
# Orbital eigenvalues, e.g.
# ALPHA EIGENVALUES
# BETA EIGENVALUES
# or just "EIGENVALUES" for closed-shell
if 'EIGENVALUES' in line:
if not hasattr(self, 'moenergies'):
self.moenergies = [] # list of arrays
energies = []
line = inputfile.next()
while len(line.split()) > 0:
energies.extend([float(i) for i in line.split()])
line = inputfile.next()
self.moenergies.append(energies)
# todo:
# Partial charges and dipole moments
# Example:
# NET ATOMIC CHARGES
if line[:16] == '== MOPAC DONE ==':
self.metadata['success'] = True
cclib-1.6.2/cclib/parser/nwchemparser.py 0000664 0000000 0000000 00000151415 13535330462 0020227 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for NWChem output files"""
import itertools
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class NWChem(logfileparser.Logfile):
"""An NWChem log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(NWChem, self).__init__(logname="NWChem", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "NWChem log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'NWChem("%s")' % (self.filename)
def normalisesym(self, label):
"""NWChem does not require normalizing symmetry labels."""
return label
name2element = lambda self, lbl: "".join(itertools.takewhile(str.isalpha, str(lbl)))
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number.
if "nwchem branch" in line:
self.metadata["package_version"] = line.split()[3]
# Don't add revision information to the main package version for now.
if "nwchem revision" in line:
revision = line.split()[3]
# This is printed in the input module, so should always be the first coordinates,
# and contains some basic information we want to parse as well. However, this is not
# the only place where the coordinates are printed during geometry optimization,
# since the gradients module has a separate coordinate printout, which happens
# alongside the coordinate gradients. This geometry printout happens at the
# beginning of each optimization step only.
if line.strip() == 'Geometry "geometry" -> ""' or line.strip() == 'Geometry "geometry" -> "geometry"':
self.skip_lines(inputfile, ['dashes', 'blank', 'units', 'blank', 'header', 'dashes'])
if not hasattr(self, 'atomcoords'):
self.atomcoords = []
line = next(inputfile)
coords = []
atomnos = []
while line.strip():
# The column labeled 'tag' is usually empty, but I'm not sure whether it can have spaces,
# so for now assume that it can and that there will be seven columns in that case.
if len(line.split()) == 6:
index, atomname, nuclear, x, y, z = line.split()
else:
index, atomname, tag, nuclear, x, y, z = line.split()
coords.append(list(map(float, [x, y, z])))
atomnos.append(int(float(nuclear)))
line = next(inputfile)
self.atomcoords.append(coords)
self.set_attribute('atomnos', atomnos)
# If the geometry is printed in XYZ format, it will have the number of atoms.
if line[12:31] == "XYZ format geometry":
self.skip_line(inputfile, 'dashes')
natom = int(next(inputfile).strip())
self.set_attribute('natom', natom)
if line.strip() == "NWChem Geometry Optimization":
self.skip_lines(inputfile, ['d', 'b', 'b', 'b', 'b', 'title', 'b', 'b'])
line = next(inputfile)
while line.strip():
if "maximum gradient threshold" in line:
gmax = float(line.split()[-1])
if "rms gradient threshold" in line:
grms = float(line.split()[-1])
if "maximum cartesian step threshold" in line:
xmax = float(line.split()[-1])
if "rms cartesian step threshold" in line:
xrms = float(line.split()[-1])
line = next(inputfile)
self.set_attribute('geotargets', [gmax, grms, xmax, xrms])
# NWChem does not normally print the basis set for each atom, but rather
# chooses the concise option of printing Gaussian coefficients for each
# atom type/element only once. Therefore, we need to first parse those
# coefficients and afterwards build the appropriate gbasis attribute based
# on that and atom types/elements already parsed (atomnos). However, if atom
# are given different names (number after element, like H1 and H2), then NWChem
# generally prints the gaussian parameters for all unique names, like this:
#
# Basis "ao basis" -> "ao basis" (cartesian)
# -----
# O (Oxygen)
# ----------
# Exponent Coefficients
# -------------- ---------------------------------------------------------
# 1 S 1.30709320E+02 0.154329
# 1 S 2.38088610E+01 0.535328
# (...)
#
# H1 (Hydrogen)
# -------------
# Exponent Coefficients
# -------------- ---------------------------------------------------------
# 1 S 3.42525091E+00 0.154329
# (...)
#
# H2 (Hydrogen)
# -------------
# Exponent Coefficients
# -------------- ---------------------------------------------------------
# 1 S 3.42525091E+00 0.154329
# (...)
#
# This current parsing code below assumes all atoms of the same element
# use the same basis set, but that might not be true, and this will probably
# need to be considered in the future when such a logfile appears.
if line.strip() == """Basis "ao basis" -> "ao basis" (cartesian)""":
self.skip_line(inputfile, 'dashes')
gbasis_dict = {}
line = next(inputfile)
while line.strip():
atomname = line.split()[0]
atomelement = self.name2element(atomname)
gbasis_dict[atomelement] = []
self.skip_lines(inputfile, ['d', 'labels', 'd'])
shells = []
line = next(inputfile)
while line.strip() and line.split()[0].isdigit():
shell = None
while line.strip():
nshell, type, exp, coeff = line.split()
nshell = int(nshell)
assert len(shells) == nshell - 1
if not shell:
shell = (type, [])
else:
assert shell[0] == type
exp = float(exp)
coeff = float(coeff)
shell[1].append((exp, coeff))
line = next(inputfile)
shells.append(shell)
line = next(inputfile)
gbasis_dict[atomelement].extend(shells)
gbasis = []
for i in range(self.natom):
atomtype = self.table.element[self.atomnos[i]]
gbasis.append(gbasis_dict[atomtype])
self.set_attribute('gbasis', gbasis)
# Normally the indexes of AOs assigned to specific atoms are also not printed,
# so we need to infer that. We could do that from the previous section,
# it might be worthwhile to take numbers from two different places, hence
# the code below, which builds atombasis based on the number of functions
# listed in this summary of the AO basis. Similar to previous section, here
# we assume all atoms of the same element have the same basis sets, but
# this will probably need to be revised later.
# The section we can glean info about aonmaes looks like:
#
# Summary of "ao basis" -> "ao basis" (cartesian)
# ------------------------------------------------------------------------------
# Tag Description Shells Functions and Types
# ---------------- ------------------------------ ------ ---------------------
# C sto-3g 3 5 2s1p
# H sto-3g 1 1 1s
#
# However, we need to make sure not to match the following entry lines:
#
# * Summary of "ao basis" -> "" (cartesian)
# * Summary of allocated global arrays
#
# Unfortantely, "ao basis" isn't unique because it can be renamed to anything for
# later reference: http://www.nwchem-sw.org/index.php/Basis
# It also appears that we have to handle cartesian vs. spherical
if line[1:11] == "Summary of":
match = re.match(' Summary of "([^\"]*)" -> "([^\"]*)" \((.+)\)', line)
if match and match.group(1) == match.group(2):
self.skip_lines(inputfile, ['d', 'title', 'd'])
self.shells = {}
self.shells["type"] = match.group(3)
atombasis_dict = {}
line = next(inputfile)
while line.strip():
atomname, desc, shells, funcs, types = line.split()
atomelement = self.name2element(atomname)
self.metadata["basis_set"] = desc
self.shells[atomname] = types
atombasis_dict[atomelement] = int(funcs)
line = next(inputfile)
last = 0
atombasis = []
for atom in self.atomnos:
atomelement = self.table.element[atom]
nfuncs = atombasis_dict[atomelement]
atombasis.append(list(range(last, last+nfuncs)))
last = atombasis[-1][-1] + 1
self.set_attribute('atombasis', atombasis)
# This section contains general parameters for Hartree-Fock calculations,
# which do not contain the 'General Information' section like most jobs.
if line.strip() == "NWChem SCF Module":
# If the calculation doesn't have a title specified, there
# aren't as many lines to skip here.
self.skip_lines(inputfile, ['d', 'b', 'b'])
line = next(inputfile)
if line.strip():
self.skip_lines(inputfile, ['b', 'b', 'b'])
line = next(inputfile)
while line.strip():
if line[2:8] == "charge":
charge = int(float(line.split()[-1]))
self.set_attribute('charge', charge)
if line[2:13] == "open shells":
unpaired = int(line.split()[-1])
self.set_attribute('mult', 2*unpaired + 1)
if line[2:7] == "atoms":
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
if line[2:11] == "functions":
nfuncs = int(line.split()[-1])
self.set_attribute("nbasis", nfuncs)
line = next(inputfile)
# This section contains general parameters for DFT calculations, as well as
# for the many-electron theory module.
if line.strip() == "General Information":
if hasattr(self, 'linesearch') and self.linesearch:
return
while line.strip():
if "No. of atoms" in line:
self.set_attribute('natom', int(line.split()[-1]))
if "Charge" in line:
self.set_attribute('charge', int(line.split()[-1]))
if "Spin multiplicity" in line:
mult = line.split()[-1]
if mult == "singlet":
mult = 1
self.set_attribute('mult', int(mult))
if "AO basis - number of function" in line:
nfuncs = int(line.split()[-1])
self.set_attribute('nbasis', nfuncs)
# These will be present only in the DFT module.
if "Convergence on energy requested" in line:
target_energy = self.float(line.split()[-1])
if "Convergence on density requested" in line:
target_density = self.float(line.split()[-1])
if "Convergence on gradient requested" in line:
target_gradient = self.float(line.split()[-1])
line = next(inputfile)
# Pretty nasty temporary hack to set scftargets only in the SCF module.
if "target_energy" in dir() and "target_density" in dir() and "target_gradient" in dir():
if not hasattr(self, 'scftargets'):
self.scftargets = []
self.scftargets.append([target_energy, target_density, target_gradient])
#DFT functional information
if "XC Information" in line:
line = next(inputfile)
line = next(inputfile)
self.metadata["functional"] = line.split()[0]
# If the full overlap matrix is printed, it looks like this:
#
# global array: Temp Over[1:60,1:60], handle: -996
#
# 1 2 3 4 5 6
# ----------- ----------- ----------- ----------- ----------- -----------
# 1 1.00000 0.24836 -0.00000 -0.00000 0.00000 0.00000
# 2 0.24836 1.00000 0.00000 -0.00000 0.00000 0.00030
# 3 -0.00000 0.00000 1.00000 0.00000 0.00000 -0.00014
# ...
if "global array: Temp Over[" in line:
self.set_attribute('nbasis', int(line.split('[')[1].split(',')[0].split(':')[1]))
self.set_attribute('nmo', int(line.split(']')[0].split(',')[1].split(':')[1]))
aooverlaps = []
while len(aooverlaps) < self.nbasis:
self.skip_line(inputfile, 'blank')
indices = [int(i) for i in inputfile.next().split()]
assert indices[0] == len(aooverlaps) + 1
self.skip_line(inputfile, "dashes")
data = [inputfile.next().split() for i in range(self.nbasis)]
indices = [int(d[0]) for d in data]
assert indices == list(range(1, self.nbasis+1))
for i in range(1, len(data[0])):
vector = [float(d[i]) for d in data]
aooverlaps.append(vector)
self.set_attribute('aooverlaps', aooverlaps)
if line.strip() in ("The SCF is already converged", "The DFT is already converged"):
if self.linesearch:
return
if hasattr(self, 'scftargets'):
self.scftargets.append(self.scftargets[-1])
if hasattr(self, 'scfvalues'):
self.scfvalues.append(self.scfvalues[-1])
# The default (only?) SCF algorithm for Hartree-Fock is a preconditioned conjugate
# gradient method that apparently "always" converges, so this header should reliably
# signal a start of the SCF cycle. The convergence targets are also printed here.
if line.strip() == "Quadratically convergent ROHF":
if hasattr(self, 'linesearch') and self.linesearch:
return
while not "Final" in line:
# Only the norm of the orbital gradient is used to test convergence.
if line[:22] == " Convergence threshold":
target = float(line.split()[-1])
if not hasattr(self, "scftargets"):
self.scftargets = []
self.scftargets.append([target])
# This is critical for the stop condition of the section,
# because the 'Final Fock-matrix accuracy' is along the way.
# It would be prudent to find a more robust stop condition.
while list(set(line.strip())) != ["-"]:
line = next(inputfile)
if line.split() == ['iter', 'energy', 'gnorm', 'gmax', 'time']:
values = []
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
while line.strip():
it, energy, gnorm, gmax, time = line.split()
gnorm = self.float(gnorm)
values.append([gnorm])
try:
line = next(inputfile)
# Is this the end of the file for some reason?
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last gradient norm: {}'.format(gnorm))
break
if not hasattr(self, 'scfvalues'):
self.scfvalues = []
self.scfvalues.append(values)
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated?')
break
# The SCF for DFT does not use the same algorithm as Hartree-Fock, but always
# seems to use the following format to report SCF convergence:
# convergence iter energy DeltaE RMS-Dens Diis-err time
# ---------------- ----- ----------------- --------- --------- --------- ------
# d= 0,ls=0.0,diis 1 -382.2544324446 -8.28D+02 1.42D-02 3.78D-01 23.2
# d= 0,ls=0.0,diis 2 -382.3017298534 -4.73D-02 6.99D-03 3.82D-02 39.3
# d= 0,ls=0.0,diis 3 -382.2954343173 6.30D-03 4.21D-03 7.95D-02 55.3
# ...
if line.split() == ['convergence', 'iter', 'energy', 'DeltaE', 'RMS-Dens', 'Diis-err', 'time']:
if hasattr(self, 'linesearch') and self.linesearch:
return
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
values = []
while line.strip():
# Sometimes there are things in between iterations with fewer columns,
# and we want to skip those lines, most probably. An exception might
# unrestricted calcualtions, which show extra RMS density and DIIS
# errors, although it is not clear yet whether these are for the
# beta orbitals or somethine else. The iterations look like this in that case:
# convergence iter energy DeltaE RMS-Dens Diis-err time
# ---------------- ----- ----------------- --------- --------- --------- ------
# d= 0,ls=0.0,diis 1 -382.0243202601 -8.28D+02 7.77D-03 1.04D-01 30.0
# 7.68D-03 1.02D-01
# d= 0,ls=0.0,diis 2 -382.0647539758 -4.04D-02 4.64D-03 1.95D-02 59.2
# 5.39D-03 2.36D-02
# ...
if len(line[17:].split()) == 6:
iter, energy, deltaE, dens, diis, time = line[17:].split()
val_energy = self.float(deltaE)
val_density = self.float(dens)
val_gradient = self.float(diis)
values.append([val_energy, val_density, val_gradient])
try:
line = next(inputfile)
# Is this the end of the file for some reason?
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last error: {}'.format(diis))
break
if not hasattr(self, 'scfvalues'):
self.scfvalues = []
self.scfvalues.append(values)
# These triggers are supposed to catch the current step in a geometry optimization search
# and determine whether we are currently in the main (initial) SCF cycle of that step
# or in the subsequent line search. The step is printed between dashes like this:
#
# --------
# Step 0
# --------
#
# and the summary lines that describe the main SCF cycle for the frsit step look like this:
#
#@ Step Energy Delta E Gmax Grms Xrms Xmax Walltime
#@ ---- ---------------- -------- -------- -------- -------- -------- --------
#@ 0 -379.76896249 0.0D+00 0.04567 0.01110 0.00000 0.00000 4.2
# ok ok
#
# However, for subsequent step the format is a bit different:
#
# Step Energy Delta E Gmax Grms Xrms Xmax Walltime
# ---- ---------------- -------- -------- -------- -------- -------- --------
#@ 2 -379.77794602 -7.4D-05 0.00118 0.00023 0.00440 0.01818 14.8
# ok
#
# There is also a summary of the line search (which we don't use now), like this:
#
# Line search:
# step= 1.00 grad=-1.8D-05 hess= 8.9D-06 energy= -379.777955 mode=accept
# new step= 1.00 predicted energy= -379.777955
#
if line[10:14] == "Step":
self.geostep = int(line.split()[-1])
self.skip_line(inputfile, 'dashes')
self.linesearch = False
if line[0] == "@" and line.split()[1] == "Step":
at_and_dashes = next(inputfile)
line = next(inputfile)
assert int(line.split()[1]) == self.geostep == 0
gmax = float(line.split()[4])
grms = float(line.split()[5])
xrms = float(line.split()[6])
xmax = float(line.split()[7])
if not hasattr(self, 'geovalues'):
self.geovalues = []
self.geovalues.append([gmax, grms, xmax, xrms])
self.linesearch = True
if line[2:6] == "Step":
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
assert int(line.split()[1]) == self.geostep
if self.linesearch:
#print(line)
return
gmax = float(line.split()[4])
grms = float(line.split()[5])
xrms = float(line.split()[6])
xmax = float(line.split()[7])
if not hasattr(self, 'geovalues'):
self.geovalues = []
self.geovalues.append([gmax, grms, xmax, xrms])
self.linesearch = True
# There is a clear message when the geometry optimization has converged:
#
# ----------------------
# Optimization converged
# ----------------------
#
if line.strip() == "Optimization converged":
self.skip_line(inputfile, 'dashes')
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues) - 1)
if "Failed to converge" in line and hasattr(self, 'geovalues'):
if not hasattr(self, 'optdone'):
self.optdone = []
# extract the theoretical method
if "Total SCF energy" in line:
self.metadata["methods"].append("HF")
if "Total DFT energy" in line:
self.metadata["methods"].append("DFT")
# The line containing the final SCF energy seems to be always identifiable like this.
if "Total SCF energy" in line or "Total DFT energy" in line:
# NWChem often does a line search during geometry optimization steps, reporting
# the SCF information but not the coordinates (which are not necessarily 'intermediate'
# since the step size can become smaller). We want to skip these SCF cycles,
# unless the coordinates can also be extracted (possibly from the gradients?).
if hasattr(self, 'linesearch') and self.linesearch:
return
if not hasattr(self, "scfenergies"):
self.scfenergies = []
energy = float(line.split()[-1])
energy = utils.convertor(energy, "hartree", "eV")
self.scfenergies.append(energy)
# The final MO orbitals are printed in a simple list, but apparently not for
# DFT calcs, and often this list does not contain all MOs, so make sure to
# parse them from the MO analysis below if possible. This section will be like this:
#
# Symmetry analysis of molecular orbitals - final
# -----------------------------------------------
#
# Numbering of irreducible representations:
#
# 1 ag 2 au 3 bg 4 bu
#
# Orbital symmetries:
#
# 1 bu 2 ag 3 bu 4 ag 5 bu
# 6 ag 7 bu 8 ag 9 bu 10 ag
# ...
if line.strip() == "Symmetry analysis of molecular orbitals - final":
self.skip_lines(inputfile, ['d', 'b', 'numbering', 'b', 'reps', 'b', 'syms', 'b'])
if not hasattr(self, 'mosyms'):
self.mosyms = [[None]*self.nbasis]
line = next(inputfile)
while line.strip():
ncols = len(line.split())
assert ncols % 2 == 0
for i in range(ncols//2):
index = int(line.split()[i*2]) - 1
sym = line.split()[i*2+1]
sym = sym[0].upper() + sym[1:]
if self.mosyms[0][index]:
if self.mosyms[0][index] != sym:
self.logger.warning("Symmetry of MO %i has changed" % (index+1))
self.mosyms[0][index] = sym
line = next(inputfile)
# The same format is used for HF and DFT molecular orbital analysis. We want to parse
# the MO energies from this section, although it is printed already before this with
# less precision (might be useful to parse that if this is not available). Also, this
# section contains coefficients for the leading AO contributions, so it might also
# be useful to parse and use those values if the full vectors are not printed.
#
# The block looks something like this (two separate alpha/beta blocks in the unrestricted case):
#
# ROHF Final Molecular Orbital Analysis
# -------------------------------------
#
# Vector 1 Occ=2.000000D+00 E=-1.104059D+01 Symmetry=bu
# MO Center= 1.4D-17, 0.0D+00, -6.5D-37, r^2= 2.1D+00
# Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
# ----- ------------ --------------- ----- ------------ ---------------
# 1 0.701483 1 C s 6 -0.701483 2 C s
#
# Vector 2 Occ=2.000000D+00 E=-1.104052D+01 Symmetry=ag
# ...
# Vector 12 Occ=2.000000D+00 E=-1.020253D+00 Symmetry=bu
# MO Center= -1.4D-17, -5.6D-17, 2.9D-34, r^2= 7.9D+00
# Bfn. Coefficient Atom+Function Bfn. Coefficient Atom+Function
# ----- ------------ --------------- ----- ------------ ---------------
# 36 -0.298699 11 C s 41 0.298699 12 C s
# 2 0.270804 1 C s 7 -0.270804 2 C s
# 48 -0.213655 15 C s 53 0.213655 16 C s
# ...
#
if "Final" in line and "Molecular Orbital Analysis" in line:
# Unrestricted jobs have two such blocks, for alpha and beta orbitals, and
# we need to keep track of which one we're parsing (always alpha in restricted case).
unrestricted = ("Alpha" in line) or ("Beta" in line)
alphabeta = int("Beta" in line)
self.skip_lines(inputfile, ['dashes', 'blank'])
nvectors = []
mooccnos = []
energies = []
symmetries = [None]*self.nbasis
line = next(inputfile)
while line[:7] == " Vector":
# Note: the vector count starts from 1 in NWChem.
nvector = int(line[7:12])
nvectors.append(nvector)
# A nonzero occupancy for SCF jobs means the orbital is occupied.
mooccno = int(self.float(line[18:30]))
mooccnos.append(mooccno)
# If the printout does not start from the first MO, assume None for all previous orbitals.
if len(energies) == 0 and nvector > 1:
for i in range(1, nvector):
energies.append(None)
energy = self.float(line[34:47])
energy = utils.convertor(energy, "hartree", "eV")
energies.append(energy)
# When symmetry is not used, this part of the line is missing.
if line[47:58].strip() == "Symmetry=":
sym = line[58:].strip()
sym = sym[0].upper() + sym[1:]
symmetries[nvector-1] = sym
line = next(inputfile)
if "MO Center" in line:
line = next(inputfile)
if "Bfn." in line:
line = next(inputfile)
if "-----" in line:
line = next(inputfile)
while line.strip():
line = next(inputfile)
line = next(inputfile)
self.set_attribute('nmo', nvector)
if not hasattr(self, 'moenergies') or (len(self.moenergies) > alphabeta):
self.moenergies = []
self.moenergies.append(energies)
if not hasattr(self, 'mosyms') or (len(self.mosyms) > alphabeta):
self.mosyms = []
self.mosyms.append(symmetries)
if not hasattr(self, 'homos') or (len(self.homos) > alphabeta):
self.homos = []
nvector_index = mooccnos.index(0) - 1
if nvector_index > -1:
self.homos.append(nvectors[nvector_index] - 1)
else:
self.homos.append(-1)
# If this was a restricted open-shell calculation, append
# to HOMOs twice since only one Molecular Orbital Analysis
# section is in the output file.
if (not unrestricted) and (1 in mooccnos):
nvector_index = mooccnos.index(1) - 1
if nvector_index > -1:
self.homos.append(nvectors[nvector_index] - 1)
else:
self.homos.append(-1)
# This is where the full MO vectors are printed, but a special
# directive is needed for it in the `scf` or `dft` block:
# print "final vectors" "final vectors analysis"
# which gives:
#
# Final MO vectors
# ----------------
#
#
# global array: alpha evecs[1:60,1:60], handle: -995
#
# 1 2 3 4 5 6
# ----------- ----------- ----------- ----------- ----------- -----------
# 1 -0.69930 -0.69930 -0.02746 -0.02769 -0.00313 -0.02871
# 2 -0.03156 -0.03135 0.00410 0.00406 0.00078 0.00816
# 3 0.00002 -0.00003 0.00067 0.00065 -0.00526 -0.00120
# ...
#
if line.strip() == "Final MO vectors":
if not hasattr(self, 'mocoeffs'):
self.mocoeffs = []
self.skip_lines(inputfile, ['d', 'b', 'b'])
# The columns are MOs, rows AOs, but that's and educated guess since no
# atom information is printed alongside the indices. This next line gives
# the dimensions, which we can check. if set before this. Also, this line
# specifies whether we are dealing with alpha or beta vectors.
array_info = next(inputfile)
while ("global array" in array_info):
alphabeta = int(line.split()[2] == "beta")
size = array_info.split('[')[1].split(']')[0]
nbasis = int(size.split(',')[0].split(':')[1])
nmo = int(size.split(',')[1].split(':')[1])
self.set_attribute('nbasis', nbasis)
self.set_attribute('nmo', nmo)
self.skip_line(inputfile, 'blank')
mocoeffs = []
while len(mocoeffs) < self.nmo:
nmos = list(map(int, next(inputfile).split()))
assert len(mocoeffs) == nmos[0] - 1
for n in nmos:
mocoeffs.append([])
self.skip_line(inputfile, 'dashes')
for nb in range(nbasis):
line = next(inputfile)
index = int(line.split()[0])
assert index == nb+1
coefficients = list(map(float, line.split()[1:]))
assert len(coefficients) == len(nmos)
for i, c in enumerate(coefficients):
mocoeffs[nmos[i]-1].append(c)
self.skip_line(inputfile, 'blank')
self.mocoeffs.append(mocoeffs)
array_info = next(inputfile)
# For Hartree-Fock, the atomic Mulliken charges are typically printed like this:
#
# Mulliken analysis of the total density
# --------------------------------------
#
# Atom Charge Shell Charges
# ----------- ------ -------------------------------------------------------
# 1 C 6 6.00 1.99 1.14 2.87
# 2 C 6 6.00 1.99 1.14 2.87
# ...
if line.strip() == "Mulliken analysis of the total density":
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
self.skip_lines(inputfile, ['d', 'b', 'header', 'd'])
charges = []
line = next(inputfile)
while line.strip():
index, atomname, nuclear, atom = line.split()[:4]
shells = line.split()[4:]
charges.append(float(atom)-float(nuclear))
line = next(inputfile)
self.atomcharges['mulliken'] = charges
# Not the the 'overlap population' as printed in the Mulliken population analysis,
# is not the same thing as the 'overlap matrix'. In fact, it is the overlap matrix
# multiplied elementwise times the density matrix.
#
# ----------------------------
# Mulliken population analysis
# ----------------------------
#
# ----- Total overlap population -----
#
# 1 2 3 4 5 6 7
#
# 1 1 C s 2.0694818227 -0.0535883400 -0.0000000000 -0.0000000000 -0.0000000000 -0.0000000000 0.0000039991
# 2 1 C s -0.0535883400 0.8281341291 0.0000000000 -0.0000000000 0.0000000000 0.0000039991 -0.0009906747
# ...
#
# DFT does not seem to print the separate listing of Mulliken charges
# by default, but they are printed by this modules later on. They are also print
# for Hartree-Fock runs, though, so in that case make sure they are consistent.
if line.strip() == "Mulliken population analysis":
self.skip_lines(inputfile, ['d', 'b', 'total_overlap_population', 'b'])
overlaps = []
line = next(inputfile)
while all([c.isdigit() for c in line.split()]):
# There is always a line with the MO indices printed in thie block.
indices = [int(i)-1 for i in line.split()]
for i in indices:
overlaps.append([])
# There is usually a blank line after the MO indices, but
# there are exceptions, so check if line is blank first.
line = next(inputfile)
if not line.strip():
line = next(inputfile)
# Now we can iterate or atomic orbitals.
for nao in range(self.nbasis):
data = list(map(float, line.split()[4:]))
for i, d in enumerate(data):
overlaps[indices[i]].append(d)
line = next(inputfile)
line = next(inputfile)
# This header should be printed later, before the charges are print, which of course
# are just sums of the overlaps and could be calculated. But we just go ahead and
# parse them, make sure they're consistent with previously parsed values and
# use these since they are more precise (previous precision could have been just 0.01).
while "Total gross population on atoms" not in line:
line = next(inputfile)
self.skip_line(inputfile, 'blank')
charges = []
for i in range(self.natom):
line = next(inputfile)
iatom, element, ncharge, epop = line.split()
iatom = int(iatom)
ncharge = float(ncharge)
epop = float(epop)
assert iatom == (i+1)
charges.append(epop-ncharge)
if not hasattr(self, 'atomcharges'):
self.atomcharges = {}
if not "mulliken" in self.atomcharges:
self.atomcharges['mulliken'] = charges
else:
assert max(self.atomcharges['mulliken'] - numpy.array(charges)) < 0.01
self.atomcharges['mulliken'] = charges
# NWChem prints the dipole moment in atomic units first, and we could just fast forward
# to the values in Debye, which are also printed. But we can also just convert them
# right away and so parse a little bit less. Note how the reference point is print
# here within the block nicely, as it is for all moment later.
#
# -------------
# Dipole Moment
# -------------
#
# Center of charge (in au) is the expansion point
# X = 0.0000000 Y = 0.0000000 Z = 0.0000000
#
# Dipole moment 0.0000000000 Debye(s)
# DMX 0.0000000000 DMXEFC 0.0000000000
# DMY 0.0000000000 DMYEFC 0.0000000000
# DMZ -0.0000000000 DMZEFC 0.0000000000
#
# ...
#
if line.strip() == "Dipole Moment":
self.skip_lines(inputfile, ['d', 'b'])
reference_comment = next(inputfile)
assert "(in au)" in reference_comment
reference = next(inputfile).split()
self.reference = [reference[-7], reference[-4], reference[-1]]
self.reference = numpy.array([float(x) for x in self.reference])
self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom')
self.skip_line(inputfile, 'blank')
magnitude = next(inputfile)
assert magnitude.split()[-1] == "A.U."
dipole = []
for i in range(3):
line = next(inputfile)
dipole.append(float(line.split()[1]))
dipole = utils.convertor(numpy.array(dipole), "ebohr", "Debye")
if not hasattr(self, 'moments'):
self.moments = [self.reference, dipole]
else:
self.moments[1] == dipole
# The quadrupole moment is pretty straightforward to parse. There are several
# blocks printed, and the first one called 'second moments' contains the raw
# moments, and later traceless values are printed. The moments, however, are
# not in lexicographical order, so we need to sort them. Also, the first block
# is in atomic units, so remember to convert to Buckinghams along the way.
#
# -----------------
# Quadrupole Moment
# -----------------
#
# Center of charge (in au) is the expansion point
# X = 0.0000000 Y = 0.0000000 Z = 0.0000000
#
# < R**2 > = ********** a.u. ( 1 a.u. = 0.280023 10**(-16) cm**2 )
# ( also called diamagnetic susceptibility )
#
# Second moments in atomic units
#
# Component Electronic+nuclear Point charges Total
# --------------------------------------------------------------------------
# XX -38.3608511210 0.0000000000 -38.3608511210
# YY -39.0055467347 0.0000000000 -39.0055467347
# ...
#
if line.strip() == "Quadrupole Moment":
self.skip_lines(inputfile, ['d', 'b'])
reference_comment = next(inputfile)
assert "(in au)" in reference_comment
reference = next(inputfile).split()
self.reference = [reference[-7], reference[-4], reference[-1]]
self.reference = numpy.array([float(x) for x in self.reference])
self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom')
self.skip_lines(inputfile, ['b', 'units', 'susc', 'b'])
line = next(inputfile)
assert line.strip() == "Second moments in atomic units"
self.skip_lines(inputfile, ['b', 'header', 'd'])
# Parse into a dictionary and then sort by the component key.
quadrupole = {}
for i in range(6):
line = next(inputfile)
quadrupole[line.split()[0]] = float(line.split()[-1])
lex = sorted(quadrupole.keys())
quadrupole = [quadrupole[key] for key in lex]
quadrupole = utils.convertor(numpy.array(quadrupole), "ebohr2", "Buckingham")
# The checking of potential previous values if a bit more involved here,
# because it turns out NWChem has separate keywords for dipole, quadrupole
# and octupole output. So, it is perfectly possible to print the quadrupole
# and not the dipole... if that is the case set the former to None and
# issue a warning. Also, a regression has been added to cover this case.
if not hasattr(self, 'moments') or len(self.moments) < 2:
self.logger.warning("Found quadrupole moments but no previous dipole")
self.moments = [self.reference, None, quadrupole]
else:
if len(self.moments) == 2:
self.moments.append(quadrupole)
else:
assert self.moments[2] == quadrupole
# The octupole moment is analogous to the quadrupole, but there are more components
# and the checking of previously parsed dipole and quadrupole moments is more involved,
# with a corresponding test also added to regressions.
#
# ---------------
# Octupole Moment
# ---------------
#
# Center of charge (in au) is the expansion point
# X = 0.0000000 Y = 0.0000000 Z = 0.0000000
#
# Third moments in atomic units
#
# Component Electronic+nuclear Point charges Total
# --------------------------------------------------------------------------
# XXX -0.0000000000 0.0000000000 -0.0000000000
# YYY -0.0000000000 0.0000000000 -0.0000000000
# ...
#
if line.strip() == "Octupole Moment":
self.skip_lines(inputfile, ['d', 'b'])
reference_comment = next(inputfile)
assert "(in au)" in reference_comment
reference = next(inputfile).split()
self.reference = [reference[-7], reference[-4], reference[-1]]
self.reference = numpy.array([float(x) for x in self.reference])
self.reference = utils.convertor(self.reference, 'bohr', 'Angstrom')
self.skip_line(inputfile, 'blank')
line = next(inputfile)
assert line.strip() == "Third moments in atomic units"
self.skip_lines(inputfile, ['b', 'header', 'd'])
octupole = {}
for i in range(10):
line = next(inputfile)
octupole[line.split()[0]] = float(line.split()[-1])
lex = sorted(octupole.keys())
octupole = [octupole[key] for key in lex]
octupole = utils.convertor(numpy.array(octupole), "ebohr3", "Debye.ang2")
if not hasattr(self, 'moments') or len(self.moments) < 2:
self.logger.warning("Found octupole moments but no previous dipole or quadrupole moments")
self.moments = [self.reference, None, None, octupole]
elif len(self.moments) == 2:
self.logger.warning("Found octupole moments but no previous quadrupole moments")
self.moments.append(None)
self.moments.append(octupole)
else:
if len(self.moments) == 3:
self.moments.append(octupole)
else:
assert self.moments[3] == octupole
if "Total MP2 energy" in line:
self.metadata["methods"].append("MP2")
mpenerg = float(line.split()[-1])
if not hasattr(self, "mpenergies"):
self.mpenergies = []
self.mpenergies.append([])
self.mpenergies[-1].append(utils.convertor(mpenerg, "hartree", "eV"))
if "CCSD total energy / hartree" in line or "total CCSD energy:" in line:
self.metadata["methods"].append("CCSD")
ccenerg = float(line.split()[-1])
if not hasattr(self, "ccenergies"):
self.ccenergies = []
self.ccenergies.append([])
self.ccenergies[-1].append(utils.convertor(ccenerg, "hartree", "eV"))
if "CCSD(T) total energy / hartree" in line:
self.metadata["methods"].append("CCSD(T)")
ccenerg = float(line.split()[-1])
if not hasattr(self, "ccenergies"):
self.ccenergies = []
self.ccenergies.append([])
self.ccenergies[-1].append(utils.convertor(ccenerg, "hartree", "eV"))
# Static and dynamic polarizability.
if "Linear Response polarizability / au" in line:
if not hasattr(self, "polarizabilities"):
self.polarizabilities = []
polarizability = []
line = next(inputfile)
assert line.split()[0] == "Frequency"
line = next(inputfile)
assert line.split()[0] == "Wavelength"
self.skip_lines(inputfile, ['coordinates', 'd'])
for _ in range(3):
line = next(inputfile)
polarizability.append(line.split()[1:])
self.polarizabilities.append(numpy.array(polarizability))
if line[:18] == ' Total times cpu:':
self.metadata['success'] = True
if line.strip() == "NWChem QMD Module":
self.is_BOMD = True
# Born-Oppenheimer molecular dynamics (BOMD): time.
if "QMD Run Information" in line:
self.skip_line(inputfile, 'd')
line = next(inputfile)
assert "Time elapsed (fs)" in line
time = float(line.split()[4])
self.append_attribute('time', time)
# BOMD: geometry coordinates when `print low`.
if line.strip() == "DFT ENERGY GRADIENTS":
if self.is_BOMD:
self.skip_lines(inputfile, ['b', 'atom coordinates gradient', 'xyzxyz'])
line = next(inputfile)
atomcoords_step = []
while line.strip():
tokens = line.split()
assert len(tokens) == 8
atomcoords_step.append([float(c) for c in tokens[2:5]])
line = next(inputfile)
self.atomcoords.append(atomcoords_step)
def before_parsing(self):
"""NWChem-specific routines performed before parsing a file.
"""
# The only reason we need this identifier is if `print low` is
# set in the input file, which we assume is likely for a BOMD
# trajectory. This will enable parsing coordinates from the
# 'DFT ENERGY GRADIENTS' section.
self.is_BOMD = False
def after_parsing(self):
"""NWChem-specific routines for after parsing a file.
Currently, expands self.shells() into self.aonames.
"""
# setup a few necessary things, including a regular expression
# for matching the shells
table = utils.PeriodicTable()
elements = [table.element[x] for x in self.atomnos]
pattern = re.compile("(\ds)+(\dp)*(\dd)*(\df)*(\dg)*")
labels = {}
labels['s'] = ["%iS"]
labels['p'] = ["%iPX", "%iPY", "%iPZ"]
if self.shells['type'] == 'spherical':
labels['d'] = ['%iD-2', '%iD-1', '%iD0', '%iD1', '%iD2']
labels['f'] = ['%iF-3', '%iF-2', '%iF-1', '%iF0',
'%iF1', '%iF2', '%iF3']
labels['g'] = ['%iG-4', '%iG-3', '%iG-2', '%iG-1', '%iG0',
'%iG1', '%iG2', '%iG3', '%iG4']
elif self.shells['type'] == 'cartesian':
labels['d'] = ['%iDXX', '%iDXY', '%iDXZ',
'%iDYY', '%iDYZ',
'%iDZZ']
labels['f'] = ['%iFXXX', '%iFXXY', '%iFXXZ',
'%iFXYY', '%iFXYZ', '%iFXZZ',
'%iFYYY', '%iFYYZ', '%iFYZZ',
'%iFZZZ']
labels['g'] = ['%iGXXXX', '%iGXXXY', '%iGXXXZ',
'%iGXXYY', '%iGXXYZ', '%iGXXZZ',
'%iGXYYY', '%iGXYYZ', '%iGXYZZ',
'%iGXZZZ', '%iGYYYY', '%iGYYYZ',
'%iGYYZZ', '%iGYZZZ', '%iGZZZZ']
else:
self.logger.warning("Found a non-standard aoname representation type.")
return
# now actually build aonames
# involves expanding 2s1p into appropriate types
self.aonames = []
for i, element in enumerate(elements):
try:
shell_text = self.shells[element]
except KeyError:
del self.aonames
msg = "Cannot determine aonames for at least one atom."
self.logger.warning(msg)
break
prefix = "%s%i_" % (element, i + 1) # (e.g. C1_)
matches = pattern.match(shell_text)
for j, group in enumerate(matches.groups()):
if group is None:
continue
count = int(group[:-1])
label = group[-1]
for k in range(count):
temp = [x % (j + k + 1) for x in labels[label]]
self.aonames.extend([prefix + x for x in temp])
# If we parsed a BOMD trajectory, the first two parsed
# geometries are identical, and all from the second onward are
# in Bohr. Delete the first one and perform the unit
# conversion.
if self.is_BOMD:
self.atomcoords = utils.convertor(numpy.asarray(self.atomcoords)[1:, ...],
'bohr', 'Angstrom')
cclib-1.6.2/cclib/parser/orcaparser.py 0000664 0000000 0000000 00000232271 13535330462 0017672 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for ORCA output files"""
from __future__ import print_function
import numpy
import re
from cclib.parser import logfileparser
from cclib.parser import utils
class ORCA(logfileparser.Logfile):
"""An ORCA log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(ORCA, self).__init__(logname="ORCA", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "ORCA log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'ORCA("%s")' % (self.filename)
def normalisesym(self, label):
"""ORCA does not require normalizing symmetry labels."""
return label
def before_parsing(self):
# A geometry optimization is started only when
# we parse a cycle (so it will be larger than zero().
self.gopt_cycle = 0
# Keep track of whether this is a relaxed scan calculation
self.is_relaxed_scan = False
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number.
if "Program Version" == line.strip()[:15]:
self.metadata["package_version"] = line.split()[2]
# ================================================================================
# WARNINGS
# Please study these warnings very carefully!
# ================================================================================
#
# Warning: TCutStore was < 0. Adjusted to Thresh (uncritical)
#
# WARNING: your system is open-shell and RHF/RKS was chosen
# ===> : WILL SWITCH to UHF/UKS
#
#
# INFO : the flag for use of LIBINT has been found!
#
# ================================================================================
if "WARNINGS" == line.strip():
self.skip_lines(inputfile, ['text', '=', 'blank'])
if 'warnings' not in self.metadata:
self.metadata['warnings'] = []
if 'info' not in self.metadata:
self.metadata['info'] = []
line = next(inputfile)
while line[0] != '=':
if line.lower()[:7] == 'warning':
self.metadata['warnings'].append('')
while len(line) > 1:
self.metadata['warnings'][-1] += line[9:].strip()
line = next(inputfile)
elif line.lower()[:4] == 'info':
self.metadata['info'].append('')
while len(line) > 1:
self.metadata['info'][-1] += line[9:].strip()
line = next(inputfile)
line = next(inputfile)
# ================================================================================
# INPUT FILE
# ================================================================================
# NAME = input.dat
# | 1> %pal nprocs 4 end
# | 2> ! B3LYP def2-svp
# | 3> ! Grid4
# | 4>
# | 5> *xyz 0 3
# | 6> O 0 0 0
# | 7> O 0 0 1.5
# | 8> *
# | 9>
# | 10> ****END OF INPUT****
# ================================================================================
if "INPUT FILE" == line.strip():
self.skip_line(inputfile, '=')
self.metadata['input_file_name'] = next(inputfile).split()[-1]
# First, collect all the lines...
lines = []
for line in inputfile:
if line[0] != '|':
break
lines.append(line[6:])
self.metadata['input_file_contents'] = ''.join(lines[:-1])
lines_iter = iter(lines[:-1])
keywords = []
coords = []
# ...then parse them separately.
for line in lines_iter:
line = line.strip()
if not line:
continue
# Keywords block
if line[0] == '!':
keywords += line[1:].split()
# Impossible to parse without knowing whether a keyword opens a new block
elif line[0] == '%':
pass
# Geometry block
elif line[0] == '*':
coord_type, charge, multiplicity = line[1:].split()[:3]
self.set_attribute('charge', int(charge))
self.set_attribute('multiplicity', int(multiplicity))
coord_type = coord_type.lower()
self.metadata['coord_type'] = coord_type
if coord_type == 'xyz':
def splitter(line):
atom, x, y, z = line.split()[:4]
return [atom, float(x), float(y), float(z)]
elif coord_type in ['int', 'internal']:
def splitter(line):
atom, a1, a2, a3, bond, angle, dihedral = line.split()[:7]
return [atom, int(a1), int(a2), int(a3), float(bond), float(angle), float(dihedral)]
elif coord_type == 'gzmt':
def splitter(line):
vals = line.split()[:7]
if len(vals) == 7:
atom, a1, bond, a2, angle, a3, dihedral = vals
return [atom, int(a1), float(bond), int(a2), float(angle), int(a3), float(dihedral)]
elif len(vals) == 5:
return [vals[0], int(vals[1]), float(vals[2]), int(vals[3]), float(vals[4])]
elif len(vals) == 3:
return [vals[0], int(vals[1]), float(vals[2])]
elif len(vals) == 1:
return [vals[0]]
self.logger.warning('Incorrect number of atoms in input geometry.')
elif 'file' in coord_type:
pass
else:
self.logger.warning('Invalid coordinate type.')
if 'file' not in coord_type:
for line in lines_iter:
if not line:
continue
if line[0] == '#' or line.strip(' ') == '\n':
continue
if line[0] == '*' or line.strip() == "end":
break
# Strip basis specification that can appear after coordinates
line = line.split('newGTO')[0].strip()
coords.append(splitter(line))
self.metadata['keywords'] = keywords
self.metadata['coords'] = coords
if line[0:15] == "Number of atoms":
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
if line[1:13] == "Total Charge":
charge = int(line.split()[-1])
self.set_attribute('charge', charge)
line = next(inputfile)
mult = int(line.split()[-1])
self.set_attribute('mult', mult)
# SCF convergence output begins with:
#
# --------------
# SCF ITERATIONS
# --------------
#
# However, there are two common formats which need to be handled, implemented as separate functions.
if line.strip() == "SCF ITERATIONS":
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
columns = line.split()
# "Starting incremental Fock matrix formation" doesn't
# necessarily appear before the extended format.
if not columns:
self.parse_scf_expanded_format(inputfile, columns)
# A header with distinct columns indicates the condensed
# format.
elif columns[1] == "Energy":
self.parse_scf_condensed_format(inputfile, columns)
# Assume the extended format.
else:
self.parse_scf_expanded_format(inputfile, columns)
# Information about the final iteration, which also includes the convergence
# targets and the convergence values, is printed separately, in a section like this:
#
# *****************************************************
# * SUCCESS *
# * SCF CONVERGED AFTER 9 CYCLES *
# *****************************************************
#
# ...
#
# Total Energy : -382.04963064 Eh -10396.09898 eV
#
# ...
#
# ------------------------- ----------------
# FINAL SINGLE POINT ENERGY -382.049630637
# ------------------------- ----------------
#
# We cannot use this last message as a stop condition in general, because
# often there is vibrational output before it. So we use the 'Total Energy'
# line. However, what comes after that is different for single point calculations
# and in the inner steps of geometry optimizations.
if "SCF CONVERGED AFTER" in line:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
if not hasattr(self, "scfvalues"):
self.scfvalues = []
if not hasattr(self, "scftargets"):
self.scftargets = []
while not "Total Energy :" in line:
line = next(inputfile)
energy = utils.convertor(float(line.split()[3]), "hartree", "eV")
self.scfenergies.append(energy)
self._append_scfvalues_scftargets(inputfile, line)
# Sometimes the SCF does not converge, but does not halt the
# the run (like in bug 3184890). In this this case, we should
# remain consistent and use the energy from the last reported
# SCF cycle. In this case, ORCA print a banner like this:
#
# *****************************************************
# * ERROR *
# * SCF NOT CONVERGED AFTER 8 CYCLES *
# *****************************************************
if "SCF NOT CONVERGED AFTER" in line:
if not hasattr(self, "scfenergies"):
self.scfenergies = []
if not hasattr(self, "scfvalues"):
self.scfvalues = []
if not hasattr(self, "scftargets"):
self.scftargets = []
energy = utils.convertor(self.scfvalues[-1][-1][0], "hartree", "eV")
self.scfenergies.append(energy)
self._append_scfvalues_scftargets(inputfile, line)
# The convergence targets for geometry optimizations are printed at the
# beginning of the output, although the order and their description is
# different than later on. So, try to standardize the names of the criteria
# and save them for later so that we can get the order right.
#
# *****************************
# * Geometry Optimization Run *
# *****************************
#
# Geometry optimization settings:
# Update method Update .... BFGS
# Choice of coordinates CoordSys .... Redundant Internals
# Initial Hessian InHess .... Almoef's Model
#
# Convergence Tolerances:
# Energy Change TolE .... 5.0000e-06 Eh
# Max. Gradient TolMAXG .... 3.0000e-04 Eh/bohr
# RMS Gradient TolRMSG .... 1.0000e-04 Eh/bohr
# Max. Displacement TolMAXD .... 4.0000e-03 bohr
# RMS Displacement TolRMSD .... 2.0000e-03 bohr
#
if line[25:50] == "Geometry Optimization Run":
stars = next(inputfile)
blank = next(inputfile)
line = next(inputfile)
while line[0:23] != "Convergence Tolerances:":
line = next(inputfile)
if hasattr(self, 'geotargets'):
self.logger.warning('The geotargets attribute should not exist yet. There is a problem in the parser.')
self.geotargets = []
self.geotargets_names = []
# There should always be five tolerance values printed here.
for i in range(5):
line = next(inputfile)
name = line[:25].strip().lower().replace('.', '').replace('displacement', 'step')
target = float(line.split()[-2])
self.geotargets_names.append(name)
self.geotargets.append(target)
# The convergence targets for relaxed surface scan steps are printed at the
# beginning of the output, although the order and their description is
# different than later on. So, try to standardize the names of the criteria
# and save them for later so that we can get the order right.
#
# *************************************************************
# * RELAXED SURFACE SCAN STEP 12 *
# * *
# * Dihedral ( 11, 10, 3, 4) : 180.00000000 *
# *************************************************************
#
# Geometry optimization settings:
# Update method Update .... BFGS
# Choice of coordinates CoordSys .... Redundant Internals
# Initial Hessian InHess .... Almoef's Model
#
# Convergence Tolerances:
# Energy Change TolE .... 5.0000e-06 Eh
# Max. Gradient TolMAXG .... 3.0000e-04 Eh/bohr
# RMS Gradient TolRMSG .... 1.0000e-04 Eh/bohr
# Max. Displacement TolMAXD .... 4.0000e-03 bohr
# RMS Displacement TolRMSD .... 2.0000e-03 bohr
if line[25:50] == "RELAXED SURFACE SCAN STEP":
self.is_relaxed_scan = True
blank = next(inputfile)
info = next(inputfile)
stars = next(inputfile)
blank = next(inputfile)
line = next(inputfile)
while line[0:23] != "Convergence Tolerances:":
line = next(inputfile)
self.geotargets = []
self.geotargets_names = []
# There should always be five tolerance values printed here.
for i in range(5):
line = next(inputfile)
name = line[:25].strip().lower().replace('.', '').replace('displacement', 'step')
target = float(line.split()[-2])
self.geotargets_names.append(name)
self.geotargets.append(target)
# ------------------
# CARTESIAN GRADIENT
# ------------------
#
# 1 H : 0.000000004 0.019501450 -0.021537091
# 2 O : 0.000000054 -0.042431648 0.042431420
# 3 H : 0.000000004 0.021537179 -0.019501388
if line[:18] == 'CARTESIAN GRADIENT':
next(inputfile)
next(inputfile)
grads = []
line = next(inputfile).strip()
while line:
idx, atom, colon, x, y, z = line.split()
grads.append((float(x), float(y), float(z)))
line = next(inputfile).strip()
if not hasattr(self, 'grads'):
self.grads = []
self.grads.append(grads)
# After each geometry optimization step, ORCA prints the current convergence
# parameters and the targets (again), so it is a good idea to check that they
# have not changed. Note that the order of these criteria here are different
# than at the beginning of the output, so make use of the geotargets_names created
# before and save the new geovalues in correct order.
#
# ----------------------|Geometry convergence|---------------------
# Item value Tolerance Converged
# -----------------------------------------------------------------
# Energy change 0.00006021 0.00000500 NO
# RMS gradient 0.00031313 0.00010000 NO
# RMS step 0.01596159 0.00200000 NO
# MAX step 0.04324586 0.00400000 NO
# ....................................................
# Max(Bonds) 0.0218 Max(Angles) 2.48
# Max(Dihed) 0.00 Max(Improp) 0.00
# -----------------------------------------------------------------
#
if line[33:53] == "Geometry convergence":
headers = next(inputfile)
dashes = next(inputfile)
names = []
values = []
targets = []
line = next(inputfile)
# Handle both the dots only and dashes only cases
while len(list(set(line.strip()))) != 1:
name = line[10:28].strip().lower()
tokens = line.split()
value = float(tokens[2])
target = float(tokens[3])
names.append(name)
values.append(value)
targets.append(target)
line = next(inputfile)
# The energy change is normally not printed in the first iteration, because
# there was no previous energy -- in that case assume zero. There are also some
# edge cases where the energy change is not printed, for example when internal
# angles become improper and internal coordinates are rebuilt as in regression
# CuI-MePY2-CH3CN_optxes, and in such cases use NaN.
newvalues = []
for i, n in enumerate(self.geotargets_names):
if (n == "energy change") and (n not in names):
if self.is_relaxed_scan:
newvalues.append(0.0)
else:
newvalues.append(numpy.nan)
else:
newvalues.append(values[names.index(n)])
assert targets[names.index(n)] == self.geotargets[i]
self.append_attribute("geovalues", newvalues)
""" Grab cartesian coordinates
---------------------------------
CARTESIAN COORDINATES (ANGSTROEM)
---------------------------------
H 0.000000 0.000000 0.000000
O 0.000000 0.000000 1.000000
H 0.000000 1.000000 1.000000
"""
if line[0:33] == "CARTESIAN COORDINATES (ANGSTROEM)":
next(inputfile)
atomnos = []
atomcoords = []
line = next(inputfile)
while len(line) > 1:
atom, x, y, z = line.split()
if atom[-1] != ">":
atomnos.append(self.table.number[atom])
atomcoords.append([float(x), float(y), float(z)])
line = next(inputfile)
self.set_attribute('natom', len(atomnos))
self.set_attribute('atomnos', atomnos)
self.append_attribute("atomcoords", atomcoords)
""" Grab atom masses
----------------------------
CARTESIAN COORDINATES (A.U.)
----------------------------
NO LB ZA FRAG MASS X Y Z
0 H 1.0000 0 1.008 0.000000 0.000000 0.000000
1 O 8.0000 0 15.999 0.000000 0.000000 1.889726
2 H 1.0000 0 1.008 0.000000 1.889726 1.889726
"""
if line[0:28] == "CARTESIAN COORDINATES (A.U.)" and not hasattr(self, 'atommasses'):
next(inputfile)
next(inputfile)
line = next(inputfile)
self.atommasses = []
while len(line) > 1:
if line[:32] == '* core charge reduced due to ECP':
break
if line.strip() == "> coreless ECP center with (optional) point charge":
break
no, lb, za, frag, mass, x, y, z = line.split()
if lb[-1] != ">":
self.atommasses.append(float(mass))
line = next(inputfile)
if line[21:68] == "FINAL ENERGY EVALUATION AT THE STATIONARY POINT":
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.atomcoords))
if "The optimization did not converge" in line:
if not hasattr(self, 'optdone'):
self.optdone = []
if line[0:16] == "ORBITAL ENERGIES":
self.skip_lines(inputfile, ['d', 'text', 'text'])
self.mooccnos = [[]]
self.moenergies = [[]]
line = next(inputfile)
while len(line) > 20: # restricted calcs are terminated by ------
info = line.split()
mooccno = int(float(info[1]))
moenergy = float(info[2])
self.mooccnos[0].append(mooccno)
self.moenergies[0].append(utils.convertor(moenergy, "hartree", "eV"))
line = next(inputfile)
line = next(inputfile)
# handle beta orbitals for UHF
if line[17:35] == "SPIN DOWN ORBITALS":
text = next(inputfile)
self.mooccnos.append([])
self.moenergies.append([])
line = next(inputfile)
while len(line) > 20: # actually terminated by ------
info = line.split()
mooccno = int(float(info[1]))
moenergy = float(info[2])
self.mooccnos[1].append(mooccno)
self.moenergies[1].append(utils.convertor(moenergy, "hartree", "eV"))
line = next(inputfile)
if not hasattr(self, 'homos'):
doubly_occupied = self.mooccnos[0].count(2)
singly_occupied = self.mooccnos[0].count(1)
# Restricted closed-shell.
if doubly_occupied > 0 and singly_occupied == 0:
self.set_attribute('homos', [doubly_occupied - 1])
# Restricted open-shell.
elif doubly_occupied > 0 and singly_occupied > 0:
self.set_attribute('homos', [doubly_occupied + singly_occupied - 1,
doubly_occupied - 1])
# Unrestricted.
else:
assert len(self.moenergies) == 2
assert doubly_occupied == 0
assert self.mooccnos[1].count(2) == 0
nbeta = self.mooccnos[1].count(1)
self.set_attribute('homos', [singly_occupied - 1, nbeta - 1])
# So nbasis was parsed at first with the first pattern, but it turns out that
# semiempirical methods (at least AM1 as reported by Julien Idé) do not use this.
# For this reason, also check for the second patterns, and use it as an assert
# if nbasis was already parsed. Regression PCB_1_122.out covers this test case.
if line[1:32] == "# of contracted basis functions":
self.set_attribute('nbasis', int(line.split()[-1]))
if line[1:27] == "Basis Dimension Dim":
self.set_attribute('nbasis', int(line.split()[-1]))
if line[0:14] == "OVERLAP MATRIX":
self.skip_line(inputfile, 'dashes')
self.aooverlaps = numpy.zeros((self.nbasis, self.nbasis), "d")
for i in range(0, self.nbasis, 6):
self.updateprogress(inputfile, "Overlap")
header = next(inputfile)
size = len(header.split())
for j in range(self.nbasis):
line = next(inputfile)
broken = line.split()
self.aooverlaps[j, i:i+size] = list(map(float, broken[1:size+1]))
# Molecular orbital coefficients are parsed here, but also related things
#like atombasis and aonames if possible.
#
# Normally the output is easy to parse like this:
# ------------------
# MOLECULAR ORBITALS
# ------------------
# 0 1 2 3 4 5
# -19.28527 -19.26828 -19.26356 -19.25801 -19.25765 -19.21471
# 2.00000 2.00000 2.00000 2.00000 2.00000 2.00000
# -------- -------- -------- -------- -------- --------
# 0C 1s 0.000002 -0.000001 0.000000 0.000000 -0.000000 0.000001
# 0C 2s -0.000007 0.000006 -0.000002 -0.000000 0.000001 -0.000003
# 0C 3s -0.000086 -0.000061 0.000058 -0.000033 -0.000027 -0.000058
# ...
#
# But when the numbers get big, things get yucky since ORCA does not use
# fixed width formatting for the floats, and does not insert extra spaces
# when the numbers get wider. So things get stuck together overflowing columns,
# like this:
# 12C 6s -11.608845-53.775398161.302640-76.633779 29.914985 22.083999
#
# One assumption that seems to hold is that there are always six significant
# digits in the coefficients, so we can try to use that to delineate numbers
# when the parsing gets rough. This is what we do below with a regex, and a case
# like this is tested in regression ORCA/ORCA4.0/invalid-literal-for-float.out
# which was reported in https://github.com/cclib/cclib/issues/629
if line[0:18] == "MOLECULAR ORBITALS":
self.skip_line(inputfile, 'dashes')
aonames = []
atombasis = [[] for i in range(self.natom)]
mocoeffs = [numpy.zeros((self.nbasis, self.nbasis), "d")]
for spin in range(len(self.moenergies)):
if spin == 1:
self.skip_line(inputfile, 'blank')
mocoeffs.append(numpy.zeros((self.nbasis, self.nbasis), "d"))
for i in range(0, self.nbasis, 6):
self.updateprogress(inputfile, "Coefficients")
self.skip_lines(inputfile, ['numbers', 'energies', 'occs'])
dashes = next(inputfile)
for j in range(self.nbasis):
line = next(inputfile)
# Only need this in the first iteration.
if spin == 0 and i == 0:
atomname = line[3:5].split()[0]
num = int(line[0:3])
orbital = line.split()[1].upper()
aonames.append("%s%i_%s" % (atomname, num+1, orbital))
atombasis[num].append(j)
# This regex will tease out all number with exactly
# six digits after the decimal point.
coeffs = re.findall('-?\d+\.\d{6}', line)
# Something is very wrong if this does not hold.
assert len(coeffs) <= 6
mocoeffs[spin][i:i+len(coeffs), j] = [float(c) for c in coeffs]
self.set_attribute('aonames', aonames)
self.set_attribute('atombasis', atombasis)
self.set_attribute("mocoeffs", mocoeffs)
# Basis set information
# ORCA prints this out in a somewhat indirect fashion.
# Therefore, parsing occurs in several steps:
# 1. read which atom belongs to which basis set group
if line[0:21] == "BASIS SET INFORMATION":
line = next(inputfile)
line = next(inputfile)
self.tmp_atnames = [] # temporary attribute, needed later
while(not line[0:5] == '-----'):
if line[0:4] == "Atom":
self.tmp_atnames.append(line[8:12].strip())
line = next(inputfile)
# 2. Read information for the basis set groups
if line[0:25] == "BASIS SET IN INPUT FORMAT":
line = next(inputfile)
line = next(inputfile)
# loop over basis set groups
gbasis_tmp = {}
while(not line[0:5] == '-----'):
if line[1:7] == 'NewGTO':
bas_atname = line.split()[1]
gbasis_tmp[bas_atname] = []
line = next(inputfile)
# loop over contracted GTOs
while(not line[0:6] == ' end;'):
words = line.split()
ang = words[0]
nprim = int(words[1])
# loop over primitives
coeff = []
for iprim in range(nprim):
words = next(inputfile).split()
coeff.append( (float(words[1]), float(words[2])) )
gbasis_tmp[bas_atname].append((ang, coeff))
line = next(inputfile)
line = next(inputfile)
# 3. Assign the basis sets to gbasis
self.gbasis = []
for bas_atname in self.tmp_atnames:
self.gbasis.append(gbasis_tmp[bas_atname])
del self.tmp_atnames
""" Banner announcing Thermochemistry
--------------------------
THERMOCHEMISTRY AT 298.15K
--------------------------
"""
if 'THERMOCHEMISTRY AT' == line[:18]:
next(inputfile)
next(inputfile)
self.temperature = float(next(inputfile).split()[2])
self.pressure = float(next(inputfile).split()[2])
total_mass = float(next(inputfile).split()[3])
# Vibrations, rotations, and translations
line = next(inputfile)
while line[:17] != 'Electronic energy':
line = next(inputfile)
self.zpe = next(inputfile).split()[4]
thermal_vibrational_correction = float(next(inputfile).split()[4])
thermal_rotional_correction = float(next(inputfile).split()[4])
thermal_translational_correction = float(next(inputfile).split()[4])
next(inputfile)
total_thermal_energy = float(next(inputfile).split()[3])
# Enthalpy
line = next(inputfile)
while line[:17] != 'Total free energy':
line = next(inputfile)
thermal_enthalpy_correction = float(next(inputfile).split()[4])
next(inputfile)
self.enthalpy = float(next(inputfile).split()[3])
# Entropy
line = next(inputfile)
while line[:18] != 'Electronic entropy':
line = next(inputfile)
electronic_entropy = float(line.split()[3])
vibrational_entropy = float(next(inputfile).split()[3])
rotational_entropy = float(next(inputfile).split()[3])
translational_entropy = float(next(inputfile).split()[3])
next(inputfile)
self.entropy = float(next(inputfile).split()[4])
line = next(inputfile)
while line[:25] != 'Final Gibbs free enthalpy':
line = next(inputfile)
self.freeenergy = float(line.split()[5])
# Read TDDFT information
if any(x in line for x in ("TD-DFT/TDA EXCITED", "TD-DFT EXCITED")):
# Could be singlets or triplets
if line.find("SINGLETS") >= 0:
sym = "Singlet"
elif line.find("TRIPLETS") >= 0:
sym = "Triplet"
else:
sym = "Not specified"
etsecs = []
etenergies = []
etsyms = []
lookup = {'a': 0, 'b': 1}
line = next(inputfile)
while line.find("STATE") < 0:
line = next(inputfile)
# Contains STATE or is blank
while line.find("STATE") >= 0:
broken = line.split()
etenergies.append(float(broken[-2]))
etsyms.append(sym)
line = next(inputfile)
sec = []
# Contains SEC or is blank
while line.strip():
start = line[0:8].strip()
start = (int(start[:-1]), lookup[start[-1]])
end = line[10:17].strip()
end = (int(end[:-1]), lookup[end[-1]])
# Coeffients are not printed for RPA, only
# TDA/CIS.
contrib = line[35:47].strip()
try:
contrib = float(contrib)
except ValueError:
contrib = numpy.nan
sec.append([start, end, contrib])
line = next(inputfile)
etsecs.append(sec)
line = next(inputfile)
self.extend_attribute('etenergies', etenergies)
self.extend_attribute('etsecs', etsecs)
self.extend_attribute('etsyms', etsyms)
# Parse the various absorption spectra for TDDFT and ROCIS.
if 'ABSORPTION SPECTRUM' in line or 'ELECTRIC DIPOLE' in line:
# CASSCF has an anomalous printing of ABSORPTION SPECTRUM.
if line[:-1] == 'ABSORPTION SPECTRUM':
return
line = line.strip()
# Standard header, occasionally changes
header = ['d', 'header', 'header', 'd']
def energy_intensity(line):
""" TDDFT and related methods standard method of output
-----------------------------------------------------------------------------
ABSORPTION SPECTRUM VIA TRANSITION ELECTRIC DIPOLE MOMENTS
-----------------------------------------------------------------------------
State Energy Wavelength fosc T2 TX TY TZ
(cm-1) (nm) (au**2) (au) (au) (au)
-----------------------------------------------------------------------------
1 5184116.7 1.9 0.040578220 0.00258 -0.05076 -0.00000 -0.00000
"""
try:
state, energy, wavelength, intensity, t2, tx, ty, tz = line.split()
except ValueError as e:
# Must be spin forbidden and thus no intensity
energy = line.split()[1]
intensity = 0
return energy, intensity
# Check for variations
if line == 'COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM' or \
line == 'COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM (origin adjusted)':
def energy_intensity(line):
""" TDDFT with DoQuad == True
------------------------------------------------------------------------------------------------------
COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM
------------------------------------------------------------------------------------------------------
State Energy Wavelength D2 m2 Q2 D2+m2+Q2 D2/TOT m2/TOT Q2/TOT
(cm-1) (nm) (*1e6) (*1e6)
------------------------------------------------------------------------------------------------------
1 61784150.6 0.2 0.00000 0.00000 3.23572 0.00000323571519 0.00000 0.00000 1.00000
"""
state, energy, wavelength, d2, m2, q2, intensity, d2_contrib, m2_contrib, q2_contrib = line.split()
return energy, intensity
elif line == 'COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM (Origin Independent, Length Representation)':
def energy_intensity(line):
""" TDDFT with doQuad == True (Origin Independent Length Representation)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM (Origin Independent, Length Representation)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
State Energy Wavelength D2 m2 Q2 DM DO D2+m2+Q2+DM+DO D2/TOT m2/TOT Q2/TOT DM/TOT DO/TOT
(cm-1) (nm) (*1e6) (*1e6) (*1e6) (*1e6)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 61784150.6 0.2 0.00000 0.00000 3.23572 0.00000 0.00000 0.00000323571519 0.00000 0.00000 1.00000 0.00000 0.00000
2 61793079.3 0.2 0.00000 0.00000 2.85949 0.00000 -0.00000 0.00000285948800 0.00000 0.00000 1.00000 0.00000 -0.00000
"""
vals = line.split()
if len(vals) < 14:
return vals[1], 0
return vals[1], vals[8]
elif line[:5] == 'X-RAY' and \
(line[6:23] == 'EMISSION SPECTRUM' or line[6:25] == 'ABSORPTION SPECTRUM'):
def energy_intensity(line):
""" X-Ray from XES (emission or absorption, electric or velocity dipole moments)
-------------------------------------------------------------------------------------
X-RAY ABSORPTION SPECTRUM VIA TRANSITION ELECTRIC DIPOLE MOMENTS
-------------------------------------------------------------------------------------
Transition Energy INT TX TY TZ
(eV) (normalized) (au) (au) (au)
-------------------------------------------------------------------------------------
1 90a -> 0a 8748.824 0.000002678629 0.00004 -0.00001 0.00003
"""
state, start, arrow, end, energy, intensity, tx, ty, tz = line.split()
return energy, intensity
elif line[:70] == 'COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE X-RAY':
header = ['header', 'd', 'header', 'd', 'header', 'header', 'd']
def energy_intensity(line):
""" XAS with quadrupole (origin adjusted)
-------------------------------------------------------------------------------------------------------------------------------
COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE X-RAY ABSORPTION SPECTRUM
(origin adjusted)
-------------------------------------------------------------------------------------------------------------------------------
INT (normalized)
---------------------------------------------------------
Transition Energy D2 M2 Q2 D2+M2+Q2 D2/TOT M2/TOT Q2/TOT
(eV) (*1e6) (*1e6)
-------------------------------------------------------------------------------------------------------------------------------
1 90a -> 0a 8748.824 0.000000 0.000292 0.003615 0.000000027512 0.858012 0.010602 0.131386
"""
state, start, arrow, end, energy, d2, m2, q2, intensity, d2_contrib, m2_contrib, q2_contrib = line.split()
return energy, intensity
elif line[:55] == 'SPIN ORBIT CORRECTED ABSORPTION SPECTRUM VIA TRANSITION':
def energy_intensity(line):
""" ROCIS dipole approximation with SOC == True (electric or velocity dipole moments)
-------------------------------------------------------------------------------
SPIN ORBIT CORRECTED ABSORPTION SPECTRUM VIA TRANSITION ELECTRIC DIPOLE MOMENTS
-------------------------------------------------------------------------------
States Energy Wavelength fosc T2 TX TY TZ
(cm-1) (nm) (au**2) (au) (au) (au)
-------------------------------------------------------------------------------
0 1 0.0 0.0 0.000000000 0.00000 0.00000 0.00000 0.00000
0 2 5184116.4 1.9 0.020288451 0.00258 0.05076 0.00003 0.00000
"""
state, state2, energy, wavelength, intensity, t2, tx, ty, tz = line.split()
return energy, intensity
elif line[:79] == 'ROCIS COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM' \
or line[:87] == 'SOC CORRECTED COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM':
def energy_intensity(line):
""" ROCIS with DoQuad = True and SOC = True (also does origin adjusted)
------------------------------------------------------------------------------------------------------
ROCIS COMBINED ELECTRIC DIPOLE + MAGNETIC DIPOLE + ELECTRIC QUADRUPOLE SPECTRUM
------------------------------------------------------------------------------------------------------
States Energy Wavelength D2 m2 Q2 D2+m2+Q2 D2/TOT m2/TOT Q2/TOT
(cm-1) (nm) (*1e6) (*1e6) (*population)
------------------------------------------------------------------------------------------------------
0 1 0.0 0.0 0.00000 0.00000 0.00000 0.00000000000000 0.00000 0.00000 0.00000
0 2 669388066.6 0.0 0.00000 0.00000 0.00876 0.00000000437784 0.00000 0.00000 1.00000
"""
state, state2, energy, wavelength, d2, m2, q2, intensity, d2_contrib, m2_contrib, q2_contrib = line.split()
return energy, intensity
# Clashes with Orca 2.6 (and presumably before) TDDFT absorption spectrum printing
elif line == 'ABSORPTION SPECTRUM' and float(self.metadata['package_version']) > 2.6:
def energy_intensity(line):
""" CASSCF absorption spectrum
------------------------------------------------------------------------------------------
ABSORPTION SPECTRUM
------------------------------------------------------------------------------------------
States Energy Wavelength fosc T2 TX TY TZ
(cm-1) (nm) (D**2) (D) (D) (D)
------------------------------------------------------------------------------------------
0( 0)-> 1( 0) 1 83163.2 120.2 0.088250385 2.25340 0.00000 0.00000 1.50113
"""
reg = r'(\d+)\( ?(\d+)\)-> ?(\d+)\( ?(\d+)\) (\d+)'+ '\s+(\d+\.\d+)'*4 + '\s+(-?\d+\.\d+)'*3
res = re.search(reg, line)
jstate, jblock, istate, iblock, mult, energy, wavelength, intensity, t2, tx, ty, tz = res.groups()
return energy, intensity
name = line
self.skip_lines(inputfile, header)
if not hasattr(self, 'transprop'):
self.transprop = {}
etenergies = []
etoscs = []
line = next(inputfile)
# The sections are occasionally ended with dashed lines
# other times they are blank (other than a new line)
while len(line.strip('-')) > 2:
energy, intensity = energy_intensity(line)
etenergies.append(float(energy))
etoscs.append(float(intensity))
line = next(inputfile)
self.set_attribute('etenergies', etenergies)
self.set_attribute('etoscs', etoscs)
self.transprop[name] = (numpy.asarray(etenergies), numpy.asarray(etoscs))
if line.strip() == "CD SPECTRUM":
# -------------------------------------------------------------------
# CD SPECTRUM
# -------------------------------------------------------------------
# State Energy Wavelength R MX MY MZ
# (cm-1) (nm) (1e40*cgs) (au) (au) (au)
# -------------------------------------------------------------------
# 1 43167.6 231.7 0.00000 0.00000 -0.00000 0.00000
#
etenergies = []
etrotats = []
self.skip_lines(inputfile, ["d", "State Energy Wavelength", "(cm-1) (nm)", "d"])
line = next(inputfile)
while line.strip():
tokens = line.split()
if "spin forbidden" in line:
etrotat, mx, my, mz = 0.0, 0.0, 0.0, 0.0
else:
etrotat, mx, my, mz = [self.float(t) for t in tokens[3:]]
etenergies.append(self.float(tokens[1]))
etrotats.append(etrotat)
line = next(inputfile)
self.set_attribute("etrotats", etrotats)
if not hasattr(self, "etenergies"):
self.logger.warning("etenergies not parsed before ECD section, "
"the output file may be malformed")
self.set_attribute("etenergies", etenergies)
if line[:23] == "VIBRATIONAL FREQUENCIES":
self.skip_lines(inputfile, ['d', 'b'])
# Starting with 4.1, a scaling factor for frequencies is printed
if float(self.metadata["package_version"][:3]) > 4.0:
self.skip_lines(inputfile, ['Scaling factor for frequencies', 'b'])
vibfreqs = numpy.zeros(3 * self.natom)
for i, line in zip(range(3 * self.natom), inputfile):
vibfreqs[i] = float(line.split()[1])
nonzero = numpy.nonzero(vibfreqs)[0]
self.first_mode = nonzero[0]
# Take all modes after first
# Mode between imaginary and real modes could be 0
self.num_modes = 3*self.natom - self.first_mode
if self.num_modes > 3*self.natom - 6:
msg = "Modes corresponding to rotations/translations may be non-zero."
if self.num_modes == 3*self.natom - 5:
msg += '\n You can ignore this if the molecule is linear.'
self.set_attribute('vibfreqs', vibfreqs[self.first_mode:])
# NORMAL MODES
# ------------
#
# These modes are the cartesian displacements weighted by the diagonal matrix
# M(i,i)=1/sqrt(m[i]) where m[i] is the mass of the displaced atom
# Thus, these vectors are normalized but *not* orthogonal
#
# 0 1 2 3 4 5
# 0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
# 1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
# 2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
# ...
if line[:12] == "NORMAL MODES":
all_vibdisps = numpy.zeros((3 * self.natom, self.natom, 3), "d")
self.skip_lines(inputfile, ['d', 'b', 'text', 'text', 'text', 'b'])
for mode in range(0, 3 * self.natom, 6):
header = next(inputfile)
for atom in range(self.natom):
all_vibdisps[mode:mode + 6, atom, 0] = next(inputfile).split()[1:]
all_vibdisps[mode:mode + 6, atom, 1] = next(inputfile).split()[1:]
all_vibdisps[mode:mode + 6, atom, 2] = next(inputfile).split()[1:]
self.set_attribute('vibdisps', all_vibdisps[self.first_mode:])
# -----------
# IR SPECTRUM
# -----------
#
# Mode freq (cm**-1) T**2 TX TY TZ
# -------------------------------------------------------------------
# 6: 2069.36 1.674341 ( -0.000000 0.914970 -0.914970)
# 7: 3978.81 76.856228 ( 0.000000 6.199041 -6.199042)
# 8: 4113.34 61.077784 ( -0.000000 5.526201 5.526200)
if line[:11] == "IR SPECTRUM":
self.skip_lines(inputfile, ['d', 'b', 'header', 'd'])
all_vibirs = numpy.zeros((3 * self.natom,), "d")
line = next(inputfile)
while len(line) > 2:
num = int(line[0:4])
all_vibirs[num] = float(line.split()[2])
line = next(inputfile)
self.set_attribute('vibirs', all_vibirs[self.first_mode:])
# --------------
# RAMAN SPECTRUM
# --------------
#
# Mode freq (cm**-1) Activity Depolarization
# -------------------------------------------------------------------
# 6: 296.23 5.291229 0.399982
# 7: 356.70 0.000000 0.749764
# 8: 368.27 0.000000 0.202068
if line[:14] == "RAMAN SPECTRUM":
self.skip_lines(inputfile, ['d', 'b', 'header', 'd'])
all_vibramans = numpy.zeros(3 * self.natom)
line = next(inputfile)
while len(line) > 2:
num = int(line[0:4])
all_vibramans[num] = float(line.split()[2])
line = next(inputfile)
self.set_attribute('vibramans', all_vibramans[self.first_mode:])
# ORCA will print atomic charges along with the spin populations,
# so care must be taken about choosing the proper column.
# Population analyses are performed usually only at the end
# of a geometry optimization or other run, so we want to
# leave just the final atom charges.
# Here is an example for Mulliken charges:
# --------------------------------------------
# MULLIKEN ATOMIC CHARGES AND SPIN POPULATIONS
# --------------------------------------------
# 0 H : 0.126447 0.002622
# 1 C : -0.613018 -0.029484
# 2 H : 0.189146 0.015452
# 3 H : 0.320041 0.037434
# ...
# Sum of atomic charges : -0.0000000
# Sum of atomic spin populations: 1.0000000
if line[:23] == "MULLIKEN ATOMIC CHARGES":
self.parse_charge_section(line, inputfile, 'mulliken')
# Things are the same for Lowdin populations, except that the sums
# are not printed (there is a blank line at the end).
if line[:22] == "LOEWDIN ATOMIC CHARGES":
self.parse_charge_section(line, inputfile, 'lowdin')
#CHELPG Charges
#--------------------------------
# 0 C : 0.363939
# 1 H : 0.025695
# ...
#--------------------------------
#Total charge: -0.000000
#--------------------------------
if line.startswith('CHELPG Charges'):
self.parse_charge_section(line, inputfile, 'chelpg')
# It is not stated explicitely, but the dipole moment components printed by ORCA
# seem to be in atomic units, so they will need to be converted. Also, they
# are most probably calculated with respect to the origin .
#
# -------------
# DIPOLE MOMENT
# -------------
# X Y Z
# Electronic contribution: 0.00000 -0.00000 -0.00000
# Nuclear contribution : 0.00000 0.00000 0.00000
# -----------------------------------------
# Total Dipole Moment : 0.00000 -0.00000 -0.00000
# -----------------------------------------
# Magnitude (a.u.) : 0.00000
# Magnitude (Debye) : 0.00000
#
if line.strip() == "DIPOLE MOMENT":
self.skip_lines(inputfile, ['d', 'XYZ', 'electronic', 'nuclear', 'd'])
total = next(inputfile)
assert "Total Dipole Moment" in total
reference = [0.0, 0.0, 0.0]
dipole = numpy.array([float(d) for d in total.split()[-3:]])
dipole = utils.convertor(dipole, "ebohr", "Debye")
if not hasattr(self, 'moments'):
self.set_attribute('moments', [reference, dipole])
else:
try:
assert numpy.all(self.moments[1] == dipole)
except AssertionError:
self.logger.warning('Overwriting previous multipole moments with new values')
self.set_attribute('moments', [reference, dipole])
if "Molecular Dynamics Iteration" in line:
self.skip_lines(inputfile, ['d', 'ORCA MD', 'd', 'New Coordinates'])
line = next(inputfile)
tokens = line.split()
assert tokens[0] == "time"
time = utils.convertor(float(tokens[2]), "time_au", "fs")
self.append_attribute('time', time)
# Static polarizability.
if line.strip() == "THE POLARIZABILITY TENSOR":
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile)
assert line.strip() == "The raw cartesian tensor (atomic units):"
polarizability = []
for _ in range(3):
line = next(inputfile)
polarizability.append(line.split())
self.polarizabilities.append(numpy.array(polarizability))
if line.strip() == 'ORCA-CASSCF':
# -------------------------------------------------------------------------------
# ORCA-CASSCF
# -------------------------------------------------------------------------------
#
# Symmetry handling UseSym ... ON
# Point group ... C2
# Used point group ... C2
# Number of irreps ... 2
# Irrep A has 10 SALCs (ofs= 0) #(closed)= 0 #(active)= 2
# Irrep B has 10 SALCs (ofs= 10) #(closed)= 0 #(active)= 2
# Symmetries of active orbitals:
# MO = 0 IRREP= 0 (A)
# MO = 1 IRREP= 1 (B)
self.skip_lines(inputfile, ['d', 'b'])
vals = next(inputfile).split()
# Symmetry section is only printed if symmetry is used.
if vals[0] == 'Symmetry':
assert vals[-1] == 'ON'
point_group = next(inputfile).split()[-1]
used_point_group = next(inputfile).split()[-1]
num_irreps = int(next(inputfile).split()[-1])
num_active = 0
# Parse the irreps.
for i, line in zip(range(num_irreps), inputfile):
reg = r'Irrep\s+(\w+) has\s+(\d+) SALCs \(ofs=\s*(\d+)\) #\(closed\)=\s*(\d+) #\(active\)=\s*(\d+)'
groups = re.search(reg, line).groups()
irrep = groups[0]
salcs, ofs, closed, active = map(int, groups[1:])
num_active += active
self.skip_line(inputfile, 'Symmetries')
# Parse the symmetries of the active orbitals.
for i, line in zip(range(num_active), inputfile):
reg = r'(\d+) IRREP= (\d+) \((\w+)\)'
groups = re.search(reg, line).groups()
mo, irrep_idx, irrep = groups
# Skip until the system specific settings.
# This will align the cases of symmetry on and off.
line = next(inputfile)
while line[:25] != 'SYSTEM-SPECIFIC SETTINGS:':
line = next(inputfile)
# SYSTEM-SPECIFIC SETTINGS:
# Number of active electrons ... 4
# Number of active orbitals ... 4
# Total number of electrons ... 4
# Total number of orbitals ... 20
num_el = int(next(inputfile).split()[-1])
num_orbs = int(next(inputfile).split()[-1])
total_el = int(next(inputfile).split()[-1])
total_orbs = int(next(inputfile).split()[-1])
# Determined orbital ranges:
# Internal 0 - -1 ( 0 orbitals)
# Active 0 - 3 ( 4 orbitals)
# External 4 - 19 ( 16 orbitals)
self.skip_lines(inputfile, ['b', 'Determined'])
orbital_ranges = []
# Parse the orbital ranges for: Internal, Active, and External orbitals.
for i in range(3):
vals = next(inputfile).split()
start, end, num = int(vals[1]), int(vals[3]), int(vals[5])
# Change from inclusive to exclusive in order to match python.
end = end + 1
assert end - start == num
orbital_ranges.append((start, end, num))
line = next(inputfile)
while line[:8] != 'CI-STEP:':
line = next(inputfile)
# CI-STEP:
# CI strategy ... General CI
# Number of symmetry/multplity blocks ... 1
# BLOCK 1 WEIGHT= 1.0000
# Multiplicity ... 1
# Irrep ... 0 (A)
# #(Configurations) ... 11
# #(CSFs) ... 12
# #(Roots) ... 1
# ROOT=0 WEIGHT= 1.000000
self.skip_line(inputfile, 'CI strategy')
num_blocks = int(next(inputfile).split()[-1])
for b in range(1, num_blocks + 1):
vals = next(inputfile).split()
block = int(vals[1])
weight = float(vals[3])
assert b == block
mult = int(next(inputfile).split()[-1])
vals = next(inputfile).split()
# The irrep will only be printed if using symmetry.
if vals[0] == 'Irrep':
irrep_idx = int(vals[-2])
irrep = vals[-1].strip('()')
vals = next(inputfile).split()
num_confs = int(vals[-1])
num_csfs = int(next(inputfile).split()[-1])
num_roots = int(next(inputfile).split()[-1])
# Parse the roots.
for r, line in zip(range(num_roots), inputfile):
reg = r'=(\d+) WEIGHT=\s*(\d\.\d+)'
groups = re.search(reg, line).groups()
root = int(groups[0])
weight = float(groups[1])
assert r == root
# Skip additional setup printing and CASSCF iterations.
line = next(inputfile).strip()
while line != 'CASSCF RESULTS':
line = next(inputfile).strip()
# --------------
# CASSCF RESULTS
# --------------
#
# Final CASSCF energy : -14.597120777 Eh -397.2078 eV
self.skip_lines(inputfile, ['d', 'b'])
casscf_energy = float(next(inputfile).split()[4])
# This is only printed for first and last step of geometry optimization.
# ----------------
# ORBITAL ENERGIES
# ----------------
#
# NO OCC E(Eh) E(eV) Irrep
# 0 0.0868 0.257841 7.0162 1-A
self.skip_lines(inputfile, ['b', 'd'])
if next(inputfile).strip() == 'ORBITAL ENERGIES':
self.skip_lines(inputfile, ['d', 'b', 'NO'])
orbitals = []
vals = next(inputfile).split()
while vals:
occ, eh, ev = map(float, vals[1:4])
# The irrep will only be printed if using symmetry.
if len(vals) == 5:
idx, irrep = vals[4].split('-')
orbitals.append((occ, ev, int(idx), irrep))
else:
orbitals.append((occ, ev))
vals = next(inputfile).split()
self.skip_lines(inputfile, ['b', 'd'])
# Orbital Compositions
# ---------------------------------------------
# CAS-SCF STATES FOR BLOCK 1 MULT= 1 IRREP= Ag NROOTS= 2
# ---------------------------------------------
#
# ROOT 0: E= -14.5950507665 Eh
# 0.89724 [ 0]: 2000
for b in range(num_blocks):
# Parse the block data.
reg = r'BLOCK\s+(\d+) MULT=\s*(\d+) (IRREP=\s*\w+ )?(NROOTS=\s*(\d+))?'
groups = re.search(reg, next(inputfile)).groups()
block = int(groups[0])
mult = int(groups[1])
# The irrep will only be printed if using symmetry.
if groups[2] is not None:
irrep = groups[2].split('=')[1].strip()
nroots = int(groups[3].split('=')[1])
self.skip_lines(inputfile, ['d', 'b'])
line = next(inputfile).strip()
while line:
if line[:4] == 'ROOT':
# Parse the root section.
reg = r'(\d+):\s*E=\s*(-?\d+.\d+) Eh(\s+\d+\.\d+ eV)?(\s+\d+\.\d+)?'
groups = re.search(reg, line).groups()
root = int(groups[0])
energy = float(groups[1])
# Excitation energies are only printed for excited state roots.
if groups[2] is not None:
excitation_energy_ev = float(groups[2].split()[0])
excitation_energy_cm = float(groups[3])
else:
# Parse the occupations section.
reg = r'(\d+\.\d+) \[\s*(\d+)\]: (\d+)'
groups = re.search(reg, line).groups()
coeff = float(groups[0])
number = float(groups[1])
occupations = list(map(int, groups[2]))
line = next(inputfile).strip()
# Skip any extended wavefunction printing.
while line != 'DENSITY MATRIX':
line = next(inputfile).strip()
self.skip_lines(inputfile, ['d', 'b'])
# --------------
# DENSITY MATRIX
# --------------
#
# 0 1 2 3
# 0 0.897244 0.000000 0.000000 0.000000
# 1 0.000000 0.533964 0.000000 0.000000
density = numpy.zeros((num_orbs, num_orbs))
for i in range(0, num_orbs, 6):
next(inputfile)
for j, line in zip(range(num_orbs), inputfile):
density[j][i:i + 6] = list(map(float, line.split()[1:]))
self.skip_lines(inputfile, ['Trace', 'b', 'd'])
# This is only printed for open-shells.
# -------------------
# SPIN-DENSITY MATRIX
# -------------------
#
# 0 1 2 3 4 5
# 0 -0.003709 0.001410 0.000074 -0.000564 -0.007978 0.000735
# 1 0.001410 -0.001750 -0.000544 -0.003815 0.008462 -0.004529
if next(inputfile).strip() == 'SPIN-DENSITY MATRIX':
self.skip_lines(inputfile, ['d', 'b'])
spin_density = numpy.zeros((num_orbs, num_orbs))
for i in range(0, num_orbs, 6):
next(inputfile)
for j, line in zip(range(num_orbs), inputfile):
spin_density[j][i:i + 6] = list(map(float, line.split()[1:]))
self.skip_lines(inputfile, ['Trace', 'b', 'd', 'ENERGY'])
self.skip_lines(inputfile, ['d', 'b'])
# -----------------
# ENERGY COMPONENTS
# -----------------
#
# One electron energy : -18.811767801 Eh -511.8942 eV
# Two electron energy : 4.367616074 Eh 118.8489 eV
# Nuclear repuslion energy : 0.000000000 Eh 0.0000 eV
# ----------------
# -14.444151727
#
# Kinetic energy : 14.371970266 Eh 391.0812 eV
# Potential energy : -28.816121993 Eh -784.1265 eV
# Virial ratio : -2.005022378
# ----------------
# -14.444151727
#
# Core energy : -13.604678408 Eh -370.2021 eV
one_el_energy = float(next(inputfile).split()[4])
two_el_energy = float(next(inputfile).split()[4])
nuclear_repulsion_energy = float(next(inputfile).split()[4])
self.skip_line(inputfile, 'dashes')
energy = float(next(inputfile).strip())
self.skip_line(inputfile, 'blank')
kinetic_energy = float(next(inputfile).split()[3])
potential_energy = float(next(inputfile).split()[3])
virial_ratio = float(next(inputfile).split()[3])
self.skip_line(inputfile, 'dashes')
energy = float(next(inputfile).strip())
self.skip_line(inputfile, 'blank')
core_energy = float(next(inputfile).split()[3])
if line[:15] == 'TOTAL RUN TIME:':
self.metadata['success'] = True
def parse_charge_section(self, line, inputfile, chargestype):
"""Parse a charge section, modifies class in place
Parameters
----------
line : str
the line which triggered entry here
inputfile : file
handle to file object
chargestype : str
what type of charge we're dealing with, must be one of
'mulliken', 'lowdin' or 'chelpg'
"""
has_spins = 'AND SPIN POPULATIONS' in line
if not hasattr(self, "atomcharges"):
self.atomcharges = {}
if has_spins and not hasattr(self, "atomspins"):
self.atomspins = {}
self.skip_line(inputfile, 'dashes')
# depending on chargestype, decide when to stop parsing lines
# start, stop - indices for slicing lines and grabbing values
if chargestype == 'mulliken':
should_stop = lambda x: x.startswith('Sum of atomic charges')
start, stop = 8, 20
elif chargestype == 'lowdin':
# stops when blank line encountered
should_stop = lambda x: not bool(x.strip())
start, stop = 8, 20
elif chargestype == 'chelpg':
should_stop = lambda x: x.startswith('---')
start, stop = 11, 26
charges = []
if has_spins:
spins = []
line = next(inputfile)
while not should_stop(line):
# Don't add point charges or embedding potentials.
if "Q :" not in line:
charges.append(float(line[start:stop]))
if has_spins:
spins.append(float(line[stop:]))
line = next(inputfile)
self.atomcharges[chargestype] = charges
if has_spins:
self.atomspins[chargestype] = spins
def parse_scf_condensed_format(self, inputfile, line):
""" Parse the SCF convergence information in condensed format """
# This is what it looks like
# ITER Energy Delta-E Max-DP RMS-DP [F,P] Damp
# *** Starting incremental Fock matrix formation ***
# 0 -384.5203638934 0.000000000000 0.03375012 0.00223249 0.1351565 0.7000
# 1 -384.5792776162 -0.058913722842 0.02841696 0.00175952 0.0734529 0.7000
# ***Turning on DIIS***
# 2 -384.6074211837 -0.028143567475 0.04968025 0.00326114 0.0310435 0.0000
# 3 -384.6479682063 -0.040547022616 0.02097477 0.00121132 0.0361982 0.0000
# 4 -384.6571124353 -0.009144228947 0.00576471 0.00035160 0.0061205 0.0000
# 5 -384.6574659959 -0.000353560584 0.00191156 0.00010160 0.0025838 0.0000
# 6 -384.6574990782 -0.000033082375 0.00052492 0.00003800 0.0002061 0.0000
# 7 -384.6575005762 -0.000001497987 0.00020257 0.00001146 0.0001652 0.0000
# 8 -384.6575007321 -0.000000155848 0.00008572 0.00000435 0.0000745 0.0000
# **** Energy Check signals convergence ****
assert line[2] == "Delta-E"
assert line[3] == "Max-DP"
if not hasattr(self, "scfvalues"):
self.scfvalues = []
self.scfvalues.append([])
# Try to keep track of the converger (NR, DIIS, SOSCF, etc.).
diis_active = True
while line:
maxDP = None
if 'Newton-Raphson' in line:
diis_active = False
elif 'SOSCF' in line:
diis_active = False
elif line[0].isdigit():
shim = 0
try:
energy = float(line[1])
deltaE = float(line[2])
maxDP = float(line[3 + int(not diis_active)])
rmsDP = float(line[4 + int(not diis_active)])
except ValueError as e:
# Someone in Orca forgot to properly add spaces in the scf printing
# code looks like:
# %3i %17.10f%12.12f%11.8f %11.8f
if line[1].count('.') == 2:
integer1, decimal1_integer2, decimal2 = line[1].split('.')
decimal1, integer2 = decimal1_integer2[:10], decimal1_integer2[10:]
energy = float(integer1 + '.' + decimal1)
deltaE = float(integer2 + '.' + decimal2)
maxDP = float(line[2 + int(not diis_active)])
rmsDP = float(line[3 + int(not diis_active)])
elif line[1].count('.') == 3:
integer1, decimal1_integer2, decimal2_integer3, decimal3 = line[1].split('.')
decimal1, integer2 = decimal1_integer2[:10], decimal1_integer2[10:]
decimal2, integer3 = decimal2_integer3[:12], decimal2_integer3[12:]
energy = float(integer1 + '.' + decimal1)
deltaE = float(integer2 + '.' + decimal2)
maxDP = float(integer3 + '.' + decimal3)
rmsDP = float(line[2 + int(not diis_active)])
elif line[2].count('.') == 2:
integer1, decimal1_integer2, decimal2 = line[2].split('.')
decimal1, integer2 = decimal1_integer2[:12], decimal1_integer2[12:]
deltaE = float(integer1 + '.' + decimal1)
maxDP = float(integer2 + '.' + decimal2)
rmsDP = float(line[3 + int(not diis_active)])
else:
raise e
self.scfvalues[-1].append([deltaE, maxDP, rmsDP])
try:
line = next(inputfile).split()
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last Max-DP: {}'.format(maxDP))
break
def parse_scf_expanded_format(self, inputfile, line):
""" Parse SCF convergence when in expanded format. """
# The following is an example of the format
# -----------------------------------------
#
# *** Starting incremental Fock matrix formation ***
#
# ----------------------------
# ! ITERATION 0 !
# ----------------------------
# Total Energy : -377.960836651297 Eh
# Energy Change : -377.960836651297 Eh
# MAX-DP : 0.100175793695
# RMS-DP : 0.004437973661
# Actual Damping : 0.7000
# Actual Level Shift : 0.2500 Eh
# Int. Num. El. : 43.99982197 (UP= 21.99991099 DN= 21.99991099)
# Exchange : -34.27550826
# Correlation : -2.02540957
#
#
# ----------------------------
# ! ITERATION 1 !
# ----------------------------
# Total Energy : -378.118458080109 Eh
# Energy Change : -0.157621428812 Eh
# MAX-DP : 0.053240648588
# RMS-DP : 0.002375092508
# Actual Damping : 0.7000
# Actual Level Shift : 0.2500 Eh
# Int. Num. El. : 43.99994143 (UP= 21.99997071 DN= 21.99997071)
# Exchange : -34.00291075
# Correlation : -2.01607243
#
# ***Turning on DIIS***
#
# ----------------------------
# ! ITERATION 2 !
# ----------------------------
# ....
#
if not hasattr(self, "scfvalues"):
self.scfvalues = []
self.scfvalues.append([])
line = "Foo" # dummy argument to enter loop
while line.find("******") < 0:
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF!')
break
info = line.split()
if len(info) > 1 and info[1] == "ITERATION":
dashes = next(inputfile)
energy_line = next(inputfile).split()
energy = float(energy_line[3])
deltaE_line = next(inputfile).split()
deltaE = float(deltaE_line[3])
if energy == deltaE:
deltaE = 0
maxDP_line = next(inputfile).split()
maxDP = float(maxDP_line[2])
rmsDP_line = next(inputfile).split()
rmsDP = float(rmsDP_line[2])
self.scfvalues[-1].append([deltaE, maxDP, rmsDP])
return
# end of parse_scf_expanded_format
def _append_scfvalues_scftargets(self, inputfile, line):
# The SCF convergence targets are always printed after this, but apparently
# not all of them always -- for example the RMS Density is missing for geometry
# optimization steps. So, assume the previous value is still valid if it is
# not found. For additional certainty, assert that the other targets are unchanged.
while not "Last Energy change" in line:
line = next(inputfile)
deltaE_value = float(line.split()[4])
deltaE_target = float(line.split()[7])
line = next(inputfile)
if "Last MAX-Density change" in line:
maxDP_value = float(line.split()[4])
maxDP_target = float(line.split()[7])
line = next(inputfile)
if "Last RMS-Density change" in line:
rmsDP_value = float(line.split()[4])
rmsDP_target = float(line.split()[7])
else:
rmsDP_value = self.scfvalues[-1][-1][2]
rmsDP_target = self.scftargets[-1][2]
assert deltaE_target == self.scftargets[-1][0]
assert maxDP_target == self.scftargets[-1][1]
self.scfvalues[-1].append([deltaE_value, maxDP_value, rmsDP_value])
self.scftargets.append([deltaE_target, maxDP_target, rmsDP_target])
cclib-1.6.2/cclib/parser/psi3parser.py 0000664 0000000 0000000 00000032473 13535330462 0017626 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Psi3 output files."""
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class Psi3(logfileparser.Logfile):
"""A Psi3 log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Psi3, self).__init__(logname="Psi3", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Psi3 log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Psi3("%s")' % (self.filename)
def normalisesym(self, label):
"""Psi3 does not require normalizing symmetry labels."""
return label
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
if "Version" in line:
self.metadata["package_version"] = ' '.join(line.split()[1:])
# Psi3 prints the coordinates in several configurations, and we will parse the
# the canonical coordinates system in Angstroms as the first coordinate set,
# although it is actually somewhere later in the input, after basis set, etc.
# We can also get or verify the number of atoms and atomic numbers from this block.
if line.strip() == "-Geometry in the canonical coordinate system (Angstrom):":
self.skip_lines(inputfile, ['header', 'd'])
coords = []
numbers = []
line = next(inputfile)
while line.strip():
tokens = line.split()
element = tokens[0]
numbers.append(self.table.number[element])
x = float(tokens[1])
y = float(tokens[2])
z = float(tokens[3])
coords.append([x, y, z])
line = next(inputfile)
self.set_attribute('natom', len(coords))
self.set_attribute('atomnos', numbers)
if not hasattr(self, 'atomcoords'):
self.atomcoords = []
self.atomcoords.append(coords)
if line.strip() == '-SYMMETRY INFORMATION:':
line = next(inputfile)
while line.strip():
if "Number of atoms" in line:
self.set_attribute('natom', int(line.split()[-1]))
line = next(inputfile)
if line.strip() == "-BASIS SET INFORMATION:":
line = next(inputfile)
while line.strip():
if "Number of SO" in line:
self.set_attribute('nbasis', int(line.split()[-1]))
line = next(inputfile)
# In Psi3, the section with the contraction scheme can be used to infer atombasis.
if line.strip() == "-Contraction Scheme:":
self.skip_lines(inputfile, ['header', 'd'])
indices = []
line = next(inputfile)
while line.strip():
shells = line.split('//')[-1]
expression = shells.strip().replace(' ', '+')
expression = expression.replace('s', '*1')
expression = expression.replace('p', '*3')
expression = expression.replace('d', '*6')
nfuncs = eval(expression)
if len(indices) == 0:
indices.append(range(nfuncs))
else:
start = indices[-1][-1] + 1
indices.append(range(start, start+nfuncs))
line = next(inputfile)
self.set_attribute('atombasis', indices)
if line.strip() == "CINTS: An integrals program written in C":
self.skip_lines(inputfile, ['authors', 'd', 'b', 'b'])
line = next(inputfile)
assert line.strip() == "-OPTIONS:"
while line.strip():
line = next(inputfile)
line = next(inputfile)
assert line.strip() == "-CALCULATION CONSTANTS:"
while line.strip():
if "Number of atoms" in line:
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
if "Number of symmetry orbitals" in line:
nbasis = int(line.split()[-1])
self.set_attribute('nbasis', nbasis)
line = next(inputfile)
if line.strip() == "CSCF3.0: An SCF program written in C":
self.skip_lines(inputfile, ['b', 'authors', 'b', 'd', 'b',
'mult', 'mult_comment', 'b'])
line = next(inputfile)
while line.strip():
if line.split()[0] == "multiplicity":
mult = int(line.split()[-1])
self.set_attribute('mult', mult)
if line.split()[0] == "charge":
charge = int(line.split()[-1])
self.set_attribute('charge', charge)
if line.split()[0] == "convergence":
conv = float(line.split()[-1])
if line.split()[0] == "reference":
self.reference = line.split()[-1]
line = next(inputfile)
if not hasattr(self, 'scftargets'):
self.scftargets = []
self.scftargets.append([conv])
# ==> Iterations <==
# Psi3 converges just the density elements, although it reports in the iterations
# changes in the energy as well as the DIIS error.
psi3_iterations_header = "iter total energy delta E delta P diiser"
if line.strip() == psi3_iterations_header:
if not hasattr(self, 'scfvalues'):
self.scfvalues = []
self.scfvalues.append([])
line = next(inputfile)
while line.strip():
ddensity = float(line.split()[-2])
self.scfvalues[-1].append([ddensity])
line = next(inputfile)
# This section, from which we parse molecular orbital symmetries and
# orbital energies, is quite similar for both Psi3 and Psi4, and in fact
# the format for orbtials is the same, although the headers and spacers
# are a bit different. Let's try to get both parsed with one code block.
#
# Here is how the block looks like for Psi4:
#
# Orbital Energies (a.u.)
# -----------------------
#
# Doubly Occupied:
#
# 1Bu -11.040586 1Ag -11.040524 2Bu -11.031589
# 2Ag -11.031589 3Bu -11.028950 3Ag -11.028820
# (...)
# 15Ag -0.415620 1Bg -0.376962 2Au -0.315126
# 2Bg -0.278361 3Bg -0.222189
#
# Virtual:
#
# 3Au 0.198995 4Au 0.268517 4Bg 0.308826
# 5Au 0.397078 5Bg 0.521759 16Ag 0.565017
# (...)
# 24Ag 0.990287 24Bu 1.027266 25Ag 1.107702
# 25Bu 1.124938
#
# The case is different in the trigger string.
if "orbital energies (a.u.)" in line.lower():
self.moenergies = [[]]
self.mosyms = [[]]
self.skip_line(inputfile, 'blank')
occupied = next(inputfile)
if self.reference[0:2] == 'RO' or self.reference[0:1] == 'R':
assert 'doubly occupied' in occupied.lower()
elif self.reference[0:1] == 'U':
assert 'alpha occupied' in occupied.lower()
# Parse the occupied MO symmetries and energies.
self._parse_mosyms_moenergies(inputfile, 0)
# The last orbital energy here represents the HOMO.
self.homos = [len(self.moenergies[0])-1]
# For a restricted open-shell calculation, this is the
# beta HOMO, and we assume the singly-occupied orbitals
# are all alpha, which are handled next.
if self.reference[0:2] == 'RO':
self.homos.append(self.homos[0])
self.skip_line(inputfile, 'blank')
unoccupied = next(inputfile)
if self.reference[0:2] == 'RO':
assert unoccupied.strip() == 'Singly Occupied:'
elif self.reference[0:1] == 'R':
assert unoccupied.strip() == 'Unoccupied orbitals'
elif self.reference[0:1] == 'U':
assert unoccupied.strip() == 'Alpha Virtual:'
# Parse the unoccupied MO symmetries and energies.
self._parse_mosyms_moenergies(inputfile, 0)
# Here is where we handle the Beta or Singly occupied orbitals.
if self.reference[0:1] == 'U':
self.mosyms.append([])
self.moenergies.append([])
line = next(inputfile)
assert line.strip() == 'Beta Occupied:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 1)
self.homos.append(len(self.moenergies[1])-1)
line = next(inputfile)
assert line.strip() == 'Beta Virtual:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 1)
elif self.reference[0:2] == 'RO':
line = next(inputfile)
assert line.strip() == 'Virtual:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 0)
# Both Psi3 and Psi4 print the final SCF energy right after
# the orbital energies, but the label is different. Psi4 also
# does DFT, and the label is also different in that case.
if "* SCF total energy" in line:
e = float(line.split()[-1])
if not hasattr(self, 'scfenergies'):
self.scfenergies = []
self.scfenergies.append(utils.convertor(e, 'hartree', 'eV'))
# We can also get some higher moments in Psi3, although here the dipole is not printed
# separately and the order is not lexicographical. However, the numbers seem
# kind of strange -- the quadrupole seems to be traceless, although I'm not sure
# whether the standard transformation has been used. So, until we know what kind
# of moment these are and how to make them raw again, we will only parse the dipole.
#
# --------------------------------------------------------------
# *** Electric multipole moments ***
# --------------------------------------------------------------
#
# CAUTION : The system has non-vanishing dipole moment, therefore
# quadrupole and higher moments depend on the reference point.
#
# -Coordinates of the reference point (a.u.) :
# x y z
# -------------------- -------------------- --------------------
# 0.0000000000 0.0000000000 0.0000000000
#
# -Electric dipole moment (expectation values) :
#
# mu(X) = -0.00000 D = -1.26132433e-43 C*m = -0.00000000 a.u.
# mu(Y) = 0.00000 D = 3.97987832e-44 C*m = 0.00000000 a.u.
# mu(Z) = 0.00000 D = 0.00000000e+00 C*m = 0.00000000 a.u.
# |mu| = 0.00000 D = 1.32262368e-43 C*m = 0.00000000 a.u.
#
# -Components of electric quadrupole moment (expectation values) (a.u.) :
#
# Q(XX) = 10.62340220 Q(YY) = 1.11816843 Q(ZZ) = -11.74157063
# Q(XY) = 3.64633112 Q(XZ) = 0.00000000 Q(YZ) = 0.00000000
#
if line.strip() == "*** Electric multipole moments ***":
self.skip_lines(inputfile, ['d', 'b', 'caution1', 'caution2', 'b'])
coordinates = next(inputfile)
assert coordinates.split()[-2] == "(a.u.)"
self.skip_lines(inputfile, ['xyz', 'd'])
line = next(inputfile)
self.origin = numpy.array([float(x) for x in line.split()])
self.origin = utils.convertor(self.origin, 'bohr', 'Angstrom')
self.skip_line(inputfile, "blank")
line = next(inputfile)
assert "Electric dipole moment" in line
self.skip_line(inputfile, "blank")
# Make sure to use the column that has the value in Debyes.
dipole = []
for i in range(3):
line = next(inputfile)
dipole.append(float(line.split()[2]))
if not hasattr(self, 'moments'):
self.moments = [self.origin, dipole]
else:
assert self.moments[1] == dipole
def _parse_mosyms_moenergies(self, inputfile, spinidx):
"""Parse molecular orbital symmetries and energies from the
'Post-Iterations' section.
"""
line = next(inputfile)
while line.strip():
for i in range(len(line.split()) // 2):
self.mosyms[spinidx].append(line.split()[i*2][-2:])
moenergy = utils.convertor(float(line.split()[i*2+1]), "hartree", "eV")
self.moenergies[spinidx].append(moenergy)
line = next(inputfile)
return
cclib-1.6.2/cclib/parser/psi4parser.py 0000664 0000000 0000000 00000140756 13535330462 0017633 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Psi4 output files."""
from collections import namedtuple
import numpy
from cclib.parser import data
from cclib.parser import logfileparser
from cclib.parser import utils
class Psi4(logfileparser.Logfile):
"""A Psi4 log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(Psi4, self).__init__(logname="Psi4", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Psi4 log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Psi4("%s")' % (self.filename)
def before_parsing(self):
# Early beta versions of Psi4 normalize basis function
# coefficients when printing.
self.version_4_beta = False
# This is just used to track which part of the output we are in for Psi4,
# with changes triggered by ==> things like this <== (Psi3 does not have this)
self.section = None
def after_parsing(self):
# Newer versions of Psi4 don't explicitly print the number of atoms.
if not hasattr(self, 'natom'):
if hasattr(self, 'atomnos'):
self.set_attribute('natom', len(self.atomnos))
def normalisesym(self, label):
"""Psi4 does not require normalizing symmetry labels."""
return label
# Match the number of skipped lines required based on the type of
# gradient present (determined from the header), as otherwise the
# parsing is identical.
GradientInfo = namedtuple('GradientInfo', ['gradient_type', 'header', 'skip_lines'])
GRADIENT_TYPES = {
'analytic': GradientInfo('analytic',
'-Total Gradient:',
['header', 'dash header']),
'numerical': GradientInfo('numerical',
'## F-D gradient (Symmetry 0) ##',
['Irrep num and total size', 'b', '123', 'b']),
}
GRADIENT_HEADERS = set([gradient_type.header
for gradient_type in GRADIENT_TYPES.values()])
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number and the version control
# information, if it exists.
if "Driver" in line:
tokens = line.split()
package_version = tokens[1].split("-")[-1]
self.metadata["package_version"] = package_version
# Keep track of early versions of Psi4.
if "beta" in package_version:
self.version_4_beta = True
# Don't add revision information to the main package version for now.
if "Git:" in line:
tokens = line.split()
revision = '-'.join(tokens[2:])
# This will automatically change the section attribute for Psi4, when encountering
# a line that <== looks like this ==>, to whatever is in between.
if (line.strip()[:3] == "==>") and (line.strip()[-3:] == "<=="):
self.section = line.strip()[4:-4]
if self.section == "DFT Potential":
self.metadata["methods"].append("DFT")
# Determine whether or not the reference wavefunction is
# restricted, unrestricted, or restricted open-shell.
if line.strip() == "SCF":
self.skip_line(inputfile, 'author list')
line = next(inputfile)
self.reference = line.split()[0]
# Work with a complex reference as if it's real.
if self.reference[0] == 'C':
self.reference = self.reference[1:]
# Parse the XC density functional
# => Composite Functional: B3LYP <=
if self.section == "DFT Potential" and "composite functional" in line.lower():
chomp = line.split()
functional = chomp[-2]
self.metadata["functional"] = functional
# ==> Geometry <==
#
# Molecular point group: c2h
# Full point group: C2h
#
# Geometry (in Angstrom), charge = 0, multiplicity = 1:
#
# Center X Y Z
# ------------ ----------------- ----------------- -----------------
# C -1.415253322400 0.230221785400 0.000000000000
# C 1.415253322400 -0.230221785400 0.000000000000
# ...
#
if (self.section == "Geometry") and ("Geometry (in Angstrom), charge" in line):
assert line.split()[3] == "charge"
charge = int(line.split()[5].strip(','))
self.set_attribute('charge', charge)
assert line.split()[6] == "multiplicity"
mult = int(line.split()[8].strip(':'))
self.set_attribute('mult', mult)
self.skip_line(inputfile, "blank")
line = next(inputfile)
# Usually there is the header and dashes, but, for example, the coordinates
# printed when a geometry optimization finishes do not have it.
if line.split()[0] == "Center":
self.skip_line(inputfile, "dashes")
line = next(inputfile)
elements = []
coords = []
atommasses = []
while line.strip():
chomp = line.split()
el, x, y, z = chomp[:4]
if len(el) > 1:
el = el[0] + el[1:].lower()
elements.append(el)
coords.append([float(x), float(y), float(z)])
# Newer versions of Psi4 print atomic masses.
if len(chomp) == 5:
atommasses.append(float(chomp[4]))
line = next(inputfile)
# The 0 is to handle the presence of ghost atoms.
self.set_attribute('atomnos', [self.table.number.get(el, 0) for el in elements])
if not hasattr(self, 'atomcoords'):
self.atomcoords = []
# This condition discards any repeated coordinates that Psi print. For example,
# geometry optimizations will print the coordinates at the beginning of and SCF
# section and also at the start of the gradient calculation.
if len(self.atomcoords) == 0 \
or (self.atomcoords[-1] != coords and not hasattr(self, 'finite_difference')):
self.atomcoords.append(coords)
if len(atommasses) > 0:
if not hasattr(self, 'atommasses'):
self.atommasses = atommasses
# Psi4 repeats the charge and multiplicity after the geometry.
if (self.section == "Geometry") and (line[2:16].lower() == "charge ="):
charge = int(line.split()[-1])
self.set_attribute('charge', charge)
if (self.section == "Geometry") and (line[2:16].lower() == "multiplicity ="):
mult = int(line.split()[-1])
self.set_attribute('mult', mult)
# The printout for Psi4 has a more obvious trigger for the SCF parameter printout.
if (self.section == "Algorithm") and (line.strip() == "==> Algorithm <==") \
and not hasattr(self, 'finite_difference'):
self.skip_line(inputfile, 'blank')
line = next(inputfile)
while line.strip():
if "Energy threshold" in line:
etarget = float(line.split()[-1])
if "Density threshold" in line:
dtarget = float(line.split()[-1])
line = next(inputfile)
if not hasattr(self, "scftargets"):
self.scftargets = []
self.scftargets.append([etarget, dtarget])
# This section prints contraction information before the atomic basis set functions and
# is a good place to parse atombasis indices as well as atomnos. However, the section this line
# is in differs between HF and DFT outputs.
#
# -Contraction Scheme:
# Atom Type All Primitives // Shells:
# ------ ------ --------------------------
# 1 C 6s 3p // 2s 1p
# 2 C 6s 3p // 2s 1p
# 3 C 6s 3p // 2s 1p
# ...
if self.section == "Primary Basis":
if line[2:12] == "Basis Set:":
self.metadata["basis_set"] = line.split()[2]
if (self.section == "Primary Basis" or self.section == "DFT Potential") and line.strip() == "-Contraction Scheme:":
self.skip_lines(inputfile, ['headers', 'd'])
atomnos = []
atombasis = []
atombasis_pos = 0
line = next(inputfile)
while line.strip():
element = line.split()[1]
if len(element) > 1:
element = element[0] + element[1:].lower()
atomnos.append(self.table.number[element])
# To count the number of atomic orbitals for the atom, sum up the orbitals
# in each type of shell, times the numbers of shells. Currently, we assume
# the multiplier is a single digit and that there are only s and p shells,
# which will need to be extended later when considering larger basis sets,
# with corrections for the cartesian/spherical cases.
ao_count = 0
shells = line.split('//')[1].split()
for s in shells:
count, type = s
multiplier = 3*(type == 'p') or 1
ao_count += multiplier*int(count)
if len(atombasis) > 0:
atombasis_pos = atombasis[-1][-1] + 1
atombasis.append(list(range(atombasis_pos, atombasis_pos+ao_count)))
line = next(inputfile)
self.set_attribute('natom', len(atomnos))
self.set_attribute('atomnos', atomnos)
self.set_attribute('atombasis', atombasis)
# The atomic basis set is straightforward to parse, but there are some complications
# when symmetry is used, because in that case Psi4 only print the symmetry-unique atoms,
# and the list of symmetry-equivalent ones is not printed. Therefore, for simplicity here
# when an atomic is missing (atom indices are printed) assume the atomic orbitals of the
# last atom of the same element before it. This might not work if a mixture of basis sets
# is used somehow... but it should cover almost all cases for now.
#
# Note that Psi also print normalized coefficients (details below).
#
# ==> AO Basis Functions <==
#
# [ STO-3G ]
# spherical
# ****
# C 1
# S 3 1.00
# 71.61683700 2.70781445
# 13.04509600 2.61888016
# ...
if (self.section == "AO Basis Functions") and (line.strip() == "==> AO Basis Functions <=="):
def get_symmetry_atom_basis(gbasis):
"""Get symmetry atom by replicating the last atom in gbasis of the same element."""
missing_index = len(gbasis)
missing_atomno = self.atomnos[missing_index]
ngbasis = len(gbasis)
last_same = ngbasis - self.atomnos[:ngbasis][::-1].index(missing_atomno) - 1
return gbasis[last_same]
dfact = lambda n: (n <= 0) or n * dfact(n-2)
# Early beta versions of Psi4 normalize basis function
# coefficients when printing.
if self.version_4_beta:
def get_normalization_factor(exp, lx, ly, lz):
norm_s = (2*exp/numpy.pi)**0.75
if lx + ly + lz > 0:
nom = (4*exp)**((lx+ly+lz)/2.0)
den = numpy.sqrt(dfact(2*lx-1) * dfact(2*ly-1) * dfact(2*lz-1))
return norm_s * nom / den
else:
return norm_s
else:
get_normalization_factor = lambda exp, lx, ly, lz: 1
self.skip_lines(inputfile, ['b', 'basisname'])
line = next(inputfile)
spherical = line.strip() == "spherical"
if hasattr(self, 'spherical_basis'):
assert self.spherical_basis == spherical
else:
self.spherical_basis = spherical
gbasis = []
self.skip_line(inputfile, 'stars')
line = next(inputfile)
while line.strip():
element, index = line.split()
if len(element) > 1:
element = element[0] + element[1:].lower()
index = int(index)
# This is the code that adds missing atoms when symmetry atoms are excluded
# from the basis set printout. Again, this will work only if all atoms of
# the same element use the same basis set.
while index > len(gbasis) + 1:
gbasis.append(get_symmetry_atom_basis(gbasis))
gbasis.append([])
line = next(inputfile)
while line.find("*") == -1:
# The shell type and primitive count is in the first line.
shell_type, nprimitives, _ = line.split()
nprimitives = int(nprimitives)
# Get the angular momentum for this shell type.
momentum = {'S': 0, 'P': 1, 'D': 2, 'F': 3, 'G': 4, 'H': 5, 'I': 6}[shell_type.upper()]
# Read in the primitives.
primitives_lines = [next(inputfile) for i in range(nprimitives)]
primitives = [list(map(float, pl.split())) for pl in primitives_lines]
# Un-normalize the coefficients. Psi prints the normalized coefficient
# of the highest polynomial, namely XX for D orbitals, XXX for F, and so on.
for iprim, prim in enumerate(primitives):
exp, coef = prim
coef = coef / get_normalization_factor(exp, momentum, 0, 0)
primitives[iprim] = [exp, coef]
primitives = [tuple(p) for p in primitives]
shell = [shell_type, primitives]
gbasis[-1].append(shell)
line = next(inputfile)
line = next(inputfile)
# We will also need to add symmetry atoms that are missing from the input
# at the end of this block, if the symmetry atoms are last.
while len(gbasis) < self.natom:
gbasis.append(get_symmetry_atom_basis(gbasis))
self.gbasis = gbasis
# A block called 'Calculation Information' prints these before starting the SCF.
if (self.section == "Pre-Iterations") and ("Number of atoms" in line):
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
if (self.section == "Pre-Iterations") and ("Number of atomic orbitals" in line):
nbasis = int(line.split()[-1])
self.set_attribute('nbasis', nbasis)
if (self.section == "Pre-Iterations") and ("Total" in line):
chomp = line.split()
nbasis = int(chomp[1])
self.set_attribute('nbasis', nbasis)
# ==> Iterations <==
# Psi4 converges both the SCF energy and density elements and reports both in the
# iterations printout. However, the default convergence scheme involves a density-fitted
# algorithm for efficiency, and this is often followed by a something with exact electron
# repulsion integrals. In that case, there are actually two convergence cycles performed,
# one for the density-fitted algorithm and one for the exact one, and the iterations are
# printed in two blocks separated by some set-up information.
if (self.section == "Iterations") and (line.strip() == "==> Iterations <==") \
and not hasattr(self, 'finite_difference'):
if not hasattr(self, 'scfvalues'):
self.scfvalues = []
scfvals = []
self.skip_lines(inputfile, ['b', 'header', 'b'])
line = next(inputfile)
# Read each SCF iteration.
while line.strip() != "==> Post-Iterations <==":
if line.strip() and line.split()[0][0] == '@':
denergy = float(line.split()[4])
ddensity = float(line.split()[5])
scfvals.append([denergy, ddensity])
try:
line = next(inputfile)
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last density err: {}'.format(ddensity))
break
self.section = "Post-Iterations"
self.scfvalues.append(scfvals)
# This section, from which we parse molecular orbital symmetries and
# orbital energies, is quite similar for both Psi3 and Psi4, and in fact
# the format for orbtials is the same, although the headers and spacers
# are a bit different. Let's try to get both parsed with one code block.
#
# Here is how the block looks like for Psi4:
#
# Orbital Energies (a.u.)
# -----------------------
#
# Doubly Occupied:
#
# 1Bu -11.040586 1Ag -11.040524 2Bu -11.031589
# 2Ag -11.031589 3Bu -11.028950 3Ag -11.028820
# (...)
# 15Ag -0.415620 1Bg -0.376962 2Au -0.315126
# 2Bg -0.278361 3Bg -0.222189
#
# Virtual:
#
# 3Au 0.198995 4Au 0.268517 4Bg 0.308826
# 5Au 0.397078 5Bg 0.521759 16Ag 0.565017
# (...)
# 24Ag 0.990287 24Bu 1.027266 25Ag 1.107702
# 25Bu 1.124938
#
# The case is different in the trigger string.
if ("orbital energies (a.u.)" in line.lower() or "orbital energies [eh]" in line.lower()) \
and not hasattr(self, 'finite_difference'):
# If this is Psi4, we will be in the appropriate section.
assert self.section == "Post-Iterations"
self.moenergies = [[]]
self.mosyms = [[]]
# Psi4 has dashes under the trigger line, but Psi3 did not.
self.skip_line(inputfile, 'dashes')
self.skip_line(inputfile, 'blank')
# Both versions have this case-insensitive substring.
occupied = next(inputfile)
if self.reference[0:2] == 'RO' or self.reference[0:1] == 'R':
assert 'doubly occupied' in occupied.lower()
elif self.reference[0:1] == 'U':
assert 'alpha occupied' in occupied.lower()
self.skip_line(inputfile, 'blank')
# Parse the occupied MO symmetries and energies.
self._parse_mosyms_moenergies(inputfile, 0)
# The last orbital energy here represents the HOMO.
self.homos = [len(self.moenergies[0])-1]
# For a restricted open-shell calculation, this is the
# beta HOMO, and we assume the singly-occupied orbitals
# are all alpha, which are handled next.
if self.reference[0:2] == 'RO':
self.homos.append(self.homos[0])
unoccupied = next(inputfile)
if self.reference[0:2] == 'RO':
assert unoccupied.strip() == 'Singly Occupied:'
elif self.reference[0:1] == 'R':
assert unoccupied.strip() == 'Virtual:'
elif self.reference[0:1] == 'U':
assert unoccupied.strip() == 'Alpha Virtual:'
# Psi4 now has a blank line, Psi3 does not.
self.skip_line(inputfile, 'blank')
# Parse the unoccupied MO symmetries and energies.
self._parse_mosyms_moenergies(inputfile, 0)
# Here is where we handle the Beta or Singly occupied orbitals.
if self.reference[0:1] == 'U':
self.mosyms.append([])
self.moenergies.append([])
line = next(inputfile)
assert line.strip() == 'Beta Occupied:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 1)
self.homos.append(len(self.moenergies[1])-1)
line = next(inputfile)
assert line.strip() == 'Beta Virtual:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 1)
elif self.reference[0:2] == 'RO':
line = next(inputfile)
assert line.strip() == 'Virtual:'
self.skip_line(inputfile, 'blank')
self._parse_mosyms_moenergies(inputfile, 0)
line = next(inputfile)
assert line.strip() == 'Final Occupation by Irrep:'
line = next(inputfile)
irreps = line.split()
line = next(inputfile)
tokens = line.split()
assert tokens[0] == 'DOCC'
docc = sum([int(x.replace(',', '')) for x in tokens[2:-1]])
line = next(inputfile)
if line.strip():
tokens = line.split()
assert tokens[0] in ('SOCC', 'NA')
socc = sum([int(x.replace(',', '')) for x in tokens[2:-1]])
# Fix up the restricted open-shell alpha HOMO.
if self.reference[0:2] == 'RO':
self.homos[0] += socc
# Both Psi3 and Psi4 print the final SCF energy right after the orbital energies,
# but the label is different. Psi4 also does DFT, and the label is also different in that case.
if self.section == "Post-Iterations" and "Final Energy:" in line \
and not hasattr(self, 'finite_difference'):
e = float(line.split()[3])
if not hasattr(self, 'scfenergies'):
self.scfenergies = []
self.scfenergies.append(utils.convertor(e, 'hartree', 'eV'))
# ==> Molecular Orbitals <==
#
# 1 2 3 4 5
#
# 1 H1 s0 0.1610392 0.1040990 0.0453848 0.0978665 1.0863246
# 2 H1 s0 0.3066996 0.0742959 0.8227318 1.3460922 -1.6429494
# 3 H1 s0 0.1669296 1.5494169 -0.8885631 -1.8689490 1.0473633
# 4 H2 s0 0.1610392 -0.1040990 0.0453848 -0.0978665 -1.0863246
# 5 H2 s0 0.3066996 -0.0742959 0.8227318 -1.3460922 1.6429494
# 6 H2 s0 0.1669296 -1.5494169 -0.8885631 1.8689490 -1.0473633
#
# Ene -0.5279195 0.1235556 0.3277474 0.5523654 2.5371710
# Sym Ag B3u Ag B3u B3u
# Occ 2 0 0 0 0
#
#
# 6
#
# 1 H1 s0 1.1331221
# 2 H1 s0 -1.2163107
# 3 H1 s0 0.4695317
# 4 H2 s0 1.1331221
# 5 H2 s0 -1.2163107
# 6 H2 s0 0.4695317
#
# Ene 2.6515637
# Sym Ag
# Occ 0
if (self.section) and ("Molecular Orbitals" in self.section) \
and ("Molecular Orbitals" in line) and not hasattr(self, 'finite_difference'):
self.skip_line(inputfile, 'blank')
mocoeffs = []
indices = next(inputfile)
while indices.strip():
if indices[:3] == '***':
break
indices = [int(i) for i in indices.split()]
if len(mocoeffs) < indices[-1]:
for i in range(len(indices)):
mocoeffs.append([])
else:
assert len(mocoeffs) == indices[-1]
self.skip_line(inputfile, 'blank')
n = len(indices)
line = next(inputfile)
while line.strip():
chomp = line.split()
m = len(chomp)
iao = int(chomp[0])
coeffs = [float(c) for c in chomp[m - n:]]
for i, c in enumerate(coeffs):
mocoeffs[indices[i]-1].append(c)
line = next(inputfile)
energies = next(inputfile)
symmetries = next(inputfile)
occupancies = next(inputfile)
self.skip_lines(inputfile, ['b', 'b'])
indices = next(inputfile)
if not hasattr(self, 'mocoeffs'):
self.mocoeffs = []
self.mocoeffs.append(mocoeffs)
# The formats for Mulliken and Lowdin atomic charges are the same, just with
# the name changes, so use the same code for both.
#
# Properties computed using the SCF density density matrix
# Mulliken Charges: (a.u.)
# Center Symbol Alpha Beta Spin Total
# 1 C 2.99909 2.99909 0.00000 0.00182
# 2 C 2.99909 2.99909 0.00000 0.00182
# ...
for pop_type in ["Mulliken", "Lowdin"]:
if line.strip() == "%s Charges: (a.u.)" % pop_type:
if not hasattr(self, 'atomcharges'):
self.atomcharges = {}
header = next(inputfile)
line = next(inputfile)
while not line.strip():
line = next(inputfile)
charges = []
while line.strip():
ch = float(line.split()[-1])
charges.append(ch)
line = next(inputfile)
self.atomcharges[pop_type.lower()] = charges
# This is for the older conventional MP2 code in 4.0b5.
mp_trigger = "MP2 Total Energy (a.u.)"
if line.strip()[:len(mp_trigger)] == mp_trigger:
self.metadata["methods"].append("MP2")
mpenergy = utils.convertor(float(line.split()[-1]), 'hartree', 'eV')
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
self.mpenergies.append([mpenergy])
# This is for the newer DF-MP2 code in 4.0.
if 'DF-MP2 Energies' in line:
self.metadata["methods"].append("DF-MP2")
while 'Total Energy' not in line:
line = next(inputfile)
mpenergy = utils.convertor(float(line.split()[3]), 'hartree', 'eV')
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
self.mpenergies.append([mpenergy])
# Note this is just a start and needs to be modified for CCSD(T), etc.
ccsd_trigger = "* CCSD total energy"
if line.strip()[:len(ccsd_trigger)] == ccsd_trigger:
self.metadata["methods"].append("CCSD")
ccsd_energy = utils.convertor(float(line.split()[-1]), 'hartree', 'eV')
if not hasattr(self, "ccenergis"):
self.ccenergies = []
self.ccenergies.append(ccsd_energy)
# The geometry convergence targets and values are printed in a table, with the legends
# describing the convergence annotation. Probably exact slicing of the line needs
# to be done in order to extract the numbers correctly. If there are no values for
# a paritcular target it means they are not used (marked also with an 'o'), and in this case
# we will set a value of numpy.inf so that any value will be smaller.
#
# ==> Convergence Check <==
#
# Measures of convergence in internal coordinates in au.
# Criteria marked as inactive (o), active & met (*), and active & unmet ( ).
# ---------------------------------------------------------------------------------------------
# Step Total Energy Delta E MAX Force RMS Force MAX Disp RMS Disp
# ---------------------------------------------------------------------------------------------
# Convergence Criteria 1.00e-06 * 3.00e-04 * o 1.20e-03 * o
# ---------------------------------------------------------------------------------------------
# 2 -379.77675264 -7.79e-03 1.88e-02 4.37e-03 o 2.29e-02 6.76e-03 o ~
# ---------------------------------------------------------------------------------------------
#
if (self.section == "Convergence Check") and line.strip() == "==> Convergence Check <==" \
and not hasattr(self, 'finite_difference'):
if not hasattr(self, "optstatus"):
self.optstatus = []
self.optstatus.append(data.ccData.OPT_UNKNOWN)
self.skip_lines(inputfile, ['b', 'units', 'comment', 'dash+tilde', 'header', 'dash+tilde'])
# These are the position in the line at which numbers should start.
starts = [27, 41, 55, 69, 83]
criteria = next(inputfile)
geotargets = []
for istart in starts:
if criteria[istart:istart+9].strip():
geotargets.append(float(criteria[istart:istart+9]))
else:
geotargets.append(numpy.inf)
self.skip_line(inputfile, 'dashes')
values = next(inputfile)
step = int(values.split()[0])
geovalues = []
for istart in starts:
if values[istart:istart+9].strip():
geovalues.append(float(values[istart:istart+9]))
if step == 1:
self.optstatus[-1] += data.ccData.OPT_NEW
# This assertion may be too restrictive, but we haven't seen the geotargets change.
# If such an example comes up, update the value since we're interested in the last ones.
if not hasattr(self, 'geotargets'):
self.geotargets = geotargets
else:
assert self.geotargets == geotargets
if not hasattr(self, 'geovalues'):
self.geovalues = []
self.geovalues.append(geovalues)
# This message signals a converged optimization, in which case we want
# to append the index for this step to optdone, which should be equal
# to the number of geovalues gathered so far.
if "Optimization is complete!" in line:
# This is a workaround for Psi4.0/sample_opt-irc-2.out;
# IRC calculations currently aren't parsed properly for
# optimization parameters.
if hasattr(self, 'geovalues'):
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.geovalues))
assert hasattr(self, "optstatus") and len(self.optstatus) > 0
self.optstatus[-1] += data.ccData.OPT_DONE
# This message means that optimization has stopped for some reason, but we
# still want optdone to exist in this case, although it will be an empty list.
if line.strip() == "Optimizer: Did not converge!":
if not hasattr(self, 'optdone'):
self.optdone = []
assert hasattr(self, "optstatus") and len(self.optstatus) > 0
self.optstatus[-1] += data.ccData.OPT_UNCONVERGED
# The reference point at which properties are evaluated in Psi4 is explicitely stated,
# so we can save it for later. It is not, however, a part of the Properties section,
# but it appears before it and also in other places where properies that might depend
# on it are printed.
#
# Properties will be evaluated at 0.000000, 0.000000, 0.000000 Bohr
#
# OR
#
# Properties will be evaluated at 0.000000, 0.000000, 0.000000 [a0]
#
if "Properties will be evaluated at" in line.strip():
self.origin = numpy.array([float(x.strip(',')) for x in line.split()[-4:-1]])
assert line.split()[-1] in ["Bohr", "[a0]"]
self.origin = utils.convertor(self.origin, 'bohr', 'Angstrom')
# The properties section print the molecular dipole moment:
#
# ==> Properties <==
#
#
#Properties computed using the SCF density density matrix
# Nuclear Dipole Moment: (a.u.)
# X: 0.0000 Y: 0.0000 Z: 0.0000
#
# Electronic Dipole Moment: (a.u.)
# X: 0.0000 Y: 0.0000 Z: 0.0000
#
# Dipole Moment: (a.u.)
# X: 0.0000 Y: 0.0000 Z: 0.0000 Total: 0.0000
#
if (self.section == "Properties") and line.strip() == "Dipole Moment: (a.u.)":
line = next(inputfile)
dipole = numpy.array([float(line.split()[1]), float(line.split()[3]), float(line.split()[5])])
dipole = utils.convertor(dipole, "ebohr", "Debye")
if not hasattr(self, 'moments'):
# Old versions of Psi4 don't print the origin; assume
# it's at zero.
if not hasattr(self, 'origin'):
self.origin = numpy.array([0.0, 0.0, 0.0])
self.moments = [self.origin, dipole]
else:
try:
assert numpy.all(self.moments[1] == dipole)
except AssertionError:
self.logger.warning('Overwriting previous multipole moments with new values')
self.logger.warning('This could be from post-HF properties or geometry optimization')
self.moments = [self.origin, dipole]
# Higher multipole moments are printed separately, on demand, in lexicographical order.
#
# Multipole Moments:
#
# ------------------------------------------------------------------------------------
# Multipole Electric (a.u.) Nuclear (a.u.) Total (a.u.)
# ------------------------------------------------------------------------------------
#
# L = 1. Multiply by 2.5417462300 to convert to Debye
# Dipole X : 0.0000000 0.0000000 0.0000000
# Dipole Y : 0.0000000 0.0000000 0.0000000
# Dipole Z : 0.0000000 0.0000000 0.0000000
#
# L = 2. Multiply by 1.3450341749 to convert to Debye.ang
# Quadrupole XX : -1535.8888701 1496.8839996 -39.0048704
# Quadrupole XY : -11.5262958 11.4580038 -0.0682920
# ...
#
if line.strip() == "Multipole Moments:":
self.skip_lines(inputfile, ['b', 'd', 'header', 'd', 'b'])
# The reference used here should have been printed somewhere
# before the properties and parsed above.
moments = [self.origin]
line = next(inputfile)
while "----------" not in line.strip():
rank = int(line.split()[2].strip('.'))
multipole = []
line = next(inputfile)
while line.strip():
value = float(line.split()[-1])
fromunits = "ebohr" + (rank > 1)*("%i" % rank)
tounits = "Debye" + (rank > 1)*".ang" + (rank > 2)*("%i" % (rank-1))
value = utils.convertor(value, fromunits, tounits)
multipole.append(value)
line = next(inputfile)
multipole = numpy.array(multipole)
moments.append(multipole)
line = next(inputfile)
if not hasattr(self, 'moments'):
self.moments = moments
else:
for im, m in enumerate(moments):
if len(self.moments) <= im:
self.moments.append(m)
else:
assert numpy.allclose(self.moments[im], m, atol=1.0e4)
## Analytic Gradient
# -Total Gradient:
# Atom X Y Z
# ------ ----------------- ----------------- -----------------
# 1 -0.000000000000 0.000000000000 -0.064527252292
# 2 0.000000000000 -0.028380539652 0.032263626146
# 3 -0.000000000000 0.028380539652 0.032263626146
## Finite Differences Gradient
# -------------------------------------------------------------
# ## F-D gradient (Symmetry 0) ##
# Irrep: 1 Size: 3 x 3
#
# 1 2 3
#
# 1 0.00000000000000 0.00000000000000 -0.02921303282515
# 2 0.00000000000000 -0.00979709321487 0.01460651641258
# 3 0.00000000000000 0.00979709321487 0.01460651641258
if line.strip() in Psi4.GRADIENT_HEADERS:
# Handle the different header lines between analytic and
# numerical gradients.
gradient_skip_lines = [
info.skip_lines
for info in Psi4.GRADIENT_TYPES.values()
if info.header == line.strip()
][0]
gradient = self.parse_gradient(inputfile, gradient_skip_lines)
if not hasattr(self, 'grads'):
self.grads = []
self.grads.append(gradient)
# OLD Normal mode output parser (PSI4 < 1)
## Harmonic frequencies.
# -------------------------------------------------------------
# Computing second-derivative from gradients using projected,
# symmetry-adapted, cartesian coordinates (fd_freq_1).
# 74 gradients passed in, including the reference geometry.
# Generating complete list of displacements from unique ones.
# Operation 2 takes plus displacements of irrep Bg to minus ones.
# Operation 3 takes plus displacements of irrep Au to minus ones.
# Operation 2 takes plus displacements of irrep Bu to minus ones.
# Irrep Harmonic Frequency
# (cm-1)
# -----------------------------------------------
# Au 137.2883
if line.strip() == 'Irrep Harmonic Frequency':
vibsyms = []
vibfreqs = []
self.skip_lines(inputfile, ['(cm-1)', 'dashes'])
## The first section contains the symmetry of each normal
## mode and its frequency.
line = next(inputfile)
while '---' not in line:
chomp = line.split()
vibsym = chomp[0]
vibfreq = Psi4.parse_vibfreq(chomp[1])
vibsyms.append(vibsym)
vibfreqs.append(vibfreq)
line = next(inputfile)
self.set_attribute('vibsyms', vibsyms)
self.set_attribute('vibfreqs', vibfreqs)
line = next(inputfile)
assert line.strip() == ''
line = next(inputfile)
assert 'Normal Modes' in line
line = next(inputfile)
assert 'Molecular mass is' in line
if hasattr(self, 'atommasses'):
assert abs(float(line.split()[3]) - sum(self.atommasses)) < 1.0e-4
line = next(inputfile)
assert line.strip() == 'Frequencies in cm^-1; force constants in au.'
line = next(inputfile)
assert line.strip() == ''
line = next(inputfile)
## The second section contains the frequency, force
## constant, and displacement for each normal mode, along
## with the atomic masses.
# Normal Modes (non-mass-weighted).
# Molecular mass is 130.07825 amu.
# Frequencies in cm^-1; force constants in au.
# Frequency: 137.29
# Force constant: 0.0007
# X Y Z mass
# C 0.000 0.000 0.050 12.000000
# C 0.000 0.000 0.050 12.000000
for vibfreq in self.vibfreqs:
_vibfreq = Psi4.parse_vibfreq(line[13:].strip())
assert abs(vibfreq - _vibfreq) < 1.0e-2
line = next(inputfile)
# Can't do anything with this for now.
assert 'Force constant:' in line
line = next(inputfile)
assert 'X Y Z mass' in line
line = next(inputfile)
if not hasattr(self, 'vibdisps'):
self.vibdisps = []
normal_mode_disps = []
# for k in range(self.natom):
while line.strip():
chomp = line.split()
# Do nothing with this for now.
atomsym = chomp[0]
atomcoords = [float(x) for x in chomp[1:4]]
# Do nothing with this for now.
atommass = float(chomp[4])
normal_mode_disps.append(atomcoords)
line = next(inputfile)
self.vibdisps.append(normal_mode_disps)
line = next(inputfile)
# NEW Normal mode output parser (PSI4 >= 1)
# ==> Harmonic Vibrational Analysis <==
# ...
# Vibration 7 8 9
# ...
#
# Vibration 10 11 12
# ...
if line.strip() == '==> Harmonic Vibrational Analysis <==':
vibsyms = []
vibfreqs = []
vibdisps = []
# Skip lines till the first Vibration block
while not line.strip().startswith('Vibration'):
line = next(inputfile)
n_modes = 0
# Parse all the Vibration blocks
while line.strip().startswith('Vibration'):
n = len(line.split()) - 1
n_modes += n
vibfreqs_, vibsyms_, vibdisps_ = self.parse_vibration(n, inputfile)
vibfreqs.extend(vibfreqs_)
vibsyms.extend(vibsyms_)
vibdisps.extend(vibdisps_)
line = next(inputfile)
# It looks like the symmetry of the normal mode may be missing
# from some / most. Only include them if they are there for all
if len(vibfreqs) == n_modes:
self.set_attribute('vibfreqs', vibfreqs)
if len(vibsyms) == n_modes:
self.set_attribute('vibsyms', vibsyms)
if len(vibdisps) == n_modes:
self.set_attribute('vibdisps', vibdisps)
# If finite difference is used to compute forces (i.e. by displacing
# slightly all the atoms), a series of additional scf calculations is
# performed. Orbitals, geometries, energies, etc. for these shouln't be
# included in the parsed data.
if line.strip().startswith('Using finite-differences of gradients'):
self.set_attribute('finite_difference', True)
if line[:54] == '*** Psi4 exiting successfully. Buy a developer a beer!'\
or line[:54] == '*** PSI4 exiting successfully. Buy a developer a beer!':
self.metadata['success'] = True
def _parse_mosyms_moenergies(self, inputfile, spinidx):
"""Parse molecular orbital symmetries and energies from the
'Post-Iterations' section.
"""
line = next(inputfile)
while line.strip():
for i in range(len(line.split()) // 2):
self.mosyms[spinidx].append(line.split()[i*2][-2:])
moenergy = utils.convertor(float(line.split()[i*2+1]), "hartree", "eV")
self.moenergies[spinidx].append(moenergy)
line = next(inputfile)
return
def parse_gradient(self, inputfile, skip_lines):
"""Parse the nuclear gradient section into a list of lists with shape
[natom, 3].
"""
self.skip_lines(inputfile, skip_lines)
line = next(inputfile)
gradient = []
while line.strip():
idx, x, y, z = line.split()
gradient.append((float(x), float(y), float(z)))
line = next(inputfile)
return gradient
@staticmethod
def parse_vibration(n, inputfile):
# Freq [cm^-1] 1501.9533 1501.9533 1501.9533
# Irrep
# Reduced mass [u] 1.1820 1.1820 1.1820
# Force const [mDyne/A] 1.5710 1.5710 1.5710
# Turning point v=0 [a0] 0.2604 0.2604 0.2604
# RMS dev v=0 [a0 u^1/2] 0.2002 0.2002 0.2002
# Char temp [K] 2160.9731 2160.9731 2160.9731
# ----------------------------------------------------------------------------------
# 1 C -0.00 0.01 0.13 -0.00 -0.13 0.01 -0.13 0.00 -0.00
# 2 H 0.33 -0.03 -0.38 0.02 0.60 -0.02 0.14 -0.01 -0.32
# 3 H -0.32 -0.03 -0.37 -0.01 0.60 -0.01 0.15 -0.01 0.33
# 4 H 0.02 0.32 -0.36 0.01 0.16 -0.34 0.60 -0.01 0.01
# 5 H 0.02 -0.33 -0.39 0.01 0.13 0.31 0.60 0.01 0.01
line = next(inputfile)
assert 'Freq' in line
chomp = line.split()
vibfreqs = [Psi4.parse_vibfreq(x) for x in chomp[-n:]]
line = next(inputfile)
assert 'Irrep' in line
chomp = line.split()
vibsyms = [irrep for irrep in chomp[1:]]
line = next(inputfile)
assert 'Reduced mass' in line
line = next(inputfile)
assert 'Force const' in line
line = next(inputfile)
assert 'Turning point' in line
line = next(inputfile)
assert 'RMS dev' in line
line = next(inputfile)
assert 'Char temp' in line
line = next(inputfile)
assert '---' in line
line = next(inputfile)
vibdisps = [ [] for i in range(n)]
while len(line.strip()) > 0:
chomp = line.split()
for i in range(n):
start = len(chomp) - (n - i) * 3
stop = start + 3
mode_disps = [float(c) for c in chomp[start:stop]]
vibdisps[i].append(mode_disps)
line = next(inputfile)
return vibfreqs, vibsyms, vibdisps
@staticmethod
def parse_vibfreq(vibfreq):
"""Imaginary frequencies are printed as '12.34i', rather than
'-12.34'.
"""
is_imag = vibfreq[-1] == 'i'
if is_imag:
return -float(vibfreq[:-1])
else:
return float(vibfreq)
cclib-1.6.2/cclib/parser/qchemparser.py 0000664 0000000 0000000 00000222056 13535330462 0020043 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Q-Chem output files"""
from __future__ import division
from __future__ import print_function
import itertools
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class QChem(logfileparser.Logfile):
"""A Q-Chem log file."""
def __init__(self, *args, **kwargs):
# Call the __init__ method of the superclass
super(QChem, self).__init__(logname="QChem", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "QChem log file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'QChem("%s")' % (self.filename)
def normalisesym(self, label):
"""Q-Chem does not require normalizing symmetry labels."""
return label
def before_parsing(self):
# Keep track of whether or not we're performing an
# (un)restricted calculation.
self.unrestricted = False
self.is_rohf = False
# Keep track of whether or not this is a fragment calculation,
# so that only the supersystem is parsed.
self.is_fragment_section = False
# These headers identify when a fragment section is
# entered/exited.
self.fragment_section_headers = (
'Guess MOs from converged MOs on fragments',
'CP correction for fragment',
)
self.supersystem_section_headers = (
'Done with SCF on isolated fragments',
'Done with counterpoise correction on fragments',
)
# Compile the dashes-and-or-spaces-only regex.
self.re_dashes_and_spaces = re.compile('^[\s-]+$')
# Compile the regex for extracting the atomic index from an
# aoname.
self.re_atomindex = re.compile('(\d+)_')
# A maximum of 6 columns per block when printing matrices. The
# Fock matrix is 4.
self.ncolsblock = 6
# By default, when asked to print orbitals via
# `scf_print`/`scf_final_print` and/or `print_orbitals`,
# Q-Chem will print all occupieds and the first 5 virtuals.
#
# When the number is set for `print_orbitals`, that section of
# the output will display (NOcc + that many virtual) MOs, but
# any other sections present due to
# `scf_print`/`scf_final_print` will still only display (NOcc
# + 5) MOs. It is the `print_orbitals` section that `aonames`
# is parsed from.
#
# Note that the (AO basis) density matrix is always (NBasis *
# NBasis)!
self.norbdisp_alpha = self.norbdisp_beta = 5
self.norbdisp_alpha_aonames = self.norbdisp_beta_aonames = 5
self.norbdisp_set = False
self.alpha_mo_coefficient_headers = (
'RESTRICTED (RHF) MOLECULAR ORBITAL COEFFICIENTS',
'ALPHA MOLECULAR ORBITAL COEFFICIENTS'
)
self.gradient_headers = (
'Full Analytical Gradient',
'Gradient of SCF Energy',
'Gradient of MP2 Energy',
)
self.hessian_headers = (
'Hessian of the SCF Energy',
'Final Hessian.',
)
self.wfn_method = [
'HF',
'MP2', 'RI-MP2', 'LOCAL_MP2', 'MP4',
'CCD', 'CCSD', 'CCSD(T)',
'QCISD', 'QCISD(T)'
]
def after_parsing(self):
# If parsing a fragment job, each of the geometries appended to
# `atomcoords` may be of different lengths, which will prevent
# conversion from a list to NumPy array.
# Take the length of the first geometry as correct, and remove
# all others with different lengths.
if len(self.atomcoords) > 1:
correctlen = len(self.atomcoords[0])
self.atomcoords[:] = [coords for coords in self.atomcoords
if len(coords) == correctlen]
# At the moment, there is no similar correction for other properties!
# QChem does not print all MO coefficients by default, but rather
# up to HOMO+5. So, fill up the missing values with NaNs. If there are
# other cases where coefficient are missing, but different ones, this
# general afterthought might not be appropriate and the fix will
# need to be done while parsing.
if hasattr(self, 'mocoeffs'):
for im in range(len(self.mocoeffs)):
_nmo, _nbasis = self.mocoeffs[im].shape
if (_nmo, _nbasis) != (self.nmo, self.nbasis):
coeffs = numpy.empty((self.nmo, self.nbasis))
coeffs[:] = numpy.nan
coeffs[0:_nmo, 0:_nbasis] = self.mocoeffs[im]
self.mocoeffs[im] = coeffs
# When parsing the 'MOLECULAR ORBITAL COEFFICIENTS' block for
# `aonames`, Q-Chem doesn't print the principal quantum number
# for each shell; this needs to be added.
if hasattr(self, 'aonames') and hasattr(self, 'atombasis'):
angmom = ('', 'S', 'P', 'D', 'F', 'G', 'H', 'I')
for atom in self.atombasis:
bfcounts = dict()
for bfindex in atom:
atomname, bfname = self.aonames[bfindex].split('_')
# Keep track of how many times each shell type has
# appeared.
if bfname in bfcounts:
bfcounts[bfname] += 1
else:
# Make sure the starting number for type of
# angular momentum begins at the appropriate
# principal quantum number (1S, 2P, 3D, 4F,
# ...).
bfcounts[bfname] = angmom.index(bfname[0])
newbfname = '{}{}'.format(bfcounts[bfname], bfname)
self.aonames[bfindex] = '_'.join([atomname, newbfname])
# Assign the number of core electrons replaced by ECPs.
if hasattr(self, 'user_input') and self.user_input.get('rem') is not None:
if self.user_input['rem'].get('ecp') is not None:
ecp_is_gen = (self.user_input['rem']['ecp'] == 'gen')
if ecp_is_gen:
assert 'ecp' in self.user_input
has_iprint = hasattr(self, 'possible_ecps')
if not ecp_is_gen and not has_iprint:
msg = """ECPs are present, but the number of core \
electrons isn't printed at all. Rerun with "iprint >= 100" to get \
coreelectrons."""
self.logger.warning(msg)
self.incorrect_coreelectrons = True
elif ecp_is_gen and not has_iprint:
nmissing = sum(ncore == 0
for (_, _, ncore) in self.user_input['ecp'])
if nmissing > 1:
msg = """ECPs are present, but coreelectrons can only \
be guessed for one element at most. Rerun with "iprint >= 100" to get \
coreelectrons."""
self.logger.warning(msg)
self.incorrect_coreelectrons = True
elif self.user_input['molecule'].get('charge') is None:
msg = """ECPs are present, but the total charge \
cannot be determined. Rerun without `$molecule read`."""
self.logger.warning(msg)
self.incorrect_coreelectrons = True
else:
user_charge = self.user_input['molecule']['charge']
# First, assign the entries given
# explicitly.
for entry in self.user_input['ecp']:
element, _, ncore = entry
if ncore > 0:
self._assign_coreelectrons_to_element(element, ncore)
# Because of how the charge is calculated
# during extract(), this is the number of
# remaining core electrons that need to be
# assigned ECP centers. Filter out the
# remaining entries, of which there should
# only be one.
core_sum = self.coreelectrons.sum() if hasattr(self, 'coreelectrons') else 0
remainder = self.charge - user_charge - core_sum
entries = [entry
for entry in self.user_input['ecp']
if entry[2] == 0]
if len(entries) != 0:
assert len(entries) == 1
element, _, ncore = entries[0]
assert ncore == 0
self._assign_coreelectrons_to_element(
element, remainder, ncore_is_total_count=True)
elif not ecp_is_gen and has_iprint:
atomsymbols = [self.table.element[atomno] for atomno in self.atomnos]
for i in range(self.natom):
if atomsymbols[i] in self.possible_ecps:
self.coreelectrons[i] = self.possible_ecps[atomsymbols[i]]
else:
assert ecp_is_gen and has_iprint
for entry in self.user_input['ecp']:
element, _, ncore = entry
# If ncore is non-zero, then it must be
# user-defined, and we take that
# value. Otherwise, look it up.
if ncore == 0:
ncore = self.possible_ecps[element]
self._assign_coreelectrons_to_element(element, ncore)
# Check to see if the charge is consistent with the input
# section. It may not be if using an ECP.
if hasattr(self, 'user_input'):
if self.user_input.get('molecule') is not None:
user_charge = self.user_input['molecule'].get('charge')
if user_charge is not None:
self.set_attribute('charge', user_charge)
def parse_charge_section(self, inputfile, chargetype):
"""Parse the population analysis charge block."""
self.skip_line(inputfile, 'blank')
line = next(inputfile)
has_spins = False
if 'Spin' in line:
if not hasattr(self, 'atomspins'):
self.atomspins = dict()
has_spins = True
spins = []
self.skip_line(inputfile, 'dashes')
if not hasattr(self, 'atomcharges'):
self.atomcharges = dict()
charges = []
line = next(inputfile)
while list(set(line.strip())) != ['-']:
elements = line.split()
charge = self.float(elements[2])
charges.append(charge)
if has_spins:
spin = self.float(elements[3])
spins.append(spin)
line = next(inputfile)
self.atomcharges[chargetype] = numpy.array(charges)
if has_spins:
self.atomspins[chargetype] = numpy.array(spins)
@staticmethod
def parse_matrix(inputfile, nrows, ncols, ncolsblock):
"""Q-Chem prints most matrices in a standard format; parse the matrix
into a NumPy array of the appropriate shape.
"""
nparray = numpy.empty(shape=(nrows, ncols))
line = next(inputfile)
assert len(line.split()) == min(ncolsblock, ncols)
colcounter = 0
while colcounter < ncols:
# If the line is just the column header (indices)...
if line[:5].strip() == '':
line = next(inputfile)
rowcounter = 0
while rowcounter < nrows:
row = list(map(float, line.split()[1:]))
assert len(row) == min(ncolsblock, (ncols - colcounter))
nparray[rowcounter][colcounter:colcounter + ncolsblock] = row
line = next(inputfile)
rowcounter += 1
colcounter += ncolsblock
return nparray
def parse_matrix_aonames(self, inputfile, nrows, ncols):
"""Q-Chem prints most matrices in a standard format; parse the matrix
into a preallocated NumPy array of the appropriate shape.
Rather than have one routine for parsing all general matrices
and the 'MOLECULAR ORBITAL COEFFICIENTS' block, use a second
which handles `aonames`.
"""
bigmom = ('d', 'f', 'g', 'h')
nparray = numpy.empty(shape=(nrows, ncols))
line = next(inputfile)
assert len(line.split()) == min(self.ncolsblock, ncols)
colcounter = 0
split_fixed = utils.WidthSplitter((4, 3, 5, 6, 10, 10, 10, 10, 10, 10))
while colcounter < ncols:
# If the line is just the column header (indices)...
if line[:5].strip() == '':
line = next(inputfile)
# Do nothing for now.
if 'eigenvalues' in line:
line = next(inputfile)
rowcounter = 0
while rowcounter < nrows:
row = split_fixed.split(line)
# Only take the AO names on the first time through.
if colcounter == 0:
if len(self.aonames) != self.nbasis:
# Apply the offset for rows where there is
# more than one atom of any element in the
# molecule.
offset = 1
if row[2] != '':
name = self.atommap.get(row[1] + str(row[2]))
else:
name = self.atommap.get(row[1] + '1')
# For l > 1, there is a space between l and
# m_l when using spherical functions.
shell = row[2 + offset]
if shell in bigmom:
shell = ''.join([shell, row[3 + offset]])
aoname = ''.join([name, '_', shell.upper()])
self.aonames.append(aoname)
row = list(map(float, row[-min(self.ncolsblock, (ncols - colcounter)):]))
nparray[rowcounter][colcounter:colcounter + self.ncolsblock] = row
line = next(inputfile)
rowcounter += 1
colcounter += self.ncolsblock
return nparray
def parse_orbital_energies_and_symmetries(self, inputfile):
"""Parse the 'Orbital Energies (a.u.)' block appearing after SCF converges,
which optionally includes MO symmetries. Based upon the
Occupied/Virtual labeling, the HOMO is also parsed.
"""
energies = []
symbols = []
line = next(inputfile)
# Sometimes Q-Chem gets a little confused...
while "MOs" not in line:
line = next(inputfile)
line = next(inputfile)
# The end of the block is either a blank line or only dashes.
while not self.re_dashes_and_spaces.search(line) \
and not 'Warning : Irrep of orbital' in line:
if 'Occupied' in line or 'Virtual' in line:
# A nice trick to find where the HOMO is.
if 'Virtual' in line:
homo = len(energies) - 1
line = next(inputfile)
tokens = line.split()
# If the line contains letters, it must be the MO
# symmetries. Otherwise, it's the energies.
if re.search("[a-zA-Z]", line):
symbols.extend(tokens[1::2])
else:
for e in tokens:
try:
energy = utils.convertor(self.float(e), 'hartree', 'eV')
except ValueError:
energy = numpy.nan
energies.append(energy)
line = next(inputfile)
# MO symmetries are either not present or there is one for each MO
# (energy).
assert len(symbols) in (0, len(energies))
return energies, symbols, homo
def generate_atom_map(self):
"""Generate the map to go from Q-Chem atom numbering:
'C1', 'C2', 'C3', 'C4', 'C5', 'C6', 'H1', 'H2', 'H3', 'H4', 'C7', ...
to cclib atom numbering:
'C1', 'C2', 'C3', 'C4', 'C5', 'C6', 'H7', 'H8', 'H9', 'H10', 'C11', ...
for later use.
"""
# Generate the desired order.
order_proper = [element + str(num)
for element, num in zip(self.atomelements,
itertools.count(start=1))]
# We need separate counters for each element.
element_counters = {element: itertools.count(start=1)
for element in set(self.atomelements)}
# Generate the Q-Chem printed order.
order_qchem = [element + str(next(element_counters[element]))
for element in self.atomelements]
# Combine the orders into a mapping.
atommap = {k: v for k, v, in zip(order_qchem, order_proper)}
return atommap
def generate_formula_histogram(self):
"""From the atomnos, generate a histogram that represents the
molecular formula.
"""
histogram = dict()
for element in self.atomelements:
if element in histogram.keys():
histogram[element] += 1
else:
histogram[element] = 1
return histogram
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number and optionally the version
# control info.
if "Q-Chem" in line:
match = re.search(r"Q-Chem\s([0-9\.]*)\sfor", line)
if match:
self.metadata["package_version"] = match.groups()[0]
# Don't add revision information to the main package version for now.
if "SVN revision" in line:
revision = line.split()[3]
# Disable/enable parsing for fragment sections.
if any(message in line for message in self.fragment_section_headers):
self.is_fragment_section = True
if any(message in line for message in self.supersystem_section_headers):
self.is_fragment_section = False
if not self.is_fragment_section:
# If the input section is repeated back, parse the $rem and
# $molecule sections.
if line[0:11] == 'User input:':
self.user_input = dict()
self.skip_line(inputfile, 'd')
while list(set(line.strip())) != ['-']:
if line.strip().lower() == '$rem':
self.user_input['rem'] = dict()
while line.strip().lower() != '$end':
line = next(inputfile).lower()
if line.strip() == '$end':
break
# Apparently calculations can run without
# a matching $end...this terminates the
# user input section no matter what.
if line.strip() == ('-' * 62):
break
tokens = line.split()
# Allow blank lines.
if len(tokens) == 0:
continue
# Entries may be separated by an equals
# sign, and can have comments, for example:
# ecp gen
# ecp = gen
# ecp gen ! only on first chlorine
# ecp = gen only on first chlorine
assert len(tokens) >= 2
keyword = tokens[0]
if tokens[1] == '=':
option = tokens[2]
else:
option = tokens[1]
self.user_input['rem'][keyword] = option
if keyword == 'method':
method = option.upper()
if method in self.wfn_method:
self.metadata["methods"].append(method)
else:
self.metadata["methods"].append('DFT')
self.metadata["functional"] = method
if keyword == 'exchange':
self.metadata["methods"].append('DFT')
self.metadata["functional"] = option
if keyword == 'print_orbitals':
# Stay with the default value if a number isn't
# specified.
if option in ('true', 'false'):
continue
else:
norbdisp_aonames = int(option)
self.norbdisp_alpha_aonames = norbdisp_aonames
self.norbdisp_beta_aonames = norbdisp_aonames
self.norbdisp_set = True
if line.strip().lower() == '$ecp':
self.user_input['ecp'] = []
line = next(inputfile)
while line.strip().lower() != '$end':
while list(set(line.strip())) != ['*']:
# Parse the element for this ECP
# entry. If only the element is on
# this line, or the 2nd token is 0, it
# applies to all atoms; if it's > 0,
# then it indexes (1-based) that
# specific atom in the whole molecule.
tokens = line.split()
assert len(tokens) > 0
element = tokens[0][0].upper() + tokens[0][1:].lower()
assert element in self.table.element
if len(tokens) > 1:
assert len(tokens) == 2
index = int(tokens[1]) - 1
else:
index = -1
line = next(inputfile)
# Next comes the ECP definition. If
# the line contains only a single
# item, it's a built-in ECP, otherwise
# it's a full definition.
tokens = line.split()
if len(tokens) == 1:
ncore = 0
line = next(inputfile)
else:
assert len(tokens) == 3
ncore = int(tokens[2])
# Don't parse the remainder of the
# ECP definition.
while list(set(line.strip())) != ['*']:
line = next(inputfile)
entry = (element, index, ncore)
self.user_input['ecp'].append(entry)
line = next(inputfile)
if line.strip().lower() == '$end':
break
if line.strip().lower() == '$molecule':
self.user_input['molecule'] = dict()
line = next(inputfile)
# Don't read the molecule, only the
# supersystem charge and multiplicity.
if line.split()[0].lower() == 'read':
pass
else:
charge, mult = [int(x) for x in line.split()]
self.user_input['molecule']['charge'] = charge
self.user_input['molecule']['mult'] = mult
line = next(inputfile).lower()
# Parse the basis set name
if 'Requested basis set' in line:
self.metadata["basis_set"] = line.split()[-1]
# Parse the general basis for `gbasis`, in the style used by
# Gaussian.
if 'Basis set in general basis input format:' in line:
self.skip_lines(inputfile, ['d', '$basis'])
line = next(inputfile)
if not hasattr(self, 'gbasis'):
self.gbasis = []
# The end of the general basis block.
while '$end' not in line:
atom = []
# 1. Contains element symbol and atomic index of
# basis functions; if 0, applies to all atoms of
# same element.
assert len(line.split()) == 2
line = next(inputfile)
# The end of each atomic block.
while '****' not in line:
# 2. Contains the type of basis function {S, SP,
# P, D, F, G, H, ...}, the number of primitives,
# and the weight of the final contracted function.
bfsplitline = line.split()
assert len(bfsplitline) == 3
bftype = bfsplitline[0]
nprim = int(bfsplitline[1])
line = next(inputfile)
# 3. The primitive basis functions that compose
# the contracted basis function; there are `nprim`
# of them. The first value is the exponent, and
# the second value is the contraction
# coefficient. If `bftype == 'SP'`, the primitives
# are for both S- and P-type basis functions but
# with separate contraction coefficients,
# resulting in three columns.
if bftype == 'SP':
primitives_S = []
primitives_P = []
else:
primitives = []
# For each primitive in the contracted basis
# function...
for iprim in range(nprim):
primsplitline = line.split()
exponent = float(primsplitline[0])
if bftype == 'SP':
assert len(primsplitline) == 3
coefficient_S = float(primsplitline[1])
coefficient_P = float(primsplitline[2])
primitives_S.append((exponent, coefficient_S))
primitives_P.append((exponent, coefficient_P))
else:
assert len(primsplitline) == 2
coefficient = float(primsplitline[1])
primitives.append((exponent, coefficient))
line = next(inputfile)
if bftype == 'SP':
bf_S = ('S', primitives_S)
bf_P = ('P', primitives_P)
atom.append(bf_S)
atom.append(bf_P)
else:
bf = (bftype, primitives)
atom.append(bf)
# Move to the next contracted basis function
# as long as we don't hit the '****' atom
# delimiter.
self.gbasis.append(atom)
line = next(inputfile)
if line.strip() == 'The following effective core potentials will be applied':
# Keep track of all elements that may have an ECP on
# them. *Which* centers have an ECP can't be
# determined here, so just take the number of valence
# electrons, then later later figure out the centers
# and do core = Z - valence.
self.possible_ecps = dict()
# This will fail if an element has more than one kind
# of ECP.
split_fixed = utils.WidthSplitter((4, 13, 20, 2, 14, 14))
self.skip_lines(inputfile, ['d', 'header', 'header', 'd'])
line = next(inputfile)
while list(set(line.strip())) != ['-']:
tokens = split_fixed.split(line)
if tokens[0] != '':
element = tokens[0]
valence = int(tokens[1])
ncore = self.table.number[element] - valence
self.possible_ecps[element] = ncore
line = next(inputfile)
if 'TIME STEP #' in line:
tokens = line.split()
self.append_attribute('time', float(tokens[8]))
# Extract the atomic numbers and coordinates of the atoms.
if 'Standard Nuclear Orientation (Angstroms)' in line:
if not hasattr(self, 'atomcoords'):
self.atomcoords = []
self.skip_lines(inputfile, ['cols', 'dashes'])
atomelements = []
atomcoords = []
line = next(inputfile)
while list(set(line.strip())) != ['-']:
entry = line.split()
atomelements.append(entry[1])
atomcoords.append(list(map(float, entry[2:])))
line = next(inputfile)
self.atomcoords.append(atomcoords)
# We calculate and handle atomnos no matter what, since in
# the case of fragment calculations the atoms may change,
# along with the charge and spin multiplicity.
self.atomnos = []
self.atomelements = []
for atomelement in atomelements:
self.atomelements.append(atomelement)
if atomelement == 'GH':
self.atomnos.append(0)
else:
self.atomnos.append(self.table.number[atomelement])
self.natom = len(self.atomnos)
self.atommap = self.generate_atom_map()
self.formula_histogram = self.generate_formula_histogram()
# Number of electrons.
# Useful for determining the number of occupied/virtual orbitals.
if 'Nuclear Repulsion Energy' in line:
line = next(inputfile)
nelec_re_string = 'There are(\s+[0-9]+) alpha and(\s+[0-9]+) beta electrons'
match = re.findall(nelec_re_string, line.strip())
self.set_attribute('nalpha', int(match[0][0].strip()))
self.set_attribute('nbeta', int(match[0][1].strip()))
self.norbdisp_alpha += self.nalpha
self.norbdisp_alpha_aonames += self.nalpha
self.norbdisp_beta += self.nbeta
self.norbdisp_beta_aonames += self.nbeta
# Calculate the spin multiplicity (2S + 1), where S is the
# total spin of the system.
S = (self.nalpha - self.nbeta) / 2
mult = int(2 * S + 1)
self.set_attribute('mult', mult)
# Calculate the molecular charge as the difference between
# the atomic numbers and the number of electrons.
if hasattr(self, 'atomnos'):
charge = sum(self.atomnos) - (self.nalpha + self.nbeta)
self.set_attribute('charge', charge)
# Number of basis functions.
if 'basis functions' in line:
if not hasattr(self, 'nbasis'):
self.set_attribute('nbasis', int(line.split()[-3]))
# In the case that there are fewer basis functions
# (and therefore MOs) than default number of MOs
# displayed, reset the display values.
self.norbdisp_alpha = min(self.norbdisp_alpha, self.nbasis)
self.norbdisp_alpha_aonames = min(self.norbdisp_alpha_aonames, self.nbasis)
self.norbdisp_beta = min(self.norbdisp_beta, self.nbasis)
self.norbdisp_beta_aonames = min(self.norbdisp_beta_aonames, self.nbasis)
# Check for whether or not we're peforming an
# (un)restricted calculation.
if 'calculation will be' in line:
if ' restricted' in line:
self.unrestricted = False
if 'unrestricted' in line:
self.unrestricted = True
if hasattr(self, 'nalpha') and hasattr(self, 'nbeta'):
if self.nalpha != self.nbeta:
self.unrestricted = True
self.is_rohf = True
# Section with SCF iterations goes like this:
#
# SCF converges when DIIS error is below 1.0E-05
# ---------------------------------------
# Cycle Energy DIIS Error
# ---------------------------------------
# 1 -381.9238072190 1.39E-01
# 2 -382.2937212775 3.10E-03
# 3 -382.2939780242 3.37E-03
# ...
#
scf_success_messages = (
'Convergence criterion met',
'corrected energy'
)
scf_failure_messages = (
'SCF failed to converge',
'Convergence failure'
)
if 'SCF converges when ' in line:
if not hasattr(self, 'scftargets'):
self.scftargets = []
target = float(line.split()[-1])
self.scftargets.append([target])
# We should have the header between dashes now,
# but sometimes there are lines before the first dashes.
while not 'Cycle Energy' in line:
line = next(inputfile)
self.skip_line(inputfile, 'd')
values = []
iter_counter = 1
line = next(inputfile)
while not any(message in line for message in scf_success_messages):
# Some trickery to avoid a lot of printing that can occur
# between each SCF iteration.
entry = line.split()
if len(entry) > 0:
if entry[0] == str(iter_counter):
# Q-Chem only outputs one error metric.
error = float(entry[2])
values.append([error])
iter_counter += 1
try:
line = next(inputfile)
# Is this the end of the file for some reason?
except StopIteration:
self.logger.warning('File terminated before end of last SCF! Last error: {}'.format(error))
break
# We've converged, but still need the last iteration.
if any(message in line for message in scf_success_messages):
entry = line.split()
error = float(entry[2])
values.append([error])
iter_counter += 1
# This is printed in regression QChem4.2/dvb_sp_unconverged.out
# so use it to bail out when convergence fails.
if any(message in line for message in scf_failure_messages):
break
if not hasattr(self, 'scfvalues'):
self.scfvalues = []
self.scfvalues.append(numpy.array(values))
# Molecular orbital coefficients.
# Try parsing them from this block (which comes from
# `scf_final_print = 2``) rather than the combined
# aonames/mocoeffs/moenergies block (which comes from
# `print_orbitals = true`).
if 'Final Alpha MO Coefficients' in line:
if not hasattr(self, 'mocoeffs'):
self.mocoeffs = []
mocoeffs = QChem.parse_matrix(inputfile, self.nbasis, self.norbdisp_alpha, self.ncolsblock)
self.mocoeffs.append(mocoeffs.transpose())
if 'Final Beta MO Coefficients' in line:
mocoeffs = QChem.parse_matrix(inputfile, self.nbasis, self.norbdisp_beta, self.ncolsblock)
self.mocoeffs.append(mocoeffs.transpose())
if 'Total energy in the final basis set' in line:
if not hasattr(self, 'scfenergies'):
self.scfenergies = []
scfenergy = float(line.split()[-1])
self.scfenergies.append(utils.convertor(scfenergy, 'hartree', 'eV'))
# Geometry optimization.
if 'Maximum Tolerance Cnvgd?' in line:
line_g = next(inputfile).split()[1:3]
line_d = next(inputfile).split()[1:3]
line_e = next(inputfile).split()[2:4]
if not hasattr(self, 'geotargets'):
self.geotargets = [line_g[1], line_d[1], self.float(line_e[1])]
if not hasattr(self, 'geovalues'):
self.geovalues = []
maxg = self.float(line_g[0])
maxd = self.float(line_d[0])
ediff = self.float(line_e[0])
geovalues = [maxg, maxd, ediff]
self.geovalues.append(geovalues)
if '** OPTIMIZATION CONVERGED **' in line:
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.atomcoords))
if '** MAXIMUM OPTIMIZATION CYCLES REACHED **' in line:
if not hasattr(self, 'optdone'):
self.optdone = []
# Moller-Plesset corrections.
# There are multiple modules in Q-Chem for calculating MPn energies:
# cdman, ccman, and ccman2, all with different output.
#
# MP2, RI-MP2, and local MP2 all default to cdman, which has a simple
# block of output after the regular SCF iterations.
#
# MP3 is handled by ccman2.
#
# MP4 and variants are handled by ccman.
# This is the MP2/cdman case.
if 'MP2 total energy' in line:
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
mp2energy = float(line.split()[4])
mp2energy = utils.convertor(mp2energy, 'hartree', 'eV')
self.mpenergies.append([mp2energy])
# This is the MP3/ccman2 case.
if line[1:11] == 'MP2 energy' and line[12:19] != 'read as':
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
mpenergies = []
mp2energy = float(line.split()[3])
mpenergies.append(mp2energy)
line = next(inputfile)
line = next(inputfile)
# Just a safe check.
if 'MP3 energy' in line:
mp3energy = float(line.split()[3])
mpenergies.append(mp3energy)
mpenergies = [utils.convertor(mpe, 'hartree', 'eV')
for mpe in mpenergies]
self.mpenergies.append(mpenergies)
# This is the MP4/ccman case.
if 'EHF' in line:
if not hasattr(self, 'mpenergies'):
self.mpenergies = []
mpenergies = []
while list(set(line.strip())) != ['-']:
if 'EMP2' in line:
mp2energy = float(line.split()[2])
mpenergies.append(mp2energy)
if 'EMP3' in line:
mp3energy = float(line.split()[2])
mpenergies.append(mp3energy)
if 'EMP4SDQ' in line:
mp4sdqenergy = float(line.split()[2])
mpenergies.append(mp4sdqenergy)
# This is really MP4SD(T)Q.
if 'EMP4 ' in line:
mp4sdtqenergy = float(line.split()[2])
mpenergies.append(mp4sdtqenergy)
line = next(inputfile)
mpenergies = [utils.convertor(mpe, 'hartree', 'eV')
for mpe in mpenergies]
self.mpenergies.append(mpenergies)
# Coupled cluster corrections.
# Hopefully we only have to deal with ccman2 here.
if 'CCD total energy' in line:
if not hasattr(self, 'ccenergies'):
self.ccenergies = []
ccdenergy = float(line.split()[-1])
ccdenergy = utils.convertor(ccdenergy, 'hartree', 'eV')
self.ccenergies.append(ccdenergy)
if 'CCSD total energy' in line:
has_triples = False
if not hasattr(self, 'ccenergies'):
self.ccenergies = []
ccsdenergy = float(line.split()[-1])
# Make sure we aren't actually doing CCSD(T).
line = next(inputfile)
line = next(inputfile)
if 'CCSD(T) total energy' in line:
has_triples = True
ccsdtenergy = float(line.split()[-1])
ccsdtenergy = utils.convertor(ccsdtenergy, 'hartree', 'eV')
self.ccenergies.append(ccsdtenergy)
if not has_triples:
ccsdenergy = utils.convertor(ccsdenergy, 'hartree', 'eV')
self.ccenergies.append(ccsdenergy)
# Electronic transitions. Works for both CIS and TDDFT.
if 'Excitation Energies' in line:
# Restricted:
# ---------------------------------------------------
# TDDFT/TDA Excitation Energies
# ---------------------------------------------------
#
# Excited state 1: excitation energy (eV) = 3.6052
# Total energy for state 1: -382.167872200685
# Multiplicity: Triplet
# Trans. Mom.: 0.0000 X 0.0000 Y 0.0000 Z
# Strength : 0.0000
# D( 33) --> V( 3) amplitude = 0.2618
# D( 34) --> V( 2) amplitude = 0.2125
# D( 35) --> V( 1) amplitude = 0.9266
#
# Unrestricted:
# Excited state 2: excitation energy (eV) = 2.3156
# Total energy for state 2: -381.980177630969
# : 0.7674
# Trans. Mom.: -2.7680 X -0.1089 Y 0.0000 Z
# Strength : 0.4353
# S( 1) --> V( 1) amplitude = -0.3105 alpha
# D( 34) --> S( 1) amplitude = 0.9322 beta
self.skip_lines(inputfile, ['dashes', 'blank'])
line = next(inputfile)
etenergies = []
etsyms = []
etoscs = []
etsecs = []
spinmap = {'alpha': 0, 'beta': 1}
while list(set(line.strip())) != ['-']:
# Take the total energy for the state and subtract from the
# ground state energy, rather than just the EE;
# this will be more accurate.
if 'Total energy for state' in line:
energy = utils.convertor(float(line.split()[5]), 'hartree', 'wavenumber')
etenergy = energy - utils.convertor(self.scfenergies[-1], 'eV', 'wavenumber')
etenergies.append(etenergy)
# if 'excitation energy' in line:
# etenergy = utils.convertor(float(line.split()[-1]), 'eV', 'wavenumber')
# etenergies.append(etenergy)
if 'Multiplicity' in line:
etsym = line.split()[1]
etsyms.append(etsym)
if 'Strength' in line:
strength = float(line.split()[-1])
etoscs.append(strength)
# This is the list of transitions.
if 'amplitude' in line:
sec = []
while line.strip() != '':
if self.unrestricted:
spin = spinmap[line[42:47].strip()]
else:
spin = 0
# There is a subtle difference between TDA and RPA calcs,
# because in the latter case each transition line is
# preceeded by the type of vector: X or Y, name excitation
# or deexcitation (see #154 for details). For deexcitations,
# we will need to reverse the MO indices. Note also that Q-Chem
# starts reindexing virtual orbitals at 1.
if line[5] == '(':
ttype = 'X'
startidx = int(line[6:9]) - 1
endidx = int(line[17:20]) - 1 + self.nalpha
contrib = float(line[34:41].strip())
else:
assert line[5] == ":"
ttype = line[4]
startidx = int(line[9:12]) - 1
endidx = int(line[20:23]) - 1 + self.nalpha
contrib = float(line[37:44].strip())
start = (startidx, spin)
end = (endidx, spin)
if ttype == 'X':
sec.append([start, end, contrib])
elif ttype == 'Y':
sec.append([end, start, contrib])
else:
raise ValueError('Unknown transition type: %s' % ttype)
line = next(inputfile)
etsecs.append(sec)
line = next(inputfile)
self.set_attribute('etenergies', etenergies)
self.set_attribute('etsyms', etsyms)
self.set_attribute('etoscs', etoscs)
self.set_attribute('etsecs', etsecs)
# Static and dynamic polarizability from mopropman.
if 'Polarizability (a.u.)' in line:
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
while 'Full Tensor' not in line:
line = next(inputfile)
self.skip_line(inputfile, 'blank')
polarizability = [next(inputfile).split() for _ in range(3)]
self.polarizabilities.append(numpy.array(polarizability))
# Static polarizability from finite difference or
# responseman.
if line.strip() in ('Static polarizability tensor [a.u.]',
'Polarizability tensor [a.u.]'):
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = [next(inputfile).split() for _ in range(3)]
self.polarizabilities.append(numpy.array(polarizability))
# Molecular orbital energies and symmetries.
if line.strip() == 'Orbital Energies (a.u.) and Symmetries':
# --------------------------------------------------------------
# Orbital Energies (a.u.) and Symmetries
# --------------------------------------------------------------
#
# Alpha MOs, Restricted
# -- Occupied --
# -10.018 -10.018 -10.008 -10.008 -10.007 -10.007 -10.006 -10.005
# 1 Bu 1 Ag 2 Bu 2 Ag 3 Bu 3 Ag 4 Bu 4 Ag
# -9.992 -9.992 -0.818 -0.755 -0.721 -0.704 -0.670 -0.585
# 5 Ag 5 Bu 6 Ag 6 Bu 7 Ag 7 Bu 8 Bu 8 Ag
# -0.561 -0.532 -0.512 -0.462 -0.439 -0.410 -0.400 -0.397
# 9 Ag 9 Bu 10 Ag 11 Ag 10 Bu 11 Bu 12 Bu 12 Ag
# -0.376 -0.358 -0.349 -0.330 -0.305 -0.295 -0.281 -0.263
# 13 Bu 14 Bu 13 Ag 1 Au 15 Bu 14 Ag 15 Ag 1 Bg
# -0.216 -0.198 -0.160
# 2 Au 2 Bg 3 Bg
# -- Virtual --
# 0.050 0.091 0.116 0.181 0.280 0.319 0.330 0.365
# 3 Au 4 Au 4 Bg 5 Au 5 Bg 16 Ag 16 Bu 17 Bu
# 0.370 0.413 0.416 0.422 0.446 0.469 0.496 0.539
# 17 Ag 18 Bu 18 Ag 19 Bu 19 Ag 20 Bu 20 Ag 21 Ag
# 0.571 0.587 0.610 0.627 0.646 0.693 0.743 0.806
# 21 Bu 22 Ag 22 Bu 23 Bu 23 Ag 24 Ag 24 Bu 25 Ag
# 0.816
# 25 Bu
#
# Beta MOs, Restricted
# -- Occupied --
# -10.018 -10.018 -10.008 -10.008 -10.007 -10.007 -10.006 -10.005
# 1 Bu 1 Ag 2 Bu 2 Ag 3 Bu 3 Ag 4 Bu 4 Ag
# -9.992 -9.992 -0.818 -0.755 -0.721 -0.704 -0.670 -0.585
# 5 Ag 5 Bu 6 Ag 6 Bu 7 Ag 7 Bu 8 Bu 8 Ag
# -0.561 -0.532 -0.512 -0.462 -0.439 -0.410 -0.400 -0.397
# 9 Ag 9 Bu 10 Ag 11 Ag 10 Bu 11 Bu 12 Bu 12 Ag
# -0.376 -0.358 -0.349 -0.330 -0.305 -0.295 -0.281 -0.263
# 13 Bu 14 Bu 13 Ag 1 Au 15 Bu 14 Ag 15 Ag 1 Bg
# -0.216 -0.198 -0.160
# 2 Au 2 Bg 3 Bg
# -- Virtual --
# 0.050 0.091 0.116 0.181 0.280 0.319 0.330 0.365
# 3 Au 4 Au 4 Bg 5 Au 5 Bg 16 Ag 16 Bu 17 Bu
# 0.370 0.413 0.416 0.422 0.446 0.469 0.496 0.539
# 17 Ag 18 Bu 18 Ag 19 Bu 19 Ag 20 Bu 20 Ag 21 Ag
# 0.571 0.587 0.610 0.627 0.646 0.693 0.743 0.806
# 21 Bu 22 Ag 22 Bu 23 Bu 23 Ag 24 Ag 24 Bu 25 Ag
# 0.816
# 25 Bu
# --------------------------------------------------------------
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
energies_alpha, symbols_alpha, homo_alpha = self.parse_orbital_energies_and_symmetries(inputfile)
# Only look at the second block if doing an unrestricted calculation.
# This might be a problem for ROHF/ROKS.
if self.unrestricted:
energies_beta, symbols_beta, homo_beta = self.parse_orbital_energies_and_symmetries(inputfile)
# For now, only keep the last set of MO energies, even though it is
# printed at every step of geometry optimizations and fragment jobs.
self.set_attribute('moenergies', [numpy.array(energies_alpha)])
self.set_attribute('homos', [homo_alpha])
self.set_attribute('mosyms', [symbols_alpha])
if self.unrestricted:
self.moenergies.append(numpy.array(energies_beta))
self.homos.append(homo_beta)
self.mosyms.append(symbols_beta)
self.set_attribute('nmo', len(self.moenergies[0]))
# Molecular orbital energies, no symmetries.
if line.strip() == 'Orbital Energies (a.u.)':
# In the case of no orbital symmetries, the beta spin block is not
# present for restricted calculations.
# --------------------------------------------------------------
# Orbital Energies (a.u.)
# --------------------------------------------------------------
#
# Alpha MOs
# -- Occupied --
# ******* -38.595 -34.580 -34.579 -34.578 -19.372 -19.372 -19.364
# -19.363 -19.362 -19.362 -4.738 -3.252 -3.250 -3.250 -1.379
# -1.371 -1.369 -1.365 -1.364 -1.362 -0.859 -0.855 -0.849
# -0.846 -0.840 -0.836 -0.810 -0.759 -0.732 -0.729 -0.704
# -0.701 -0.621 -0.610 -0.595 -0.587 -0.584 -0.578 -0.411
# -0.403 -0.355 -0.354 -0.352
# -- Virtual --
# -0.201 -0.117 -0.099 -0.086 0.020 0.031 0.055 0.067
# 0.075 0.082 0.086 0.092 0.096 0.105 0.114 0.148
#
# Beta MOs
# -- Occupied --
# ******* -38.561 -34.550 -34.549 -34.549 -19.375 -19.375 -19.367
# -19.367 -19.365 -19.365 -4.605 -3.105 -3.103 -3.102 -1.385
# -1.376 -1.376 -1.371 -1.370 -1.368 -0.863 -0.858 -0.853
# -0.849 -0.843 -0.839 -0.818 -0.765 -0.738 -0.737 -0.706
# -0.702 -0.624 -0.613 -0.600 -0.591 -0.588 -0.585 -0.291
# -0.291 -0.288 -0.275
# -- Virtual --
# -0.139 -0.122 -0.103 0.003 0.014 0.049 0.049 0.059
# 0.061 0.070 0.076 0.081 0.086 0.090 0.098 0.106
# 0.138
# --------------------------------------------------------------
self.skip_line(inputfile, 'dashes')
line = next(inputfile)
energies_alpha, _, homo_alpha = self.parse_orbital_energies_and_symmetries(inputfile)
# Only look at the second block if doing an unrestricted calculation.
# This might be a problem for ROHF/ROKS.
if self.unrestricted:
energies_beta, _, homo_beta = self.parse_orbital_energies_and_symmetries(inputfile)
# For now, only keep the last set of MO energies, even though it is
# printed at every step of geometry optimizations and fragment jobs.
self.set_attribute('moenergies', [numpy.array(energies_alpha)])
self.set_attribute('homos', [homo_alpha])
if self.unrestricted:
self.moenergies.append(numpy.array(energies_beta))
self.homos.append(homo_beta)
self.set_attribute('nmo', len(self.moenergies[0]))
# Molecular orbital coefficients.
# This block comes from `print_orbitals = true/{int}`. Less
# precision than `scf_final_print >= 2` for `mocoeffs`, but
# important for `aonames` and `atombasis`.
if any(header in line
for header in self.alpha_mo_coefficient_headers):
# If we've asked to display more virtual orbitals than
# there are MOs present in the molecule, fix that now.
if hasattr(self, 'nmo') and hasattr(self, 'nalpha') and hasattr(self, 'nbeta'):
self.norbdisp_alpha_aonames = min(self.norbdisp_alpha_aonames, self.nmo)
self.norbdisp_beta_aonames = min(self.norbdisp_beta_aonames, self.nmo)
if not hasattr(self, 'mocoeffs'):
self.mocoeffs = []
if not hasattr(self, 'atombasis'):
self.atombasis = []
for n in range(self.natom):
self.atombasis.append([])
if not hasattr(self, 'aonames'):
self.aonames = []
# We could also attempt to parse `moenergies` here, but
# nothing is gained by it.
mocoeffs = self.parse_matrix_aonames(inputfile, self.nbasis, self.norbdisp_alpha_aonames)
# Only use these MO coefficients if we don't have them
# from `scf_final_print`.
if len(self.mocoeffs) == 0:
self.mocoeffs.append(mocoeffs.transpose())
# Go back through `aonames` to create `atombasis`.
assert len(self.aonames) == self.nbasis
for aoindex, aoname in enumerate(self.aonames):
atomindex = int(self.re_atomindex.search(aoname).groups()[0]) - 1
self.atombasis[atomindex].append(aoindex)
assert len(self.atombasis) == len(self.atomnos)
if 'BETA MOLECULAR ORBITAL COEFFICIENTS' in line:
mocoeffs = self.parse_matrix_aonames(inputfile, self.nbasis, self.norbdisp_beta_aonames)
if len(self.mocoeffs) == 1:
self.mocoeffs.append(mocoeffs.transpose())
# Population analysis.
if 'Ground-State Mulliken Net Atomic Charges' in line:
self.parse_charge_section(inputfile, 'mulliken')
if 'Hirshfeld Atomic Charges' in line:
self.parse_charge_section(inputfile, 'hirshfeld')
if 'Ground-State ChElPG Net Atomic Charges' in line:
self.parse_charge_section(inputfile, 'chelpg')
# Multipole moments are not printed in lexicographical order,
# so we need to parse and sort them. The units seem OK, but there
# is some uncertainty about the reference point and whether it
# can be changed.
#
# Notice how the letter/coordinate labels change to coordinate ranks
# after hexadecapole moments, and need to be translated. Additionally,
# after 9-th order moments the ranks are not necessarily single digits
# and so there are spaces between them.
#
# -----------------------------------------------------------------
# Cartesian Multipole Moments
# LMN = < X^L Y^M Z^N >
# -----------------------------------------------------------------
# Charge (ESU x 10^10)
# 0.0000
# Dipole Moment (Debye)
# X 0.0000 Y 0.0000 Z 0.0000
# Tot 0.0000
# Quadrupole Moments (Debye-Ang)
# XX -50.9647 XY -0.1100 YY -50.1441
# XZ 0.0000 YZ 0.0000 ZZ -58.5742
# ...
# 5th-Order Moments (Debye-Ang^4)
# 500 0.0159 410 -0.0010 320 0.0005
# 230 0.0000 140 0.0005 050 0.0012
# ...
# -----------------------------------------------------------------
#
if "Cartesian Multipole Moments" in line:
# This line appears not by default, but only when
# `multipole_order` > 4:
line = inputfile.next()
if 'LMN = < X^L Y^M Z^N >' in line:
line = inputfile.next()
# The reference point is always the origin, although normally the molecule
# is moved so that the center of charge is at the origin.
self.reference = [0.0, 0.0, 0.0]
self.moments = [self.reference]
# Watch out! This charge is in statcoulombs without the exponent!
# We should expect very good agreement, however Q-Chem prints
# the charge only with 5 digits, so expect 1e-4 accuracy.
charge_header = inputfile.next()
assert charge_header.split()[0] == "Charge"
charge = float(inputfile.next().strip())
charge = utils.convertor(charge, 'statcoulomb', 'e') * 1e-10
# Allow this to change until fragment jobs are properly implemented.
# assert abs(charge - self.charge) < 1e-4
# This will make sure Debyes are used (not sure if it can be changed).
line = inputfile.next()
assert line.strip() == "Dipole Moment (Debye)"
while "-----" not in line:
# The current multipole element will be gathered here.
multipole = []
line = inputfile.next()
while ("-----" not in line) and ("Moment" not in line):
cols = line.split()
# The total (norm) is printed for dipole but not other multipoles.
if cols[0] == 'Tot':
line = inputfile.next()
continue
# Find and replace any 'stars' with NaN before moving on.
for i in range(len(cols)):
if '***' in cols[i]:
cols[i] = numpy.nan
# The moments come in pairs (label followed by value) up to the 9-th order,
# although above hexadecapoles the labels are digits representing the rank
# in each coordinate. Above the 9-th order, ranks are not always single digits,
# so there are spaces between them, which means moments come in quartets.
if len(self.moments) < 5:
for i in range(len(cols)//2):
lbl = cols[2*i]
m = cols[2*i + 1]
multipole.append([lbl, m])
elif len(self.moments) < 10:
for i in range(len(cols)//2):
lbl = cols[2*i]
lbl = 'X'*int(lbl[0]) + 'Y'*int(lbl[1]) + 'Z'*int(lbl[2])
m = cols[2*i + 1]
multipole.append([lbl, m])
else:
for i in range(len(cols)//4):
lbl = 'X'*int(cols[4*i]) + 'Y'*int(cols[4*i + 1]) + 'Z'*int(cols[4*i + 2])
m = cols[4*i + 3]
multipole.append([lbl, m])
line = inputfile.next()
# Sort should use the first element when sorting lists,
# so this should simply work, and afterwards we just need
# to extract the second element in each list (the actual moment).
multipole.sort()
multipole = [m[1] for m in multipole]
self.moments.append(multipole)
# For `method = force` or geometry optimizations,
# the gradient is printed.
if any(header in line for header in self.gradient_headers):
if not hasattr(self, 'grads'):
self.grads = []
if 'SCF' in line:
ncolsblock = self.ncolsblock
else:
ncolsblock = 5
grad = QChem.parse_matrix(inputfile, 3, self.natom, ncolsblock)
self.grads.append(grad.T)
# (Static) polarizability from frequency calculations.
if 'Polarizability Matrix (a.u.)' in line:
if not hasattr(self, 'polarizabilities'):
self.polarizabilities = []
polarizability = []
self.skip_line(inputfile, 'index header')
for _ in range(3):
line = next(inputfile)
ss = line.strip()[1:]
polarizability.append([ss[0:12], ss[13:24], ss[25:36]])
# For some reason the sign is inverted.
self.polarizabilities.append(-numpy.array(polarizability, dtype=float))
# For IR-related jobs, the Hessian is printed (dim: 3*natom, 3*natom).
# Note that this is *not* the mass-weighted Hessian.
if any(header in line for header in self.hessian_headers):
if not hasattr(self, 'hessian'):
dim = 3*self.natom
self.hessian = QChem.parse_matrix(inputfile, dim, dim, self.ncolsblock)
# Start of the IR/Raman frequency section.
if 'VIBRATIONAL ANALYSIS' in line:
while 'STANDARD THERMODYNAMIC QUANTITIES' not in line:
## IR, optional Raman:
#
# **********************************************************************
# ** **
# ** VIBRATIONAL ANALYSIS **
# ** -------------------- **
# ** **
# ** VIBRATIONAL FREQUENCIES (CM**-1) AND NORMAL MODES **
# ** FORCE CONSTANTS (mDYN/ANGSTROM) AND REDUCED MASSES (AMU) **
# ** INFRARED INTENSITIES (KM/MOL) **
##** RAMAN SCATTERING ACTIVITIES (A**4/AMU) AND DEPOLARIZATION RATIOS **
# ** **
# **********************************************************************
#
#
# Mode: 1 2 3
# Frequency: -106.88 -102.91 161.77
# Force Cnst: 0.0185 0.0178 0.0380
# Red. Mass: 2.7502 2.8542 2.4660
# IR Active: NO YES YES
# IR Intens: 0.000 0.000 0.419
# Raman Active: YES NO NO
##Raman Intens: 2.048 0.000 0.000
##Depolar: 0.750 0.000 0.000
# X Y Z X Y Z X Y Z
# C 0.000 0.000 -0.100 -0.000 0.000 -0.070 -0.000 -0.000 -0.027
# C 0.000 0.000 0.045 -0.000 0.000 -0.074 0.000 -0.000 -0.109
# C 0.000 0.000 0.148 -0.000 -0.000 -0.074 0.000 0.000 -0.121
# (...)
# H -0.000 -0.000 0.422 -0.000 -0.000 0.499 0.000 0.000 -0.285
# TransDip 0.000 -0.000 -0.000 0.000 -0.000 -0.000 -0.000 0.000 0.021
#
# Mode: 4 5 6
# ...
#
# There isn't any symmetry information for normal modes present
# in Q-Chem.
# if not hasattr(self, 'vibsyms'):
# self.vibsyms = []
if 'Frequency:' in line:
if not hasattr(self, 'vibfreqs'):
self.vibfreqs = []
vibfreqs = map(float, line.split()[1:])
self.vibfreqs.extend(vibfreqs)
if 'IR Intens:' in line:
if not hasattr(self, 'vibirs'):
self.vibirs = []
vibirs = map(float, line.split()[2:])
self.vibirs.extend(vibirs)
if 'Raman Intens:' in line:
if not hasattr(self, 'vibramans'):
self.vibramans = []
vibramans = map(float, line.split()[2:])
self.vibramans.extend(vibramans)
# This is the start of the displacement block.
if line.split()[0:3] == ['X', 'Y', 'Z']:
if not hasattr(self, 'vibdisps'):
self.vibdisps = []
disps = []
for k in range(self.natom):
line = next(inputfile)
numbers = list(map(float, line.split()[1:]))
N = len(numbers) // 3
if not disps:
for n in range(N):
disps.append([])
for n in range(N):
disps[n].append(numbers[3*n:(3*n)+3])
self.vibdisps.extend(disps)
line = next(inputfile)
# Anharmonic vibrational analysis.
# Q-Chem includes 3 theories: VPT2, TOSH, and VCI.
# For now, just take the VPT2 results.
# if 'VIBRATIONAL ANHARMONIC ANALYSIS' in line:
# while list(set(line.strip())) != ['=']:
# if 'VPT2' in line:
# if not hasattr(self, 'vibanharms'):
# self.vibanharms = []
# self.vibanharms.append(float(line.split()[-1]))
# line = next(inputfile)
if 'STANDARD THERMODYNAMIC QUANTITIES AT' in line:
if not hasattr(self, 'temperature'):
self.temperature = float(line.split()[4])
# Not supported yet.
if not hasattr(self, 'pressure'):
self.pressure = float(line.split()[7])
self.skip_line(inputfile, 'blank')
line = next(inputfile)
if self.natom == 1:
assert 'Translational Enthalpy' in line
else:
assert 'Imaginary Frequencies' in line
line = next(inputfile)
# Not supported yet.
assert 'Zero point vibrational energy' in line
if not hasattr(self, 'zpe'):
# Convert from kcal/mol to Hartree/particle.
self.zpe = utils.convertor(float(line.split()[4]),
'kcal/mol', 'hartree')
atommasses = []
while 'Translational Enthalpy' not in line:
if 'Has Mass' in line:
atommass = float(line.split()[6])
atommasses.append(atommass)
line = next(inputfile)
if not hasattr(self, 'atommasses'):
self.atommasses = numpy.array(atommasses)
while line.strip():
line = next(inputfile)
line = next(inputfile)
assert 'Total Enthalpy' in line
if not hasattr(self, 'enthalpy'):
enthalpy = float(line.split()[2])
self.enthalpy = utils.convertor(enthalpy,
'kcal/mol', 'hartree')
line = next(inputfile)
assert 'Total Entropy' in line
if not hasattr(self, 'entropy'):
entropy = float(line.split()[2]) * self.temperature / 1000
# This is the *temperature dependent* entropy.
self.entropy = utils.convertor(entropy,
'kcal/mol', 'hartree')
if not hasattr(self, 'freeenergy'):
self.freeenergy = self.enthalpy - self.entropy
if line[:16] == ' Total job time:':
self.metadata['success'] = True
# TODO:
# 'enthalpy' (incorrect)
# 'entropy' (incorrect)
# 'freeenergy' (incorrect)
# 'nocoeffs'
# 'nooccnos'
# 'vibanharms'
cclib-1.6.2/cclib/parser/turbomoleparser.py 0000664 0000000 0000000 00000111270 13535330462 0020751 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Parser for Turbomole output files."""
from __future__ import print_function
import re
import numpy
from cclib.parser import logfileparser
from cclib.parser import utils
class AtomBasis:
def __init__(self, atname, basis_name, inputfile):
self.symmetries=[]
self.coefficients=[]
self.atname=atname
self.basis_name=basis_name
self.parse_basis(inputfile)
def parse_basis(self, inputfile):
i=0
line=inputfile.next()
while(line[0]!="*"):
(nbasis_text, symm)=line.split()
self.symmetries.append(symm)
nbasis=int(nbasis_text)
coeff_arr=numpy.zeros((nbasis, 2), float)
for j in range(0, nbasis, 1):
line=inputfile.next()
(e1_text, e2_text)=line.split()
coeff_arr[j][0]=float(e1_text)
coeff_arr[j][1]=float(e2_text)
self.coefficients.append(coeff_arr)
line=inputfile.next()
class Turbomole(logfileparser.Logfile):
"""A Turbomole log file."""
def __init__(self, *args, **kwargs):
super(Turbomole, self).__init__(logname="Turbomole", *args, **kwargs)
def __str__(self):
"""Return a string representation of the object."""
return "Turbomole output file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Turbomole("%s")' % (self.filename)
def normalisesym(self, label):
"""Normalise the symmetries used by Turbomole."""
raise NotImplementedError('Now yet implemented for Turbomole.')
def before_parsing(self):
self.geoopt = False # Is this a GeoOpt? Needed for SCF targets/values.
self.periodic_table = utils.PeriodicTable()
@staticmethod
def split_molines(inline):
"""Splits the lines containing mocoeffs (each of length 20)
and converts them to float correctly.
"""
line = inline.replace("D", "E")
f1 = line[0:20]
f2 = line[20:40]
f3 = line[40:60]
f4 = line[60:80]
if(len(f4) > 1):
return [float(f1), float(f2), float(f3), float(f4)]
if(len(f3) > 1):
return [float(f1), float(f2), float(f3)]
if(len(f2) > 1):
return [float(f1), float(f2)]
if(len(f1) > 1):
return [float(f1)]
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
## This information is in the control file.
# $rundimensions
# dim(fock,dens)=1860
# natoms=20
# nshell=40
# nbf(CAO)=60
# nbf(AO)=60
# dim(trafo[SAO<-->AO/CAO])=60
# rhfshells=1
if line[3:10]=="natoms=":
self.natom=int(line[10:])
if line[3:11] == "nbf(AO)=":
nmo = int(line.split('=')[1])
self.set_attribute('nbasis', nmo)
self.set_attribute('nmo', nmo)
# Extract the version number and optionally the build number.
searchstr = ": TURBOMOLE"
index = line.find(searchstr)
if index > -1:
line = line[index + len(searchstr):]
tokens = line.split()
self.metadata["package_version"] = tokens[0][1:].replace("-", ".")
# Don't add revision information to the main package version for now.
if tokens[1] == "(":
revision = tokens[2]
## Atomic coordinates in job.last:
# +--------------------------------------------------+
# | Atomic coordinate, charge and isotop information |
# +--------------------------------------------------+
#
#
# atomic coordinates atom shells charge pseudo isotop
# -2.69176330 -0.00007129 -0.44712612 c 3 6.000 0 0
# -1.69851645 -0.00007332 2.06488947 c 3 6.000 0 0
# 0.92683848 -0.00007460 2.49592179 c 3 6.000 0 0
# 2.69176331 -0.00007127 0.44712612 c 3 6.000 0 0
# 1.69851645 -0.00007331 -2.06488947 c 3 6.000 0 0
#...
# -7.04373606 0.00092244 2.74543891 h 1 1.000 0 0
# -9.36352819 0.00017229 0.07445322 h 1 1.000 0 0
# -0.92683849 -0.00007461 -2.49592179 c 3 6.000 0 0
# -1.65164853 -0.00009927 -4.45456858 h 1 1.000 0 0
if 'Atomic coordinate, charge and isotop information' in line:
while 'atomic coordinates' not in line:
line = next(inputfile)
atomcoords = []
atomnos = []
line = next(inputfile)
while len(line) > 2:
atomnos.append(self.periodic_table.number[line.split()[3].upper()])
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom")
for x in line.split()[:3]])
line = next(inputfile)
self.append_attribute('atomcoords', atomcoords)
self.set_attribute('atomnos', atomnos)
self.set_attribute('natom', len(atomcoords))
# Frequency values in aoforce.out
# mode 7 8 9 10 11 12
#
# frequency 53.33 88.32 146.85 171.70 251.75 289.44
#
# symmetry a a a a a a
#
# IR YES YES YES YES YES YES
# |dDIP/dQ| (a.u.) 0.0002 0.0000 0.0005 0.0004 0.0000 0.0000
# intensity (km/mol) 0.05 0.00 0.39 0.28 0.00 0.00
# intensity ( % ) 0.05 0.00 0.40 0.28 0.00 0.00
#
# RAMAN YES YES YES YES YES YES
#
# 1 c x 0.00000 0.00001 0.00000 -0.01968 -0.04257 0.00001
# y -0.08246 -0.08792 0.02675 -0.00010 0.00000 0.17930
# z 0.00001 0.00003 0.00004 -0.10350 0.11992 -0.00003
if 'NORMAL MODES and VIBRATIONAL FREQUENCIES (cm**(-1))' in line:
vibfreqs, vibsyms, vibirs, vibdisps = [], [], [], []
while '**** force : all done ****' not in line:
if line.strip().startswith('frequency'):
freqs = [float(i.replace('i', '-')) for i in line.split()[1:]]
vibfreqs.extend(freqs)
self.skip_line(inputfile, ['b'])
line = next(inputfile)
if line.strip().startswith('symmetry'):
syms = line.split()[1:]
vibsyms.extend(syms)
self.skip_lines(inputfile, ['b', 'IR', 'dQIP'])
line = next(inputfile)
if line.strip().startswith('intensity (km/mol)'):
irs = [self.float(f) for f in line.split()[2:]]
vibirs.extend(irs)
self.skip_lines(inputfile, ['intensity', 'b', 'raman', 'b'])
line = next(inputfile)
x, y, z = [], [], []
while line.split():
x.append([float(i) for i in line.split()[3:]])
line = next(inputfile)
y.append([float(i) for i in line.split()[1:]])
line = next(inputfile)
z.append([float(i) for i in line.split()[1:]])
line = next(inputfile)
for j in range(len(x[0])):
disps = []
for i in range(len(x)):
disps.append([x[i][j], y[i][j], z[i][j]])
vibdisps.append(disps)
line = next(inputfile)
self.set_attribute('vibfreqs', vibfreqs)
self.set_attribute('vibsyms', vibsyms)
self.set_attribute('vibirs', vibirs)
self.set_attribute('vibdisps', vibdisps)
# In this section we are parsing mocoeffs and moenergies from
# the files like: mos, alpha and beta.
# $scfmo scfconv=6 format(4d20.14)
# # SCF total energy is -382.3457535740 a.u.
# #
# 1 a eigenvalue=-.97461484059799D+01 nsaos=60
# 0.69876828353937D+000.32405121159405D-010.87670894913921D-03-.85232349313288D-07
# 0.19361534257922D-04-.23841194890166D-01-.81711001390807D-020.13626356942047D-02
# ...
# ...
# $end
if (line.startswith('$scfmo') or line.startswith('$uhfmo')) and line.find('scfconv') > 0:
if line.strip().startswith('$uhfmo_alpha'):
self.unrestricted = True
# Need to skip the first line to start with lines starting with '#'.
line = next(inputfile)
while line.strip().startswith('#') and not line.find('eigenvalue') > 0:
line = next(inputfile)
moenergies = []
mocoeffs = []
while not line.strip().startswith('$'):
info = re.match(".*eigenvalue=(?P[0-9D\.+-]{20})\s+nsaos=(?P\d+).*", line)
eigenvalue = self.float(info.group('moenergy'))
orbital_energy = utils.convertor(eigenvalue, 'hartree', 'eV')
moenergies.append(orbital_energy)
single_coeffs = []
nsaos = int(info.group('count'))
while(len(single_coeffs) < nsaos):
line = next(inputfile)
single_coeffs.extend(Turbomole.split_molines(line))
mocoeffs.append(single_coeffs)
line = next(inputfile)
max_nsaos = max([len(i) for i in mocoeffs])
for i in mocoeffs:
while len(i) < max_nsaos:
i.append(numpy.nan)
if not hasattr(self, 'mocoeffs'):
self.mocoeffs = []
if not hasattr(self, 'moenergies'):
self.moenergies = []
self.mocoeffs.append(mocoeffs)
self.moenergies.append(moenergies)
# Parsing the scfenergies, scfvalues and scftargets from job.last file.
# scf convergence criterion : increment of total energy < .1000000D-05
# and increment of one-electron energy < .1000000D-02
#
# ...
# ...
# current damping : 0.700
# ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
# 1 -382.34543727790 -1396.8009423 570.56292464 0.000D+00 0.556D-09
# Exc = -57.835278090846 N = 69.997494722
# max. resid. norm for Fia-block= 2.782D-05 for orbital 33a
# ...
# ...
# current damping : 0.750
# ITERATION ENERGY 1e-ENERGY 2e-ENERGY NORM[dD(SAO)] TOL
# 3 -382.34575357399 -1396.8009739 570.56263988 0.117D-03 0.319D-09
# Exc = -57.835593208072 N = 69.999813370
# max. resid. norm for Fia-block= 7.932D-06 for orbital 33a
# max. resid. fock norm = 8.105D-06 for orbital 33a
#
# convergence criteria satisfied after 3 iterations
#
#
# ------------------------------------------
# | total energy = -382.34575357399 |
# ------------------------------------------
# : kinetic energy = 375.67398458525 :
# : potential energy = -758.01973815924 :
# : virial theorem = 1.98255043001 :
# : wavefunction norm = 1.00000000000 :
# ..........................................
if 'scf convergence criterion' in line:
total_energy_threshold = self.float(line.split()[-1])
one_electron_energy_threshold = self.float(next(inputfile).split()[-1])
scftargets = [total_energy_threshold, one_electron_energy_threshold]
self.append_attribute('scftargets', scftargets)
iter_energy = []
iter_one_elec_energy = []
while 'convergence criteria satisfied' not in line:
if 'ITERATION ENERGY' in line:
line = next(inputfile)
info = line.split()
iter_energy.append(self.float(info[1]))
iter_one_elec_energy.append(self.float(info[2]))
line = next(inputfile)
assert len(iter_energy) == len(iter_one_elec_energy), \
'Different number of values found for total energy and one electron energy.'
scfvalues = [[x - y, a - b] for x, y, a, b in
zip(iter_energy[1:], iter_energy[:-1], iter_one_elec_energy[1:], iter_one_elec_energy[:-1])]
self.append_attribute('scfvalues', scfvalues)
while 'total energy' not in line:
line = next(inputfile)
scfenergy = utils.convertor(self.float(line.split()[4]), 'hartree', 'eV')
self.append_attribute('scfenergies', scfenergy)
# **********************************************************************
# * *
# * RHF energy : -74.9644564256 *
# * MP2 correlation energy (doubles) : -0.0365225363 *
# * *
# * Final MP2 energy : -75.0009789619 *
# ...
# * Norm of MP1 T2 amplitudes : 0.0673494687 *
# * *
# **********************************************************************
# OR
# **********************************************************************
# * *
# * RHF energy : -74.9644564256 *
# * correlation energy : -0.0507799360 *
# * *
# * Final CCSD energy : -75.0152363616 *
# * *
# * D1 diagnostic : 0.0132 *
# * *
# **********************************************************************
if 'C C S D F 1 2 P R O G R A M' in line:
while 'ccsdf12 : all done' not in line:
if 'Final MP2 energy' in line:
mp2energy = [utils.convertor(self.float(line.split()[5]), 'hartree', 'eV')]
self.append_attribute('mpenergies', mp2energy)
if 'Final CCSD energy' in line:
ccenergy = [utils.convertor(self.float(line.split()[5]), 'hartree', 'eV')]
self.append_attribute('ccenergies', ccenergy)
line = next(inputfile)
# *****************************************************
# * *
# * SCF-energy : -74.49827196840999 *
# * MP2-energy : -0.19254365976227 *
# * total : -74.69081562817226 *
# * *
# * (MP2-energy evaluated from T2 amplitudes) *
# * *
# *****************************************************
if 'm p g r a d - program' in line:
while 'ccsdf12 : all done' not in line:
if 'MP2-energy' in line:
line = next(inputfile)
if 'total' in line:
mp2energy = [utils.convertor(self.float(line.split()[3]), 'hartree', 'eV')]
self.append_attribute('mpenergies', mp2energy)
line = next(inputfile)
def deleting_modes(self, vibfreqs, vibdisps, vibirs):
"""Deleting frequencies relating to translations or rotations"""
i = 0
while i < len(vibfreqs):
if vibfreqs[i] == 0.0:
# Deleting frequencies that have value 0 since they
# do not correspond to vibrations.
del vibfreqs[i], vibdisps[i], vibirs[i]
i -= 1
i += 1
def after_parsing(self):
if hasattr(self, 'vibfreqs'):
self.deleting_modes(self.vibfreqs, self.vibdisps, self.vibirs)
class OldTurbomole(logfileparser.Logfile):
"""A Turbomole output file. Code is outdated and is not being used."""
def __init__(self, *args):
# Call the __init__ method of the superclass
super(Turbomole, self).__init__(logname="Turbomole", *args)
def __str__(self):
"""Return a string representation of the object."""
return "Turbomole output file %s" % (self.filename)
def __repr__(self):
"""Return a representation of the object."""
return 'Turbomole("%s")' % (self.filename)
def atlist(self, atstr):
# turn atstr from atoms section into array
fields=atstr.split(',')
list=[]
for f in fields:
if(f.find('-')!=-1):
rangefields=f.split('-')
start=int(rangefields[0])
end=int(rangefields[1])
for j in range(start, end+1, 1):
list.append(j-1)
else:
list.append(int(f)-1)
return(list)
def normalisesym(self, label):
"""Normalise the symmetries used by Turbomole."""
return ans
def before_parsing(self):
self.geoopt = False # Is this a GeoOpt? Needed for SCF targets/values.
def split_molines(self, inline):
line=inline.replace("D", "E")
f1=line[0:20]
f2=line[20:40]
f3=line[40:60]
f4=line[60:80]
if(len(f4)>1):
return( (float(f1), float(f2), float(f3), float(f4)) )
if(len(f3)>1):
return( (float(f1), float(f2), float(f3)) )
if(len(f2)>1):
return( (float(f1), float(f2)) )
if(len(f1)>1):
return([float(f1)])
return
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
if line[3:11]=="nbf(AO)=":
nmo=int(line[11:])
self.nbasis=nmo
self.nmo=nmo
if line[3:9]=="nshell":
temp=line.split('=')
homos=int(temp[1])
if line[0:6] == "$basis":
print("Found basis")
self.basis_lib=[]
line = inputfile.next()
line = inputfile.next()
while line[0] != '*' and line[0] != '$':
temp=line.split()
line = inputfile.next()
while line[0]=="#":
line = inputfile.next()
self.basis_lib.append(AtomBasis(temp[0], temp[1], inputfile))
line = inputfile.next()
if line == "$ecp\n":
self.ecp_lib=[]
line = inputfile.next()
line = inputfile.next()
while line[0] != '*' and line[0] != '$':
fields=line.split()
atname=fields[0]
ecpname=fields[1]
line = inputfile.next()
line = inputfile.next()
fields=line.split()
ncore = int(fields[2])
while line[0] != '*':
line = inputfile.next()
self.ecp_lib.append([atname, ecpname, ncore])
if line[0:6] == "$coord":
if line[0:11] == "$coordinate":
# print "Breaking"
return
# print "Found coords"
self.atomcoords = []
self.atomnos = []
atomcoords = []
atomnos = []
line = inputfile.next()
if line[0:5] == "$user":
# print "Breaking"
return
while line[0] != "$":
temp = line.split()
atsym=temp[3].capitalize()
atomnos.append(self.table.number[atsym])
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom")
for x in temp[0:3]])
line = inputfile.next()
self.atomcoords.append(atomcoords)
self.atomnos = numpy.array(atomnos, "i")
if line[14:32] == "atomic coordinates":
atomcoords = []
atomnos = []
line = inputfile.next()
while len(line) > 2:
temp = line.split()
atsym = temp[3].capitalize()
atomnos.append(self.table.number[atsym])
atomcoords.append([utils.convertor(float(x), "bohr", "Angstrom")
for x in temp[0:3]])
line = inputfile.next()
if not hasattr(self,"atomcoords"):
self.atomcoords = []
self.atomcoords.append(atomcoords)
self.atomnos = numpy.array(atomnos, "i")
if line[0:6] == "$atoms":
print("parsing atoms")
line = inputfile.next()
self.atomlist=[]
while line[0]!="$":
temp=line.split()
at=temp[0]
atnosstr=temp[1]
while atnosstr[-1] == ",":
line = inputfile.next()
temp=line.split()
atnosstr=atnosstr+temp[0]
# print "Debug:", atnosstr
atlist=self.atlist(atnosstr)
line = inputfile.next()
temp=line.split()
# print "Debug basisname (temp):",temp
basisname=temp[2]
ecpname=''
line = inputfile.next()
while(line.find('jbas')!=-1 or line.find('ecp')!=-1 or
line.find('jkbas')!=-1):
if line.find('ecp')!=-1:
temp=line.split()
ecpname=temp[2]
line = inputfile.next()
self.atomlist.append( (at, basisname, ecpname, atlist))
# I have no idea what this does, so "comment" out
if line[3:10]=="natoms=":
# if 0:
self.natom=int(line[10:])
basistable=[]
for i in range(0, self.natom, 1):
for j in range(0, len(self.atomlist), 1):
for k in range(0, len(self.atomlist[j][3]), 1):
if self.atomlist[j][3][k]==i:
basistable.append((self.atomlist[j][0],
self.atomlist[j][1],
self.atomlist[j][2]))
self.aonames=[]
counter=1
for a, b, c in basistable:
ncore=0
if len(c) > 0:
for i in range(0, len(self.ecp_lib), 1):
if self.ecp_lib[i][0]==a and \
self.ecp_lib[i][1]==c:
ncore=self.ecp_lib[i][2]
for i in range(0, len(self.basis_lib), 1):
if self.basis_lib[i].atname==a and self.basis_lib[i].basis_name==b:
pa=a.capitalize()
basis=self.basis_lib[i]
s_counter=1
p_counter=2
d_counter=3
f_counter=4
g_counter=5
# this is a really ugly piece of code to assign the right labels to
# basis functions on atoms with an ecp
if ncore == 2:
s_counter=2
elif ncore == 10:
s_counter=3
p_counter=3
elif ncore == 18:
s_counter=4
p_counter=4
elif ncore == 28:
s_counter=4
p_counter=4
d_counter=4
elif ncore == 36:
s_counter=5
p_counter=5
d_counter=5
elif ncore == 46:
s_counter=5
p_counter=5
d_counter=6
for j in range(0, len(basis.symmetries), 1):
if basis.symmetries[j]=='s':
self.aonames.append("%s%d_%d%s" % \
(pa, counter, s_counter, "S"))
s_counter=s_counter+1
elif basis.symmetries[j]=='p':
self.aonames.append("%s%d_%d%s" % \
(pa, counter, p_counter, "PX"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, p_counter, "PY"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, p_counter, "PZ"))
p_counter=p_counter+1
elif basis.symmetries[j]=='d':
self.aonames.append("%s%d_%d%s" % \
(pa, counter, d_counter, "D 0"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, d_counter, "D+1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, d_counter, "D-1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, d_counter, "D+2"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, d_counter, "D-2"))
d_counter=d_counter+1
elif basis.symmetries[j]=='f':
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F 0"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F+1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F-1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F+2"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F-2"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F+3"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "F-3"))
elif basis.symmetries[j]=='g':
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "G 0"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "G+1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, f_counter, "G-1"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G+2"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G-2"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G+3"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G-3"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G+4"))
self.aonames.append("%s%d_%d%s" % \
(pa, counter, g_counter, "G-4"))
break
counter=counter+1
if line=="$closed shells\n":
line = inputfile.next()
temp = line.split()
occs = int(temp[1][2:])
self.homos = numpy.array([occs-1], "i")
if line == "$alpha shells\n":
line = inputfile.next()
temp = line.split()
occ_a = int(temp[1][2:])
line = inputfile.next() # should be $beta shells
line = inputfile.next() # the beta occs
temp = line.split()
occ_b = int(temp[1][2:])
self.homos = numpy.array([occ_a-1,occ_b-1], "i")
if line[12:24]=="OVERLAP(CAO)":
line = inputfile.next()
line = inputfile.next()
overlaparray=[]
self.aooverlaps=numpy.zeros( (self.nbasis, self.nbasis), "d")
while line != " ----------------------\n":
temp=line.split()
overlaparray.extend(map(float, temp))
line = inputfile.next()
counter=0
for i in range(0, self.nbasis, 1):
for j in range(0, i+1, 1):
self.aooverlaps[i][j]=overlaparray[counter]
self.aooverlaps[j][i]=overlaparray[counter]
counter=counter+1
if ( line[0:6] == "$scfmo" or line[0:12] == "$uhfmo_alpha" ) and line.find("scf") > 0:
temp = line.split()
if temp[1][0:7] == "scfdump":
# self.logger.warning("SCF not converged?")
print("SCF not converged?!")
if line[0:12] == "$uhfmo_alpha": # if unrestricted, create flag saying so
unrestricted = 1
else:
unrestricted = 0
self.moenergies=[]
self.mocoeffs=[]
for spin in range(unrestricted + 1): # make sure we cover all instances
title = inputfile.next()
while(title[0] == "#"):
title = inputfile.next()
# mocoeffs = numpy.zeros((self.nbasis, self.nbasis), "d")
moenergies = []
moarray=[]
if spin == 1 and title[0:11] == "$uhfmo_beta":
title = inputfile.next()
while title[0] == "#":
title = inputfile.next()
while(title[0] != '$'):
temp=title.split()
orb_symm=temp[1]
try:
energy = float(temp[2][11:].replace("D", "E"))
except ValueError:
print(spin, ": ", title)
orb_en = utils.convertor(energy,"hartree","eV")
moenergies.append(orb_en)
single_mo = []
while(len(single_mo) 0):
self.vibmasses.append(float(temp[2]))
line=inputfile.next()
temp=line.split()
if line[5:14] == "frequency":
if not hasattr(self,"vibfreqs"):
self.vibfreqs = []
self.vibfreqs = []
self.vibsyms = []
self.vibdisps = []
self.vibirs = []
temp=line.replace("i","-").split()
freqs = [self.float(f) for f in temp[1:]]
self.vibfreqs.extend(freqs)
line=inputfile.next()
line=inputfile.next()
syms=line.split()
self.vibsyms.extend(syms[1:])
line=inputfile.next()
line=inputfile.next()
line=inputfile.next()
line=inputfile.next()
temp=line.split()
irs = [self.float(f) for f in temp[2:]]
self.vibirs.extend(irs)
line=inputfile.next()
line=inputfile.next()
line=inputfile.next()
line=inputfile.next()
x=[]
y=[]
z=[]
line=inputfile.next()
while len(line) > 1:
temp=line.split()
x.append(map(float, temp[3:]))
line=inputfile.next()
temp=line.split()
y.append(map(float, temp[1:]))
line=inputfile.next()
temp=line.split()
z.append(map(float, temp[1:]))
line=inputfile.next()
# build xyz vectors for each mode
for i in range(0, len(x[0]), 1):
disp=[]
for j in range(0, len(x), 1):
disp.append( [x[j][i], y[j][i], z[j][i]])
self.vibdisps.append(disp)
# line=inputfile.next()
def after_parsing(self):
# delete all frequencies that correspond to translations or rotations
if hasattr(self,"vibfreqs"):
i = 0
while i < len(self.vibfreqs):
if self.vibfreqs[i]==0.0:
del self.vibfreqs[i]
del self.vibdisps[i]
del self.vibirs[i]
del self.vibsyms[i]
i -= 1
i += 1
cclib-1.6.2/cclib/parser/utils.py 0000664 0000000 0000000 00000015225 13535330462 0016667 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Utilities often used by cclib parsers and scripts"""
import sys
import numpy
# See https://github.com/kachayev/fn.py/commit/391824c43fb388e0eca94e568ff62cc35b543ecb
if sys.version_info <= (3, 3):
import operator
def accumulate(iterable, func=operator.add):
"""Return running totals"""
# accumulate([1,2,3,4,5]) --> 1 3 6 10 15
# accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120
it = iter(iterable)
try:
total = next(it)
except StopIteration:
return
yield total
for element in it:
total = func(total, element)
yield total
else:
from itertools import accumulate
def find_package(package):
"""Check if a package exists without importing it.
Derived from https://stackoverflow.com/a/14050282
"""
if sys.version_info.major == 2:
import pkgutil
return pkgutil.find_loader(package) is not None
else:
import importlib
module_spec = importlib.util.find_spec(package)
return module_spec is not None and module_spec.loader is not None
def symmetrize(m, use_triangle='lower'):
"""Symmetrize a square NumPy array by reflecting one triangular
section across the diagonal to the other.
"""
if use_triangle not in ('lower', 'upper'):
raise ValueError
if not len(m.shape) == 2:
raise ValueError
if not (m.shape[0] == m.shape[1]):
raise ValueError
dim = m.shape[0]
lower_indices = numpy.tril_indices(dim, k=-1)
upper_indices = numpy.triu_indices(dim, k=1)
ms = m.copy()
if use_triangle == 'lower':
ms[upper_indices] = ms[lower_indices]
if use_triangle == 'upper':
ms[lower_indices] = ms[upper_indices]
return ms
def convertor(value, fromunits, tounits):
"""Convert from one set of units to another.
Sources:
NIST 2010 CODATA (http://physics.nist.gov/cuu/Constants/index.html)
Documentation of GAMESS-US or other programs as noted
"""
_convertor = {
"time_au_to_fs": lambda x: x * 0.02418884,
"fs_to_time_au": lambda x: x / 0.02418884,
"Angstrom_to_bohr": lambda x: x * 1.8897261245,
"bohr_to_Angstrom": lambda x: x * 0.5291772109,
"wavenumber_to_eV": lambda x: x / 8065.54429,
"wavenumber_to_hartree": lambda x: x / 219474.6313708,
"wavenumber_to_kcal/mol": lambda x: x / 349.7550112,
"wavenumber_to_kJ/mol": lambda x: x / 83.5934722814,
"wavenumber_to_nm": lambda x: 1e7 / x,
"wavenumber_to_Hz": lambda x: x * 29.9792458,
"eV_to_wavenumber": lambda x: x * 8065.54429,
"eV_to_hartree": lambda x: x / 27.21138505,
"eV_to_kcal/mol": lambda x: x * 23.060548867,
"eV_to_kJ/mol": lambda x: x * 96.4853364596,
"hartree_to_wavenumber": lambda x: x * 219474.6313708,
"hartree_to_eV": lambda x: x * 27.21138505,
"hartree_to_kcal/mol": lambda x: x * 627.50947414,
"hartree_to_kJ/mol": lambda x: x * 2625.4996398,
"kcal/mol_to_wavenumber": lambda x: x * 349.7550112,
"kcal/mol_to_eV": lambda x: x / 23.060548867,
"kcal/mol_to_hartree": lambda x: x / 627.50947414,
"kcal/mol_to_kJ/mol": lambda x: x * 4.184,
"kJ/mol_to_wavenumber": lambda x: x * 83.5934722814,
"kJ/mol_to_eV": lambda x: x / 96.4853364596,
"kJ/mol_to_hartree": lambda x: x / 2625.49963978,
"kJ/mol_to_kcal/mol": lambda x: x / 4.184,
"nm_to_wavenumber": lambda x: 1e7 / x,
# Taken from GAMESS docs, "Further information",
# "Molecular Properties and Conversion Factors"
"Debye^2/amu-Angstrom^2_to_km/mol": lambda x: x * 42.255,
# Conversion for charges and multipole moments.
"e_to_coulomb": lambda x: x * 1.602176565 * 1e-19,
"e_to_statcoulomb": lambda x: x * 4.80320425 * 1e-10,
"coulomb_to_e": lambda x: x * 0.6241509343 * 1e19,
"statcoulomb_to_e": lambda x: x * 0.2081943527 * 1e10,
"ebohr_to_Debye": lambda x: x * 2.5417462300,
"ebohr2_to_Buckingham": lambda x: x * 1.3450341749,
"ebohr2_to_Debye.ang": lambda x: x * 1.3450341749,
"ebohr3_to_Debye.ang2": lambda x: x * 0.7117614302,
"ebohr4_to_Debye.ang3": lambda x: x * 0.3766479268,
"ebohr5_to_Debye.ang4": lambda x: x * 0.1993134985,
}
return _convertor["%s_to_%s" % (fromunits, tounits)](value)
class PeriodicTable(object):
"""Allows conversion between element name and atomic no."""
def __init__(self):
self.element = [
None,
'H', 'He',
'Li', 'Be',
'B', 'C', 'N', 'O', 'F', 'Ne',
'Na', 'Mg',
'Al', 'Si', 'P', 'S', 'Cl', 'Ar',
'K', 'Ca',
'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn',
'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr',
'Rb', 'Sr',
'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd',
'In', 'Sn', 'Sb', 'Te', 'I', 'Xe',
'Cs', 'Ba',
'La', 'Ce', 'Pr', 'Nd', 'Pm', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb',
'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg',
'Tl', 'Pb', 'Bi', 'Po', 'At', 'Rn',
'Fr', 'Ra',
'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No',
'Lr', 'Rf', 'Db', 'Sg', 'Bh', 'Hs', 'Mt', 'Ds', 'Rg', 'Cn',
'Uut', 'Fl', 'Uup', 'Lv', 'Uus', 'Uuo']
self.number = {}
for i in range(1, len(self.element)):
self.number[self.element[i]] = i
class WidthSplitter:
"""Split a line based not on a character, but a given number of field
widths.
"""
def __init__(self, widths):
self.start_indices = [0] + list(accumulate(widths))[:-1]
self.end_indices = list(accumulate(widths))
def split(self, line, truncate=True):
"""Split the given line using the field widths passed in on class
initialization.
"""
elements = [line[start:end].strip()
for (start, end) in zip(self.start_indices, self.end_indices)]
# Handle lines that contain fewer fields than specified in the
# widths; they are added as empty strings, so remove them.
if truncate:
while len(elements) and elements[-1] == '':
elements.pop()
return elements
cclib-1.6.2/cclib/progress/ 0000775 0000000 0000000 00000000000 13535330462 0015520 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/progress/__init__.py 0000664 0000000 0000000 00000000550 13535330462 0017631 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
import sys
if 'PyQt4' in list(sys.modules.keys()):
from cclib.progress.qt4progress import Qt4Progress
from cclib.progress.textprogress import TextProgress
cclib-1.6.2/cclib/progress/qt4progress.py 0000664 0000000 0000000 00000001765 13535330462 0020400 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
from PyQt4 import QtGui, QtCore
class Qt4Progress(QtGui.QProgressDialog):
def __init__(self, title, parent=None):
QtGui.QProgressDialog.__init__(self, parent)
self.nstep = 0
self.text = None
self.oldprogress = 0
self.progress = 0
self.calls = 0
self.loop=QtCore.QEventLoop(self)
self.setWindowTitle(title)
def initialize(self, nstep, text=None):
self.nstep = nstep
self.text = text
self.setRange(0,nstep)
if text:
self.setLabelText(text)
self.setValue(1)
#sys.stdout.write("\n")
def update(self, step, text=None):
if text:
self.setLabelText(text)
self.setValue(step)
self.loop.processEvents(QtCore.QEventLoop.ExcludeUserInputEvents)
cclib-1.6.2/cclib/progress/textprogress.py 0000664 0000000 0000000 00000002502 13535330462 0020642 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
from __future__ import print_function
import sys
class TextProgress:
def __init__(self):
self.nstep = 0
self.text = None
self.oldprogress = 0
self.progress = 0
self.calls = 0
def initialize(self, nstep, text=None):
self.nstep = float(nstep)
self.text = text
#sys.stdout.write("\n")
def update(self, step, text=None):
self.progress = int(step * 100 / self.nstep)
if self.progress/2 >= self.oldprogress/2 + 1 or self.text != text:
# just went through at least an interval of ten, ie. from 39 to 41,
# so update
mystr = "\r["
prog = int(self.progress / 10)
mystr += prog * "=" + (10-prog) * "-"
mystr += "] %3i" % self.progress + "%"
if text:
mystr += " "+text
sys.stdout.write("\r" + 70 * " ")
sys.stdout.flush()
sys.stdout.write(mystr)
sys.stdout.flush()
self.oldprogress = self.progress
if self.progress >= 100 and text == "Done":
print(" ")
return
cclib-1.6.2/cclib/scripts/ 0000775 0000000 0000000 00000000000 13535330462 0015343 5 ustar 00root root 0000000 0000000 cclib-1.6.2/cclib/scripts/__init__.py 0000664 0000000 0000000 00000000402 13535330462 0017450 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2018, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
from . import ccget
from . import ccwrite
from . import cda
cclib-1.6.2/cclib/scripts/ccframe.py 0000664 0000000 0000000 00000004772 13535330462 0017327 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Script for writing data tables from computational chemistry files."""
import argparse
import os.path
import sys
from cclib.io import ccopen
from cclib.io import ccframe
from cclib.parser.utils import find_package
_has_pandas = find_package("pandas")
if _has_pandas:
import pandas as pd
def process_logfiles(filenames, output, identifier):
df = ccframe([ccopen(path) for path in filenames])
if output is not None:
outputtype = os.path.splitext(os.path.basename(output))[1][1:]
if not outputtype:
raise RuntimeWarning(
"The output type could not be determined from the given path, "
"not writing DataFrame to disk"
)
if outputtype in {'csv'}:
df.to_csv(output)
elif outputtype in {'h5', 'hdf', 'hdf5'}:
df.to_hdf(output, key=identifier)
elif outputtype in {'json'}:
df.to_json(output)
elif outputtype in {'pickle', 'pkl'}:
df.to_pickle(output)
elif outputtype in {'xlsx'}:
writer = pd.ExcelWriter(output)
# This overwrites previous sheets
# (see https://stackoverflow.com/a/42375263/4039050)
df.to_excel(writer, sheet_name=identifier)
writer.save()
else:
print(df)
def main():
parser = argparse.ArgumentParser()
parser.add_argument('-O', '--output',
help=('the output document to write, including an '
'extension supported by pandas '
'(csv, h5/hdf/hdf5, json, pickle/pkl, xlsx)'))
parser.add_argument('compchemlogfiles', metavar='compchemlogfile',
nargs='+',
help=('one or more computational chemistry output '
'files to parse and convert'))
parser.add_argument('--identifier',
default='logfiles',
help=('name of sheet which will contain DataFrame, if '
'writing to an Excel file, or identifier for '
'the group in HDFStore, if writing a HDF file'))
args = parser.parse_args()
process_logfiles(args.compchemlogfiles, args.output, args.identifier)
if __name__ == "__main__":
main()
cclib-1.6.2/cclib/scripts/ccget.py 0000664 0000000 0000000 00000015551 13535330462 0017011 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Script for loading data from computational chemistry files."""
from __future__ import print_function
import glob
import logging
import os.path
import sys
from functools import partial
from pprint import pprint
import numpy
from cclib.parser import ccData
from cclib.io import ccread, URL_PATTERN
# Set up options for pretty-printing output.
if sys.version_info < (3, 4):
pprint = partial(pprint, width=120)
else:
pprint = partial(pprint, width=120, compact=True)
numpy.set_printoptions(linewidth=120)
def ccget():
"""Parse files with cclib based on command line arguments."""
import argparse
parser = argparse.ArgumentParser()
parser.add_argument(
"attribute_or_compchemlogfile", nargs="+",
help="one or more attributes to be parsed from one ore more logfiles",
)
group = parser.add_mutually_exclusive_group()
group.add_argument(
"--list", "-l",
action="store_true",
help="print a list of attributes available in each file",
)
group.add_argument(
"--json", "-j",
action="store_true",
help="the given logfile is in CJSON format",
)
group.add_argument(
"--multi", "-m",
action="store_true",
help="parse multiple input files as one input stream",
)
parser.add_argument(
"--verbose", "-v",
action="store_true",
help="more verbose parsing output (only errors by default)",
)
parser.add_argument(
"--future", "-u",
action="store_true",
help="use experimental features (currently optdone_as_list)",
)
parser.add_argument(
"--full", "-f",
action="store_true",
help="toggle full print behaviour for attributes",
)
args = parser.parse_args()
arglist = args.attribute_or_compchemlogfile
showattr = args.list
cjsonfile = args.json
multifile = args.multi
verbose = args.verbose
future = args.future
full = args.full
# Toggle full print behaviour for numpy arrays.
if full:
numpy.set_printoptions(threshold=numpy.nan)
# We need at least one attribute and the filename, so two arguments, or
# just one filename if we want to list attributes that can be extracted.
# In multifile mode, we generally want at least two filenames, so the
# expected number of arguments is a bit different.
if not multifile:
correct_number = (not showattr and len(arglist) > 1) or (showattr and len(arglist) > 0)
else:
correct_number = (not showattr and len(arglist) > 2) or (showattr and len(arglist) > 1)
if not correct_number:
print("The number of arguments does not seem to be correct.")
parser.print_usage()
parser.exit(1)
# Figure out which are the attribute names and which are the filenames or links.
# Note that in Linux, the shell expands wild cards, but not so in Windows,
# so try to do that here using glob.
attrnames = []
filenames = []
for arg in arglist:
if arg in ccData._attrlist:
attrnames.append(arg)
elif URL_PATTERN.match(arg) or os.path.isfile(arg):
filenames.append(arg)
else:
wildcardmatches = glob.glob(arg)
if wildcardmatches:
filenames.extend(wildcardmatches)
else:
print("%s is neither a filename nor an attribute name." % arg)
parser.print_usage()
parser.exit(1)
# Since there is some ambiguity to the correct number of arguments, check
# that there is at least one filename (or two in multifile mode), and also
# at least one attribute to parse if the -l option was not passed.
if len(filenames) == 0:
print("No logfiles given")
parser.exit(1)
if multifile and len(filenames) == 1:
print("Expecting at least two logfiles in multifile mode")
parser.exit(1)
if not showattr and len(attrnames) == 0:
print("No attributes given")
parser.exit(1)
# This should be sufficient to correctly handle multiple files, that is to
# run the loop below only once with all logfiles in the variable `filename`.
# Although, perhaps it would be clearer to abstract the contents of the loop
# into another function.
if multifile:
filenames = [filenames]
# Now parse each file and print out the requested attributes.
for filename in filenames:
if multifile:
name = ", ".join(filename[:-1]) + " and " + filename[-1]
else:
name = filename
# The keyword dictionary are not used so much. but could be useful for
# passing options downstream. For example, we might use --future for
# triggering experimental or alternative behavior (as with optdone).
kwargs = {}
if verbose:
kwargs['verbose'] = True
kwargs['loglevel'] = logging.INFO
else:
kwargs['verbose'] = False
kwargs['loglevel'] = logging.ERROR
if future:
kwargs['future'] = True
if cjsonfile:
kwargs['cjson'] = True
print("Attempting to read %s" % name)
data = ccread(filename, **kwargs)
if data is None:
print("Cannot figure out the format of '%s'" % name)
print("Report this to the cclib development team if you think it is an error.")
print("\n" + parser.format_usage())
parser.exit(1)
if showattr:
print("cclib can parse the following attributes from %s:" % name)
if cjsonfile:
for key in data:
print(key)
break
for attr in data._attrlist:
if hasattr(data, attr):
print(" %s" % attr)
else:
invalid = False
for attr in attrnames:
if cjsonfile:
if attr in data:
print("%s:\n%s" % (attr, data[attr]))
continue
else:
if hasattr(data, attr):
print(attr)
attr_val = getattr(data, attr)
# List of attributes to be printed with new lines
if attr in data._listsofarrays and full:
for val in attr_val:
pprint(val)
else:
pprint(attr_val)
continue
print("Could not parse %s from this file." % attr)
invalid = True
if invalid:
parser.print_help()
if __name__ == "__main__":
ccget()
cclib-1.6.2/cclib/scripts/ccwrite.py 0000775 0000000 0000000 00000007055 13535330462 0017367 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
from __future__ import print_function
import argparse
import logging
import os.path
import sys
from cclib.parser import ccData
from cclib.io import ccopen
from cclib.io import ccwrite
def main():
parser = argparse.ArgumentParser()
parser.add_argument('outputtype',
choices=('json', 'cjson', 'cml', 'xyz', 'molden', 'wfx'),
help='the output format to write (json/cjson are identical)')
parser.add_argument('compchemlogfile',
nargs='+',
help='one or more computational chemistry output files to parse and convert')
parser.add_argument('-v', '--verbose',
action='store_true',
help='more verbose parsing output (only errors by default)')
parser.add_argument('-g', '--ghost',
type=str,
default=None,
help='Symbol to use for ghost atoms')
parser.add_argument('-t', '--terse',
action='store_true',
help='CJSON by default is not indented for readability, saves space (indented for readability\'s sake)')
parser.add_argument('-u', '--future',
action='store_true',
help='use experimental features (currently optdone_as_list)')
parser.add_argument('-i', '--index',
type=int,
default=None,
help='optional zero-based index for which structure to extract')
args = parser.parse_args()
outputtype = args.outputtype
filenames = args.compchemlogfile
verbose = args.verbose
terse = args.terse
future = args.future
index = args.index
ghost = args.ghost
for filename in filenames:
# We might want to use this option in the near future.
ccopen_kwargs = dict()
if future:
ccopen_kwargs['future'] = True
print("Attempting to parse {}".format(filename))
log = ccopen(filename, **ccopen_kwargs)
if not log:
print("Cannot figure out what type of computational chemistry output file '{}' is.".format(filename))
print("Report this to the cclib development team if you think this is an error.")
sys.exit()
if verbose:
log.logger.setLevel(logging.INFO)
else:
log.logger.setLevel(logging.ERROR)
data = log.parse()
print("cclib can parse the following attributes from {}:".format(filename))
hasattrs = [' {}'.format(attr) for attr in ccData._attrlist if hasattr(data, attr)]
print('\n'.join(hasattrs))
# Write out to disk.
outputdest = '.'.join([os.path.splitext(os.path.basename(filename))[0], outputtype])
ccwrite_kwargs = dict()
if future:
ccwrite_kwargs['future'] = True
if ghost:
ccwrite_kwargs['ghost'] = ghost
# For XYZ files, write the last geometry unless otherwise
# specified.
if not index:
index = -1
ccwrite_kwargs['jobfilename'] = filename
# The argument terse presently is only applicable to
# CJSON/JSON formats
ccwrite(data, outputtype, outputdest, indices=index, terse=terse,
**ccwrite_kwargs)
if __name__ == "__main__":
main()
cclib-1.6.2/cclib/scripts/cda.py 0000664 0000000 0000000 00000004230 13535330462 0016443 0 ustar 00root root 0000000 0000000 #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
from __future__ import print_function
import logging
from argparse import ArgumentParser
from cclib.io import ccread
from cclib.method import CDA
def main():
parser = ArgumentParser()
parser.add_argument("file1", help="logfile containing the supermolecule")
parser.add_argument("file2", help="logfile containing the first fragment")
parser.add_argument("file3", help="logfile containing the second fragment")
args = parser.parse_args()
loglevel = logging.ERROR
data1 = ccread(args.file1, loglevel=loglevel)
data2 = ccread(args.file2, loglevel=loglevel)
data3 = ccread(args.file3, loglevel=loglevel)
fa = CDA(data1, None, loglevel)
retval = fa.calculate([data2, data3])
if retval:
print("Charge decomposition analysis of {}\n".format(args.file1))
if len(data1.homos) == 2:
print("ALPHA SPIN:")
print("===========")
print(" MO# d b r s")
print("-------------------------------------")
for spin in range(len(data1.homos)):
if spin == 1:
print("\nBETA SPIN:")
print("==========")
for i in range(len(fa.donations[spin])):
print("%4i: %7.3f %7.3f %7.3f %7.3f" % \
(i + 1, fa.donations[spin][i],
fa.bdonations[spin][i],
fa.repulsions[spin][i],
fa.residuals[spin][i]))
if i == data1.homos[spin]:
print("------ HOMO - LUMO gap ------")
print("-------------------------------------")
print(" T: %7.3f %7.3f %7.3f %7.3f" % \
(fa.donations[spin].sum(),
fa.bdonations[spin].sum(),
fa.repulsions[spin].sum(),
fa.residuals[spin].sum()))
if __name__ == '__main__':
main()
cclib-1.6.2/doc/ 0000775 0000000 0000000 00000000000 13535330462 0013345 5 ustar 00root root 0000000 0000000 cclib-1.6.2/doc/Makefile 0000664 0000000 0000000 00000000103 13535330462 0014777 0 ustar 00root root 0000000 0000000 default: sphinx
.PHONY: sphinx
sphinx:
$(MAKE) -C sphinx default
cclib-1.6.2/doc/README.md 0000664 0000000 0000000 00000001311 13535330462 0014620 0 ustar 00root root 0000000 0000000 # Documentation for cclib
This directory contains the source of the current official website and documentation for [cclib](https://github.com/cclib/cclib), available on GitHub pages at http://cclib.github.io.
## How to update documentation
The website is generated using [Sphinx](http://sphinx-doc.org/) with some [custom adjustments](https://github.com/cclib/sphinx_rtd_theme/tree/cclib). The [reStructuredText](http://sphinx-doc.org/rest.html) sources are in the `sphinx` subdirectory, and executing `make` places the built website in `sphinx/_build/html`. Some of the content is generated automatically from the cclib code using autodoc or custom Python scripts, and this should be handled by the Makefile.
cclib-1.6.2/doc/sphinx/ 0000775 0000000 0000000 00000000000 13535330462 0014656 5 ustar 00root root 0000000 0000000 cclib-1.6.2/doc/sphinx/.gitignore 0000664 0000000 0000000 00000000176 13535330462 0016652 0 ustar 00root root 0000000 0000000 __pycache__
_build
cclib
*.log
*.pyc
# auto-generated files
attributes.rst
attributes_dev.rst
coverage.rst
coverage_dev.rst
cclib-1.6.2/doc/sphinx/Makefile 0000664 0000000 0000000 00000007275 13535330462 0016331 0 ustar 00root root 0000000 0000000 # Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = python -msphinx
SPHINXPROJ = cclib
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
.PHONY: help Makefile
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) && exit 2
.PHONY: default
default: cclib update attributes coverage html
clean:
-rm -rf $(BUILDDIR)/*
-rm -rf cclib*
-rm -rf attributes*.rst
-rm -rf coverage*.rst
PRODUCTION_VERSION = v1.6.2
CCLIB_PROD = $(BUILDDIR)/cclib_prod
CCLIB_DEV = $(BUILDDIR)/cclib_dev
SRC_PROD = $(CCLIB_PROD)/cclib
SRC_DEV = $(CCLIB_DEV)/cclib
# Have the production and master in two separate directories so that
# dependencies are triggered correctly. If we had one, it would not
# work correctly, since git updates timestamps on each checkout, which
# would trigger some targets unnecessarily.
.PHONY: cclib
cclib: $(CCLIB_PROD) $(CCLIB_DEV)
$(CCLIB_PROD):
@echo "Checking out prod..."
git clone https://github.com/cclib/cclib.git $@
cd $(CCLIB_PROD); git checkout $(PRODUCTION_VERSION)
$(CCLIB_DEV):
@echo "Checking out dev..."
git clone -b master https://github.com/cclib/cclib.git $@
.PHONY: update update_prod update_dev
update: update_prod update_dev
update_prod:
@echo "Updating prod..."
cd $(CCLIB_PROD); git checkout $(PRODUCTION_VERSION); git fetch --all --tags
cd $(CCLIB_PROD); find . -name "*.pyc" -delete
update_dev:
@echo "Updating dev..."
cd $(CCLIB_DEV); git pull origin; git fetch --all --tags
cd $(CCLIB_DEV); find . -name "*.pyc" -delete
# Since we have two separate directories, 'checking out' a branch actually
# means making a symlink to the appropriate directory, because scripts should be
# impervious to this Makefile and will import from always the same name (cclib).
.PHONY: checkout_prod checkout_dev
checkout_prod:
-rm -rf cclib
ln -s $(SRC_PROD) cclib
checkout_dev:
-rm -rf cclib
ln -s $(SRC_DEV) cclib
# We need three layers of targets in order to always get the checkout
# to execute, but the actual file-generating target to run only if the
# appropriate files have been updated (resolved via the right symlink).
.PHONY: attributes
attributes: attributes.rst attributes_dev.rst
attributes.rst: attributes.py $(SRC_PROD)/parser/data.py
@echo "Generating prod attributes..."
@$(MAKE) --no-print-directory checkout_prod
python attributes.py > $@
attributes_dev.rst: attributes.py $(SRC_DEV)/parser/data.py
@echo "Generating dev attributes..."
@$(MAKE) --no-print-directory checkout_dev
python attributes.py > $@
# Same as above, we need three layers of targets, since this script
# especially takes some time as it performs all the tests.
.PHONY: coverage
coverage: coverage.rst coverage_dev.rst
coverage.rst: coverage.py $(wildcard $(SRC_PROD)/parser/*parser.py) $(wildcard $(CCLIB_PROD)/test/*.py)
@echo "Generating prod coverage..."
@$(MAKE) --no-print-directory checkout_prod
python coverage.py $(CCLIB_PROD) > $@
coverage_dev.rst: coverage.py $(wildcard $(SRC_DEV)/parser/*parser.py) $(wildcard $(CCLIB_DEV)/test/*.py)
@echo "Generating dev coverage..."
@$(MAKE) --no-print-directory checkout_dev
python coverage.py $(CCLIB_DEV) > $@
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
# %: Makefile
# @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
# We can't use the default catch-all target, because it traps our
# custom targets that don't use Sphinx, like `attributes`, but it's
# still needed for our specific Sphinx build type.
.PHONY: html
html: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
cclib-1.6.2/doc/sphinx/attributes.py 0000664 0000000 0000000 00000005250 13535330462 0017420 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
"""Generate the attributes.rst and attributes_dev.rst files from the
ccData docstring that describes attributes."""
from __future__ import print_function
from docs_common import check_cclib
import cclib
check_cclib(cclib)
def generate_attributes():
"""Generate a string containing a reStructuredText table
representation of the ccData docstring, which contains a list of
all supported attributes with
1. the name of each attribute,
2. the text definition of each attribute,
3. the attribute's container data type, shape (if relevant), and
4. the physical units for each attribute.
"""
lines = []
# Need to parse the ccData docstring, since only that currently
# contains all the information needed for this table.
data_doc = cclib.parser.data.ccData.__doc__
attributes = [line for line in data_doc.split('\n') if line[:8].strip() == '']
attributes = [line for line in attributes if "--" in line]
# These are the widths of the columns in the table
wattr = 20
wdesc = 65
wunit = 28
wtype = 32
dashes = " "
for w in [wattr, wdesc, wunit, wtype]:
dashes += "="*(w-1) + " "
header = " "
header += "Name".ljust(wattr)
header += "Description".ljust(wdesc)
header += "Units".ljust(wunit)
header += "Data type".ljust(wtype)
lines.append(dashes)
lines.append(header)
lines.append(dashes)
names = []
for line in attributes:
# There is always a double dash after the name.
attr, desc = line.strip().split(' -- ')
names.append(attr)
# The type and unit are in parentheses, but these
# are not always the only parentheses on the line.
other = desc.split('(')[-1]
desc = desc[:-len(other)-1].strip()
other = other.split(')')[0]
# Furthermore, the unit is not always there.
if "," in other:
atype, aunit = other.split(", ")
else:
atype = other
aunit = ''
# Print the line with columns align to the table. Note that
# the description sometimes contain Unicode characters, so
# decode-encode when justifying to get the correct length.
attr = ("`%s`_" % attr).ljust(wattr)
desc = desc.ljust(wdesc)
aunit = aunit.ljust(wunit)
for i in range(1, 4):
atype = atype.replace('[%i]' % i, ' of rank %i' % i)
lines.append(" " + attr + desc + aunit + atype)
lines.append(dashes)
lines.append("")
for n in names:
lines.append(".. _`%s`: data_notes.html#%s" % (n, n))
return "\n".join(lines)
if __name__ == "__main__":
print(generate_attributes())
cclib-1.6.2/doc/sphinx/changelog.rst 0000664 0000000 0000000 00000046521 13535330462 0017347 0 ustar 00root root 0000000 0000000 .. index::
single: development; changelog
Changelog
=========
Changes in cclib-1.6.2
----------------------
**Features**
* Molden writer now supports ghost atoms (Shiv Upadhyay)
* Handle comments in XYZ files when reading and writing
* Updated regression testing framework (Amanda Dumi, Shiv Upadhyay)
* Updated test file versions to GAMESS-US 2018 (Shiv Upadhyay)
**Bugfixes**
* Fixed parsing ORCA output with user comments in coordinates (Felix Plasser)
* Fixed parsing ORCA output with embedding potentials
* Fixed parsing ORCA output with ROCIS in version 4.1
* Fixed parsing etenergies and similar attribute in ORCA for excited states
* Fixed parsing of vibfreqs for ORCA for linear molecules
* Parsing geometry optimization in ORCA is mode robust wrt line endings
Changes in cclib-1.6.1
----------------------
**Features**
* New attribute nsocoeffs for natural spin orbital coefficients (Shiv Upadhyay)
* New attribute nsooccnos for natural spin orbital occupation numbers (Shiv Upadhyay)
* New methods: alpha and beta electron counts (Jaime Rodríguez-Guerra)
* Support coreelectrons attribute in Molcas (Kunal Sharma)
* Support etoscs for response calculations in Dalton (Peter Reinholdt)
* Support etenergies for TDDFT in GAMESS
* Support etrotats attribute in ORCA
* Support functional name in metadata for Psi4 (Alessandro Genova)
* Updated testing framework (Jaime Rodríguez-Guerra, Maxim Stolyarchuk and others)
* Updated test file version to QChem 5.1
**Bugfixes**
* Fixed parsing GAMESS output for EOM-CC output
* Fixed parsing Gaussian output for G3 jobs
* Fixed parsing ORCA output for certain invalid inputs (Felipe S. S. Schneider)
* Fixed parsing of mocoeffs in ORCA when they are glued together (Felipe S. S. Schneider)
* Fixed parsing of mocoeffs and vibfreqs in Psi4 (Alessandro Genova)
* Fixed parsing of mocoeffs in Molcas for some files (Shiv Upadhyay)
* Fixed parsing of etsecs in Dalton
* Fixed bond atom indices in CJSON output (Alessandro Genova)
Changes in cclib-1.6
--------------------
**Features**
* New parser: cclib can now parse Molcas files (Kunal Sharma)
* New parser: cclib can now parse Turbomole files (Christopher Rowley, Kunal Sharma)
* New script: ccframe writes data table files from logfiles (Felipe Schneider)
* New method: stoichiometry builds the chemical formula of a system (Jaime Rodríguez-Guerra)
* Support package version in metadata for most parsers
* Support time attribute and BOMD output in Gaussian, NWChem, ORCA and QChem
* Support grads and metadata attributes in ORCA (Jonathon Vandezande)
* Experimental support for CASSCF output in ORCA (Jonathon Vandezande)
* Added entry in metadata for successful completion of jobs
* Updated test file versions to ORCA 4.0
* Update minimum Python3 version to 3.4
**Bugfixes**
* Fixed parsing ORCA output with linear molecules (Jonathon Vandezande)
* Fixed parsing NWChem output with incomplete SCF
Changes in cclib-1.5.3
----------------------
**Features**
* New attribute transprop for electronic transitions (Jonathon Vandezande)
* Support grads attribute in Psi4 (Adam Abbott)
* Support grads attribute in Molpro (Oskar Weser)
* Support optstatus for IRCs and in Psi4 (Emmanuel LaTruelle)
* Updated test file versions to Gaussian16 (Andrew S. Rosen)
* Add ability to write XYZ coordinates for arbitrary indices
**Bugfixes**
* Fixed ccwrite script and added unit tests (Georgy Frolov)
* Fixed closed shell determination for Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of natom for >9999 atoms in Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of ADF jobs with no title
* Fixed parsing of charge and core electrons when using ECPs in QChem
* Fixed parsing of scfvalues for malformed output in Gaussian
Changes in cclib-1.5.2
----------------------
**Features**
* Support for writing Molden and WFX files (Sagar Gaur)
* Support for thermochemistry attributes in ORCA (Jonathon Vandezande)
* Support for chelpg atomic charges in ORCA (Richard Gowers)
* Updated test file versions to GAMESS-US 2017 (Sagar Gaur)
* Added option to print full arrays with ccget (Sagar Gaur)
**Bugfixes**
* Fixed polarizability parsing bug in DALTON (Maxim Stolyarchuk)
* Fixed IRC parsing in Gaussian for large trajectories (Dénes Berta, LaTruelle)
* Fixed coordinate parsing for heavy elements in ORCA (Jonathon Vandezande)
* Fixed parsing of large mocoeffs in fixed width format for QChem (srtlg)
* Fixed parsing of large polarizabilities in fixed width format for DALTON (Maxim Stolyarchuk)
* Fixed parsing molecular orbitals when there are more than basis set functions in QChem
Changes in cclib-1.5.1
----------------------
**Features**
* New attribute polarizabilities for static or dynamic dipole polarizability
* New attribute pressure for thermochemistry (renpj)
* Add property to detect closed shells in parsed data
* Handle RPA excited state calculation in ORCA, in addition to TDA
* Support for Python 3.6
**Bugfixes**
* Restore alias cclib.parser.ccopen for backwards compatibility
* Fixed parsing thermochemistry for single atoms in QChem
* Fixed handling of URLs (Alexey Alnatanov)
* Fixed Atom object creation in Biopython bridge (Nitish Garg)
* Fixed ccopen when working with multiple files
Changes in cclib-1.5
--------------------
**Features**
* Support for both reading and writing CJSON (Sanjeed Schamnad)
* New parser: cclib can now parse MOPAC files (Geoff Hutchison)
* New attribute time tracks coordinated for dynamics jobs (Ramon Crehuet)
* New attribute metadata holds miscellaneous information not in other attributes (bwang2453)
* Extract moments attribute for Gaussian (Geoff Hutchison)
* Extract atombasis for ADF in simple cases (Felix Plasser)
* License change to BSD 3-Clause License
**Bugfixes**
* Correct parsing of several attributes for ROHF calculations
* Fixed precision of scfvalues in ORCA
* Fixed MO parsing from older versions of Firefly (mkrompiec)
Changes in cclib-1.4.1
----------------------
**Features**
* Preliminary support for writing CJSON (Sanjeed Schamnad)
* Tentative support for BOMD trajectories in Gaussian (Ramon Crehuet)
* Support for atombasis in ADF (Felix Plasser)
* Support for nocoeffs and nooccnos in Molpro
**Bugfixes**
* Fix for non-standard basis sets in DALTON
* Fix for non-standard MO coefficient printing in GAMESS
Changes in cclib-1.4
--------------------
**Features**
* New parser: cclib can now parse DALTON files
* New parser: cclib can now parse ORCA files
* New attribute optstatus for status during geometry optimizations and scans
* Extract atommasses for GAMESS-US (Sagar Gaur)
* Extract atombasis, gbasis and mocoeffs for QChem
* Extract gbasis for ORCA (Felix Plasser)
* Handle multi-step jobs by parsing only the supersystem
* Improve parsing vibrational symmetries and displacements for Gaussian (mwykes)
* Improve support for compressed files (mwykes)
* Improve and update unit test and regression suites
* Support for Python 3.5
**Bugfixes**
* Fix StopIteration crashes for most parsers
* Fix parsing basis section for Molpro job generated by Avogadro
* Fix parsing multi-job Gaussian output with different orbitals (Geoff Hutchinson)
* Fix parsing ORCA geometry optimization with improper internal coordinates (glideht)
* Fix units in atom coordinates parsed from GAMESS-UK files (mwykes)
* Fix test for vibrational frequencies in Turbomole (mwykes)
* Fix parsing vibration symmetries for Molpro (mwykes)
* Fix parsing eigenvectors in GAMESS-US (Alexis Otero-Calvis)
* Fix duplicate parsing of symmetry labels for Gaussian (Martin Peeks)
Changes in cclib-1.3.2
----------------------
**Features**
* New attribute nooccnos for natural orbital occupation numbers
* Read data from XYZ files using Open Babel bridge
* Start basic tests for bridge functionality
**Bugfixes**
* Better handling of ONIOM logfiles in Gaussian (Clyde Fare)
* Fix IR intensity bug in Gaussian parser (Clyde Fare)
* Fix QChem parser for OpenMP output
* Fix parsing TDDFT/RPA transitions (Felix Plasser)
* Fix encoding issues for UTF-8 symbols in parsers and bridges
Changes in cclib-1.3.1
----------------------
**Features**
* New attribute nooccnos for natural orbital occupation numbers
* Read data from XYZ files using Open Babel bridge
* Start basic tests for bridge functionality
**Bugfixes**
* Better handling of ONIOM logfiles in Gaussian (Clyde Fare)
* Fix IR intensity bug in Gaussian parser (Clyde Fare)
* Fix QChem parser for OpenMP output
* Fix parsing TDDFT/RPA transitions (Felix Plasser)
* Fix encoding issues for UTF-8 symbols in parsers and bridges
Changes in cclib-1.3
--------------------
**Features**
* New parser: cclib can now parse NWChem files
* New parser: cclib can now parse Psi (versions 3 and 4) files
* New parser: cclib can now parse QChem files (by Eric Berquist)
* New method: Nuclear (currently calculates the repulsion energy)
* Handle Gaussian basis set output with GFPRINT keyword
* Attribute optdone reverted to single Boolean value by default
* Add --verbose and --future options to ccget and parsers
* Replaced PC-GAMESS test files with newer Firefly versions
* Updated test file versions to GAMESS-UK 8.0
**Bugfixes**
* Handle GAMESS-US file with LZ value analysis (Martin Rahm)
* Handle Gaussian jobs with stars in output (Russell Johnson, NIST)
* Handle ORCA singlet-only TD calculations (May A.)
* Fix parsing of Gaussian jobs with fragments and ONIOM output
* Use UTF-8 encodings for files that need them (Matt Ernst)
Changes in cclib-1.2
--------------------
**Features**
* Move project to GitHub
* Transition to Python 3 (Python 2.7 will still work)
* Add a multifile mode to ccget script
* Extract vibrational displacements for ORCA
* Extract natural atom charges for Gaussian (Fedor Zhuravlev)
* Updated test file versions to ADF2013.01, GAMESS-US 2012, Gaussian09, Molpro 2012 and ORCA 3.0.1
**Bugfixes**
* Ignore Unicode errors in logfiles
* Handle Gaussian jobs with terse output (basis set count not reported)
* Handle Gaussian jobs using IndoGuess (Scott McKechnie)
* Handle Gaussian file with irregular ONION gradients (Tamilmani S)
* Handle ORCA file with SCF convergence issue (Melchor Sanchez)
* Handle Gaussian file with problematic IRC output (Clyde Fare)
* Handle ORCA file with AM1 output (Julien Idé)
* Handle GAMESS-US output with irregular frequency format (Andrew Warden)
Changes in cclib-1.1
--------------------
**Features**
* Add progress info for all parsers
* Support ONIOM calculations in Gaussian (Karen Hemelsoet)
* New attribute atomcharges extracts Mulliken and Löwdin atomic charges if present
* New attribute atomspins extracts Mulliken and Löwdin atomic spin densities if present
* New thermodynamic attributes: freeenergy, temperature, enthalpy (Edward Holland)
* Extract PES information: scanenergies, scancoords, scanparm, scannames (Edward Holland)
**Bugfixes**
* Handle coupled cluster energies in Gaussian 09 (Björn Dahlgren)
* Vibrational displacement vectors missing for Gaussian 09 (Björn Dahlgren)
* Fix problem parsing vibrational frequencies in some GAMESS-US files
* Fix missing final scfenergy in ADF geometry optimisations
* Fix missing final scfenergy for ORCA where a specific number of SCF cycles has been specified
* ORCA scfenergies not parsed if COSMO solvent effects included
* Allow spin unrestricted calculations to use the fragment MO overlaps correctly for the MPA and CDA calculations
* Handle Gaussian MO energies that are printed as a row of asterisks (Jerome Kieffer)
* Add more explicit license notices, and allow LGPL versions after 2.1
* Support Firefly calculations where nmo != nbasis (Pavel Solntsev)
* Fix problem parsing vibrational frequency information in recent GAMESS (US) files (Chengju Wang)
* Apply patch from Chengju Wang to handle GAMESS calculations with more than 99 atoms
* Handle Gaussian files with more than 99 atoms having pseudopotentials (Björn Baumeier)
Changes in cclib-1.0.1
----------------------
**Features**
* New attribute atommasses - atomic masses in Dalton
* Added support for Gaussian geometry optimisations that change the number of linearly independent basis functions over the course of the calculation
**Bugfixes**
* Handle triplet PM3 calculations in Gaussian03 (Greg Magoon)
* Some Gaussian09 calculations were missing atomnos (Marius Retegan)
* Handle multiple pseudopotentials in Gaussian03 (Tiago Silva)
* Handle Gaussian calculations with >999 basis functions
* ADF versions > 2007 no longer print overlap info by default
* Handle parsing Firefly calculations that fail
* Fix parsing of ORCA calculation (Marius Retegan)
Changes in cclib-1.0
--------------------
**Features**
* Handle PBC calculations from Gaussian
* Updates to handle Gaussian09
* Support TDDFT calculations from ADF
* A number of improvements for GAMESS support
* ccopen now supports any file-like object with a read() method, so it can parse across HTTP
**Bugfixes**
* Many many additional files parsed thanks to bugs reported by users
Changes in cclib-0.9
--------------------
**Features**
* New parser: cclib can now parse ORCA files
* Added option to use setuptools instead of distutils.core for installing
* Improved handling of CI and TD-DFT data: TD-DFT data extracted from GAMESS and etsecs standardised across all parsers
* Test suite changed to include output from only the newest program versions
**Bugfixes**
* A small number of parsing errors were fixed
Changes in cclib-0.8
--------------------
**Feaures**
* New parser: cclib can now parse Molpro files
* Separation of parser and data objects: Parsed data is now returned is a ccData object that can be pickled, and converted to and from JSON
* Parsers: multiple files can be parsed with one parse command
* NumPy support: Dropped Numeric support in favour of NumPy
* API addition: 'charge' for molecular charge
* API addition: 'mult' for spin multiplicity
* API addition: 'atombasis' for indices of atom orbitals on each atom
* API addition: 'nocoeffs' for Natural Orbital (NO) coefficients
* GAMESS-US parser: added 'etoscs' (CIS calculations)
* Jaguar parser: added 'mpenergies' (LMP2 calculations)
* Jaguar parser: added 'etenergies' and 'etoscs' (CIS calculations)
* New method: Lowdin Population Analysis (LPA)
* Tests: unittests can be run from the Python interpreter, and for a single parser; the number of "passed" tests is also counted and shown
**Bugfixes**
* Several parsing errors were fixed
* Fixed some methods to work with different numbers of alpha and beta MO coefficients in mocoeffs (MPA, CSPA, OPA)
Changes in cclib-0.7
--------------------
**Feaures**
* New parser: cclib can now parse Jaguar files
* ccopen: Can handle log files which have been compressed into .zip, .bz2 or .gz files.
* API addition: 'gbasis' holds the Gaussian basis set
* API addition: 'coreelectrons' contains the number of core electrons in each atom's pseudopotential
* API addition: 'mpenergies' holds the Moller-Plesset corrected molecular electronic energies
* API addition: 'vibdisps' holds the Cartesian displacement vectors
* API change: 'mocoeffs' is now a list of rank 2 arrays, rather than a rank 3 array
* API change: 'moenergies' is now a list of rank 1 arrays, rather than rank 2 array
* GAMESS-UK parser: added 'vibramans'
* New method: Charge Decomposition Analysis (CDA) for studying electron donation, back donation, and repulsion between fragments in a molecule
* New method: Fragment Analysis for studying bonding interactions between two or more fragments in a molecule
* New method: Ability to calculate the electron density or wavefunction
**Bugfixes**
* GAMESS parser:
- Failed to parse frequency calculation with imaginary frequencies
- Rotations and translations now not included in frequencies
- Failed to parse a DFT calculation
* GAMESS-UK parser:
- 'atomnos' not being extracted
- Rotations and translations now not included in frequencies
* bridge to Open Babel: No longer dependent on pyopenbabel
Changes in cclib-0.6.1
----------------------
**Bugfixes**
* cclib: The "import cclib.parsers" statement failed due to references to Molpro and Jaguar parsers which are not present
* Gaussian parser: Failed to parse single point calculations where the input coords are a z-matrix, and symmetry is turned off.
Changes in cclib-0.6.0
----------------------
**Feaures**
* ADF parser: If some MO eigenvalues are not present, the parser does not fail, but uses values of 99999 instead and A symmetry
**Bugfixes**
* ADF parser: The following bugs have been fixed P/D orbitals for single atoms not handled correctly Problem parsing homos in unrestricted calculations Problem skipping the Create sections in certain calculations
* Gaussian parser: The following bugs have been fixed Parser failed if standard orientation not found
* ccget: aooverlaps not included when using --list option
Changes in cclib-0.6b
---------------------
**Feaures**
* New parser: GAMESS-UK parser
* API addition: the .clean() method; the .clean() method of a parser clears all of the parsed attributes. This is useful if you need to reparse during the course of a calculation.
* Function rename: guesstype() has been renamed to ccopen()
* Speed up: Calculation of Overlap Density of States has been sped up by two orders of magnitude
**Bugfixes**
* ccopen: Minor problems fixed with identification of log files
* ccget: Passing multiple filenames now works on Windows too
* ADF parser: The following bugs have been fixed
- Problem with parsing SFOs in certain log files
- Handling of molecules with orbitals of E symmetry
- Couldn't find the HOMO in log files from new versions of ADF
- Parser used to miss attributes if SCF not converged
- For a symmetrical molecule, mocoeffs were in the wrong order and the homo was not identified correctly if degenerate
* Gaussian parser: The following bugs have been fixed
- SCF values was not extracting the dEnergy value
- Was extracting Depolar P instead of Raman activity
Changes in cclib-0.5
--------------------
**Features**
* (src/scripts/ccget): Added handling of multiple filenames. It's now possible to use ccget as follows: ``ccget *.log``. This is a good way of checking out whether cclib is able to parse all of the files in a given directory. Also possible is: ``ccget homos *.log``.
* Change of license: Changed license from GPL to LGPL
**Bugfixes**
* src/cclib/parser/gamessparser.py: gamessparser was dying on GAMESS VERSION = 12 DEC 2003 gopts, as it was unable to parse the scftargets.
* src/cclib/parser/gamessparser.py: Remove assertion to catch instances where scftargets is unset. This occurs in the case of failed calculations (e.g. wrong multiplicity).
* src/cclib/parser/adfparser.py: Fixed one of the errors with the Mo5Obdt2-c2v-opt.adfout example, which had to do with the SFOs being made of more than two combinations of atoms (4, because of rotation in c2v point group). At least one error is still present with atomcoords. It looks like non-coordinate integers are being parsed as well, which makes some of the atomcoords list have more than the 3 values for x,y,z.
* src/cclib/parser/adfparser.py: Hopefully fixed the last error in Mo5Obdt2-c2v-opt. Problem was that it was adding line.split()[5:], but sometimes there was more than 3 fields left, so it was changed to [5:8]. Need to check actual parsed values to make sure it is parsed correctly.
* data/Gaussian, logfiledist, src/cclib/parser/gaussianparser.py, test/regression.py: Bug fix: Mo4OSibdt2-opt.log has no atomcoords despite being a geo-opt. This was due to the fact that the parser was extracting "Input orientation" and not "Standard orientation". It's now changed to "Standard orientation" which works for all of the files in the repository.
cclib-1.6.2/doc/sphinx/conf.py 0000664 0000000 0000000 00000017761 13535330462 0016171 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# cclib documentation build configuration file, created by
# sphinx-quickstart on Thu Jan 30 14:44:27 2014.
#
# This file is execfile()d with the current directory set to its containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import sys, os
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
# -- General configuration -----------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = ['sphinx.ext.mathjax']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'contents'
# General information about the project.
project = u'cclib'
copyright = u'2014-2018, cclib Development Team'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = '1.6'
# The full version, including alpha/beta/rc tags.
release = '1.6.2'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build', '_themes']
# The reST default role (used for this markup: `text`) to use for all documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# -- Options for HTML output ---------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'sphinx_rtd_theme'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
html_theme_path = ['_themes/sphinx_rtd_theme']
# The name for this set of Sphinx documents. If None, it defaults to
# " v documentation".
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
#html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
#html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
#html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'cclibdoc'
# -- Options for LaTeX output --------------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#'preamble': '',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
('index', 'cclib.tex', u'cclib Documentation',
u'cclib Development Team', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# -- Options for manual page output --------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', 'cclib', u'cclib Documentation',
[u'cclib Development Team'], 1)
]
# If true, show URL addresses after external links.
#man_show_urls = False
# -- Options for Texinfo output ------------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
('index', 'cclib', u'cclib Documentation',
u'cclib Development Team', 'cclib', 'One line description of project.',
'Miscellaneous'),
]
# Documents to append as an appendix to all manuals.
#texinfo_appendices = []
# If false, no module index is generated.
#texinfo_domain_indices = True
# How to display URL addresses: 'footnote', 'no', or 'inline'.
#texinfo_show_urls = 'footnote'
# Update template context with project information.
context = {
'conf_py_path': '/sphinx/',
'github_user': 'cclib',
'github_repo': 'cclib.github.io',
'github_version': 'master',
'display_github': True,
'source_suffix': source_suffix,
}
if 'html_context' in globals():
html_context.update(context)
else:
html_context = context
cclib-1.6.2/doc/sphinx/contents.rst 0000664 0000000 0000000 00000000412 13535330462 0017242 0 ustar 00root root 0000000 0000000 .. _contents:
Table of Contents
=================
.. toctree::
:maxdepth: 2
index
how_to_install
how_to_parse
data
data_notes
methods
development
data_dev
changelog
Indices and tables
==================
* :ref:`genindex`
cclib-1.6.2/doc/sphinx/coverage.py 0000664 0000000 0000000 00000010547 13535330462 0017032 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
#
# Copyright (c) 2019, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Generate the coverage.rst and coverage.rst files from test results."""
from __future__ import print_function
import os
import sys
from docs_common import check_cclib
# Import cclib and check we are using the version from a subdirectory.
import cclib
check_cclib(cclib)
def generate_coverage():
"""Generate a string containing a reStructuredTest table
representation of which parsers support which attributes, based on
test results.
"""
lines = []
# Change directory to where tests are and add it to the path. Because there are
# separate directories for different branches/versions, and we use a symlink to
# point to the one we want, we need to test the real path this link resolves to.
if "cclib_prod" in os.path.realpath('cclib'):
testpath = "_build/cclib_prod"
else:
assert "cclib_dev" in os.path.realpath('cclib')
testpath = "_build/cclib_dev"
os.chdir(testpath)
thispath = os.path.dirname(os.path.realpath(__file__))
sys.path.insert(1, thispath)
from test.test_data import (all_modules, all_parsers, parser_names, DataSuite)
import inspect
ds_args = inspect.getargspec(DataSuite.__init__).args
logpath = thispath + "/coverage.tests.log"
try:
with open(logpath, "w") as flog:
stdout_backup = sys.stdout
sys.stdout = flog
alltests = {}
for p in parser_names:
assert 'parsers' in ds_args
suite = DataSuite(parsers={p: all_parsers[p]}, modules=all_modules, stream=flog)
suite.testall()
alltests[p] = [{'data': t.data} for t in suite.alltests]
sys.stdout = stdout_backup
except Exception as e:
print("Unit tests did not run correctly. Check log file for errors:")
with open(logpath) as fh:
print(fh.read())
print(e)
sys.exit(1)
ncols = len(parser_names) + 1
colwidth = 20
colfmt = "%%-%is" % colwidth
dashes = ("=" * (colwidth - 1) + " ") * ncols
lines.append(dashes)
lines.append(colfmt * ncols % tuple(["attributes"] + parser_names))
lines.append(dashes)
# Eventually we want to move this to cclib, too.
not_applicable = {
'ADF' : ['aonames', 'ccenergies', 'mpenergies'],
'DALTON' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'GAMESS' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'GAMESSUK' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'Gaussian' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'Jaguar' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'Molpro' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'NWChem' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'ORCA' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'Psi' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
'QChem' : ['fonames', 'fooverlaps', 'fragnames', 'frags'],
}
not_possible = {
'Psi' : ['aooverlaps', 'vibirs'],
'QChem' : ['aooverlaps', 'etrotats'],
}
# For each attribute, get a list of Boolean values for each parser that flags
# if it has been parsed by at least one unit test. Substitute an OK sign or
# T/D appropriately, with the exception of attributes that have been explicitely
# designated as N/A.
attributes = sorted(cclib.parser.data.ccData._attrlist)
for attr in attributes:
parsed = [any([attr in t['data'].__dict__ for t in alltests[p]]) for p in parser_names]
for ip, p in enumerate(parsed):
if p:
parsed[ip] = "√"
else:
if attr in not_applicable.get(parser_names[ip], []):
parsed[ip] = "N/A"
elif attr in not_possible.get(parser_names[ip], []):
parsed[ip] = "N/P"
else:
parsed[ip] = "T/D"
lines.append(colfmt*ncols % tuple(["`%s`_" % attr] + parsed))
lines.append(dashes)
lines.append("")
for attr in attributes:
lines.append(".. _`%s`: data_notes.html#%s" % (attr, attr))
return "\n".join(lines)
if __name__ == "__main__":
print(generate_coverage())
cclib-1.6.2/doc/sphinx/data.rst 0000664 0000000 0000000 00000002300 13535330462 0016314 0 ustar 00root root 0000000 0000000 Parsed data (version |release|)
===============================
This is a list of all the data parsed by the current official release of cclib (namely version |release|). For the same list for the development version, see `development parsed data`_. For details and miscellaneous notes on these attributes, see the `data notes`_ page.
.. _`development parsed data`: data_dev.html
.. _`data notes`: data_notes.html
Description of parsed data
--------------------------
Click the attribute name in the table below to go the notes and specifications for a particular attribute. All arrays are Numpy arrays of type 'd' (if containing floats) or 'i' (if containing integers).
.. include:: attributes.rst
Details of current implementation
---------------------------------
The autogenerated table below details which attributes are supported by which parsers based on unit tests. Note that only actively maintained parsers are listed here, although legacy parsers are still testing with old data as regressions. To see the full list of parsers, see the `summary`_ page.
**N/A** = not applicable, **N/P** = applicable, but not possible, **T/D** = to do
.. include:: coverage.rst
.. _`summary`: index.html#summary
cclib-1.6.2/doc/sphinx/data_dev.rst 0000664 0000000 0000000 00000001564 13535330462 0017165 0 ustar 00root root 0000000 0000000 Development parsed data
=======================
This is a list of all the data parsed by the `current development code of cclib`_. For the same information for the current official release (version |release|), see the regular `parsed data`_ page. Note that the information on this page may be outdated.
.. _`current development code of cclib`: https://github.com/cclib/cclib
.. _`parsed data`: data.html
Description of parsed data
--------------------------
Click the attribute name in the table below to go to the notes and specifications for a particular attribute. All arrays are Numpy arrays of type 'd' (if containing floats) or 'i' (if containing integers).
.. include:: attributes_dev.rst
Details of current implementation
---------------------------------
**N/A** = not applicable, **N/P** = applicable, but not possible, **T/D** = to do
.. include:: coverage_dev.rst
cclib-1.6.2/doc/sphinx/data_notes.rst 0000664 0000000 0000000 00000113640 13535330462 0017536 0 ustar 00root root 0000000 0000000 .. index::
module: data_notes
Parsed data notes
=================
This is a list of descriptions and notes for all the data attributes currently parsed by cclib, either in the official release (|release|) or development branch. In particular, this page contains technical details about the interpretation of attributes, how to produce them in the various programs and examples in some cases. For a summary and details of the current implementation by the different parsers, please see the `extracted data`_ page and its `development`_ version.
.. _`extracted data`: data.html
.. _`development`: data_dev.html
aonames
-------
This attribute contains the atomic orbital names. These are not normalised as the following examples show, although a reasonable attempt is made to get them close to each other. Users will need to know what each orbital is by knowing the basis set inside out, rather than relying on this data. Such is life, as GAMESS does not provide enough information.
* Gaussian gives names of the form::
['C1_1S', 'C1_2S', 'C1_2PX', 'C1_2PY', 'C1_2PZ', 'C2_1S', 'C2_2S', 'C2_2PX', 'C2_2PY', 'C2_2PZ', 'C3_1S', 'C3_2S', 'C3_2PX', 'C3_2PY', 'C3_2PZ', 'C4_1S', 'C4_2S', 'C4_2PX', 'C4_2PY', 'C4_2PZ', 'C5_1S', 'C5_2S', 'C5_2PX', 'C5_2PY', 'C5_2PZ', 'H6_1S', 'H7_1S', 'H8_1S', 'C9_1S', 'C9_2S', 'C9_2PX', 'C9_2PY', 'C9_2PZ', 'C10_1S', 'C10_2S', 'C10_2PX', 'C10_2PY', 'C10_2PZ', 'H11_1S', 'H12_1S', 'H13_1S', 'C14_1S', 'C14_2S', 'C14_2PX', 'C14_2PY', 'C14_2PZ', 'H15_1S', 'C16_1S', 'C16_2S', 'C16_2PX', 'C16_2PY', 'C16_2PZ', 'H17_1S', 'H18_1S', 'C19_1S', 'C19_2S', 'C19_2PX', 'C19_2PY', 'C19_2PZ', 'H20_1S']
* GAMESS gives names of the form::
['C1_1S', 'C1_2S', 'C1_3X', 'C1_3Y', 'C1_3Z', 'C2_1S', 'C2_2S', 'C2_3X', 'C2_3Y', 'C2_3Z', 'C3_1S', 'C3_2S', 'C3_3X', 'C3_3Y', 'C3_3Z', 'C4_1S', 'C4_2S', 'C4_3X', 'C4_3Y', 'C4_3Z', 'C5_1S', 'C5_2S', 'C5_3X', 'C5_3Y', 'C5_3Z', 'C6_1S', 'C6_2S', 'C6_3X', 'C6_3Y', 'C6_3Z', 'H7_1S', 'H8_1S', 'H9_1S', 'H10_1S', 'C11_1S', 'C11_2S', 'C11_3X', 'C11_3Y', 'C11_3Z', 'C12_1S', 'C12_2S', 'C12_3X', 'C12_3Y', 'C12_3Z', 'H13_1S', 'H14_1S', 'C15_1S', 'C15_2S', 'C15_3X', 'C15_3Y', 'C15_3Z', 'C16_1S', 'C16_2S', 'C16_3X', 'C16_3Y', 'C16_3Z', 'H17_1S', 'H18_1S', 'H19_1S', 'H20_1S']
And for a large basis set calculation on a single C atom:
* Gaussian::
['C1_1S', 'C1_2S', 'C1_3S', 'C1_4S', 'C1_5S', 'C1_6PX', 'C1_6PY', 'C1_6PZ', 'C1_7PX', 'C1_7PY', 'C1_7PZ', 'C1_8PX', 'C1_8PY', 'C1_8PZ', 'C1_9PX', 'C1_9PY', 'C1_9PZ', 'C1_10D 0', 'C1_10D+1', 'C1_10D-1', 'C1_10D+2', 'C1_10D-2', 'C1_11D 0', 'C1_11D+1', 'C1_11D-1', 'C1_11D+2', 'C1_11D-2', 'C1_12D 0', 'C1_12D+1', 'C1_12D-1', 'C1_12D+2', 'C1_12D-2', 'C1_13F 0', 'C1_13F+1', 'C1_13F-1', 'C1_13F+2', 'C1_13F-2', 'C1_13F+3', 'C1_13F-3', 'C1_14F 0', 'C1_14F+1', 'C1_14F-1', 'C1_14F+2', 'C1_14F-2', 'C1_14F+3', 'C1_14F-3', 'C1_15G 0', 'C1_15G+1', 'C1_15G-1', 'C1_15G+2', 'C1_15G-2', 'C1_15G+3', 'C1_15G-3', 'C1_15G+4', 'C1_15G-4', 'C1_16S', 'C1_17PX', 'C1_17PY', 'C1_17PZ', 'C1_18D 0', 'C1_18D+1', 'C1_18D-1', 'C1_18D+2', 'C1_18D-2', 'C1_19F 0', 'C1_19F+1', 'C1_19F-1', 'C1_19F+2', 'C1_19F-2', 'C1_19F+3', 'C1_19F-3', 'C1_20G 0', 'C1_20G+1', 'C1_20G-1', 'C1_20G+2', 'C1_20G-2', 'C1_20G+3', 'C1_20G-3', 'C1_20G+4', 'C1_20G-4']
* GAMESS::
['C1_1S', 'C1_2S', 'C1_3S', 'C1_4S', 'C1_5S', 'C1_6X', 'C1_6Y', 'C1_6Z', 'C1_7X', 'C1_7Y', 'C1_7Z', 'C1_8X', 'C1_8Y', 'C1_8Z', 'C1_9X', 'C1_9Y', 'C1_9Z', 'C1_10XX', 'C1_10YY', 'C1_10ZZ', 'C1_10XY', 'C1_10XZ', 'C1_10YZ', 'C1_11XX', 'C1_11YY', 'C1_11ZZ', 'C1_11XY', 'C1_11XZ', 'C1_11YZ', 'C1_12XX', 'C1_12YY', 'C1_12ZZ', 'C1_12XY', 'C1_12XZ', 'C1_12YZ', 'C1_13XXX', 'C1_13YYY', 'C1_13ZZZ', 'C1_13XXY','C1_13XXZ', 'C1_13YYX', 'C1_13YYZ', 'C1_13ZZX', 'C1_13ZZY', 'C1_13XYZ', 'C1_14XXX', 'C1_14YYY', 'C1_14ZZZ', 'C1_14XXY', 'C1_14XXZ', 'C1_14YYX', 'C1_14YYZ', 'C1_14ZZX', 'C1_14ZZY', 'C1_14XYZ', 'C1_15XXXX', 'C1_15YYYY', 'C1_15ZZZZ', 'C1_15XXXY', 'C1_15XXXZ', 'C1_15YYYX', 'C1_15YYYZ', 'C1_15ZZZX', 'C1_15ZZZY', 'C1_15XXYY', 'C1_15XXZZ', 'C1_15YYZZ', 'C1_15XXYZ', 'C1_15YYXZ', 'C1_15ZZXY', 'C1_16S', 'C1_17S', 'C1_18S', 'C1_19X', 'C1_19Y', 'C1_19Z', 'C1_20X', 'C1_20Y', 'C1_20Z', 'C1_21X', 'C1_21Y', 'C1_21Z', 'C1_22XX', 'C1_22YY', 'C1_22ZZ', 'C1_22XY', 'C1_22XZ', 'C1_22YZ', 'C1_23XX', 'C1_23YY', 'C1_23ZZ', 'C1_23XY', 'C1_23XZ', 'C1_23YZ', 'C1_24XXX', 'C1_24YYY', 'C1_24ZZZ', 'C1_24XXY', 'C1_24XXZ', 'C1_24YYX', 'C1_24YYZ', 'C1_24ZZX', 'C1_24ZZY', 'C1_24XYZ', 'C1_25S', 'C1_26X', 'C1_26Y', 'C1_26Z', 'C1_27XX', 'C1_27YY', 'C1_27ZZ', 'C1_27XY', 'C1_27XZ', 'C1_27YZ', 'C1_28XXX', 'C1_28YYY', 'C1_28ZZZ', 'C1_28XXY', 'C1_28XXZ', 'C1_28YYX', 'C1_28YYZ', 'C1_28ZZX', 'C1_28ZZY', 'C1_28XYZ', 'C1_29XXXX', 'C1_29YYYY', 'C1_29ZZZZ', 'C1_29XXXY', 'C1_29XXXZ', 'C1_29YYYX', 'C1_29YYYZ', 'C1_29ZZZX', 'C1_29ZZZY', 'C1_29XXYY', 'C1_29XXZZ', 'C1_29YYZZ', 'C1_29XXYZ', 'C1_29YYXZ', 'C1_29ZZXY']
aooverlaps
----------
This is a 2-dimensional array which holds the numerical values of the overlap between basis functions (also called atomic orbitals). It is needed for most analyses like `Mulliken`_, `C squared`_, and `Mayer's Bond Orders`_. The indices of the matrix correspond to the basis functions of interest. This matrix is symmetric, so ``aooverlaps[i,j]`` is the same as ``aooverlaps[j,i]``.
Some examples:
* ``aooverlaps[0,3]`` is the overlap between the 1st and 4th basis function
* ``aooverlaps[2,:]`` is a 1-dimensional array containing the overlap between every basis function and the 3rd basis function
**ADF**: not present by default, printed when `PRINT Smat` is in the input; do not mistake with `fooverlaps`_.
**DALTON**: no option to print as of version 2013.
**Gaussian**: ``iop(3/33=1)`` must be specified in the input file.
.. _`Mulliken`: methods.html#mulliken-population-analysis-mpa
.. _`C squared`: methods.html#c-squared-population-analysis-cspa
.. _`Mayer's Bond Orders`: methods.html#mayer-s-bond-orders
atombasis
---------
The attribute ``atombasis`` is a list, each element being a list that contains the atomic orbital indices on the respective atom. For example, ``atombasis[1]`` will contain the indices of atomic orbitals on the second atom of the molecule.
.. index::
single: properties; atomcharges (attribute)
atomcharges
-----------
The attribute ``atomcharges`` contains the atomic partial charges as taken from the output file. Since these charges are arbitrary and depend on the details of a population analysis, this attribute is dictionary containing any number of various atomic charges. The keys in this dictionary are strings naming the population analysis, and the values are arrays of rank 1 and contain the actual charges.
Currently, cclib parses Mulliken, Löwdin, NPA and CHELPG charges, whose respective dictionary keys are ``mulliken``, ``lowdin``, ``natural`` and ``chelpg``.
In practice, these may differ somewhat from the values cclib calculates in the various `calculation methods`_.
**Molpro**: use the ``pop`` command (see http://www.molpro.net/info/2015.1/doc/manual/node515.html).
.. _`calculation methods`: methods.html
atomcoords
----------
The attribute ``atomcoords`` contains the atomic coordinates as taken from the output file. This is an array of rank 3, with a shape (n,m,3) where n is 1 for a single point calculation and >=1 for a geometry optimisation and m is the number of atoms.
**Gaussian**: for geometry optimisations, the "Standard orientation" sections are extracted.
**Molpro**: typically prints output about geometry optimisation in a separate logfile. So, both that and the initial output need to be passed to the cclib parser.
atommasses
----------
The attribute ``atommasses`` contains the masses of all atoms in unified atomic mass units, or Daltons (Da). This is an array or rank 1.
atomnos
-------
An array of integers for the atomic numbers, or the number of protons in the atom nuclei.
atomspins
---------
The attribute ``atomspins`` contains the atomic spin densities as calculated in a population analysis and taken from the output file. Since these densities are arbitrary and depend on the particular population analysis, this attribute is dictionary. In analogy to `atomcharges`_, the keys in this dictionary are strings naming the population analysis, and the values are arrays of rank 1 and contain the actual spin densities.
Currently, cclib parses Mulliken and Löwdin spin densities, whose respective dictionary keys are ``mulliken`` and ``lowdin``.
.. index::
single: energy; ccenergies (attribute)
ccenergies
----------
A one-dimensional array holds the total molecule energies including Coupled Cluster corrections. The array's length is 1 for single point calculations and larger for optimisations. Only the highest theory level is parsed into this attribute (for example, CCSD energies as opposed to CCD energies, or CCSD(T) as opposed to CCSD energies).
charge
------
Net charge of the calculated system, in units of ``e``.
coreelectrons
-------------
The attribute ``coreelectrons`` contains the number of core electrons in each atom's pseudopotentials. It is an array of rank 1, with as many integer elements as there are atoms.
etenergies
----------
This is a rank 1 array that contains the energies of electronic transitions from a reference state to the excited states of the molecule, in ``cm-1``. There should be as many elements to this array as there are excited states calculated. Any type of excited state calculation should provide output that can be parsed into this attribute.
etoscs
------
The attribute ``etoscs`` is a rank 1 array that contains the oscillator strengths of transitions from the reference (ground) state to the excited electronic states of the of the molecule. As for `etenergies`_ and other attributes related to excited states, there should as many elements in this array as there are excited states in the calculation.
etsecs
------
The singly-excited configurations that contribute to electronic transitions are stored in ``etsecs``. It is a list (for each electronic transition from the reference ground state) of lists (for each singly-excited configuration) with three members each:
* a tuple (moindex, alpha/beta), which indicates the MO where the transition begins
* a tuple (moindex, alpha/beta), which indicates the MO where the transition ends
* a float (which can be negative), the coefficient of this singly-excited configuration
In these tuples, the value of alpha/beta is 0 or 1, respectively. For a restricted calculation, this value is always 0, although some programs (GAMESS) sometimes print coefficients for both alpha and beta electrons.
The excitation coefficient is always converted to its unnormalized value by cclib - so the sum of the squared coefficients of all alpha and beta excitations should be unity. It is important to keep in mind, however, that only the square of the excitation coefficient has a physical meaning, and its sign depends on the numerical procedures used by each program.
etsyms
------
The attributes ``etsyms`` is a list containing the symmetries (strings) of the excited states found in the calculation. As for `etenergies`_ and other attributes related to excited states, there should be as many elements in this list as there are excited states in the calculation.
Note that while the symmetry descriptions start with the string ``Singlet`` or ``Triplet``, the exact format differs between programs.
fonames
-------
ADF uses symmetry-adapted fragment orbitals (SFOs) as its basis. These SFOs are generally orthonormal linear combinations of atomic orbitals. This makes it difficult to determine which individual atomic orbitals form the basis in calculations that have any symmetry. In addition, ADF allows "fragment" calculations which use the molecular orbitals of the fragments (FOs, or fragment orbitals) for building up the calculated molecular orbitals.
The difficulty in handling the basis for a molecule with symmetry and the availability of extra information in the fragment calculations makes using `aonames`_ (as specified for the other formats) inappropriate, except for certain circumstances. Therefore, an extra member called fonames is available for the adfparser.
Some examples:
``C1+C4_1S+1S`` - Orbitals from carbon 1 and carbon 4 can interact, and their ``1S`` orbitals mix in a positive manner
``C1+C4_1Px-1Px`` - Orbitals from carbon 1 and carbon 4 can interact, and their ``1Px`` orbitals mix in a negative manner
``bdt1_37A`` - Molecular orbital 37A from the fragment bdt1
**ADF**: There are no required inputfile options for fonames to be supported; however, if one wishes to have SFOs map directly to atomic basis functions, there are two requirements. First, the ``Symmetry NOSYM`` option must be given to force ADF to not linearly combine atomic orbitals into SFOs. Second, fragment calculations cannot be done (for obvious reasons). Also, it is suggested that ``Eigval 99999 99999`` be put into an ``Eprint`` block of the input file of a spin-restricted calculation so that every molecular orbital energy will be printed.
fooverlaps
----------
This is a 2-dimensional array that holds numerical values for the spacial overlap between basis functions. It is very similar to `aooverlaps`_, but differs because of the way ADF performs the calculation (see below for more details). The matrix indices correspond to the fragment orbitals; see the examples listed for `aonames`_.
**Background**
ADF uses symmetry-adapted fragment orbitals (SFOs) as its basis. These SFOs are generally orthonormal linear combinations of atomic orbitals. This makes it difficult to determine which individual atomic orbitals form the basis in calculations that have any symmetry. In addition, ADF allows "fragment" calculations which use the molecular orbitals of the fragments (FOs, or fragment orbitals) for building up the calculated molecular orbitals.
The difficulty in handling the basis for a molecule with symmetry and the availability of extra information in the fragment calculations makes using aooverlaps (as specified for the other formats) inappropriate, except for certain circumstances. Therefore, an extra member called fooverlaps is available for the ADF parser.
**ADF**: There are no required inputfile options for fooverlaps to be supported; however, if one wishes to have SFOs map directly to atomic basis functions, there are two requirements. First, the ``Symmetry NOSYM`` option must be given to force ADF to not linearly combine atomic orbitals into SFOs. Second, fragment calculations cannot be done (for obvious reasons). Also, it is suggested that ``Eigval 99999 99999`` be put into an ``Eprint`` block of the input file of a spin-restricted calculation so that every molecular orbital energy will be printed.
.. index::
single: basis sets; gbasis (attribute)
gbasis
------
This attribute stores information about the Gaussian basis functions that were used in the calculation, per atom using the same conventions as `PyQuante `_. Specifically, ``gbasis`` is a list of lists iterating over atoms and Gaussian basis functions. The elements (basis functions) are tuples of length 2 consisting of orbital type (e.g. 'S', 'P' or 'D') and a list (per contracted GTO) of tuples of size 2 consisting of the exponent and coefficient. Confused? Well, here's ``gbasis`` for a molecule consisting of a single C atom with a STO-3G basis:
.. code-block:: python
[ # per atom
[
('S', [
(71.616837, 0.154329),
(13.045096, 0.535328),
(3.530512, 0.444635),
]),
('S', [
(2.941249, -0.099967),
(0.683483, 0.399513),
(0.222290, 0.700115),
]),
('P', [
(2.941249, 0.155916),
(0.683483, 0.607684),
(0.222290, 0.391957),
]),
]
]
For D and F functions there is an important distinction between pure (5D, 7F) or Cartesian (6D, 10F) functions. PyQuante can only handle Cartesian functions, but we should extract this information in any case, and perhaps work to extend the PyQuante basis set format to include this.
**Gaussian**: the `GFINPUT`_ keyword should normally be used (`GFPRINT`_ gives equivalent information in a different format and is supported in cclib after v1.2).
**GAMESS/GAMESS-UK**: no special keywords are required, but the basis is only available for symmetry inequivalent atoms. There does not seem to be any way to get GAMESS to say which atoms are related through symmetry. As a result, if you want to get basis set info for every atom, you need to reduce the symmetry to C1.
**Jaguar**: for more information see manual (for example at http://yfaat.ch.huji.ac.il/jaguar-help/mand.html#114223)
**ORCA**: include ``Print[ P_Basis ] 2`` in the ``output`` block
.. _`GFINPUT`: http://www.gaussian.com/g_tech/g_ur/k_gfinput.htm
.. _`GFPRINT`: http://www.gaussian.com/g_tech/g_ur/k_gfprint.htm
.. index::
single: geometry optimisation; geotargets (attribute)
geotargets
----------
Geotargets are the target values of the criteria used to determine whether a geometry optimisation has converged. The targets are stored in an array of length ``n``, where ``n`` is the number of targets, and the actual values of these criteria are stored for every optimisation step in the attribute `geovalues`_. Note that cclib does not carry information about the meaning of these criteria, and it is up to the user to interpret the values properly for a particular program. Below we provide some details for several parsers, but it is always a good idea to refer to the source documentation.
In some special cases, the values in ``geotargets`` will be `numpy.inf`_.
**GAMESS UK**: the criteria used for geometry convergence are based on the ``TOL`` parameter, which can be set using the ``XTOLL`` directive. The fault value of this parameter and the conditions required for convergence vary among the various optimisation strategies (see the `GAMESS-UK manual section on controlling optimisation`_ for details). In ``OPTIMIZE`` mode, ``TOL`` defaults to 0.003 and the conditions are,
- maximum change in variables below TOL,
- average change in variables smaller than TOL * 2/3,
- maximum gradient below TOL * 1/4,
- average gradient below TOL * 1/6.
.. _`GAMESS-UK manual section on controlling optimisation`: http://www.cfs.dl.ac.uk/docs/html/part4/node14.html
**Jaguar** has several geometry convergence criteria,
* gconv1: maximum element of gradient (4.5E-04)
* gconv2: rms of gradient elements (3.0E-04)
* gconv5: maximum element of nuclear displacement (1.8E-03)
* gconv6: rms of nuclear displacement elements (1.2E-03)
* gconv7: difference between final energies from previous and current geometry optimisation iterations (5.0E-05)
Note that a value for gconv7 is not available until the second iteration, so it is set to zero in the first element of `geovalues`_.
**Molpro** has custom convergence criteria, as described in the `manual `_:
The standard MOLPRO convergence criterion requires the maximum component of the gradient to be less then :math:`3 \cdot 10^{-4}` [a.u.] and the maximum energy change to be less than :math:`1 \cdot 10^{-6}` [H] or the maximum component of the gradient to be less then $ 3 \cdot 10^{-4}$ [a.u.] and the maximum component of the step to be less then :math:`3 \cdot 10^{-4}` [a.u.].
.. _Molpro manual convergence: https://www.molpro.net/info/2012.1/doc/manual/node592.html
**ORCA** tracks the change in energy as well as RMS and maximum gradients and displacements. As of version 3.0, an optimisation is considered converged when all the tolerances are met, and there are four exceptions:
* the energy is within 25x the tolerance and all other criteria are met
* the gradients are overachieved (1/3 of the tolerance) and displacements are reasonable (at most 3x the tolerance)
* the displacements are overachieved (1/3 of the tolerance) and the gradients are reasonable (at most 3x the tolerance)
* the energy gradients and internal coordinates are converged (bond distances, angles, dihedrals and impropers)
**Psi** normally tracks five different values, as described `in the documentation`_, but their use various depending on the strategy employed. The default strategy (QCHEM) check whether the maximum force is converged and if the maximum energy change or displacement is converged. Additionally, to aid with flat potential energy surfaces, convergence is as assumed when the root mean square force converged to 0.01 of its default target. Note that Psi print values even for targets that are not being used -- in these cases the targets are parsed as `numpy.inf`_ so that they can still be used (any value will be converged).
.. _`in the documentation`: http://sirius.chem.vt.edu/psi4manual/latest/optking.html
.. _`numpy.inf`: http://docs.scipy.org/doc/numpy-1.8.1/user/misc.html#ieee-754-floating-point-special-values
.. index::
single: geomtry optimisation; geovalues (attribute)
geovalues
---------
These are the current values for the criteria used to determine whether a geometry has converged in the course of a geometry optimisation. It is an array of dimensions ``m x n``, where ``m`` is the number of geometry optimisation iterations and ``n`` the number of target criteria.
Note that many programs print atomic coordinates before and after a geometry optimisation, which means that there will not necessarily be ``m`` elements in `atomcoords`_.
If the optimisation has finished successfully, the values in the last row should be smaller than the values in geotargets_ (unless the convergence criteria require otherwise).
hessian
-------
An array of rank 1 that contains the elements of the `hessian `_ or force constant matrix. Only the lower triangular part of the 3Nx3N matrix is stored (this may change in the future, maybe also only the unweighted matrix will be parsed).
.. index::
single: molecular orbitals; homos (attribute)
homos
-----
A 1D array that holds the indexes of the highest occupied molecular orbitals (HOMOs), which contains one element for restricted and two elements for unrestricted calculations. These indexes can be applied to other attributes describing molecular orbitals, such as `moenergies`_ and `mocoeffs`_.
.. index::
single: molecular orbitals; mocoeffs (attribute)
metadata
--------
A dictionary containing metadata_ (data about data) for the calculation. Currently, it can contain the following possible attributes, not all of which are implemented for each parser.
* ``basis_set``: A string with the name of the basis set, if it is printed anywhere as a standard name.
* ``coord_type``: For the ``coords`` field, a string for the representation of stored coordinates. Currently, it is one of ``xyz``, ``int``/``internal``, or ``gzmat``.
* ``coords``: A list of lists with shape ``[natoms, 4]`` which contains the input coordinates (those found in the input file). The first column is the atomic symbol as a string, and the next three columns are floats. This is useful as many programs reorient coordinates for symmetry reasons.
* ``functional``: A string with the name of the density functional used.
* ``info``: A list of strings, each of which is an information or log message produced during a calculation.
* ``input_file_contents``: A string containing the entire input file, if it is echoed back during the calculation.
* ``input_file_name``: A string containing the name of the input file, with file extension. It may not contain the entire path to the file.
* ``keywords``: A list of strings corresponding to the keywords used in the input file, in the loose format used by ORCA.
* ``methods``: A list of strings containing each method used in order. Currently, the list may contain ``HF``, ``DFT``, ``LMP2``/``DF-MP2``/``MP2``, ``MP3``, ``MP4``, ``CCSD``, and/or ``CCSD(T)``/``CCSD-T``.
* ``package``: A string with the name of the quantum chemistry program used.
* ``package_version``: A string representation of the package version. It is formatted to allow comparison using relational operators.
* ``success``: A boolean for whether or not the calculation completed properly.
* ``unrestricted``: A boolean for whether or not the calculation was performed with a unrestricted wavefunction.
* ``warnings``: A list of strings, each of which is a warning produced during a calculation.
The implementation and coverage of metadata is currently inconsistent. In the future, metadata may receive its own page similar to `extracted data`_.
.. _metadata: https://en.wikipedia.org/wiki/Metadata
mocoeffs
--------
A list of rank 2 arrays containing the molecular orbital (MO) coefficients. The list is of length 1 for restricted calculations, but length 2 for unrestricted calculations. For the array(s) in the list, the first axis corresponds to molecular orbitals, and the second corresponds to basis functions.
Examples:
* ``mocoeffs[0][2,5]`` -- The coefficient of the 6th basis function of the 3rd alpha molecular orbital
* ``mocoeffs[1][:,0]`` -- An array of the 1st basis function coefficients for the every beta molecular orbital
Note: For restricted calculation, ``mocoeffs`` is still a list, but it only contains a single rank 2 array so you access the matrix with mocoeffs[0].
**GAMESS-UK** - the `FORMAT HIGH`_ directive needs to be included if you want information on all of the eigenvalues to be available. In versions before 8.0 for unrestricted calculations, ``FORMAT HIGH`` does not increase the number of orbitals for which the molecular orbital coefficents are printed, so that there may be more orbital information on the alpha orbitals compared to the beta orbitals, and as a result the extra beta molecular orbital coefficients for which information is not available will be padded out with zeros by cclib.
**Molpro** - does not print MO coefficients at all by default, and you must add in the input ``GPRINT,ORBITALS``. What's more, this prints only the occupied orbitals, and to get virtuals add also ``ORBPTIN,NVIRT``, where ``NVIRT`` is how many virtuals to print (can be a large number like 99999 to print all).
.. index::
single: molecular orbitals; moenergies (attribute)
moenergies
----------
A list of rank 1 arrays containing the molecular orbital energies in eV. The list is of length 1 for restricted calculations, but length 2 for unrestricted calculations.
**GAMESS-UK**: similar to `mocoeffs`_, the directive `FORMAT HIGH`_ needs to be used if you want all of the eigenvalues printed.
**Jaguar**: the first ten virtual orbitals are printed by default. In order to print more, use the ``ipvirt`` keyword, with ``ipvirt=-1`` printing all virtual orbitals.
.. _`FORMAT HIGH`: http://www.cfs.dl.ac.uk/docs/html/part3/node8.html#SECTION00083000000000000000
.. index::
single: properties; moments (attribute)
moments
-------
This attribute contains the dipole moment vector and any higher electrostatic multipole moments for the whole molecule. It comprises a list of one dimensional arrays,
* the first is the reference point used in the multipole expansion, which is normally the center of mass,
* the second is the dipole moment vector, in Debyes (:math:`\mathbf{\mathrm{D}}`),
* the third array contains the raw molecular quadrupole moments in lexicographical order, that is the XX, XY, XZ, YY, YZ and ZZ moments, in Buckinghams (:math:`\mathbf{\mathrm{B}}`),
* any further arrays contain the raw molecular multipole moments of higher rank, in lexicographical order and in units of :math:`\mathbf{\mathrm{D}} \cdot Å^{L-1} = 10^{-10} \mathrm{esu} \cdot Å^L`
Note that by default cclib will provide the last moments printed, if several are printed in the course of a geometry optimisation or other job type involving several more than one geometry. For post-Hartree-Fock calculations, such as MP2 or coupled cluster, the uncorrelated moments are reported if none are printed for the final wavefunction.
.. index::
single: molecular orbitals; mosyms (attribute)
mosyms
------
For unrestricted calculations, this is a list of two lists containing alpha and beta symmetries (i.e. ``[[alpha_syms],[beta_syms]]``) containing strings for the orbital symmetries, arranged in order of energy. In a restricted calculation, there is only one nested list (``[[syms]]``).
The symmetry labels are normalised and cclib reports standard symmetry names:
======= ======= ======= ========== ================== ======
cclib ADF GAMESS GAMESS-UK Gaussian Jaguar
======= ======= ======= ========== ================== ======
A A A a A A
A1 A1 A1 a1 A1 A1
Ag A.g AG ag AG Ag
A' AA A' a' A' Ap
A" AAA A' ' a" or a' ' A" App
A1' AA1 A1' a1' A1' A1p
A1" AAA1 A1" a1" A1" A1pp
sigma Sigma SG
pi Pi PI
phi Phi PHI (inferred)
delta Delta DLTA but DLTU/DLTG
sigma.g Sigma.g SGG
======= ======= ======= ========== ================== ======
* ADF - the full list can be found [http://www.scm.com/Doc/Doc2005.01/ADF/ADFUsersGuide/page339.html here].
* GAMESS-UK - to get the list, 'grep "data yr" input.m' if you have access to the source. Note that for E, it's split into "e1+" and "e1-" for instance.
* Jaguar - to get the list, look at the examples in schrodinger/jaguar-whatever/samples if you have access to Jaguar. Note that for E, it's written as E1pp/Ap, for instance.
* NWChem - if molecular symmetry is turned off or set to C1, symmetry adaption for orbitals is also deactivated, and can be explicitly turned on with `adapt on` in the SCF block
Developers:
* The use of a function with doctests for each of these cases is recommended, to make sure that the conversion is robust. There is a prototype called normalisesym() in logfileparser.py which should be overwritten in the subclasses if necessary (there is a unittest to make sure that this has been done).
* The character tables `here `_ may be useful in determining the correspondence between the labels used by the comp chem package and the commonly-used symbols.
.. index::
single: energy; mpenergies (attribute)
mpenergies
----------
The attribute ``mpenergies`` holds the total molecule energies including Møller-Plesset correlation energy corrections in a two-dimensional array. The array's shape is (n,L), where ``n`` is 1 for single point calculations and larger for optimisations, and ``L`` is the order at which the correction is truncated. The order of elements is ascending, so a single point MP5 calculation will yield mpenergies as :math:`E_{MP2}, E_{MP3}, E_{MP4}, E_{MP5}`.
**ADF**: does not perform such calculations.
**GAMESS**: second-order corrections (MP2) are available in GAMESS-US, and MP2 through MP3 calculations in PC-GAMESS (use ``mplevl=n`` in the ``$contrl`` section).
**GAMESS-UK**: MP2 through MP3 corrections are available.
**Gaussian**: MP2 through MP5 energies are available using the ``MP`` keyword. For MP4 corrections, the energy with the most substitutions is used (SDTQ by default).
**Jaguar**: the LMP2 is available.
mult
----
The attribute ``mult`` is an integer and represents the spin multiplicity of the calculated system, which in turn is the total spin plus one.
natom
-----
``Natom`` is an integer, the number of atoms treated in the calculation.
.. index::
single: basis sets; nbasis (attribute)
nbasis
------
An integer representing the number of basis functions used in the calculation.
.. index::
single: basis sets; nmo (attribute)
nmo
---
The number of molecular orbitals in the calculation. It is an integer and is typically equal to `nbasis`_, but may be less than this if a linear dependency was identified between the basis functions.
Commands to get information on all orbitals:
**GAMESS-UK**: only usually prints information on the 5 lowest virtual orbitals. "FORMAT HIGH" should make it do this for all of the orbitals, although GAMESS-UK 7.0 has a bug that means that this only works for restricted calculations.
**Jaguar**: the first ten virtual orbitals are printed by default; in order to print more of them, use the ``ipvirt`` keyword in the input file, with ``ipvirt=-1`` printing all virtual orbitals (see the `manual `_ for more information).
.. _Jaguar manual nmo: http://www.pdc.kth.se/doc/jaguar4.1/html/manual/mang.html#644675
optdone
-------
Flags whether a geometry optimisation has completed. Currently this attribute is a single Boolean value, which is set to True when the final `atomcoords`_ represent a converged geometry optimisation. In the future, ``optdone`` will be a list that indexes which elements of `atomcoords`_ represent converged geometries. This functionality can be used starting from version 1.3, from the command line by passing the ``--future`` option to ``ccget``,
.. code-block:: bash
$ ccget optdone data/Gaussian/basicGaussian09/dvb_gopt.out
Attempting to parse data/Gaussian/basicGaussian09/dvb_gopt.out
optdone:
True
$ ccget --future optdone data/Gaussian/basicGaussian09/dvb_gopt.out
Attempting to parse data/Gaussian/basicGaussian09/dvb_gopt.out
optdone:
[4]
or by providing the corresponding argument to ``ccopen``,
.. code-block:: python
from cclib.parser import ccopen
parser = ccopen("filename", optdone_as_list=True) # could also do future=True instead of optdone_as_list
data = parser.parse()
scfenergies
-----------
An array containing the converged SCF energies of the calculation, in eV. For an optimisation log file, there will be as many elements in this array as there were optimisation steps.
**Molpro**: typically prints output about geometry optimisation in a separate logfile. So, both that and the initial output need to be passed to the cclib parser.
scftargets
----------
Target thresholds for determining whether the current SCF run has converged, stored in a ``n x m`` array, where ``n`` is the number of geometry optimisation steps (1 for a single point calculation) and ``m`` is the number of criteria. The criteria vary between programs, and depending on the program they may be constant for the whole of a geometry optimisation or they may change between optimisation steps. A more detailed description for each program follows.
**ADF**: There are two convergence criteria which are controlled by ``SCFcnv`` in the `CONVERGE subkey of the SCF block`_.
* The maximum element of the commutator of the Fock matrix and P-matrix needs to be below ``SCFcnv``.
* The norm of the same matrix needs to be below ``10*SCFcnv``.
This hard target is normally used for single point calculations and the last step of geometry optimisations, and it defaults to 1.0E-6. There is also a soft target ``scfconv2`` that defaults to 1.0E-3, which can be switched on and is used by ADF automatically in some cases such as the first step in a geometry optimization.
For intermediate steps in a geometry optimisation the situation is more complicated and depends on the gradient and the integration accuracy. A post on the ADF user's forum revealed that it is calculated as follows:
.. math:: \mathrm{new\,criteria} = max( \mathrm{SCFcnv}, \, min(\mathrm{old\,criteria}, \, \mathrm{grdmax}/30, 10^{-\mathrm{accint}})) ),
where ``old criteria`` is the initial value or from the previous geometry cycle, ``grdmax`` is the maximum gradient from the last geometry step and ``accint`` is the current integration accuracy.
.. _`CONVERGE subkey of the SCF block`: http://www.scm.com/Doc/Doc2014/ADF/ADFUsersGuide/page235.html#keyscheme%20INTEGRATION
**GAMESS**: Two criteria are, the maximum and root-mean-square (RMS) density matrix change, are used with a default starting value of 5.0E-05. It seems these values can change over the course of a geometry optimisation. ROHF calculations use SQCDF instead of the standard RMS change.
**GAMESS-UK**: According to `the manual `_, convergence is determined by convergence of density matrix elements. The default value for SCF is 1E-5, but it appears to be 1E-7 for geoopts.
.. _`GAMESS-UK manual convergence`: http://www.cfs.dl.ac.uk/docs/html/part4/node6.html
**Gaussian**: normally three criteria are used.
* The RMS change in the density matrix elements, with a default of 1.0E-4 (1.0E-8 for geo opts).
* Maximum change in the density matrix elements, with a default of 1.0E-2 (1.0E-6 for geo opts).
* The change in energy, with a default threshold of 5.0E-05 (1.0E-06 for geo opts).
**Jaguar 4.2**: The targets in Jaguar 4.2 (based on the manual) depend on whether the job is a geometry optimisation or not. For geometry optimisations and hyper/polarisability calculation, the RMS change in the density matrix elements is used as a criterion (controlled by the ``dconv`` keyword), with a default of 5.0E6.
The energy convergence criterion (keyword ``econv``) is ignored for geometry optimisation calculations but is used for SCF calculations, and the default in this case is 5.0E5, except for hyper/polarisability calcualtions where it is 1.0E6.
scfvalues
---------
The attribute ``scfvalues`` is a list of arrays of dimension ``n x m`` (one element for each step in a geometry optimisation), where ``n`` is the number of SCF cycles required for convergence and ``m`` is the number of SCF convergence target criteria. For some packages, you may need to include a directive to make sure that SCF convergence information is printed to the log file
**Gaussian**: requires the `route section`_ to start with #P
.. _`route section`: http://www.gaussian.com/g_tech/g_ur/k_route.htm
**GAMESS-UK**: convergence information is printed only for the first optimisation step by default, but can be forced at all steps by adding ``IPRINT SCF`` to the input file.
vibdisps
--------
The attribute ``vibdisps`` stores the Cartesian displacement vectors from the output of a vibrational frequency calculation. It is a rank 3 array having dimensions ``M x N x 3``, where ``M`` is the number of normal modes and ``N`` is the number of atoms. ``M`` is typically ``3N-6`` (``3N-5`` for linear molecules).
cclib-1.6.2/doc/sphinx/development.rst 0000664 0000000 0000000 00000024105 13535330462 0017734 0 ustar 00root root 0000000 0000000 ===========
Development
===========
Basic instructions
==================
The default cclib files distributed with a release, as described in `How to install`_, do not include any unit tests and logfiles necessary to run those tests. This section covers how to download the full source along with all test data and scripts, and how to use these for development and testing.
.. _`How to install`: how_to_install.html
Cloning cclib from GitHub
~~~~~~~~~~~~~~~~~~~~~~~~~
cclib is hosted by the fantastic people at `GitHub`_ (previously at `Sourceforge`_) in a `git`_ repository. You can download a `zipped archive`_ of the current development version (called `master`) for installation and testing or browse the available `releases`_. In order to contribute any changes, however, you will need to create a local copy of the repository:
.. code-block:: bash
git clone https://github.com/cclib/cclib.git
.. _`GitHub`: https://github.com
.. _`Sourceforge`: https://sourceforge.net
.. _`git`: https://git-scm.com
.. _`zipped archive`: https://github.com/cclib/cclib/archive/master.zip
.. _`releases`: https://github.com/cclib/cclib/releases
Guidelines
~~~~~~~~~~~~~~~~
We follow a typical GitHub collaborative model, relying on `forks and pull requests`_. In short, the development process consists of:
* `Creating your own fork`_ of cclib in order to develop
* `Creating a pull request`_ to contribute your changes
* Reviewing and merging open pull requests (by someone else)
* Using `issues`_ to plan and prioritize future work
.. _`creating your own fork`: https://help.github.com/articles/fork-a-repo
.. _`creating a pull request`: https://help.github.com/articles/creating-a-pull-request
.. _`forks and pull requests`: https://help.github.com/articles/using-pull-requests
.. _`issues`: https://github.com/cclib/cclib/issues
Here are some general guidelines for developers who are contributing code:
* Run and review the unit tests (see below) before submitting a pull request.
* There should normally not be more failed tests than before your changes.
* For larger changes or features that take some time to implement, `using branches`_ is recommended.
.. _`using branches`: https://help.github.com/articles/branching-out
Releasing a new version
~~~~~~~~~~~~~~~~~~~~~~~
The release cycle of cclib is irregular, with new versions being created as deemed necessary after significant changes or new features. We roughly follow semantic versioning with respect to the `parsed attributes`_.
When creating a new release on GitHub, the typical procedure might include the following steps:
* Update the `CHANGELOG`_, `ANNOUNCE`_ and any other files that might change content with the new version
* Make sure that `setup.py`_ has the right version number, as well as __version__ in `__init__.py`_ and any other relevant files
* Update the download and install instructions in the documentation, if appropriate
* Create a branch for the release, so that development can continue
* Run all tests for a final time and fix any remaining issues
* Tag the release (make sure to use an annotated tag using ``git -a``) and upload it (``git push --tags``)
* Run `manifest.py`_ to update the MANIFEST file
* Create the source distributions (``python setup.py sdist --formats=gztar,zip``) and Windows binary installers (``python setup.py bdist_wininst``)
* Create a release on GitHub using the created tag (see `Creating releases`_) and upload the source distributions and Windows binaries
* Email the users and developers mailing list with the message in `ANNOUNCE`_
* Update the Python package index (https://pypi.python.org/pypi/cclib), normally done by ``python setup.py register``
* For significant releases, if appropriate, send an email to the `CCL list`_ and any mailing lists for computational chemistry packages supported by cclib
.. _`parsed attributes`: data.html
.. _`ANNOUNCE`: https://github.com/cclib/cclib/blob/master/ANNOUNCE
.. _`CHANGELOG`: https://github.com/cclib/cclib/blob/master/CHANGELOG
.. _`setup.py`: https://github.com/cclib/cclib/blob/master/setup.py
.. _`__init__.py`: https://github.com/cclib/cclib/blob/master/src/cclib/__init__.py
.. _`manifest.py`: https://github.com/cclib/cclib/blob/master/manifest.py
.. _`Creating releases`: https://help.github.com/articles/creating-releases
.. _`CCL list`: http://www.ccl.net
Testing
=======
.. index::
single: testing; unit tests
The `test directory`_, which is not included in the default download, contains the test scripts that keep cclib reliable, and keep the developers sane. With any new commit or pull request to cclib on GitHub the tests are triggered and run with `Travis CI`_, for both the current production version |release| (|travis_prod|) as well as master (|travis_master|).
The input files for tests, which are logfiles from computational chemistry programs, are located in the `data directory`_. These are a central part of cclib, and any progress should always be supported by corresponding tests. When a user opens an issue or reports a bug, it is prudent to write a test that reproduces the bug as well as fixing it. This ensures it will remain fixed in the future. Likewise, extending the coverage of data attributes to more programs should proceed in parallel with the growth of unit tests.
.. _`Travis CI`: https://travis-ci.org/cclib/cclib
.. |travis_prod| image:: https://travis-ci.org/cclib/cclib.svg?branch=v1.6.2
.. |travis_master| image:: https://travis-ci.org/cclib/cclib.svg?branch=master
.. _`data directory`: https://github.com/cclib/cclib/tree/master/data
.. _`test directory`: https://github.com/cclib/cclib/tree/master/test
.. index::
single: testing; unit tests
Unit tests
~~~~~~~~~~
Unit tests check that the parsers work correctly for typical calculation types on small molecules, usually water or 1,4-divinylbenzene (dvb) with :math:`C_{\mathrm{2h}}` symmetry. The corresponding logfiles stored in folders like ``data/NWChem/basicNWChem6.0`` are intended to test logfiles for an approximate major version of a program, and are standardized for all supported programs to the extent possible. They are located alongside the code in the repository, but are not normally distributed with the source. Attributes are considered supported only if they are checked by at least one test, and the `table of attribute coverage`_ is generated automatically using this criterion.
The job types currently included as unit tests:
* restricted and unrestricted single point energies for dvb (RHF/STO-3G **and** B3LYP/STO-3G)
* geometry optimization and scan for dvb (RHF/STO-3G and/or B3LYP/STO-3G)
* frequency calculation with IR and Raman intensities for dvb (RHF/STO-3G or B3LYP/STO-3G)
* single point energy for carbon atom using a large basis set such as aug-cc-pCVQZ
* Møller–Plesset and coupled cluster energies for water (STO-3G or 6-31G basis set)
* static polarizabilities for tryptophan (RHF/STO-3G)
.. _`table of attribute coverage`: data_dev.html#details-of-current-implementation
Adding a new program version
----------------------------
There are a few conventions when adding a new supported program version to the unit tests:
* Two different recent versions are typically used in the unit tests. If there already are two, move the older version(s) the regression suite (see below).
* When adding files for the new version, first copy the corresponding files for the last version already in cclib. Afterwards, check in files from the new program version as changes to the copied files. This procedure makes it easy to look at the differences introduced with the new version in git clients.
.. index::
single: testing; regressions
Regression tests
~~~~~~~~~~~~~~~~
Regression tests ensure that bugs, once fixed, stay fixed. These are real-life files that at some point broke a cclib parser, and are stored in folders like ``data/regression/Jaguar/Jaguar6.4``. The files associated with regression tests are not stored stored together with the source code as they are often quite large. A separate repository on GitHub, `cclib-data`_, is used to track these files, and we do not distribute them with any releases.
For every bug found in the parsers, there should be a corresponding regression test that tests this bug stays fixed. The process is automated by `regression.py`_, which runs through all of our test data, both the basic data and regression files, opens them, tries to parse, and runs any relevant regression tests defined for that file. New regression tests are added by creating a function ``testMyFileName_out`` according to the examples at the start of `regression.py`_.
Using both the unit and regression tests, the line-by-line `test coverage`_ shows which parts of cclib are touched by at least one test. When adding new features and tests, the Travis CI `testing script`_ can be run locally to generate the HTML coverage pages and ensure that the tests exercise the feature code.
.. _`cclib-data`: https://github.com/cclib/cclib-data
.. _`regression.py`: https://github.com/cclib/cclib/blob/master/test/regression.py
.. _`test coverage`: coverage/index.html
.. _`testing script`: https://github.com/cclib/cclib/blob/master/travis/run_pytest.sh
Websites related to cclib
=========================
* The official `cclib organization on github`_
* The `cclib project page on Sourceforge`_ (inactive now)
* The `cclib page for Travis CI`_
* The `cclib entry on PyPI`_
* The `cclib entry on Ohloh`_
.. _`cclib organization on github`: https://github.com/cclib
.. _`cclib project page on Sourceforge`: http://sourceforge.net/projects/cclib/
.. _`cclib entry on PyPI`: http://www.python.org/pypi/cclib
.. _`cclib page for Travis CI`: https://travis-ci.org/cclib/cclib
.. _`cclib entry on Ohloh`: https://www.ohloh.net/p/cclib
Developers
==========
Besides input from a number of people `listed in the repository`_, the following developers have contributed code to cclib (in alphabetical order):
* `Eric Berquist`_
* `Karol M. Langner`_
* `Noel O'Boyle`_
* Christopher Rowley
* Adam Tenderholt
.. _`listed in the repository`: https://github.com/cclib/cclib/blob/master/THANKS
.. _`Eric Berquist`: https://github.com/berquist
.. _`Karol M. Langner`: https://github.com/langner
.. _`Noel O'Boyle`: https://www.redbrick.dcu.ie/~noel/
cclib-1.6.2/doc/sphinx/docs_common.py 0000664 0000000 0000000 00000001055 13535330462 0017531 0 ustar 00root root 0000000 0000000 # -*- coding: utf-8 -*-
from __future__ import print_function
import os
import sys
def check_cclib(cclib):
"""Make sure we are importing code from a subdirectory, which should exist
and should have been updated just before running this script. Note that
this script does not assume any version in the module and just takes
what it finds... so an appropriate checkout should be done first."""
if cclib.__file__[:len(os.getcwd())] != os.getcwd():
print("Do not seem to be importing from current directory")
sys.exit(1)
cclib-1.6.2/doc/sphinx/how_to_install.rst 0000664 0000000 0000000 00000013452 13535330462 0020442 0 ustar 00root root 0000000 0000000 How to install
==============
This page describes how to download, install and use the basic functionality of cclib.
Requirements
------------
Before you install cclib, you need to make sure that you have the following:
* Python (at least version 3.4 is recommended, although 2.7 is still tested)
* NumPy (at least version 1.5 is recommended)
Python is an open-source programming language available from https://www.python.org. It is available for Windows as well as being included in most Linux distributions. In Debian/Ubuntu it is installed as follows (as root):
.. code-block:: bash
apt-get install python python-dev
NumPy (Numerical Python) adds a fast array facility and linear algebra routines to Python. It is available from https://www.numpy.org. Windows users should use the most recent NumPy installation for the Python version they have (e.g. numpy-1.0.3.1.win32-py2.4.exe for Python 2.4). Linux users are recommended to find a binary package for their distribution rather than compiling it themselves. In Debian/Ubuntu it is installed as follows (as root):
.. code-block:: bash
apt-get install python-numpy
To test whether Python is on the ``PATH``, open a command prompt window and type:
.. code-block:: bash
python
If Python is not on the ``PATH`` and you use Windows, add the full path to the directory containing it to the end of the ``PATH`` variable under Control Panel/System/Advanced Settings/Environment Variables. If you use Linux and Python is not on the ``PATH``, put/edit the appropriate line in your .bashrc or similar startup file.
To test that NumPy is working, try importing it at the Python prompt. You should see something similar to the following::
$ python
Python 3.7.0 (default, Jul 15 2018, 10:44:58)
[GCC 8.1.1 20180531] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'1.15.0'
>>>
(To exit, press CTRL+D in Linux or CTRL+Z,Enter in Windows)
Installing using pip
--------------------
pip_ is a command-line tool that often comes pre-packaged with Python distributions which can download packages from the `Python Package Index`_ (PyPI). To see if it's installed, on Linux or macOS try::
$ which pip
and on Windows::
cmd> where.exe pip
The version of `cclib uploaded to PyPI`_ can then be installed globally using::
python -m pip install cclib
or to your home directory using::
python -m pip install --user cclib
.. _pip: https://pip.pypa.io/en/stable/
.. _`Python Package Index`: https://pypi.org/
.. _`cclib uploaded to PyPI`: https://pypi.python.org/pypi/cclib
Installing using a system package manager
-----------------------------------------
If you're using `Debian GNU/Linux`_, `Ubuntu`_, or a similar distribution, there are official `cclib packages`_ that you can install in any package manager (should as synaptic) or with one simple command:
.. code-block:: bash
aptitude install cclib
There are in fact two packages, `python-cclib`_ containing the Python module, and `cclib`_ which installs just the user scripts. If you also need to also install the unittests and logfiles we distribute, you will need to install the `cclib-data`_ package from the non-free repositories (due to license issues). Because of distribution release cycles, package manager versions of cclib may be out of date compared to the PyPI version.
.. _`Debian GNU/Linux`: https://www.debian.org
.. _`Ubuntu`: https://www.ubuntu.com
.. _`cclib packages`: https://packages.debian.org/src:cclib
.. _`python-cclib`: https://packages.debian.org/stretch/python-cclib
.. _`cclib`: https://packages.debian.org/stretch/cclib
.. _`cclib-data`: https://packages.debian.org/stretch/cclib-data
Manual download and install
---------------------------
The source code of the newest release of cclib (version |release|) is distributed as:
* A .zip file: https://github.com/cclib/cclib/releases/download/v1.6/cclib-1.6.2.zip
* A .tar.gz file: https://github.com/cclib/cclib/releases/download/v1.6/cclib-1.6.2.tar.gz
* Windows binary installers (see the `newest release page`_)
On Windows, if you choose to download the .exe files instead, you can install simply by double-clicking on the file. To uninstall, use the "Add and Remove Programs" menu in the Control Panel.
None of these files include the tests and logfiles used for testing. In order to download all tests, we also provide source archives on the `newest release page`_.
If you are using the .zip or .tar.gz files, extract the contents of the file at an appropriate location, which we will call INSTALLDIR. Open a command prompt and change directory to INSTALLDIR. Next, run the following commands to install cclib:
.. code-block:: bash
python setup.py build
python setup.py install # (as root)
or, if pip_ is available::
python -m pip install .
To test, trying importing '''cclib''' at the Python prompt. You should see something similar to the following::
$ python
Python 3.7.0 (default, Jul 15 2018, 10:44:58)
[GCC 8.1.1 20180531] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cclib
>>> cclib.__version__
'1.6.2'
>>>
.. _`newest release page`: https://github.com/cclib/cclib/releases/tag/v1.3.1
What next?
----------
* Read the list and specifications of the `parsed data`_ and related `data notes`_
* Test the program using the test data files included in the full source distribution
* Run the unit and regression tests in the test directory (``testall.py`` and ``regression.py``)
* Send any questions to the cclib-users mailing list at https://lists.sourceforge.net/lists/listinfo/cclib-users.
* Write some computational chemistry algorithms using information parsed from cclib and donate the code to the project
.. _`parsed data`: data.html
.. _`data notes`: data_notes.html
cclib-1.6.2/doc/sphinx/how_to_parse.rst 0000664 0000000 0000000 00000022576 13535330462 0020115 0 ustar 00root root 0000000 0000000 How to parse and write
======================
This page outlines the various ways cclib can be used to parse and write logfiles, and provides several examples to get you started.
From Python
+++++++++++
Importing cclib and parsing a file is a few lines of Python code, making it simple to access data from the output file of any supported computational chemistry program. For example:
.. code-block:: python
import cclib
filename = "logfile.out"
parser = cclib.io.ccopen(filename)
data = parser.parse()
print("There are %i atoms and %i MOs" % (data.natom, data.nmo))
A newer command, ``ccread``, combines both the format detection and parsing steps:
.. code-block:: python
import cclib
filename = "logfile.out"
data = cclib.io.ccread(filename)
print("There are %i atoms and %i MOs" % (data.natom, data.nmo))
From command line
+++++++++++++++++
The cclib package provides four scripts to parse and write data: ``ccget``, ``ccwrite``, ``cda``, and ``ccframe``.
1. **ccget** is used to parse attribute data from output files.
2. **ccwrite** has the ability to list out all valid attribute data that can be parsed from an output format. It has the added feature of writing the output file into four different formats i.e. ``json``, ``cjson``, ``cml``, ``xyz``.
3. **cda** is used for the chemical decomposition analysis of output files.
4. **ccframe** is used to write data tables from output files.
This page describes how to use the ccget, ccwrite and ccframe scripts to obtain data from output files.
ccget
-----
The data types that can be parsed from the output file depends on the type of computation being conducted. The name of the output file used to show example usage is ``Benzeneselenol.out``.
Data type can be parsed from the output file by following this format::
ccget [] []
where ``attribute`` can be any one of the attribute names available `here`_.
.. _`here`: data_dev.html
1. Atomic Charges
The atomic charges are obtained by using the ``atomcharges`` attribute::
$ ccget atomcharges Benzeneselenol.out
Attempting to read Benzeneselenol.out
atomcharges:
{'mulliken': array([-0.49915 , 0.056965, 0.172161, 0.349794, -0.153072, 0.094583,
0.016487, 0.050249, 0.002149, 0.01161 , 0.053777, -0.173671,
0.018118])}
2. Electronic Energies
The molecular electronic energies after SCF (DFT) optimization of the input molecule are printed by using the ``scfenergies`` attribute::
$ ccget scfenergies Benzeneselenol.out
Attempting to read Benzeneselenol.out
scfenergies:
[-71671.43702915 -71671.4524142 -71671.4534768 -71671.45447492
-71671.4556548 -71671.45605671 -71671.43194906 -71671.45761021
-71671.45850275 -71671.39630296 -71671.45915119 -71671.45935854
-71671.4594614 -71671.45947338 -71671.45948807 -71671.4594946
-71671.4594946 ]
3. Geometry Targets
The targets for convergence of geometry optimization can be obtained by using the ``geotargets`` attribute::
$ ccget geotargets Benzeneselenol.out
Attempting to read Benzeneselenol.out
geotargets:
[ 0.00045 0.0003 0.0018 0.0012 ]
Chaining of attributes
^^^^^^^^^^^^^^^^^^^^^^
ccget provides the user with the option to chain attributes to obtain more than one type of data with a command call. The attributes can be chained in any particular order. A few chained examples are provided below.
1. Molecular Orbitals and Multiplicity
The number of molecular orbitals and the number of basis functions used to optimize the molecule can be obtained by running the following command::
$ ccget nmo nbasis Benzeneselenol.out
Attempting to read Benzeneselenol.out
nmo:
405
nbasis:
407
2. Enthalpy and Vibrational Frequency
The enthalpy and the vibrational frequencies of the optimized molecule is conducted is obtained below::
$ ccget enthalpy vibfreqs Benzeneselenol.out
Attempting to read Benzeneselenol.out
enthalpy:
-2633.77264
vibfreqs:
[ 129.5512 170.6681 231.4278 304.8614 407.8299 472.5026
629.9087 679.9032 693.2509 746.7694 812.5113 850.2578
915.8742 987.1252 988.1785 1002.8922 1038.1073 1091.4005
1102.3417 1183.3857 1209.2727 1311.3497 1355.6441 1471.4447
1510.1919 1611.9088 1619.0156 2391.2487 3165.1596 3171.3909
3182.0753 3188.5786 3198.0359]
ccwrite
-------
The same Benzeneselenol.out file used in the previous examples will be used as the input file for ccwrite. When the ccwrite script is used with a valid input, it prints out the valid attributes that can be parsed from the file.
Command line format::
ccwrite []
The valid output file formats are ``cjson``, ``cml``, and ``xyz``.
1. `Chemical markup language`_ (CML)::
$ ccwrite cml Benzeneselenol.out
Attempting to parse Benzeneselenol.out
cclib can parse the following attributes from Benzeneselenol.out:
atomcharges
atomcoords
atomnos
charge
coreelectrons
enthalpy
geotargets
geovalues
grads
homos
moenergies
mosyms
mult
natom
nbasis
nmo
optdone
optstatus
scfenergies
scftargets
temperature
vibdisps
vibfreqs
vibirs
vibsyms
.. _`chemical markup language`: http://www.xml-cml.org/
A ``Benzeneselenol.cml`` output file is generated in the same directory as the ``Benzeneselenol.out`` file:
.. code-block:: xml
2. XYZ_
Using ``xyz`` as the ```` with ``Benzeneselenol.out``, we obtain the following ``Benzeneselenol.xyz`` file::
13
Benzeneselenol.out: Geometry 17
C -2.8947620000 -0.0136420000 -0.0015280000
C -2.2062510000 1.1938510000 -0.0025210000
C -0.8164260000 1.2153020000 -0.0022010000
C -0.1033520000 0.0183920000 0.0031060000
C -0.7906630000 -1.1943840000 0.0058500000
C -2.1799570000 -1.2059710000 0.0017890000
H -3.9758430000 -0.0253010000 -0.0029040000
H -2.7502340000 2.1291370000 -0.0052760000
H -0.2961840000 2.1630180000 -0.0073260000
H -0.2474670000 -2.1302310000 0.0132260000
H -2.7028960000 -2.1530750000 0.0036640000
Se 1.8210800000 -0.0433780000 -0.0038760000
H 2.0043580000 1.4100070000 0.1034490000
.. _XYZ: https://en.wikipedia.org/wiki/XYZ_file_format
ccframe
-------
This script creates complete tables of data from output files in some of the formats supported by pandas_.
Since the pandas library is not a dependency of cclib, `it must be installed `_ separately.
.. _pandas: https://pandas.pydata.org/
A complete data table can be parsed from many output files by following this format::
ccframe -O [...]
The argument for ``-O`` indicates the data file to be written and its extension specifies the filetype (e.g. csv, h5/hdf/hdf5, json, pickle/pkl, xlsx).
Since higher-dimensional attributes (e.g. ``atomcoords``) are handled as plain text in some file formats (such as Excel XLSX or CSV), we recommend storing JSON or HDF5 files.
cclib-1.6.2/doc/sphinx/index.rst 0000664 0000000 0000000 00000012277 13535330462 0016530 0 ustar 00root root 0000000 0000000 Overview
========
**cclib** is an `open source`_ library, written in Python_, for parsing and interpreting the results of computational chemistry packages.
The goals of cclib are centered around the reuse of data obtained from these programs and contained in output files, specifically:
- extract (parse) data from the output files generated by multiple programs
- provide a consistent interface to the results of computational chemistry calculations, particularly those results that are useful for algorithms or visualization
- facilitate the implementation of algorithms that are not specific to a particular computational chemistry package
- to maximize interoperability with other open source computational chemistry and cheminformatic software libraries
Summary
-------
The current version is **cclib 1.6.2** (see the changelog_ for what's new). The following programs are supported and systematically tested at the versions given in parentheses:
- ADF_ (versions 2007 and 2013)
- DALTON_ (versions 2013 and 2015)
- Firefly_, formerly known as PC GAMESS (version 8.0)
- `GAMESS (US)`_ (versions 2014 and 2017)
- `GAMESS-UK`_ (versions 7.0 and 8.0)
- Gaussian_ (versions 09 and 16)
- Jaguar_ (versions 7.0 and 8.3)
- Molcas_ (version 18.0)
- Molpro_ (versions 2006 and 2012)
- MOPAC_ (version 2016)
- NWChem_ (versions 6.0, 6.1, 6.5 and 6.6)
- ORCA_ (versions 4.0 and 4.1)
- Psi4_ (versions 1.0 and 1.2.1)
- `Q-Chem`_ (versions 4.2 and 5.1)
- Turbomole_ (versions 5.9 and 7.2)
Output files from other versions of the above programs may still work, and regression tests are always welcome. The following legacy parsers are still tested as regressions, but not actively maintained:
- Psi3_ (version 3.4)
Many types of output data are parsed by cclib, including atom coordinates, orbital information, vibrational modes and TD-DFT calculations. See the page on `Extracted Data`_ for a complete list with coverage for the different programs. Several `calculation methods`_ are also provided for interpreting the electronic properties of molecules.
How to use cclib
----------------
You can download the `source package for cclib 1.6.2`_ or the `current development version`_ (from the `GitHub repository`_). For information on packages available in various Linux distributions, installing the source code and requirements, as well as basic usage, the `How to install`_ is a good place to start.
If you need further help, find a bug, need new features or have any question, please send email to the `mailing list`_ or submit an issue to the `tracker`_.
About cclib
-----------
The code behind cclib was started as a collaboration between Noel O'Boyle, Adam Tenderholt and Karol M. Langner (see page about Development_ for details) and is licensed under the `BSD 3-clause license`_. Other developers are encouraged to contribute to this open source project -- send an email to the `developers mailing list`_. Applications that use cclib include GaussSum_ and QMForge_. It has also been used in the literature_.
If you use cclib in your scientific work, please support our work by adding a reference to the following article:
| N\. M\. O'Boyle, A\. L\. Tenderholt, K\. M\. Langner, *cclib: a library for package-independent computational chemistry algorithms*, J. Comp. Chem. 29 (5), pp. 839-845, **2008** (DOI_).
|
A record for the latest release is also available on Zenodo_.
.. _`open source`: http://en.wikipedia.org/wiki/Open_source
.. _Python: http://www.python.org
.. _`BSD 3-clause license`: https://en.wikipedia.org/wiki/BSD_licenses#3-clause_license_(%22BSD_License_2.0%22,_%22Revised_BSD_License%22,_%22New_BSD_License%22,_or_%22Modified_BSD_License%22)
.. _changelog: changelog.html
.. _`extracted data`: data.html
.. _`calculation methods`: methods.html
.. _`installation page`: installation.html
.. _`How to install`: how_to_install.html
.. _development: development.html
.. _ADF: https://www.scm.com/product/adf/
.. _DALTON: http://daltonprogram.org
.. _Firefly: http://classic.chem.msu.su/gran/gamess/
.. _`GAMESS (US)`: http://www.msg.ameslab.gov/GAMESS/GAMESS.html
.. _`GAMESS-UK`: http://www.cfs.dl.ac.uk
.. _`Gaussian`: http://www.gaussian.com
.. _Jaguar: https://www.schrodinger.com/jaguar
.. _Molcas: https://gitlab.com/Molcas/OpenMolcas
.. _Molpro: http://www.molpro.net/
.. _MOPAC: http://openmopac.net/
.. _NWChem: http://www.nwchem-sw.org/index.php/Main_Page
.. _ORCA: https://orcaforum.cec.mpg.de/
.. _Psi3: http://openscience.org/psi3/
.. _Psi4: http://psicode.org/
.. _`Q-Chem`: http://q-chem.com/
.. _Turbomole: http://www.turbomole-gmbh.com/
.. _`source package for cclib 1.6.2`: https://github.com/cclib/cclib/releases/download/v1.6/cclib-1.6.2.zip
.. _`current development version`: https://github.com/cclib/cclib/archive/master.zip
.. _`GitHub repository`: https://github.com/cclib/cclib
.. _`mailing list`: https://lists.sourceforge.net/lists/listinfo/cclib-users
.. _`developers mailing list`: https://lists.sourceforge.net/lists/listinfo/cclib-devel
.. _`tracker`: https://github.com/cclib/cclib/issues
.. _GaussSum: http://gausssum.sourceforge.net/
.. _QMForge: https://qmforge.net/
.. _literature: http://pubs.acs.org/doi/abs/10.1021/jacs.5b05600
.. _DOI: http://dx.doi.org/10.1002/jcc.20823
.. _Zenodo: http://dx.doi.org/10.5281/zenodo.1407790
cclib-1.6.2/doc/sphinx/make.bat 0000664 0000000 0000000 00000011746 13535330462 0016274 0 ustar 00root root 0000000 0000000 @ECHO OFF
REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set BUILDDIR=_build
set ALLSPHINXOPTS=-d %BUILDDIR%/doctrees %SPHINXOPTS% .
set I18NSPHINXOPTS=%SPHINXOPTS% .
if NOT "%PAPER%" == "" (
set ALLSPHINXOPTS=-D latex_paper_size=%PAPER% %ALLSPHINXOPTS%
set I18NSPHINXOPTS=-D latex_paper_size=%PAPER% %I18NSPHINXOPTS%
)
if "%1" == "" goto help
if "%1" == "help" (
:help
echo.Please use `make ^` where ^ is one of
echo. html to make standalone HTML files
echo. dirhtml to make HTML files named index.html in directories
echo. singlehtml to make a single large HTML file
echo. pickle to make pickle files
echo. json to make JSON files
echo. htmlhelp to make HTML files and a HTML help project
echo. qthelp to make HTML files and a qthelp project
echo. devhelp to make HTML files and a Devhelp project
echo. epub to make an epub
echo. latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter
echo. text to make text files
echo. man to make manual pages
echo. texinfo to make Texinfo files
echo. gettext to make PO message catalogs
echo. changes to make an overview over all changed/added/deprecated items
echo. linkcheck to check all external links for integrity
echo. doctest to run all doctests embedded in the documentation if enabled
goto end
)
if "%1" == "clean" (
for /d %%i in (%BUILDDIR%\*) do rmdir /q /s %%i
del /q /s %BUILDDIR%\*
goto end
)
if "%1" == "html" (
%SPHINXBUILD% -b html %ALLSPHINXOPTS% %BUILDDIR%/html
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The HTML pages are in %BUILDDIR%/html.
goto end
)
if "%1" == "dirhtml" (
%SPHINXBUILD% -b dirhtml %ALLSPHINXOPTS% %BUILDDIR%/dirhtml
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The HTML pages are in %BUILDDIR%/dirhtml.
goto end
)
if "%1" == "singlehtml" (
%SPHINXBUILD% -b singlehtml %ALLSPHINXOPTS% %BUILDDIR%/singlehtml
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The HTML pages are in %BUILDDIR%/singlehtml.
goto end
)
if "%1" == "pickle" (
%SPHINXBUILD% -b pickle %ALLSPHINXOPTS% %BUILDDIR%/pickle
if errorlevel 1 exit /b 1
echo.
echo.Build finished; now you can process the pickle files.
goto end
)
if "%1" == "json" (
%SPHINXBUILD% -b json %ALLSPHINXOPTS% %BUILDDIR%/json
if errorlevel 1 exit /b 1
echo.
echo.Build finished; now you can process the JSON files.
goto end
)
if "%1" == "htmlhelp" (
%SPHINXBUILD% -b htmlhelp %ALLSPHINXOPTS% %BUILDDIR%/htmlhelp
if errorlevel 1 exit /b 1
echo.
echo.Build finished; now you can run HTML Help Workshop with the ^
.hhp project file in %BUILDDIR%/htmlhelp.
goto end
)
if "%1" == "qthelp" (
%SPHINXBUILD% -b qthelp %ALLSPHINXOPTS% %BUILDDIR%/qthelp
if errorlevel 1 exit /b 1
echo.
echo.Build finished; now you can run "qcollectiongenerator" with the ^
.qhcp project file in %BUILDDIR%/qthelp, like this:
echo.^> qcollectiongenerator %BUILDDIR%\qthelp\cclib.qhcp
echo.To view the help file:
echo.^> assistant -collectionFile %BUILDDIR%\qthelp\cclib.ghc
goto end
)
if "%1" == "devhelp" (
%SPHINXBUILD% -b devhelp %ALLSPHINXOPTS% %BUILDDIR%/devhelp
if errorlevel 1 exit /b 1
echo.
echo.Build finished.
goto end
)
if "%1" == "epub" (
%SPHINXBUILD% -b epub %ALLSPHINXOPTS% %BUILDDIR%/epub
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The epub file is in %BUILDDIR%/epub.
goto end
)
if "%1" == "latex" (
%SPHINXBUILD% -b latex %ALLSPHINXOPTS% %BUILDDIR%/latex
if errorlevel 1 exit /b 1
echo.
echo.Build finished; the LaTeX files are in %BUILDDIR%/latex.
goto end
)
if "%1" == "text" (
%SPHINXBUILD% -b text %ALLSPHINXOPTS% %BUILDDIR%/text
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The text files are in %BUILDDIR%/text.
goto end
)
if "%1" == "man" (
%SPHINXBUILD% -b man %ALLSPHINXOPTS% %BUILDDIR%/man
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The manual pages are in %BUILDDIR%/man.
goto end
)
if "%1" == "texinfo" (
%SPHINXBUILD% -b texinfo %ALLSPHINXOPTS% %BUILDDIR%/texinfo
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The Texinfo files are in %BUILDDIR%/texinfo.
goto end
)
if "%1" == "gettext" (
%SPHINXBUILD% -b gettext %I18NSPHINXOPTS% %BUILDDIR%/locale
if errorlevel 1 exit /b 1
echo.
echo.Build finished. The message catalogs are in %BUILDDIR%/locale.
goto end
)
if "%1" == "changes" (
%SPHINXBUILD% -b changes %ALLSPHINXOPTS% %BUILDDIR%/changes
if errorlevel 1 exit /b 1
echo.
echo.The overview file is in %BUILDDIR%/changes.
goto end
)
if "%1" == "linkcheck" (
%SPHINXBUILD% -b linkcheck %ALLSPHINXOPTS% %BUILDDIR%/linkcheck
if errorlevel 1 exit /b 1
echo.
echo.Link check complete; look for any errors in the above output ^
or in %BUILDDIR%/linkcheck/output.txt.
goto end
)
if "%1" == "doctest" (
%SPHINXBUILD% -b doctest %ALLSPHINXOPTS% %BUILDDIR%/doctest
if errorlevel 1 exit /b 1
echo.
echo.Testing of doctests in the sources finished, look at the ^
results in %BUILDDIR%/doctest/output.txt.
goto end
)
:end
cclib-1.6.2/doc/sphinx/methods.rst 0000664 0000000 0000000 00000024016 13535330462 0017056 0 ustar 00root root 0000000 0000000 .. index::
module: methods
Calculation methods
===================
The following methods in cclib allow further analysis of calculation output.
.. _`methods in the development version`: methods_dev.html
.. index::
single: methods; C squared population analysis (CSPA)
C squared population analysis (CSPA)
------------------------------------
**CSPA** can be used to determine and interpret the electron density of a molecule. The contribution of the a-th atomic orbital to the i-th molecular orbital can be written in terms of the molecular orbital coefficients:
.. math:: \Phi_{ai} = \frac{c^2_{ai}}{\sum_k c^2_{ki}}
The CSPA class available from cclib.method performs C-squared population analysis and can be used as follows:
.. code-block:: python
from cclib.io import ccread
from cclib.method import CSPA
data = ccread("mycalc.out")
m = CSPA(data)
m.calculate()
After the ``calculate()`` method is called, the following attributes are available:
* ``aoresults`` is a NumPy array[3] with spin, molecular orbital, and atomic/fragment orbitals as the axes (``aoresults[0][45][0]`` gives the contribution of the 1st atomic/fragment orbital to the 46th alpha/restricted molecular orbital)
* ``fragresults`` is a NumPy array[3] with spin, molecular orbital, and atoms as the axes (``atomresults[1][23][4]`` gives the contribution of the 5th atomic/fragment orbital to the 24th beta molecular orbital)
* ``fragcharges`` is a NumPy array[1] with the number of (partial) electrons in each atom (``atomcharges[2]`` gives the number of electrons on the 3rd atom)
Custom fragments
~~~~~~~~~~~~~~~~
Calling the calculate method without an argument treats each atom as a fragment in the population analysis. An optional argument can be passed - a list of lists - containing the atomic orbital numbers to be included in each fragment. Calling with this additional argument is useful if one is more interested in the contributions of certain orbitals, such as metal d, to the molecular orbitals. For example:
.. code-block:: python
from cclib.io import ccread
from cclib.method import CSPA
data = ccread("mycalc.out")
m = CSPA(data)
m.calculate([[0, 1, 2, 3, 4], [5, 6], [7, 8, 9]]) # fragment one is made from basis functions 0 - 4
# fragment two is made from basis functions 5 & 6
# fragment three is made from basis functions 7 - 9
Custom progress
~~~~~~~~~~~~~~~
The CSPA class also can take a progress class as an argument so that the progress of the calculation can be monitored:
.. code-block:: python
from cclib.method import CSPA
from cclib.parser import Gaussian
from cclib.progress import TextProgress
import logging
progress = TextProgress()
p = Gaussian("mycalc.out", logging.ERROR)
d = p.parse(progress)
m = CSPA(d, progress, logging.ERROR)
m.calculate()
.. index::
single: methods; Mulliken population analysis (MPA)
Mulliken population analysis (MPA)
----------------------------------
MPA can be used to determine and interpret the electron density of a molecule. The contribution of the a-th atomic orbital to the i-th molecular orbital in this method is written in terms of the molecular orbital coefficients, c, and the overlap matrix, S:
.. math:: \Phi_{ai} = \sum_b c_{ai} c_{bi} S_{ab}
The MPA class available from cclib.method performs Mulliken population analysis and can be used as follows:
.. code-block:: python
import sys
from cclib.method import MPA
from cclib.parser import ccopen
d = ccopen(sys.argv[1]).parse()
m = MPA(d)
m.calculate()
After the calculate() method is called, the following attributes are available:
* ``aoresults``: a three dimensional array with spin, molecular orbital, and atomic orbitals as the axes, so that ``aoresults[0][45][0]`` gives the contribution of the 1st atomic orbital to the 46th alpha/restricted molecular orbital,
* ``fragresults``: a three dimensional array with spin, molecular orbital, and atoms as the axes, so that ``fragresults[1][23][4]`` gives the contribution of the 5th fragment orbitals to the 24th beta molecular orbital)
* ``fragcharges``: a vector with the number of (partial) electrons in each fragment, so that ``fragcharges[2]`` gives the number of electrons in the 3rd fragment.
Custom fragments
~~~~~~~~~~~~~~~~
The calculate method chooses atoms as the fragments by default, and optionally accepts a list of lists containing the atomic orbital numbers (e.g. ``[[0, 1, 2], [3, 4, 5, 6], ...]``) of arbitrary fragments. Calling it in this way is useful if one is more interested in the contributions of groups of atoms or even certain orbitals or orbital groups, such as metal d, to the molecular orbitals. In this case, fragresults and fragcharges reflect the chosen groups of atomic orbitals instead of atoms.
Custom progress
~~~~~~~~~~~~~~~
The Mulliken class also can take a progress class as an argument so that the progress of the calculation can be monitored:
.. code-block:: python
from cclib.method import MPA
from cclib.parser import ccopen
from cclib.progress import TextProgress
import logging
progress = TextProgress()
d = ccopen("mycalc.out", logging.ERROR).parse(progress)
m = MPA(d, progress, logging.ERROR)
m.calculate()
.. index::
single: methods; Löwdin Population Analysis
Löwdin Population Analysis
--------------------------
The LPA class available from cclib.method performs Löwdin population analysis and can be used as follows:
.. code-block:: python
import sys
from cclib.method import LPA
from cclib.parser import ccopen
d = ccopen(sys.argv[1]).parse()
m = LPA(d)
m.calculate()
..
Overlap Population Analysis
---------------------------
Density Matrix calculation
--------------------------
The Density class from cclib.method can be used to calculate the density matrix:
.. code-block:: python
from cclib.parser import ccopen
from cclib.method import Density
parser = ccopen("myfile.out")
data = parser.parse()
d = Density(data)
d.calculate()
After ``calculate()`` is called, the density attribute is available. It is simply a NumPy array with three axes. The first axis is for the spin contributions, and the second and third axes are for the density matrix, which follows the standard definition.
Mayer's Bond Orders
-------------------
This method calculates the Mayer's bond orders for a given molecule:
.. code-block:: python
import sys
from cclib.parser import ccopen
from cclib.method import MBO
parser = ccopen(sys.argv[1])
data = parser.parse()
d = MBO(data)
d.calculate()
After ``calculate()`` is called, the fragresults attribute is available, which is a NumPy array of rank 3. The first axis is for contributions of each spin to the MBO, while the second and third correspond to the indices of the atoms.
Charge Decomposition Analysis
-----------------------------
The Charge Decomposition Analysis (CDA) as developed by Gernot Frenking et al. is used to study the donor-acceptor interactions of a molecule in terms of two user-specified fragments.
The CDA class available from cclib.method performs this analysis:
.. code-block:: python
from cclib.io import ccopen
from cclib.method import CDA
molecule = ccopen("molecule.log")
frag1 = ccopen("fragment1.log")
frag2 = ccopen("fragment2.log")
# if using CDA from an interactive session, it's best
# to parse the files at the same time in case they aren't
# parsed immediately---go get a drink!
m = molecule.parse()
f1 = frag1.parse()
f2 = frag2.parse()
cda = CDA(m)
cda.calculate([f1, f2])
After ``calculate()`` finishes, there should be the donations, bdonations (back donation), and repulsions attributes to the cda instance. These attributes are simply lists of 1-dimensional NumPy arrays corresponding to the restricted or alpha/beta molecular orbitals of the entire molecule. Additionally, the CDA method involves transforming the atomic basis functions of the molecule into a basis using the molecular orbitals of the fragments so the attributes mocoeffs and fooverlaps are created and can be used in population analyses such as Mulliken or C-squared (see Fragment Analysis for more details).
There is also a script provided by cclib that performs the CDA from a command-line::
$ cda molecule.log fragment1.log fragment2.log
Charge decomposition analysis of molecule.log
MO# d b r
-----------------------------
1: -0.000 -0.000 -0.000
2: -0.000 0.002 0.000
3: -0.001 -0.000 0.000
4: -0.001 -0.026 -0.006
5: -0.006 0.082 0.230
6: -0.040 0.075 0.214
7: 0.001 -0.001 0.022
8: 0.001 -0.001 0.022
9: 0.054 0.342 -0.740
10: 0.087 -0.001 -0.039
11: 0.087 -0.001 -0.039
------ HOMO - LUMO gap ------
12: 0.000 0.000 0.000
13: 0.000 0.000 0.000
......
Notes
~~~~~
* Only molecular orbitals with non-zero occupancy will have a non-zero value.
* The absolute values of the calculated terms have no physical meaning and only the relative magnitudes, especially for the donation and back donation terms, are of any real value (Frenking, et al.)
* The atom coordinates in molecules and fragments must be the same, which is usually accomplished with an argument in the QM program (the NoSymm keyword in Gaussian, for instance).
* The current implementation has some subtle differences than the code from the Frenking group. The CDA class in cclib follows the formula outlined in one of Frenking's CDA papers, but contains an extra factor of 2 to give results that agree with those from the original CDA program. It also doesn't include negligible terms (on the order of 10^-6) that result from overlap between MOs on the same fragment that appears to be included in the Frenking code. Contact atenderholt (at) gmail (dot) com for discussion and more information.
..
Electron Density Calculation
----------------------------
cclib-1.6.2/logo.png 0000664 0000000 0000000 00000040466 13535330462 0014260 0 ustar 00root root 0000000 0000000 PNG
IHDR , *:~ sBIT|d pHYs 7 tEXtSoftware www.inkscape.org< IDATxwWqrFъN
8gl-k&,f,,i`06q `s-ٖQ:w[=5T=]=Uwկ=sD)`\nrG}k{p<;Uf>Wk?+ܭz-}(b`<>۾>_w{ծ|eFs)^ax#%%塛jnÜur=xs܄b:"9lu} g OSJ%;FLDXy` 9㋘tԏ5#_J].3@͔F1$QN!|MJ[JF%ݼW{JD)>pmp@DnעJHzi08
L%lK=`-.? /"Bkz<mJ88G \FDD@YA d,4Ʌ+?sJzw
؏|&,KmY}8IƊȳ$JDhzRtt.cCz)nTD>S;gɄF p-O,/ 7`&f
L!hBQ Ѓp6d5a~kP#ݼ{RR~J;{z\Wx'ձ0!,CfKDdTY~2!uG'ЂK+"-"U)LoJP," =V U|Z`:N"XGzDѩADVDrSt!4$bjhi*^u}E'"sB?L@@p p`)JNTR^)O)4(vX>:b.C&: ~wä* LZEP;6:;O咗 lk: 7ffx|]DhU6c+"W)զ)HbYDSv+EG` |t}|)Gd ɘ
R:!Yl1>|g}7.zo{9@{{
&l0Q@+r2Z\YbN.K}h
;U߭ls}3GB}nk)A^)h+XIuY
N $V|M>P9$pA]+2 WtVcR;Qy_DdoNzt
qJOڷҸ'B)g9yw/D SJ|#V J<ŭ Gz;{*
%* ̢kV9Ӈ"R׳K/GJuߜ
b0p(B Dd&0#E'QX7+ӈ'VKEoyo_u.QJ߃$=nI|8TYbhKˈkɬq`tYÿГ?Zv]D